astro-hep-bert / README.md
arnosimons's picture
Update README.md
22e5d80 verified
---
license: apache-2.0
language:
- en
pipeline_tag: fill-mask
widget:
- text: >-
The Standard Model (SM) of [MASK] physics has been tested by many
experiments over the last four decades and has been shown to successfully
describe high energy particle interactions.
example_title: particle physics
- text: >-
Clear evidence for the production of a neutral boson with a measured mass of
[MASK].0 ± 0.4 (stat) ± 0.4 (sys) GeV is presented.
example_title: 126.0 ± 0.4 (stat) ± 0.4 (sys) GeV
- text: >-
An excess of [MASK] is observed above the expected background, with a local
significance of 5.0 standard deviations, at a mass near 125 GeV, signalling
the production of a new particle.
example_title: excess of events
- text: >-
On September 14, 2015 at 09:50:45 UTC the two [MASK] of the Laser
Interferometer Gravitational-Wave Observatory simultaneously observed a
transient gravitational-wave signal.
example_title: two detectors
- text: >-
These first images from the EHT achieve the highest [MASK] resolution in the
history of ground-based VLBI.
example_title: angular resolution
- text: >-
We propose a comprehensive theory of [MASK] matter that explains the recent
proliferation of unexpected observations in high-energy astrophysics.
example_title: dark matter
- text: >-
Formation of galaxy clusters corresponds to the collapse of the largest
gravitationally bound overdensities in the initial [MASK] field and is
accompanied by the most energetic phenomena since the Big Bang and by the
complex interplay between gravity-induced dynamics of collapse and baryonic
processes associated with galaxy formation.
example_title: initial density field
- text: >-
The Event [MASK] Telescope (EHT) has led to the first images of a
supermassive black hole, revealing the central compact objects in the
elliptical galaxy M87 and the Milky Way.
example_title: Event Horizon Telescope
datasets:
- wikipedia
- bookcorpus
- arnosimons/astro-hep-corpus
tags:
- arXiv
- astrophysics
- conceptual analysis
- epistemic change
- high-energy physics (HEP)
- history of science
- semantic shift detection
- sociology of science
- philosophy of science
- physics
- word embeddings
---
# Model Card for Astro-HEP-BERT
**Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the Astro-HEP-BERT paper [link coming soon].
<!-- <a target="_blank" rel="noopener noreferrer" href="">Astro-HEP-BERT paper</a>. -->
## Model Details
- **Developer:** <a target="_blank" rel="noopener noreferrer" href="https://www.tu.berlin/en/hps-mod-sci/arno-simons">Arno Simons</a>
- **Funded by:** The European Union under Grant agreement ID: <a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >101044932</a>
- **Language (NLP):** English
- **License:** apache-2.0
- **Parent model:** Google's <a target="_blank" rel="noopener noreferrer" href="https://github.com/google-research/bert">`bert-base-uncased`</a>
<!---
## How to Get Started with the Model
Use the code below to get started with the model.
[Coming soon]
## Citation
**BibTeX:**
[Coming soon]
**APA:**
[Coming soon]
-->