arnosimons commited on
Commit
7dbd128
·
verified ·
1 Parent(s): 28a2355

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -24,24 +24,24 @@ datasets:
24
  - wikipedia
25
  - bookcorpus
26
  tags:
27
- - physics
28
  - astrophysics
 
 
29
  - high-energy physics (HEP)
30
  - history of science
31
- - philosophy of science
32
  - sociology of science
 
 
33
  - word embeddings
34
- - semantic shift detection
35
- - conceptual change
36
- - epistemic change
37
- - arXiv
38
  ---
39
 
40
  # Model Card for Astro-HEP-BERT
41
 
42
  **Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
43
 
44
- The Astro-HEP-BERT project embodies the spirit of a tabletop experiment or grassroots scientific effort. It exclusively utilized open-source inputs during training, and the entire training process was completed on a single MacBook Pro M2/96GB in 48 days for 3 epochs. This project stands as a proof of concept, showcasing the viability of employing a bidirectional transformer for research ventures in the history, philosophy, and sociology of science (HPSS) even with limited financial resources.
45
 
46
  For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the Astro-HEP-BERT paper [link coming soon].
47
 
 
24
  - wikipedia
25
  - bookcorpus
26
  tags:
27
+ - arXiv
28
  - astrophysics
29
+ - conceptual analysis
30
+ - epistemic change
31
  - high-energy physics (HEP)
32
  - history of science
33
+ - semantic shift detection
34
  - sociology of science
35
+ - philosophy of science
36
+ - physics
37
  - word embeddings
 
 
 
 
38
  ---
39
 
40
  # Model Card for Astro-HEP-BERT
41
 
42
  **Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
43
 
44
+ The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
45
 
46
  For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the Astro-HEP-BERT paper [link coming soon].
47