ibivibiv commited on
Commit
58f9604
1 Parent(s): ff8613e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +155 -5
README.md CHANGED
@@ -1,5 +1,155 @@
1
- ---
2
- license: llama2
3
- language:
4
- - en
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Aegolius Acadicus
2
+
3
+
4
+
5
+ # Prompting
6
+
7
+ ## Prompt Template for alpaca style
8
+
9
+ ```
10
+ ### Instruction:
11
+
12
+ <prompt> (without the <>)
13
+
14
+ ### Response:
15
+ ```
16
+
17
+ ## Sample Code
18
+
19
+ ```python
20
+ import torch
21
+ from transformers import AutoModelForCausalLM, AutoTokenizer
22
+
23
+ torch.set_default_device("cuda")
24
+
25
+ model = AutoModelForCausalLM.from_pretrained("ibivibiv/aegolius-acadicus-30b", torch_dtype="auto", device_config='auto')
26
+ tokenizer = AutoTokenizer.from_pretrained("ibivibiv/aegolius-acadicus-30b")
27
+
28
+ inputs = tokenizer("### Instruction: Who would when in an arm wrestling match between Abraham Lincoln and Chuck Norris?\n### Response:\n", return_tensors="pt", return_attention_mask=False)
29
+
30
+ outputs = model.generate(**inputs, max_length=200)
31
+ text = tokenizer.batch_decode(outputs)[0]
32
+ print(text)
33
+ ```
34
+
35
+ # Model Details
36
+ * **Trained by**: [ibivibiv](https://huggingface.co/ibivibiv)
37
+ * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
38
+ * **Model type:** **aegolius-acadicus-30b** is an auto-regressive language model moe from Llama 2 transformer architecture models and mistral models.
39
+ * **Language(s)**: English
40
+ * **Purpose**: This model is an attempt at an moe model to cover multiple disciplines using finetuned llama 2 and mistral models as base models.
41
+
42
+ # Benchmark Scores
43
+
44
+ pending
45
+
46
+ ## Citations
47
+
48
+ ```
49
+ @misc{open-llm-leaderboard,
50
+ author = {Edward Beeching and Clémentine Fourrier and Nathan Habib and Sheon Han and Nathan Lambert and Nazneen Rajani and Omar Sanseviero and Lewis Tunstall and Thomas Wolf},
51
+ title = {Open LLM Leaderboard},
52
+ year = {2023},
53
+ publisher = {Hugging Face},
54
+ howpublished = "\url{https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard}"
55
+ }
56
+ ```
57
+ ```
58
+ @software{eval-harness,
59
+ author = {Gao, Leo and
60
+ Tow, Jonathan and
61
+ Biderman, Stella and
62
+ Black, Sid and
63
+ DiPofi, Anthony and
64
+ Foster, Charles and
65
+ Golding, Laurence and
66
+ Hsu, Jeffrey and
67
+ McDonell, Kyle and
68
+ Muennighoff, Niklas and
69
+ Phang, Jason and
70
+ Reynolds, Laria and
71
+ Tang, Eric and
72
+ Thite, Anish and
73
+ Wang, Ben and
74
+ Wang, Kevin and
75
+ Zou, Andy},
76
+ title = {A framework for few-shot language model evaluation},
77
+ month = sep,
78
+ year = 2021,
79
+ publisher = {Zenodo},
80
+ version = {v0.0.1},
81
+ doi = {10.5281/zenodo.5371628},
82
+ url = {https://doi.org/10.5281/zenodo.5371628}
83
+ }
84
+ ```
85
+ ```
86
+ @misc{clark2018think,
87
+ title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
88
+ author={Peter Clark and Isaac Cowhey and Oren Etzioni and Tushar Khot and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
89
+ year={2018},
90
+ eprint={1803.05457},
91
+ archivePrefix={arXiv},
92
+ primaryClass={cs.AI}
93
+ }
94
+ ```
95
+ ```
96
+ @misc{zellers2019hellaswag,
97
+ title={HellaSwag: Can a Machine Really Finish Your Sentence?},
98
+ author={Rowan Zellers and Ari Holtzman and Yonatan Bisk and Ali Farhadi and Yejin Choi},
99
+ year={2019},
100
+ eprint={1905.07830},
101
+ archivePrefix={arXiv},
102
+ primaryClass={cs.CL}
103
+ }
104
+ ```
105
+ ```
106
+ @misc{hendrycks2021measuring,
107
+ title={Measuring Massive Multitask Language Understanding},
108
+ author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
109
+ year={2021},
110
+ eprint={2009.03300},
111
+ archivePrefix={arXiv},
112
+ primaryClass={cs.CY}
113
+ }
114
+ ```
115
+ ```
116
+ @misc{lin2022truthfulqa,
117
+ title={TruthfulQA: Measuring How Models Mimic Human Falsehoods},
118
+ author={Stephanie Lin and Jacob Hilton and Owain Evans},
119
+ year={2022},
120
+ eprint={2109.07958},
121
+ archivePrefix={arXiv},
122
+ primaryClass={cs.CL}
123
+ }
124
+ ```
125
+ ```
126
+ @misc{DBLP:journals/corr/abs-1907-10641,
127
+ title={{WINOGRANDE:} An Adversarial Winograd Schema Challenge at Scale},
128
+ author={Keisuke Sakaguchi and Ronan Le Bras and Chandra Bhagavatula and Yejin Choi},
129
+ year={2019},
130
+ eprint={1907.10641},
131
+ archivePrefix={arXiv},
132
+ primaryClass={cs.CL}
133
+ }
134
+ ```
135
+ ```
136
+ @misc{DBLP:journals/corr/abs-2110-14168,
137
+ title={Training Verifiers to Solve Math Word Problems},
138
+ author={Karl Cobbe and
139
+ Vineet Kosaraju and
140
+ Mohammad Bavarian and
141
+ Mark Chen and
142
+ Heewoo Jun and
143
+ Lukasz Kaiser and
144
+ Matthias Plappert and
145
+ Jerry Tworek and
146
+ Jacob Hilton and
147
+ Reiichiro Nakano and
148
+ Christopher Hesse and
149
+ John Schulman},
150
+ year={2021},
151
+ eprint={2110.14168},
152
+ archivePrefix={arXiv},
153
+ primaryClass={cs.CL}
154
+ }
155
+ ```