Pushed by DataDreamer
Browse files
README.md
CHANGED
@@ -1,199 +1,40 @@
|
|
|
|
1 |
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
|
|
5 |
|
6 |
-
|
7 |
-
|
8 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
## Model Details
|
13 |
-
|
14 |
-
### Model Description
|
15 |
-
|
16 |
-
<!-- Provide a longer summary of what this model is. -->
|
17 |
-
|
18 |
-
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
19 |
-
|
20 |
-
- **Developed by:** [More Information Needed]
|
21 |
-
- **Funded by [optional]:** [More Information Needed]
|
22 |
-
- **Shared by [optional]:** [More Information Needed]
|
23 |
-
- **Model type:** [More Information Needed]
|
24 |
-
- **Language(s) (NLP):** [More Information Needed]
|
25 |
-
- **License:** [More Information Needed]
|
26 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
27 |
-
|
28 |
-
### Model Sources [optional]
|
29 |
-
|
30 |
-
<!-- Provide the basic links for the model. -->
|
31 |
-
|
32 |
-
- **Repository:** [More Information Needed]
|
33 |
-
- **Paper [optional]:** [More Information Needed]
|
34 |
-
- **Demo [optional]:** [More Information Needed]
|
35 |
-
|
36 |
-
## Uses
|
37 |
-
|
38 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
39 |
-
|
40 |
-
### Direct Use
|
41 |
-
|
42 |
-
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
43 |
-
|
44 |
-
[More Information Needed]
|
45 |
-
|
46 |
-
### Downstream Use [optional]
|
47 |
-
|
48 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
49 |
-
|
50 |
-
[More Information Needed]
|
51 |
-
|
52 |
-
### Out-of-Scope Use
|
53 |
-
|
54 |
-
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
55 |
-
|
56 |
-
[More Information Needed]
|
57 |
-
|
58 |
-
## Bias, Risks, and Limitations
|
59 |
-
|
60 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
61 |
-
|
62 |
-
[More Information Needed]
|
63 |
-
|
64 |
-
### Recommendations
|
65 |
-
|
66 |
-
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
67 |
-
|
68 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
69 |
-
|
70 |
-
## How to Get Started with the Model
|
71 |
-
|
72 |
-
Use the code below to get started with the model.
|
73 |
-
|
74 |
-
[More Information Needed]
|
75 |
-
|
76 |
-
## Training Details
|
77 |
-
|
78 |
-
### Training Data
|
79 |
-
|
80 |
-
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
81 |
-
|
82 |
-
[More Information Needed]
|
83 |
-
|
84 |
-
### Training Procedure
|
85 |
-
|
86 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
87 |
-
|
88 |
-
#### Preprocessing [optional]
|
89 |
-
|
90 |
-
[More Information Needed]
|
91 |
-
|
92 |
-
|
93 |
-
#### Training Hyperparameters
|
94 |
-
|
95 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
96 |
-
|
97 |
-
#### Speeds, Sizes, Times [optional]
|
98 |
-
|
99 |
-
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
100 |
|
101 |
-
|
102 |
|
103 |
-
|
|
|
104 |
|
105 |
-
|
|
|
|
|
106 |
|
107 |
-
|
|
|
|
|
108 |
|
109 |
-
|
110 |
-
|
111 |
-
<!-- This should link to a Dataset Card if possible. -->
|
112 |
-
|
113 |
-
[More Information Needed]
|
114 |
-
|
115 |
-
#### Factors
|
116 |
-
|
117 |
-
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
118 |
-
|
119 |
-
[More Information Needed]
|
120 |
-
|
121 |
-
#### Metrics
|
122 |
-
|
123 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
124 |
-
|
125 |
-
[More Information Needed]
|
126 |
-
|
127 |
-
### Results
|
128 |
-
|
129 |
-
[More Information Needed]
|
130 |
-
|
131 |
-
#### Summary
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
## Model Examination [optional]
|
136 |
-
|
137 |
-
<!-- Relevant interpretability work for the model goes here -->
|
138 |
-
|
139 |
-
[More Information Needed]
|
140 |
-
|
141 |
-
## Environmental Impact
|
142 |
-
|
143 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
144 |
-
|
145 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
146 |
-
|
147 |
-
- **Hardware Type:** [More Information Needed]
|
148 |
-
- **Hours used:** [More Information Needed]
|
149 |
-
- **Cloud Provider:** [More Information Needed]
|
150 |
-
- **Compute Region:** [More Information Needed]
|
151 |
-
- **Carbon Emitted:** [More Information Needed]
|
152 |
-
|
153 |
-
## Technical Specifications [optional]
|
154 |
-
|
155 |
-
### Model Architecture and Objective
|
156 |
-
|
157 |
-
[More Information Needed]
|
158 |
-
|
159 |
-
### Compute Infrastructure
|
160 |
-
|
161 |
-
[More Information Needed]
|
162 |
-
|
163 |
-
#### Hardware
|
164 |
-
|
165 |
-
[More Information Needed]
|
166 |
-
|
167 |
-
#### Software
|
168 |
-
|
169 |
-
[More Information Needed]
|
170 |
-
|
171 |
-
## Citation [optional]
|
172 |
-
|
173 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
174 |
-
|
175 |
-
**BibTeX:**
|
176 |
-
|
177 |
-
[More Information Needed]
|
178 |
-
|
179 |
-
**APA:**
|
180 |
-
|
181 |
-
[More Information Needed]
|
182 |
-
|
183 |
-
## Glossary [optional]
|
184 |
-
|
185 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
186 |
-
|
187 |
-
[More Information Needed]
|
188 |
-
|
189 |
-
## More Information [optional]
|
190 |
-
|
191 |
-
[More Information Needed]
|
192 |
-
|
193 |
-
## Model Card Authors [optional]
|
194 |
-
|
195 |
-
[More Information Needed]
|
196 |
-
|
197 |
-
## Model Card Contact
|
198 |
-
|
199 |
-
[More Information Needed]
|
|
|
1 |
+
|
2 |
---
|
3 |
+
base_model: google/t5-v1_1-base
|
4 |
+
|
5 |
+
tags:
|
6 |
+
- datadreamer
|
7 |
+
- datadreamer-0.38.0
|
8 |
+
- synthetic
|
9 |
+
- gpt-4
|
10 |
+
- gpt-4
|
11 |
+
- text2text-generation
|
12 |
+
|
13 |
+
widget:
|
14 |
+
- text: "In the ever-growing field of Natural Language Processing (NLP), understanding the nuances and depth of human expression and delivering contextualized outputs is an essential yet challenging task. The contribution of Deep Learning and Machine Learning methods toward tackling complex language processing tasks necessitates ongoing research. This paper outlines a novel architecture accounting for semantic bridges in the realm of NLP, utilizing sophisticated RNN and LSTM models. We connect phrase-level and sentence-level semantics under a unified framework, contributing towards generating better contextual understanding of textual data and providing detailed insights for tasks such as sentiment analysis and topic modeling. Our architecture outperforms most known models in these tasks due to its ability to consider longer textual context while simultaneously avoiding complications arising from language ambiguity. Our results provide inspiring indications on the benefits of capturing semantic bridges for more robust language models. We carry rigorous evaluations impinging both qualitative and quantitative insights, thereby showcasing our model's impressive generalizability to real-world applications."
|
15 |
+
example_title: "Example 1"
|
16 |
+
- text: "Automatic Natural Language Processing technologies have rapidly evolved in recent years, enabling diverse real-life applications and unveiling new challenging aspects. Considerable recognition should be attributed to neural network architectures such as the transformer and several learning techniques. \r\n\r\nIn this paper, we delve deep into an unexplored paradigm: grounding transformer-based Natural Language Processing in external knowledge bases. While recent efforts have shown significant successes topped with the emerging and rekindled interest in the potential neuro-symbolic connection, several research questions conveniently lurk around practical employment, scalability and explainability.\r\n\r\nSpecifically, we introduce and experimentally validate three algorithms to enhance the knowledge-grounded transformer. Each method encompasses the essence of grounding in external knowledge bases and evolves by saturating this groundedness; scaling across tasks, domains and languages. We believe, with evidence from detailed analysis on performance benchmarks and qualitative evaluation, that our work makes a step towards setting up a novel avenue for scientific researchers. Significantly, we posit that shallow grounding may tackle practical NLP employment, feasible algorithms for vertical scaling loosen up constraints on computational resources, while the Chen\u2019s failure analysis exposes room for future improved models.\n\nBy concluding our results and proposals, we create a vibrant snapshot of the current progress in the research for grounding Transformer models in external knowledge, contributing clearer solutions for scalability issue in neural-based NLP, and knownledge transferable abilities in different tasks and languages. Postulation that our methods can provide vital insight into why some transformer models fail at understanding natural language may offer unique insight to Conversie AI scientists. Our propositions for further exploiting of this neuro-symbolic connection hold promise to further navigation in the realm of explainable artificial intelligence failing to leave out calls to attention towards ensuring ethical AI applications."
|
17 |
+
example_title: "Example 2"
|
18 |
+
- text: "In this paper, we explore the latest advancements in Natural Language Processing (NLP) capacities using deep learning. The research focusses on understanding the interaction dynamics between syntactic comprehension and semantic prediction. Initial results identify intriguing checkpoint stages that internally modulate systems engaged in semantic prediction, hinting towards possible bi-dimensional processing mechanisms, broaching deeper parallelisms to cognitive hierarchical structures. Neural network tests using transformer models, particularly BERT and GPT-3 further elucidate, how such models react to complex multi-layered sentence structures, deconstructing their strategical use of syntactic information and projectional planning abilities in generating dependable language constructs. Ab initio transformations in joint paraphrasing and entity substitution procedures enabled optimization in performance when dealing with nuanced distinctions in language representation. Recognizing the limitations with available reference corpora, careful data augmentation techniques were applied to ensure comprehensive coverage and interpretations of language structures. Our research supports a more-rounded comprehension of how pre-training influences a model's linguistic understanding and establishes preliminary steps towards more intentional, rationalized decisions while model synthesis. Future work would aim at adapting these insights in designing new self-supervised learning technologies while deeply benefiting disparate domains, including data querying and humanoid artificial intelligence."
|
19 |
+
example_title: "Example 3"
|
20 |
+
pipeline_tag: text2text-generation
|
21 |
---
|
22 |
+
# Model Card
|
23 |
|
24 |
+
[Add more information here](https://huggingface.co/templates/model-card-example)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
+
## Example Usage
|
27 |
|
28 |
+
```python3
|
29 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
|
30 |
|
31 |
+
tokenizer = AutoTokenizer.from_pretrained('dansul/datadreamer-dev-abstracts_to_tweet_model', revision=None) # Load tokenizer
|
32 |
+
model = AutoModelForSeq2SeqLM.from_pretrained('dansul/datadreamer-dev-abstracts_to_tweet_model', revision=None) # Load model
|
33 |
+
pipe = pipeline('text2text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id)
|
34 |
|
35 |
+
inputs = ["In the ever-growing field of Natural Language Processing (NLP), understanding the nuances and depth of human expression and delivering contextualized outputs is an essential yet challenging task. The contribution of Deep Learning and Machine Learning methods toward tackling complex language processing tasks necessitates ongoing research. This paper outlines a novel architecture accounting for semantic bridges in the realm of NLP, utilizing sophisticated RNN and LSTM models. We connect phrase-level and sentence-level semantics under a unified framework, contributing towards generating better contextual understanding of textual data and providing detailed insights for tasks such as sentiment analysis and topic modeling. Our architecture outperforms most known models in these tasks due to its ability to consider longer textual context while simultaneously avoiding complications arising from language ambiguity. Our results provide inspiring indications on the benefits of capturing semantic bridges for more robust language models. We carry rigorous evaluations impinging both qualitative and quantitative insights, thereby showcasing our model's impressive generalizability to real-world applications."]
|
36 |
+
print(pipe(inputs, max_length=512, do_sample=False))
|
37 |
+
```
|
38 |
|
39 |
+
---
|
40 |
+
This model was trained with a synthetic dataset with [DataDreamer 🤖💤](https://datadreamer.dev). The synthetic dataset card and model card can be found [here](datadreamer.json). The training arguments can be found [here](training_args.json).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|