tiiuae
/

falcon-rw-1b

Text Generation

text-generation-inference

Model card Files Files and versions Community

slippylolo commited on May 25, 2023

Commit

1a8b638

•

1 Parent(s): d3685ba

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+datasets:
+- tiiuae/falcon-refinedweb
+language:
+- en
+---
+# Falcon-RW-1B
+**Falcon-RW-1B is a 1B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the [TII Falcon LLM License](https://huggingface.co/tiiuae/falcon-rw-1b/blob/main/LICENSE.txt).**
+RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-1B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
+This model is intended for use as a research artifact, to study the influence of training on appropriately filtered web data alone.
+# Model Card for Falcon-RW-1B
+## Model Details
+### Model Description
+- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
+- **Model type:** Causal decoder-only
+- **Language(s) (NLP):** English
+- **License:** TII Falcon LLM License
+### Model Source
+- **Paper:** coming soon
+- **Demo:** coming soon
+## Uses
+### Direct Use
+Research on large language models, and the influence of adequately filtered and deduplicated web data on the properties of large language models (fairness, safety, limitations, capabilities, etc.).
+### Out-of-Scope Use
+Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful
+## Bias, Risks, and Limitations
+Falcon-RW models are trained on English data only, and will not generalize appropriately to other languages. Furthermore, as they are trained on a large-scale corpora representative of the web, they will carry the stereotypes and biases commonly encountered online
+## Paper
+More details coming soon in the paper.