|
--- |
|
license: mit |
|
datasets: |
|
- HuggingFaceFW/fineweb |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
|
|
widget: |
|
- text: "He is a doctor. His main goal is" |
|
example_title: " to help people." |
|
- text: "My name is Merve and my favorite" |
|
example_title: "activity is reading." |
|
--- |
|
# GPT3 |
|
|
|
Welcome to the GPT3 repository! This project is an attempt to recreate the architecture and approach from the original OpenAI GPT-3 paper. The repository includes scripts for training, fine-tuning, and inference of a GPT-3-like model using PyTorch and the Hugging Face Transformers library. |
|
Here are located weights of dev checkpoints of my models. You can always download a folder, paste it's path inside inference.py and chat with them. |
|
|
|
# **You can find all code on [GitHub](https://github.com/krll-corp/GPT3)** |
|
# Note: This is a very small model (17M params) so do not expect very good results. I'm getting the bigger checkpoint ready, maybe it will behave better. Benchmarks are also on their way. |
|
|
|
## Contributing |
|
|
|
Contributions are welcome! I'm just a student who is interested in AI so my code may be incorrect or have logical issues. Please open an issue or submit a pull request for any improvements or bug fixes, I will be happy. |
|
|
|
## License |
|
|
|
This project is licensed under the MIT License. See the LICENSE file for details. Everyone can use and modify this code at their discretion. |
|
|
|
## Acknowledgements |
|
|
|
Thanks OpenAI, HuggingFace and Pytorch for making this project possible! |
|
|
|
- [OpenAI GPT-3 Paper](https://arxiv.org/abs/2005.14165) |
|
- [Hugging Face Transformers](https://github.com/huggingface/transformers) |
|
- [PyTorch](https://pytorch.org/) |