LLAMA-3 8B Chat Nuclear Model
- Developed by: inetnuc
- License: apache-2.0
- Finetuned from model: unsloth/llama-3-8b-bnb-4bit
This LLAMA-3 model was finetuned to enhance capabilities in text generation for nuclear-related topics. The training was accelerated using Unsloth and Huggingface's TRL library, achieving a 2x faster performance.
Finetuning Process
The model was finetuned using the Unsloth library, leveraging its efficient training capabilities. The process included the following steps:
- Data Preparation: Loaded and preprocessed nuclear-related data.
- Model Loading: Utilized
unsloth/llama-3-8b-bnb-4bit
as the base model. - LoRA Patching: Applied LoRA (Low-Rank Adaptation) for efficient training.
- Training: Finetuned the model using Hugging Face's TRL library with optimized hyperparameters.
Model Details
- Base Model:
unsloth/llama-3-8b-bnb-4bit
- Language: English (
en
) - License: Apache-2.0
Files and Versions
File Name | Description |
---|---|
.gitattributes | Initial commit |
README.md | Model description and usage |
adapter_config.json | Configuration for adapter |
adapter_model.safetensors | Finetuned model weights |
config.json | Configuration for base model |
generation_config.json | Generation configuration for model |
model-00001-of-00007.safetensors | Part of the base model weights |
model-00002-of-00007.safetensors | Part of the base model weights |
model-00003-of-00007.safetensors | Part of the base model weights |
model-00004-of-00007.safetensors | Part of the base model weights |
model-00005-of-00007.safetensors | Part of the base model weights |
model-00006-of-00007.safetensors | Part of the base model weights |
model-00007-of-00007.safetensors | Part of the base model weights |
model.safetensors.index.json | Index for the model weights |
special_tokens_map.json | Special tokens mapping |
tokenizer.json | Tokenizer data |
tokenizer_config.json | Configuration for tokenizer |
Model Card Authors
MUSTAFA UMUT OZBEK
Contact
https://www.linkedin.com/in/mustafaumutozbek/ https://x.com/m_umut_ozbek
Usage
Loading the Model
You can load the model and tokenizer using the following code snippet:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("inetnuc/llama-3-8b-chat-nuclear")
model = AutoModelForCausalLM.from_pretrained("inetnuc/llama-3-8b-chat-nuclear")
# Example of generating text
inputs = tokenizer("what is the iaea approach for cyber security?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for inetnuc/llama-3-8b-chat-nuclear
Base model
meta-llama/Meta-Llama-3-8B
Quantized
unsloth/llama-3-8b-bnb-4bit