Edit model card

LLAMA-3 8B Chat Nuclear Model

  • Developed by: inetnuc
  • License: apache-2.0
  • Finetuned from model: unsloth/llama-3-8b-bnb-4bit

This LLAMA-3 model was finetuned to enhance capabilities in text generation for nuclear-related topics. The training was accelerated using Unsloth and Huggingface's TRL library, achieving a 2x faster performance.

Finetuning Process

The model was finetuned using the Unsloth library, leveraging its efficient training capabilities. The process included the following steps:

  1. Data Preparation: Loaded and preprocessed nuclear-related data.
  2. Model Loading: Utilized unsloth/llama-3-8b-bnb-4bit as the base model.
  3. LoRA Patching: Applied LoRA (Low-Rank Adaptation) for efficient training.
  4. Training: Finetuned the model using Hugging Face's TRL library with optimized hyperparameters.

Model Details

  • Base Model: unsloth/llama-3-8b-bnb-4bit
  • Language: English (en)
  • License: Apache-2.0

Files and Versions

File Name Description
.gitattributes Initial commit
README.md Model description and usage
adapter_config.json Configuration for adapter
adapter_model.safetensors Finetuned model weights
config.json Configuration for base model
generation_config.json Generation configuration for model
model-00001-of-00007.safetensors Part of the base model weights
model-00002-of-00007.safetensors Part of the base model weights
model-00003-of-00007.safetensors Part of the base model weights
model-00004-of-00007.safetensors Part of the base model weights
model-00005-of-00007.safetensors Part of the base model weights
model-00006-of-00007.safetensors Part of the base model weights
model-00007-of-00007.safetensors Part of the base model weights
model.safetensors.index.json Index for the model weights
special_tokens_map.json Special tokens mapping
tokenizer.json Tokenizer data
tokenizer_config.json Configuration for tokenizer

Model Card Authors

MUSTAFA UMUT OZBEK

Contact

https://www.linkedin.com/in/mustafaumutozbek/ https://x.com/m_umut_ozbek

Usage

Loading the Model

You can load the model and tokenizer using the following code snippet:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("inetnuc/llama-3-8b-chat-nuclear")
model = AutoModelForCausalLM.from_pretrained("inetnuc/llama-3-8b-chat-nuclear")

# Example of generating text
inputs = tokenizer("what is the iaea approach for cyber security?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


Downloads last month
5
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for inetnuc/llama-3-8b-chat-nuclear

Finetuned
(2406)
this model