---
language: en
tags:
- text-generation
- YouTube-scripts
- fine-tuned
- causal-lm
datasets:
- custom
license: mit
model_name: Gemma 2 Scripter
---

# Gemma 2 Scripter

**Gemma 2 Scripter** is a fine-tuned version of the Gemma 2 2B instruct model designed for generating high-quality YouTube scripts based on provided keywords. It is optimized for text generation tasks, delivering coherent and contextually relevant outputs.

## Model Details

- **Model Name**: `Sidharthan/gemma2_scripter`
- **Architecture**: Causal Language Model
- **Base Model**: Gemma 2 2B
- **Fine-tuning Objective**: Script generation using prompt-based keywords.

## How to Use

### Installation

Ensure you have the following dependencies installed:

```bash
pip install torch transformers peft
```

### Code Sample

```python
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM
import torch

# Load the model and tokenizer
model_name = "Sidharthan/gemma2_scripter"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model = AutoPeftModelForCausalLM.from_pretrained(
    model_name,
    device_map=None,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    trust_remote_code=True,
    low_cpu_mem_usage=True
).to(device)

# Generate a script
def generate_script(prompt):
    formatted_prompt = f"<bos><start_of_turn>keywords\n{prompt}<end_of_turn>\n<start_of_turn>script\n"
    inputs = tokenizer(formatted_prompt, return_tensors="pt")
    inputs = {key: value.to(device) for key, value in inputs.items()}
    
    outputs = model.generate(
        **inputs,
        max_length=1024,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        top_k=50,
        repetition_penalty=1.2,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id
    )
    
    return tokenizer.decode(outputs[0], skip_special_tokens=True).strip()

# Example usage
prompt = "crosshatch waffle texture, dark chocolate, four bar crispy wafers, kat, milk chocolate"
response = generate_script(prompt)
print(f"Generated Script:\n{response}")
```

### Input Format

The model expects prompts in the following format:

```
<bos><start_of_turn>keywords
<your_keywords_here><end_of_turn>
<start_of_turn>script

```

Example:
```
<bos><start_of_turn>keywords
crosshatch waffle texture, dark chocolate, four bar crispy wafers, kat, milk chocolate<end_of_turn>
<start_of_turn>script

```

### Output

The output is a YouTube script generated based on the keywords provided.

### Performance

- CPU: Slower inference due to computational constraints.
- GPU: Optimized for faster inference with FP16 support.

### Applications

- Generating structured scripts for video content
- Keyword-based text generation for creative tasks

## Training Details

### Training Data

The model was fine-tuned on a custom dataset of YouTube scripts paired with their corresponding keywords.

### Training Procedure

- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Optimization**: AdamW optimizer
- **Learning Rate**: 2e-4
- **Batch Size**: 4
- **Training Steps**: 1000

## Limitations

- The model's output quality depends on the clarity and relevance of input keywords
- May occasionally generate repetitive content
- Performance may vary based on hardware capabilities

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{gemma2_scripter,
  author = {Sidharthan},
  title = {Gemma 2 Scripter: Fine-tuned YouTube Script Generator},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/Sidharthan/gemma2_scripter}}
}
```

### License

This model is released under the MIT License.