--- language: en tags: - text-generation - YouTube-scripts - fine-tuned - causal-lm datasets: - custom license: mit model_name: Gemma 2 Scripter --- # Gemma 2 Scripter **Gemma 2 Scripter** is a fine-tuned version of the Gemma 2 2B instruct model designed for generating high-quality YouTube scripts based on provided keywords. It is optimized for text generation tasks, delivering coherent and contextually relevant outputs. ## Model Details - **Model Name**: `Sidharthan/gemma2_scripter` - **Architecture**: Causal Language Model - **Base Model**: Gemma 2 2B - **Fine-tuning Objective**: Script generation using prompt-based keywords. ## How to Use ### Installation Ensure you have the following dependencies installed: ```bash pip install torch transformers peft ``` ### Code Sample ```python from transformers import AutoTokenizer from peft import AutoPeftModelForCausalLM import torch # Load the model and tokenizer model_name = "Sidharthan/gemma2_scripter" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") model = AutoPeftModelForCausalLM.from_pretrained( model_name, device_map=None, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, trust_remote_code=True, low_cpu_mem_usage=True ).to(device) # Generate a script def generate_script(prompt): formatted_prompt = f"keywords\n{prompt}\nscript\n" inputs = tokenizer(formatted_prompt, return_tensors="pt") inputs = {key: value.to(device) for key, value in inputs.items()} outputs = model.generate( **inputs, max_length=1024, do_sample=True, temperature=0.7, top_p=0.95, top_k=50, repetition_penalty=1.2, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id ) return tokenizer.decode(outputs[0], skip_special_tokens=True).strip() # Example usage prompt = "crosshatch waffle texture, dark chocolate, four bar crispy wafers, kat, milk chocolate" response = generate_script(prompt) print(f"Generated Script:\n{response}") ``` ### Input Format The model expects prompts in the following format: ``` keywords script ``` Example: ``` keywords crosshatch waffle texture, dark chocolate, four bar crispy wafers, kat, milk chocolate script ``` ### Output The output is a YouTube script generated based on the keywords provided. ### Performance - CPU: Slower inference due to computational constraints. - GPU: Optimized for faster inference with FP16 support. ### Applications - Generating structured scripts for video content - Keyword-based text generation for creative tasks ## Training Details ### Training Data The model was fine-tuned on a custom dataset of YouTube scripts paired with their corresponding keywords. ### Training Procedure - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Optimization**: AdamW optimizer - **Learning Rate**: 2e-4 - **Batch Size**: 4 - **Training Steps**: 1000 ## Limitations - The model's output quality depends on the clarity and relevance of input keywords - May occasionally generate repetitive content - Performance may vary based on hardware capabilities ## Citation If you use this model in your research, please cite: ```bibtex @misc{gemma2_scripter, author = {Sidharthan}, title = {Gemma 2 Scripter: Fine-tuned YouTube Script Generator}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub}, howpublished = {\url{https://huggingface.co/Sidharthan/gemma2_scripter}} } ``` ### License This model is released under the MIT License.