--- license: mit datasets: - fka/awesome-chatgpt-prompts - gopipasala/fka-awesome-chatgpt-prompts metrics: - character base_model: - meta-llama/Llama-3.2-11B-Vision-Instruct new_version: meta-llama/Llama-3.1-8B-Instruct pipeline_tag: summarization library_name: diffusers tags: - legal --- Model Card for Rithu Paran's Summarization Model Model Details Model Description Purpose: This model is designed for text summarization, specifically built to condense long-form content into concise, meaningful summaries. Developed by: Rithu Paran Model type: Transformer-based Language Model for Summarization Base Model: Meta-Llama/Llama-3.2-11B-Vision-Instruct Finetuned Model Version: Meta-Llama/Llama-3.1-8B-Instruct Language(s): Primarily English, with limited support for other languages. License: MIT License Model Sources Repository: Available on Hugging Face Hub under Rithu Paran Datasets Used: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts Uses Direct Use This model can be directly employed for summarizing various types of content, such as news articles, reports, and other informational documents. Out-of-Scope Use It is not recommended for highly technical or specialized documents without additional fine-tuning or adaptation. Bias, Risks, and Limitations While this model was designed to be general-purpose, there may be inherent biases due to the training data. Users should be cautious when using the model for sensitive content or in applications where accuracy is crucial. How to Get Started with the Model Here's a quick example of how to start using the model for summarization: python Copy code from transformers import pipeline summarizer = pipeline("summarization", model="rithu-paran/your-summarization-model") text = "Insert long-form text here." summary = summarizer(text, max_length=100, min_length=30) print(summary) Training Details Training Data Datasets: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts Preprocessing: Data was tokenized and normalized for better model performance. Training Procedure Hardware: Trained on GPUs with Hugging Face API resources. Precision: Mixed-precision (fp16) was utilized to enhance training efficiency. Training Hyperparameters Batch Size: 16 Learning Rate: 5e-5 Epochs: 3 Optimizer: AdamW Evaluation Metrics Metrics Used: ROUGE Score, BLEU Score Evaluation Datasets: Evaluated on a subset of fka/awesome-chatgpt-prompts for summarization performance. Technical Specifications Model Architecture Based on Llama-3 architecture, optimized for summarization through attention-based mechanisms. Compute Infrastructure Hardware: Nvidia A100 GPUs were used for training. Software: Hugging Face’s transformers library along with the diffusers library. Environmental Impact Hardware Type: Nvidia A100 GPUs Training Duration: ~10 hours Estimated Carbon Emission: Approximate emissions calculated using Machine Learning Impact calculator. Contact For any questions or issues, please reach out to Rithu Paran via the Hugging Face Forum.