ai_summarizer / README.md
rithuparan07's picture
Update README.md
db0ca04 verified
|
raw
history blame
3.04 kB
metadata
license: mit
datasets:
  - fka/awesome-chatgpt-prompts
  - gopipasala/fka-awesome-chatgpt-prompts
metrics:
  - character
base_model:
  - meta-llama/Llama-3.2-11B-Vision-Instruct
new_version: meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: summarization
library_name: diffusers
tags:
  - legal

Model Card for Rithu Paran's Summarization Model Model Details Model Description Purpose: This model is designed for text summarization, specifically built to condense long-form content into concise, meaningful summaries. Developed by: Rithu Paran Model type: Transformer-based Language Model for Summarization Base Model: Meta-Llama/Llama-3.2-11B-Vision-Instruct Finetuned Model Version: Meta-Llama/Llama-3.1-8B-Instruct Language(s): Primarily English, with limited support for other languages. License: MIT License Model Sources Repository: Available on Hugging Face Hub under Rithu Paran Datasets Used: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts Uses Direct Use This model can be directly employed for summarizing various types of content, such as news articles, reports, and other informational documents. Out-of-Scope Use It is not recommended for highly technical or specialized documents without additional fine-tuning or adaptation. Bias, Risks, and Limitations While this model was designed to be general-purpose, there may be inherent biases due to the training data. Users should be cautious when using the model for sensitive content or in applications where accuracy is crucial.

How to Get Started with the Model Here's a quick example of how to start using the model for summarization:

python Copy code from transformers import pipeline

summarizer = pipeline("summarization", model="rithu-paran/your-summarization-model") text = "Insert long-form text here." summary = summarizer(text, max_length=100, min_length=30) print(summary) Training Details Training Data Datasets: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts Preprocessing: Data was tokenized and normalized for better model performance. Training Procedure Hardware: Trained on GPUs with Hugging Face API resources. Precision: Mixed-precision (fp16) was utilized to enhance training efficiency. Training Hyperparameters Batch Size: 16 Learning Rate: 5e-5 Epochs: 3 Optimizer: AdamW Evaluation Metrics Metrics Used: ROUGE Score, BLEU Score Evaluation Datasets: Evaluated on a subset of fka/awesome-chatgpt-prompts for summarization performance. Technical Specifications Model Architecture Based on Llama-3 architecture, optimized for summarization through attention-based mechanisms. Compute Infrastructure Hardware: Nvidia A100 GPUs were used for training. Software: Hugging Face’s transformers library along with the diffusers library. Environmental Impact Hardware Type: Nvidia A100 GPUs Training Duration: ~10 hours Estimated Carbon Emission: Approximate emissions calculated using Machine Learning Impact calculator. Contact For any questions or issues, please reach out to Rithu Paran via the Hugging Face Forum.