rithuparan07
/

ai_summarizer

text-generation-inference

inference-endpoint

Model card Files Files and versions Community

ai_summarizer / README.md

rithuparan07's picture

Update README.md

db0ca04 verified 3 months ago

|

3.04 kB

	---
	license: mit
	datasets:
	- fka/awesome-chatgpt-prompts
	- gopipasala/fka-awesome-chatgpt-prompts
	metrics:
	- character
	base_model:
	- meta-llama/Llama-3.2-11B-Vision-Instruct
	new_version: meta-llama/Llama-3.1-8B-Instruct
	pipeline_tag: summarization
	library_name: diffusers
	tags:
	- legal
	---


	Model Card for Rithu Paran's Summarization Model
	Model Details
	Model Description
	Purpose: This model is designed for text summarization, specifically built to condense long-form content into concise, meaningful summaries.
	Developed by: Rithu Paran
	Model type: Transformer-based Language Model for Summarization
	Base Model: Meta-Llama/Llama-3.2-11B-Vision-Instruct
	Finetuned Model Version: Meta-Llama/Llama-3.1-8B-Instruct
	Language(s): Primarily English, with limited support for other languages.
	License: MIT License
	Model Sources
	Repository: Available on Hugging Face Hub under Rithu Paran
	Datasets Used: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts
	Uses
	Direct Use
	This model can be directly employed for summarizing various types of content, such as news articles, reports, and other informational documents.
	Out-of-Scope Use
	It is not recommended for highly technical or specialized documents without additional fine-tuning or adaptation.
	Bias, Risks, and Limitations
	While this model was designed to be general-purpose, there may be inherent biases due to the training data. Users should be cautious when using the model for sensitive content or in applications where accuracy is crucial.

	How to Get Started with the Model
	Here's a quick example of how to start using the model for summarization:

	python
	Copy code
	from transformers import pipeline

	summarizer = pipeline("summarization", model="rithu-paran/your-summarization-model")
	text = "Insert long-form text here."
	summary = summarizer(text, max_length=100, min_length=30)
	print(summary)
	Training Details
	Training Data
	Datasets: fka/awesome-chatgpt-prompts, gopipasala/fka-awesome-chatgpt-prompts
	Preprocessing: Data was tokenized and normalized for better model performance.
	Training Procedure
	Hardware: Trained on GPUs with Hugging Face API resources.
	Precision: Mixed-precision (fp16) was utilized to enhance training efficiency.
	Training Hyperparameters
	Batch Size: 16
	Learning Rate: 5e-5
	Epochs: 3
	Optimizer: AdamW
	Evaluation
	Metrics
	Metrics Used: ROUGE Score, BLEU Score
	Evaluation Datasets: Evaluated on a subset of fka/awesome-chatgpt-prompts for summarization performance.
	Technical Specifications
	Model Architecture
	Based on Llama-3 architecture, optimized for summarization through attention-based mechanisms.
	Compute Infrastructure
	Hardware: Nvidia A100 GPUs were used for training.
	Software: Hugging Face’s transformers library along with the diffusers library.
	Environmental Impact
	Hardware Type: Nvidia A100 GPUs
	Training Duration: ~10 hours
	Estimated Carbon Emission: Approximate emissions calculated using Machine Learning Impact calculator.
	Contact
	For any questions or issues, please reach out to Rithu Paran via the Hugging Face Forum.