Model Card for Model ID

Finetuned a Language Model to specialize in summarization tasks, enabling it to condense texts efficiently. The finetuning process refined its abilities to generate concise and accurate summaries, enhancing its effectiveness in information compression.

Model Details

By fine-tuning a Language Model specifically for summarization tasks, I optimized its ability to comprehend and distill lengthy texts into succinct and coherent summaries. Through this process, the model was trained to prioritize important information, improve sentence compression, and enhance overall summarization quality, empowering it to generate concise and informative summaries across various content domains.

  • Finetuned from model: meta-llama/Llama-2-7b-chat-hf

Model Sources

Uses

This Finetuned Language Model specialized in summarization serves a multitude of practical purposes. It can efficiently condense lengthy articles, documents, or texts into shorter, more manageable summaries, aiding in information retrieval and comprehension. Additionally, it's invaluable for generating abstracts, aiding researchers, journalists, and professionals in quickly grasping the core essence of voluminous content, thereby saving time and facilitating better decision-making processes.

How to Get Started with the Model

Use the code below to get started with the model.

Use a pipeline as a high-level helper

from transformers import pipeline

pipe = pipeline("text-generation", model="PiyushLavaniya/Finetuned_Llama2-Summarizer")

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("PiyushLavaniya/Finetuned_Llama2-Summarizer") model = AutoModelForCausalLM.from_pretrained("PiyushLavaniya/Finetuned_Llama2-Summarizer")

Training Details

Training Data

Model is Finetuned on dataset "gopalkalpande/bbc-news-summary"

Finetuning an LLM on the "gopalkalpande/bbc-news-summary" dataset, which comprises article summaries from BBC News, equips the model to excel in summarizing news content. This specialized training enhances its ability to comprehend and condense news articles effectively, making it particularly adept at distilling key information, events, and nuances from news stories. As a result, the finetuned model becomes a valuable tool for swiftly generating accurate and concise summaries of news articles, aiding in information extraction and digesting news content more efficiently.

Training Hyperparameters

  • Training regime: per_device_train_batch_size = 4, gradient_accumulation_steps = 4, optim = 'paged_adamw_32bit', logging_steps = 20, learning_rate = 2e-4, fp16 = True, max_grad_norm = 0.3, max_steps = 109, #num_train_epochs = 1, #evaluation_strategy = 'steps', #eval_steps = 0.2, warmup_ratio = 0.05, save_strategy = 'epoch', group_by_length = True, output_dir = OUTPUT_DIR, report_to = 'tensorboard', save_safetensors = True, lr_scheduler_type = 'cosine', )
Downloads last month
12
Safetensors
Model size
6.74B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.