license: creativeml-openrail-m
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
pipeline_tag: text-generation
tags:
- triangulum_10b
- sft
- chain_of_thought
- ollama
- text-generation-inference
- llama_for_causal_lm
library_name: transformers
__ .__ .__ _/ |_ _______ |__|_____ ____ ____ __ __ | | __ __ _____ \ __\\_ __ \| |\__ \ / \ / ___\ | | \| | | | \ / \ | | | | \/| | / __ \_| | \/ /_/ >| | /| |__| | /| Y Y \ |__| |__| |__|(____ /|___| /\___ / |____/ |____/|____/ |__|_| / \/ \//_____/ \/
Triangulum 10B: Multilingual Large Language Models (LLMs)
Triangulum 10B is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
Key Features
Foundation Model: Built upon LLaMA's autoregressive language model, leveraging an optimized transformer architecture for enhanced performance.
Instruction Tuning: Includes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align model outputs with human preferences for helpfulness and safety.
Multilingual Support: Designed to handle multiple languages, ensuring broad applicability across diverse linguistic contexts.
Training Approach
- Synthetic Datasets: Utilizes long chain-of-thought synthetic data to enhance reasoning capabilities.
- Supervised Fine-Tuning (SFT): Aligns the model to specific tasks through curated datasets.
- Reinforcement Learning with Human Feedback (RLHF): Ensures the model adheres to human values and safety guidelines through iterative training processes.
How to use with transformers
Starting with transformers >= 4.43.0
onward, you can run conversational inference using the Transformers pipeline
abstraction or by leveraging the Auto classes with the generate()
function.
Make sure to update your transformers installation via pip install --upgrade transformers
.
import torch
from transformers import pipeline
model_id = "prithivMLmods/Triangulum-10B"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are the kind and tri-intelligent assistant helping people to understand complex concepts."},
{"role": "user", "content": "Who are you?"},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
Demo Inference LlamaForCausalLM
import torch
from transformers import AutoTokenizer, LlamaForCausalLM
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained('prithivMLmods/Triangulum-10B', trust_remote_code=True)
model = LlamaForCausalLM.from_pretrained(
"prithivMLmods/Triangulum-10B",
torch_dtype=torch.float16,
device_map="auto",
load_in_8bit=False,
load_in_4bit=True,
use_flash_attention_2=True
)
# Define a list of system and user prompts
prompts = [
"""<|im_start|>system
You are the kind and tri-intelligent assistant helping people to understand complex concepts.<|im_end|>
<|im_start|>user
Can you explain the concept of eigenvalues and eigenvectors in a simple way?<|im_end|>
<|im_start|>assistant"""
]
# Generate responses for each prompt
for chat in prompts:
print(f"Prompt:\n{chat}\n")
input_ids = tokenizer(chat, return_tensors="pt").input_ids.to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=750, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(generated_ids[0][input_ids.shape[-1]:], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Response:\n{response}\n{'-'*80}\n")
Key Adjustments:
- System Prompts: Each prompt defines a different role or persona for the AI to adopt.
- User Prompts: These specify the context or task for the assistant, ranging from teaching to storytelling or career advice.
- Looping Through Prompts: Each prompt is processed in a loop to showcase the model's versatility.
You can expand the list of prompts to explore a variety of scenarios and responses.
Use Cases
- Multilingual content generation
- Question answering and dialogue systems
- Text summarization and analysis
- Translation and localization tasks
Technical Details
Triangulum 10B employs a state-of-the-art autoregressive architecture inspired by LLaMA. The optimized transformer framework ensures both efficiency and scalability, making it suitable for a variety of use cases.