Update README.md

baa52d8 verified 6 months ago

3.77 kB

	---
	license: llama3.1
	pipeline_tag: text-generation
	tags:
	- facebook
	- meta
	- pytorch
	- llama
	- llama-3
	datasets:
	- Kushtrim/alpaca-cleaned-sq
	language:
	- sq
	---

	# Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip

	## Model overview

	Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip is a fine-tuned version of the [Llama 3.1 model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), specifically optimized for Albanian language tasks. This model is tailored to perform a variety of natural language processing tasks in Albanian, utilizing a quantized 4-bit precision to maintain efficiency and scalability while supporting extensive inference tasks.

	## Model Details

	- Model Name: Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip
	- Base Model: Llama 3.1
	- Model Size: 8 billion parameters
	- Quantization: 4-bit precision (bnb)
	- Language: Albanian
	- License: [llama3.1](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/resolve/main/LICENSE)

	## Limitations

	- Representation of Harms & Stereotypes: Potential for biased outputs reflecting real-world societal biases.
	- Inappropriate or Offensive Content: Risk of generating content that may be offensive or inappropriate in certain contexts.
	- Information Reliability: Possibility of producing inaccurate or outdated information.
	- Dataset Size: The Albanian dataset used for fine-tuning was not very large, which may affect the model's performance and coverage.

	## Intended Use

	- Intended Use Cases: This model is suitable for various NLP tasks in Albanian, including conversational AI, text generation, and language understanding.
	- Out-of-scope Use: This model should not be used in ways that violate laws, regulations, or ethical guidelines. It is also not intended for use in languages other than Albanian unless appropriately fine-tuned.

	## Responsible AI Considerations

	Developers using this model should:

	- Evaluate and mitigate risks related to accuracy, safety, and fairness.
	- Ensure compliance with applicable laws and regulations.
	- Implement additional safeguards for high-risk scenarios and sensitive contexts.
	- Inform end-users that they are interacting with an AI system.
	- Use feedback mechanisms and contextual information grounding techniques (RAG) to enhance output reliability.

	```python
	!pip3 install -U transformers peft accelerate bitsandbytes

	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
	import torch

	hf_token = "hf_...."

	torch.random.manual_seed(0)

	model = AutoModelForCausalLM.from_pretrained(
	"Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip",
	device_map="cuda",
	torch_dtype="auto",
	trust_remote_code=True,
	token=hf_token,
	)

	tokenizer = AutoTokenizer.from_pretrained("Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip", token=hf_token)

	messages = [
	{"role": "system", "content": "Je një asistent inteligjent shumë i dobishëm."},
	{"role": "user", "content": "Identifiko emrat e personave në këtë artikull 'Majlinda Kelmendi (lindi më 9 maj 1991), është një xhudiste shqiptare nga Peja, Kosovë.'"},
	]

	pipe = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	)

	generation_args = {
	"max_new_tokens": 2048,
	"return_full_text": False,
	"temperature": 0.9,
	"do_sample": True,
	}

	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
	output = pipe(prompt, **generation_args)
	print(output[0]['generated_text'])
	```

	## Acknowledgements

	This model is built upon the Meta-Llama-3.1-8B-Instruct by leveraging its robust capabilities and further fine-tuning it for Albanian language tasks. Special thanks to the developers and researchers who contributed to the original Llama3.1.