s3nh/TinyLLama-4x1.1B-MoE

Example usage:

from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")
tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")

input_text =  """
###Input: You are a pirate. tell me a story about wrecked ship.
###Response:
""")

input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device)
output = model.generate(inputs=input_ids,
                        max_length=max_length,
                        do_sample=True,
                        top_k=10,
                        temperature=0.7,
                        pad_token_id=tokenizer.eos_token_id,
                        attention_mask=input_ids.new_ones(input_ids.shape))
tokenizer.decode(output[0], skip_special_tokens=True)

This model was possible to create by tremendous work of mergekit developers. I decided to merge tinyLlama models to create mixture of experts. Config used as below:

"""base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
experts:
  - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
  - source_model: 78health/TinyLlama_1.1B-function-calling
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
  - source_model: phanerozoic/Tiny-Pirate-1.1b-v0.1
    positive_prompts:
    - "storywriting"
    - "write"
    - "scene"
    - "story"
    - "character"
  - source_model: Tensoic/TinyLlama-1.1B-3T-openhermes
    positive_prompts:
    - "reason"
    - "provide"
    - "instruct"
    - "summarize"
    - "count"
"""

s3nh
/

TinyLLama-4x1.1B-MoE

Model tree for s3nh/TinyLLama-4x1.1B-MoE