MixtureOfPhi3
MixtureOfPhi3 is a Mixure of Experts (MoE) made with the following models using mergekit:
This has been created using LazyMergekit-Phi3
This run is only for development purposes, since merging 2 identical models does not bring any performance benefits, but once specialized finetunes of Phi3 models will be available, it will be a starting point for creating MoE from them.
©️ Credits
- mlabonne's phixtral where I adapted the inference code to Phi3's architecture.
- mergekit code which I tweaked to merge Phi3s
These have been merged using cheap_embed
where each model is assigned a vector representation of words - such as experts for scientific work, reasoning, math etc.
Try your own in the link above !
🧩 Configuration
base_model: microsoft/Phi-3-mini-128k-instruct
gate_mode: cheap_embed
dtype: float16
experts:
- source_model: microsoft/Phi-3-mini-128k-instruct
positive_prompts: ["research, logic, math, science"]
- source_model: microsoft/Phi-3-mini-128k-instruct
positive_prompts: ["creative, art"]
💻 Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = "paulilioaica/MixtureOfPhi3"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(
model,
trust_remote_code=True,
)
prompt="How many continents are there?"
input = f"<|system|>\nYou are a helpful AI assistant.<|end|>\n<|user|>{prompt}\n<|assistant|>"
tokenized_input = tokenizer.encode(input, return_tensors="pt")
outputs = model.generate(tokenized_input, max_new_tokens=128, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for paulilioaica/MixtureOfPhi3
Base model
microsoft/Phi-3-mini-128k-instruct