-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 39 -
microsoft/phi-1_5
Text Generation • Updated • 133k • 1.32k -
Language models scale reliably with over-training and on downstream tasks
Paper • 2403.08540 • Published • 14 -
Akashpb13/Swahili_xlsr
Automatic Speech Recognition • Updated • 14 • 8
Wambugu Muchemi
FrankXII
·
AI & ML interests
None yet
Recent Activity
liked
a model
28 days ago
ibm-granite/granite-geospatial-biomass
liked
a model
about 1 month ago
meta-llama/Llama-3.2-90B-Vision-Instruct
liked
a model
about 1 month ago
google/datagemma-rag-27b-it
Organizations
Collections
1
models
None public yet
datasets
None public yet