Fine-tuning foundation Llama-3.2-3B-Instruct on medical Q&A using differential attention (In progress). Paper: https://arxiv.org/pdf/2410.05258
Ali Janati
Na0s
AI & ML interests
NLP, Speech Recognition, Computer Vision, Time Series Forecasting.
Recent Activity
updated
a collection
4 days ago
Pruned MoEs (Mixtral-8x7B-Instruct-v0.1)
updated
a collection
4 days ago
Pruned MoEs (Mixtral-8x7B-Instruct-v0.1)
updated
a collection
4 days ago
Pruned MoEs (Mixtral-8x7B-Instruct-v0.1)
Organizations
Collections
4
-
Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-3.0
Text Generation • Updated • 23 • 3 -
Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-2.0
Text Generation • Updated • 41 • 1 -
Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT-1.0
Text Generation • Updated • 14 -
Na0s/Llama-3.1-8B-Pruned-4-Layers_LoRA-PEFT
Text Generation • Updated • 21
models
21
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-1-expert
Text Generation
•
Updated
•
10
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-2-experts
Text Generation
•
Updated
•
10
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-3-experts
Text Generation
•
Updated
•
8
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-4-experts
Text Generation
•
Updated
•
12
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA
Text Generation
•
Updated
•
16
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-3-experts
Text Generation
•
Updated
•
7
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-4-experts
Text Generation
•
Updated
•
16
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-2-experts
Text Generation
•
Updated
•
7
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-1-experts
Text Generation
•
Updated
•
11
Na0s/Mixtral-8x7B-v0.1-instruct-l2-norm-post-Gates-SFT-pruned-1-experts
Text Generation
•
Updated
•
8
datasets
37
Na0s/sft-ready-garage-bAInd-Open-Platypus
Viewer
•
Updated
•
24.9k
•
50
Na0s/Next_Token_Prediction_dataset
Viewer
•
Updated
•
5.5M
•
108
Na0s/sft-ready-neulab-conala
Viewer
•
Updated
•
2.38k
•
48
Na0s/sft-ready-HuggingFaceH4-ultrachat-200k
Viewer
•
Updated
•
658k
•
48
Na0s/sft-ready-Text-Generation-Augmented-Data
Viewer
•
Updated
•
7.67M
•
82
Na0s/sft-ready-Teknium-OpenHermes
Viewer
•
Updated
•
243k
•
42
Na0s/sft-ready-google-boolq
Viewer
•
Updated
•
9.43k
•
43
Na0s/sft-ready-allenai-WildChat-1M
Viewer
•
Updated
•
1.96M
•
50
Na0s/sft-ready-toughdata-quora-question-answer-dataset
Viewer
•
Updated
•
56.4k
•
41
Na0s/sft-ready-nvidia-HelpSteer2
Viewer
•
Updated
•
10.2k
•
42