data update
Browse files
README.md
CHANGED
@@ -303,7 +303,7 @@ Granite Language Instruct models are trained on a collection of publicly availab
|
|
303 |
|
304 |
* English datasets: [Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus), [WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub), [OASST-OctoPack](https://huggingface.co/datasets/bigcode/oasst-octopack), [Daring-Anteater](https://huggingface.co/datasets/nvidia/Daring-Anteater), [SoftAge-Multiturn](https://huggingface.co/datasets/SoftAge-AI/multi-turn_dataset), [Glaive-RAG-v1 ](https://huggingface.co/datasets/glaiveai/RAG-v1 ), [EvolKit-20k](https://huggingface.co/datasets/arcee-ai/EvolKit-20k ), [Magpie-Phi3-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Phi3-Pro-300K-Filtered).
|
305 |
* Multilingual datasets: [Aya Dataset](https://huggingface.co/datasets/CohereForAI/aya_dataset) and IBM Synthetic datasets (e.g., Blue Multilingual, Daring Anteater Translated).
|
306 |
-
* Code datasets: [Glaive Code Assistant V3](https://huggingface.co/datasets/glaiveai/glaive-code-assistant-v3), [SQL Create Context Instruction](https://huggingface.co/datasets/bugdaryan/sql-create-context-instruction), and [Self-OSS-Instruct-SC2](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k).
|
307 |
* Math: [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), [StackMathQA](https://huggingface.co/datasets/math-ai/StackMathQA ), and [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)
|
308 |
* Tools: [xlam-function-calling](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k), [Glaive Function Calling V2](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2), [Hermes Function Calling V1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1), and IBM Synthetic API data.
|
309 |
* Safety: [SimpleSafetyTests](https://huggingface.co/datasets/Bertievidgen/SimpleSafetyTests), [HarmBench Behaviors](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_all.csv), [Strong Reject](https://github.com/alexandrasouly/strongreject/blob/main/strongreject_dataset/strongreject_dataset.csv), [AdvBench](https://huggingface.co/datasets/walledai/AdvBench), [MistralGuard](https://huggingface.co/datasets/natolambert/xstest-v2-copy), [Do-Not-Answer](https://huggingface.co/datasets/LibrAI/do-not-answer), and IBM Synthetic data for safety.
|
|
|
303 |
|
304 |
* English datasets: [Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus), [WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub), [OASST-OctoPack](https://huggingface.co/datasets/bigcode/oasst-octopack), [Daring-Anteater](https://huggingface.co/datasets/nvidia/Daring-Anteater), [SoftAge-Multiturn](https://huggingface.co/datasets/SoftAge-AI/multi-turn_dataset), [Glaive-RAG-v1 ](https://huggingface.co/datasets/glaiveai/RAG-v1 ), [EvolKit-20k](https://huggingface.co/datasets/arcee-ai/EvolKit-20k ), [Magpie-Phi3-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Phi3-Pro-300K-Filtered).
|
305 |
* Multilingual datasets: [Aya Dataset](https://huggingface.co/datasets/CohereForAI/aya_dataset) and IBM Synthetic datasets (e.g., Blue Multilingual, Daring Anteater Translated).
|
306 |
+
* Code datasets: [Glaive Code Assistant V3](https://huggingface.co/datasets/glaiveai/glaive-code-assistant-v3), [SQL Create Context Instruction](https://huggingface.co/datasets/bugdaryan/sql-create-context-instruction), and [Self-OSS-Instruct-SC2](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k). Single and multi-turn IBM synthetic datasets, including a set of datasets generated via the evol-instruct method.
|
307 |
* Math: [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), [StackMathQA](https://huggingface.co/datasets/math-ai/StackMathQA ), and [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)
|
308 |
* Tools: [xlam-function-calling](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k), [Glaive Function Calling V2](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2), [Hermes Function Calling V1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1), and IBM Synthetic API data.
|
309 |
* Safety: [SimpleSafetyTests](https://huggingface.co/datasets/Bertievidgen/SimpleSafetyTests), [HarmBench Behaviors](https://github.com/centerforaisafety/HarmBench/blob/main/data/behavior_datasets/harmbench_behaviors_text_all.csv), [Strong Reject](https://github.com/alexandrasouly/strongreject/blob/main/strongreject_dataset/strongreject_dataset.csv), [AdvBench](https://huggingface.co/datasets/walledai/AdvBench), [MistralGuard](https://huggingface.co/datasets/natolambert/xstest-v2-copy), [Do-Not-Answer](https://huggingface.co/datasets/LibrAI/do-not-answer), and IBM Synthetic data for safety.
|