|
--- |
|
license: openrail |
|
datasets: |
|
- the_pile_openwebtext2 |
|
- semeru/code-code-CodeCompletion-TokenLevel-Python |
|
- pacovaldez/stackoverflow-questions |
|
- AhmedSSoliman/CodeSearchNet-py |
|
- irds/codesearchnet |
|
- bigscience-catalogue-data-dev/lm_code_github-eval_subset |
|
- codeparrot/github-code |
|
- nchen909/bigclonebench-processed |
|
- Open-Orca/OpenOrca |
|
- fka/awesome-chatgpt-prompts |
|
- openchat/openchat_sharegpt4_dataset |
|
- bookcorpus |
|
- bookcorpusopen |
|
- nRuaif/OpenOrca-GPT3.5 |
|
- irds/codesearchnet |
|
- giganticode/java-cmpx-v1 |
|
- nickrosh/Evol-Instruct-Code-80k-v1 |
|
- bigcode/starcoderdata |
|
- bigcode/the-stack |
|
- bigcode/the-stack-smol |
|
- Cdaprod/AI-Developer-Prompts |
|
- code_x_glue_ct_code_to_text |
|
- codeparrot/github-code |
|
- codeparrot/github-code-clean |
|
- code_x_glue_cc_code_completion_line |
|
- >- |
|
autoevaluate/autoeval-eval-jeffdshen__inverse_superglue_mixedp1-jeffdshen__inverse-63643c-1665558893 |
|
- bentrevett/multi30k |
|
- edbeeching/decision_transformer_gym_replay |
|
- psyche/common_crawl |
|
- Birchlabs/openai-prm800k-solutions-only |
|
- openchat/openchat_sharegpt4_dataset |
|
- Open-Orca/OpenOrca |
|
- cjvt/slownet |
|
- para_crawl |
|
- zeroshot/twitter-financial-news-sentiment |
|
- laugustyniak/political-advertising-pl |
|
- code_search_net |
|
- sukaka/novelai-webui |
|
- P1ayer-1/chatgpt-conversations-chatlogs.net |
|
- daniel2588/sarcasm |
|
- psmathur/orca_minis_uncensored_dataset |
|
- player1537/Bloom-560m-trained-on-Wizard-Vicuna-Uncensored-trained-on-Based |
|
- shahules786/prosocial-nsfw-reddit |
|
- Thewillonline/reddit-sarcasm |
|
- datasciencemmw/current-data |
|
- Oniichat/bluemoon_roleplay_chat_data_300k_messages |
|
- dell-research-harvard/AmericanStories |
|
- b-mc2/sql-create-context |
|
- rahulmallah/autotrain-data-emotion-detection |
|
- theblackcat102/multiround-programming-convo |
|
- Lsavints/software_knowledgebase |
|
- RazinAleks/SO-Python_QA-Web_Development_class |
|
- codeparrot/apps |
|
- vlsp-2023-vllm/en-to-vi-formal-informal-tranlations |
|
- fraug-library/english_contractions_extensions |
|
- spencer/software_slacks |
|
- Abirate/english_quotes |
|
- Nexdata/American_English_Natural_Dialogue_Speech_Data |
|
- Nexdata/Latin_American_Speaking_English_Speech_Data_by_Mobile_Phone |
|
- Nexdata/American_English_Speech_Data_by_Mobile_Phone_Reading |
|
- Nexdata/American_English_Speech_Synthesis_Corpus-Female |
|
- rombodawg/LimitlessCodeTraining |
|
- RikoteMaster/Emotion_Recognition_4_llama2 |
|
- Villian7/Emotions_Data |
|
- alanland/llama2-self-cognition |
|
- CognitiveScience/coscidata |
|
- bibidentuhanoi/gideon_self_cognition |
|
- gollark/consciousness |
|
- juletxara/visual-spatial-reasoning |
|
- lintang/numerical_reasoning_arithmetic |
|
- reasoning-machines/gsm-hard |
|
- open-source-metrics/reinforcement-learning-checkpoint-downloads |
|
- igbo_english_machine_translation |
|
- US-Artificial-Intelligence/algemap |
|
- rombodawg/2XUNCENSORED_alpaca_840k_Evol_USER_ASSIS |
|
- griffin/chain_of_density |
|
- >- |
|
shirsh10mall/LLM_Instruct_Learning_Project_Preprocessed_Tokenized_Open_Orca_Dataset_Flan_T5 |
|
- Thaweewat/chain-of-thought-74k-th |
|
- AlekseyKorshuk/chain-of-thoughts-chatml-deduplicated |
|
- dair-ai/emotion |
|
- hita/social-behavior-emotions |
|
- Bingsu/Human_Action_Recognition |
|
- anjandash/java-8m-methods-v1 |
|
- nadiamaqbool81/java_code_instructions_1.178k_alpaca |
|
- DavidMOBrien/8000-java |
|
- rombodawg/LimitlessCodeTraining_1k-Python-Javascript_GuanacoFormat |
|
- angie-chen55/javascript-github-code |
|
- kye/all-lucidrain-python-3 |
|
- Fraser/python-state-changes |
|
- ammarnasr/the-stack-ruby-clean |
|
- ammarnasr/the-stack-rust-clean |
|
- seyyedaliayati/solidity-dataset |
|
- jkhedri/psychology-dataset |
|
- KonradSzafer/stackoverflow_linux |
|
- vikp/textbook_quality_programming |
|
- rombodawg/LosslessMegaCodeTrainingV3_MINI |
|
- BelleGroup/multiturn_chat_0.8M |
|
- smangrul/code-chat-assistant-v1 |
|
- goendalf666/sales-textbook_for_convincing_and_selling |
|
- readerbench/ConversationalAgent-Ro |
|
- beurkinger/autotrain-data-human-action-recognition |
|
- jpwahle/autoencoder-paraphrase-dataset |
|
- jpwahle/autoregressive-paraphrase-dataset |
|
- teknium/GPT4-LLM-Cleaned |
|
- Anthropic/model-written-evals |
|
- openai_humaneval |
|
- kye/all-google-ai-python-code |
|
- kye/all-openai-github-code |
|
- EleutherAI/lambada_openai |
|
- CShorten/ML-ArXiv-Papers |
|
- WaltonFuture/InstructionGPT-4 |
|
- open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B |
|
- seansullivan/INT-Business-Syllabus |
|
- theoldmandthesea/17k_business_book |
|
- SunRise228/business-doc |
|
- gauravshrm211/VC-startup-evaluation-for-investment |
|
- TuningAI/Startups_V1 |
|
- TuningAI/Startups_V2 |
|
- AdiOO7/llama-2-finance |
|
- scillm/scientific_papers |
|
- gokuls/wiki_book_corpus_complete_processed_bert_dataset |
|
- the_pile_books3 |
|
- go_emotions |
|
- yizhongw/self_instruct |
|
- codeparrot/self-instruct-starcoder |
|
- Amani27/massive_translation_dataset |
|
- huggingface/transformers-metadata |
|
- hf-internal-testing/transformers-metadata |
|
- commonsense_qa |
|
- nlplabtdtu/test-edu-crawl |
|
- kernelmachine/open-license-corpus |
|
- BDas/EnglishNLPDataset |
|
- CyberNative/github_cybersecurity_READMEs |
|
- thomwolf/github-python |
|
- CM/codexglue_code2text_java |
|
- autoevaluate/autoeval-staging-eval-project-glue-f16e6c43-14015917 |
|
- lemonteaa/algorithmic-reasoning-seed |
|
- EmpathyFirstMedia/algolia |
|
- vicgalle/alpaca-gpt4 |
|
- pariajm/sharif_emotional_speech_dataset |
|
- lighteval/synthetic_reasoning_natural |
|
- jxu124/llava_complex_reasoning_77k |
|
- bibidentuhanoi/gideon_self_cognition_text |
|
- ohilikeit/empathetic_dialogues_mutli_turn_ko |
|
- KevinZ/psycholinguistic_eval |
|
- fiveflow/psychology-dataset |
|
- shahidul034/text_generation_model_data |
|
- qwedsacf/story-generation |
|
- EnigmaOfTheWorld/b-mc2-sql-create-context |
|
- HuggingFaceH4/testing_self_instruct_small |
|
- RUCAIBox/Data-to-text-Generation |
|
- Fhrozen/AudioSet2K22 |
|
- Chr0my/Epidemic_sounds |
|
- ChristophSchuhmann/lyrics-index |
|
- Cropinky/rap_lyrics_english |
|
- tsterbak/eurovision-lyrics-1956-2023 |
|
- brunokreiner/genius-lyrics |
|
- google/MusicCaps |
|
- ccmusic-database/music_genre |
|
- Hyeon2/riffusion-musiccaps-dataset |
|
- SamAct/autotrain-data-musicprompt |
|
- Chr0my/Epidemic_music |
|
- juliensimon/autonlp-data-song-lyrics |
|
- Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC |
|
- Chr0my/freesound.org |
|
- teticio/audio-diffusion-256 |
|
- KELONMYOSA/dusha_emotion_audio |
|
- Ar4ikov/iemocap_audio_text_splitted |
|
- flexthink/ljspeech |
|
- mozilla-foundation/common_voice_13_0 |
|
- facebook/voxpopuli |
|
- SocialGrep/one-million-reddit-jokes |
|
- breadlicker45/human-midi-rlhf |
|
- breadlicker45/midi-gpt-music-small |
|
- projectlosangeles/Los-Angeles-MIDI-Dataset |
|
- huggingartists/epic-rap-battles-of-history |
|
- SocialGrep/one-million-reddit-confessions |
|
- shahules786/prosocial-nsfw-reddit |
|
- Thewillonline/reddit-sarcasm |
|
- autoevaluate/autoeval-eval-futin__guess-vi-4200fb-2012366606 |
|
- lmsys/chatbot_arena_conversations |
|
- mozilla-foundation/common_voice_11_0 |
|
- mozilla-foundation/common_voice_4_0 |
|
- dell-research-harvard/AmericanStories |
|
- zZWipeoutZz/insane_style |
|
- mu-llama/MusicQA |
|
- RaphaelOlivier/whisper_adversarial_examples |
|
- huggingartists/metallica |
|
- vldsavelyev/guitar_tab |
|
- NLPCoreTeam/humaneval_ru |
|
- seungheondoh/audioset-music |
|
- gary109/onset-singing3_corpora_parliament_processed_MIR-ST500 |
|
- LDD5522/Rock_Vocals |
|
- huggingartists/rage-against-the-machine |
|
- huggingartists/chester-bennington |
|
- huggingartists/logic |
|
- cmsolson75/artist_song_lyric_dataset |
|
- BhavyaMuni/artist-lyrics |
|
- vjain/emotional_intelligence |
|
- mhenrichsen/context-aware-splits |
|
language: |
|
- en |
|
- es |
|
- it |
|
- ru |
|
- la |
|
metrics: |
|
- accuracy |
|
- bertscore |
|
- code_eval |
|
- f1 |
|
- bleu |
|
- perplexity |
|
- mean_iou |
|
- hyperml/balanced_accuracy |
|
tags: |
|
- code |
|
- music |
|
library_name: transformers |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] |
|
- **Language(s) (NLP):** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
[More Information Needed] |
|
|
|
### Downstream Use [optional] |
|
|
|
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> |
|
|
|
[More Information Needed] |
|
|
|
### Out-of-Scope Use |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
<!-- This should link to a Data Card if possible. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Factors |
|
|
|
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
|
|
|
|
## Model Examination [optional] |
|
|
|
<!-- Relevant interpretability work for the model goes here --> |
|
|
|
[More Information Needed] |
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** [More Information Needed] |
|
- **Hours used:** [More Information Needed] |
|
- **Cloud Provider:** [More Information Needed] |
|
- **Compute Region:** [More Information Needed] |
|
- **Carbon Emitted:** [More Information Needed] |
|
|
|
## Technical Specifications [optional] |
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
[More Information Needed] |
|
|
|
#### Hardware |
|
|
|
[More Information Needed] |
|
|
|
#### Software |
|
|
|
[More Information Needed] |
|
|
|
## Citation [optional] |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed] |
|
|
|
## Glossary [optional] |
|
|
|
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. --> |
|
|
|
[More Information Needed] |
|
|
|
## More Information [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Authors [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Contact |
|
|
|
[More Information Needed] |