File size: 12,537 Bytes

---
license: openrail
datasets:
- the_pile_openwebtext2
- semeru/code-code-CodeCompletion-TokenLevel-Python
- pacovaldez/stackoverflow-questions
- AhmedSSoliman/CodeSearchNet-py
- irds/codesearchnet
- bigscience-catalogue-data-dev/lm_code_github-eval_subset
- codeparrot/github-code
- nchen909/bigclonebench-processed
- Open-Orca/OpenOrca
- fka/awesome-chatgpt-prompts
- openchat/openchat_sharegpt4_dataset
- bookcorpus
- bookcorpusopen
- nRuaif/OpenOrca-GPT3.5
- irds/codesearchnet
- giganticode/java-cmpx-v1
- nickrosh/Evol-Instruct-Code-80k-v1
- bigcode/starcoderdata
- bigcode/the-stack
- bigcode/the-stack-smol
- Cdaprod/AI-Developer-Prompts
- code_x_glue_ct_code_to_text
- codeparrot/github-code
- codeparrot/github-code-clean
- code_x_glue_cc_code_completion_line
- >-
  autoevaluate/autoeval-eval-jeffdshen__inverse_superglue_mixedp1-jeffdshen__inverse-63643c-1665558893
- bentrevett/multi30k
- edbeeching/decision_transformer_gym_replay
- psyche/common_crawl
- Birchlabs/openai-prm800k-solutions-only
- openchat/openchat_sharegpt4_dataset
- Open-Orca/OpenOrca
- cjvt/slownet
- para_crawl
- zeroshot/twitter-financial-news-sentiment
- laugustyniak/political-advertising-pl
- code_search_net
- sukaka/novelai-webui
- P1ayer-1/chatgpt-conversations-chatlogs.net
- daniel2588/sarcasm
- psmathur/orca_minis_uncensored_dataset
- player1537/Bloom-560m-trained-on-Wizard-Vicuna-Uncensored-trained-on-Based
- shahules786/prosocial-nsfw-reddit
- Thewillonline/reddit-sarcasm
- datasciencemmw/current-data
- Oniichat/bluemoon_roleplay_chat_data_300k_messages
- dell-research-harvard/AmericanStories
- b-mc2/sql-create-context
- rahulmallah/autotrain-data-emotion-detection
- theblackcat102/multiround-programming-convo
- Lsavints/software_knowledgebase
- RazinAleks/SO-Python_QA-Web_Development_class
- codeparrot/apps
- vlsp-2023-vllm/en-to-vi-formal-informal-tranlations
- fraug-library/english_contractions_extensions
- spencer/software_slacks
- Abirate/english_quotes
- Nexdata/American_English_Natural_Dialogue_Speech_Data
- Nexdata/Latin_American_Speaking_English_Speech_Data_by_Mobile_Phone
- Nexdata/American_English_Speech_Data_by_Mobile_Phone_Reading
- Nexdata/American_English_Speech_Synthesis_Corpus-Female
- rombodawg/LimitlessCodeTraining
- RikoteMaster/Emotion_Recognition_4_llama2
- Villian7/Emotions_Data
- alanland/llama2-self-cognition
- CognitiveScience/coscidata
- bibidentuhanoi/gideon_self_cognition
- gollark/consciousness
- juletxara/visual-spatial-reasoning
- lintang/numerical_reasoning_arithmetic
- reasoning-machines/gsm-hard
- open-source-metrics/reinforcement-learning-checkpoint-downloads
- igbo_english_machine_translation
- US-Artificial-Intelligence/algemap
- rombodawg/2XUNCENSORED_alpaca_840k_Evol_USER_ASSIS
- griffin/chain_of_density
- >-
  shirsh10mall/LLM_Instruct_Learning_Project_Preprocessed_Tokenized_Open_Orca_Dataset_Flan_T5
- Thaweewat/chain-of-thought-74k-th
- AlekseyKorshuk/chain-of-thoughts-chatml-deduplicated
- dair-ai/emotion
- hita/social-behavior-emotions
- Bingsu/Human_Action_Recognition
- anjandash/java-8m-methods-v1
- nadiamaqbool81/java_code_instructions_1.178k_alpaca
- DavidMOBrien/8000-java
- rombodawg/LimitlessCodeTraining_1k-Python-Javascript_GuanacoFormat
- angie-chen55/javascript-github-code
- kye/all-lucidrain-python-3
- Fraser/python-state-changes
- ammarnasr/the-stack-ruby-clean
- ammarnasr/the-stack-rust-clean
- seyyedaliayati/solidity-dataset
- jkhedri/psychology-dataset
- KonradSzafer/stackoverflow_linux
- vikp/textbook_quality_programming
- rombodawg/LosslessMegaCodeTrainingV3_MINI
- BelleGroup/multiturn_chat_0.8M
- smangrul/code-chat-assistant-v1
- goendalf666/sales-textbook_for_convincing_and_selling
- readerbench/ConversationalAgent-Ro
- beurkinger/autotrain-data-human-action-recognition
- jpwahle/autoencoder-paraphrase-dataset
- jpwahle/autoregressive-paraphrase-dataset
- teknium/GPT4-LLM-Cleaned
- Anthropic/model-written-evals
- openai_humaneval
- kye/all-google-ai-python-code
- kye/all-openai-github-code
- EleutherAI/lambada_openai
- CShorten/ML-ArXiv-Papers
- WaltonFuture/InstructionGPT-4
- open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B
- seansullivan/INT-Business-Syllabus
- theoldmandthesea/17k_business_book
- SunRise228/business-doc
- gauravshrm211/VC-startup-evaluation-for-investment
- TuningAI/Startups_V1
- TuningAI/Startups_V2
- AdiOO7/llama-2-finance
- scillm/scientific_papers
- gokuls/wiki_book_corpus_complete_processed_bert_dataset
- the_pile_books3
- go_emotions
- yizhongw/self_instruct
- codeparrot/self-instruct-starcoder
- Amani27/massive_translation_dataset
- huggingface/transformers-metadata
- hf-internal-testing/transformers-metadata
- commonsense_qa
- nlplabtdtu/test-edu-crawl
- kernelmachine/open-license-corpus
- BDas/EnglishNLPDataset
- CyberNative/github_cybersecurity_READMEs
- thomwolf/github-python
- CM/codexglue_code2text_java
- autoevaluate/autoeval-staging-eval-project-glue-f16e6c43-14015917
- lemonteaa/algorithmic-reasoning-seed
- EmpathyFirstMedia/algolia
- vicgalle/alpaca-gpt4
- pariajm/sharif_emotional_speech_dataset
- lighteval/synthetic_reasoning_natural
- jxu124/llava_complex_reasoning_77k
- bibidentuhanoi/gideon_self_cognition_text
- ohilikeit/empathetic_dialogues_mutli_turn_ko
- KevinZ/psycholinguistic_eval
- fiveflow/psychology-dataset
- shahidul034/text_generation_model_data
- qwedsacf/story-generation
- EnigmaOfTheWorld/b-mc2-sql-create-context
- HuggingFaceH4/testing_self_instruct_small
- RUCAIBox/Data-to-text-Generation
- Fhrozen/AudioSet2K22
- Chr0my/Epidemic_sounds
- ChristophSchuhmann/lyrics-index
- Cropinky/rap_lyrics_english
- tsterbak/eurovision-lyrics-1956-2023
- brunokreiner/genius-lyrics
- google/MusicCaps
- ccmusic-database/music_genre
- Hyeon2/riffusion-musiccaps-dataset
- SamAct/autotrain-data-musicprompt
- Chr0my/Epidemic_music
- juliensimon/autonlp-data-song-lyrics
- Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC
- Chr0my/freesound.org
- teticio/audio-diffusion-256
- KELONMYOSA/dusha_emotion_audio
- Ar4ikov/iemocap_audio_text_splitted
- flexthink/ljspeech
- mozilla-foundation/common_voice_13_0
- facebook/voxpopuli
- SocialGrep/one-million-reddit-jokes
- breadlicker45/human-midi-rlhf
- breadlicker45/midi-gpt-music-small
- projectlosangeles/Los-Angeles-MIDI-Dataset
- huggingartists/epic-rap-battles-of-history
- SocialGrep/one-million-reddit-confessions
- shahules786/prosocial-nsfw-reddit
- Thewillonline/reddit-sarcasm
- autoevaluate/autoeval-eval-futin__guess-vi-4200fb-2012366606
- lmsys/chatbot_arena_conversations
- mozilla-foundation/common_voice_11_0
- mozilla-foundation/common_voice_4_0
- dell-research-harvard/AmericanStories
- zZWipeoutZz/insane_style
- mu-llama/MusicQA
- RaphaelOlivier/whisper_adversarial_examples
- huggingartists/metallica
- vldsavelyev/guitar_tab
- NLPCoreTeam/humaneval_ru
- seungheondoh/audioset-music
- gary109/onset-singing3_corpora_parliament_processed_MIR-ST500
- LDD5522/Rock_Vocals
- huggingartists/rage-against-the-machine
- huggingartists/chester-bennington
- huggingartists/logic
- cmsolson75/artist_song_lyric_dataset
- BhavyaMuni/artist-lyrics
- vjain/emotional_intelligence
- mhenrichsen/context-aware-splits
language:
- en
- es
- it
- ru
- la
metrics:
- accuracy
- bertscore
- code_eval
- f1
- bleu
- perplexity
- mean_iou
- hyperml/balanced_accuracy
tags:
- code
- music
library_name: transformers
---
# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

[More Information Needed]

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

[More Information Needed]

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

[More Information Needed]

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

[More Information Needed]

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Training Details

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[More Information Needed]

### Training Procedure 

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

#### Preprocessing [optional]

[More Information Needed]


#### Training Hyperparameters

- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

[More Information Needed]

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

<!-- This should link to a Data Card if possible. -->

[More Information Needed]

#### Factors

<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

[More Information Needed]

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

[More Information Needed]

### Results

[More Information Needed]

#### Summary



## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

[More Information Needed]

#### Software

[More Information Needed]

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]

## More Information [optional]

[More Information Needed]

## Model Card Authors [optional]

[More Information Needed]

## Model Card Contact

[More Information Needed]