SquanchNastyAI / README.md

Update README.md

743b893 over 1 year ago

12.5 kB

	---
	license: openrail
	datasets:
	- the_pile_openwebtext2
	- semeru/code-code-CodeCompletion-TokenLevel-Python
	- pacovaldez/stackoverflow-questions
	- AhmedSSoliman/CodeSearchNet-py
	- irds/codesearchnet
	- bigscience-catalogue-data-dev/lm_code_github-eval_subset
	- codeparrot/github-code
	- nchen909/bigclonebench-processed
	- Open-Orca/OpenOrca
	- fka/awesome-chatgpt-prompts
	- openchat/openchat_sharegpt4_dataset
	- bookcorpus
	- bookcorpusopen
	- nRuaif/OpenOrca-GPT3.5
	- irds/codesearchnet
	- giganticode/java-cmpx-v1
	- nickrosh/Evol-Instruct-Code-80k-v1
	- bigcode/starcoderdata
	- bigcode/the-stack
	- bigcode/the-stack-smol
	- Cdaprod/AI-Developer-Prompts
	- code_x_glue_ct_code_to_text
	- codeparrot/github-code
	- codeparrot/github-code-clean
	- code_x_glue_cc_code_completion_line
	- >-
	autoevaluate/autoeval-eval-jeffdshen__inverse_superglue_mixedp1-jeffdshen__inverse-63643c-1665558893
	- bentrevett/multi30k
	- edbeeching/decision_transformer_gym_replay
	- psyche/common_crawl
	- Birchlabs/openai-prm800k-solutions-only
	- openchat/openchat_sharegpt4_dataset
	- Open-Orca/OpenOrca
	- cjvt/slownet
	- para_crawl
	- zeroshot/twitter-financial-news-sentiment
	- laugustyniak/political-advertising-pl
	- code_search_net
	- sukaka/novelai-webui
	- P1ayer-1/chatgpt-conversations-chatlogs.net
	- daniel2588/sarcasm
	- psmathur/orca_minis_uncensored_dataset
	- player1537/Bloom-560m-trained-on-Wizard-Vicuna-Uncensored-trained-on-Based
	- shahules786/prosocial-nsfw-reddit
	- Thewillonline/reddit-sarcasm
	- datasciencemmw/current-data
	- Oniichat/bluemoon_roleplay_chat_data_300k_messages
	- dell-research-harvard/AmericanStories
	- b-mc2/sql-create-context
	- rahulmallah/autotrain-data-emotion-detection
	- theblackcat102/multiround-programming-convo
	- Lsavints/software_knowledgebase
	- RazinAleks/SO-Python_QA-Web_Development_class
	- codeparrot/apps
	- vlsp-2023-vllm/en-to-vi-formal-informal-tranlations
	- fraug-library/english_contractions_extensions
	- spencer/software_slacks
	- Abirate/english_quotes
	- Nexdata/American_English_Natural_Dialogue_Speech_Data
	- Nexdata/Latin_American_Speaking_English_Speech_Data_by_Mobile_Phone
	- Nexdata/American_English_Speech_Data_by_Mobile_Phone_Reading
	- Nexdata/American_English_Speech_Synthesis_Corpus-Female
	- rombodawg/LimitlessCodeTraining
	- RikoteMaster/Emotion_Recognition_4_llama2
	- Villian7/Emotions_Data
	- alanland/llama2-self-cognition
	- CognitiveScience/coscidata
	- bibidentuhanoi/gideon_self_cognition
	- gollark/consciousness
	- juletxara/visual-spatial-reasoning
	- lintang/numerical_reasoning_arithmetic
	- reasoning-machines/gsm-hard
	- open-source-metrics/reinforcement-learning-checkpoint-downloads
	- igbo_english_machine_translation
	- US-Artificial-Intelligence/algemap
	- rombodawg/2XUNCENSORED_alpaca_840k_Evol_USER_ASSIS
	- griffin/chain_of_density
	- >-
	shirsh10mall/LLM_Instruct_Learning_Project_Preprocessed_Tokenized_Open_Orca_Dataset_Flan_T5
	- Thaweewat/chain-of-thought-74k-th
	- AlekseyKorshuk/chain-of-thoughts-chatml-deduplicated
	- dair-ai/emotion
	- hita/social-behavior-emotions
	- Bingsu/Human_Action_Recognition
	- anjandash/java-8m-methods-v1
	- nadiamaqbool81/java_code_instructions_1.178k_alpaca
	- DavidMOBrien/8000-java
	- rombodawg/LimitlessCodeTraining_1k-Python-Javascript_GuanacoFormat
	- angie-chen55/javascript-github-code
	- kye/all-lucidrain-python-3
	- Fraser/python-state-changes
	- ammarnasr/the-stack-ruby-clean
	- ammarnasr/the-stack-rust-clean
	- seyyedaliayati/solidity-dataset
	- jkhedri/psychology-dataset
	- KonradSzafer/stackoverflow_linux
	- vikp/textbook_quality_programming
	- rombodawg/LosslessMegaCodeTrainingV3_MINI
	- BelleGroup/multiturn_chat_0.8M
	- smangrul/code-chat-assistant-v1
	- goendalf666/sales-textbook_for_convincing_and_selling
	- readerbench/ConversationalAgent-Ro
	- beurkinger/autotrain-data-human-action-recognition
	- jpwahle/autoencoder-paraphrase-dataset
	- jpwahle/autoregressive-paraphrase-dataset
	- teknium/GPT4-LLM-Cleaned
	- Anthropic/model-written-evals
	- openai_humaneval
	- kye/all-google-ai-python-code
	- kye/all-openai-github-code
	- EleutherAI/lambada_openai
	- CShorten/ML-ArXiv-Papers
	- WaltonFuture/InstructionGPT-4
	- open-llm-leaderboard/details_AIDC-ai-business__Marcoroni-70B
	- seansullivan/INT-Business-Syllabus
	- theoldmandthesea/17k_business_book
	- SunRise228/business-doc
	- gauravshrm211/VC-startup-evaluation-for-investment
	- TuningAI/Startups_V1
	- TuningAI/Startups_V2
	- AdiOO7/llama-2-finance
	- scillm/scientific_papers
	- gokuls/wiki_book_corpus_complete_processed_bert_dataset
	- the_pile_books3
	- go_emotions
	- yizhongw/self_instruct
	- codeparrot/self-instruct-starcoder
	- Amani27/massive_translation_dataset
	- huggingface/transformers-metadata
	- hf-internal-testing/transformers-metadata
	- commonsense_qa
	- nlplabtdtu/test-edu-crawl
	- kernelmachine/open-license-corpus
	- BDas/EnglishNLPDataset
	- CyberNative/github_cybersecurity_READMEs
	- thomwolf/github-python
	- CM/codexglue_code2text_java
	- autoevaluate/autoeval-staging-eval-project-glue-f16e6c43-14015917
	- lemonteaa/algorithmic-reasoning-seed
	- EmpathyFirstMedia/algolia
	- vicgalle/alpaca-gpt4
	- pariajm/sharif_emotional_speech_dataset
	- lighteval/synthetic_reasoning_natural
	- jxu124/llava_complex_reasoning_77k
	- bibidentuhanoi/gideon_self_cognition_text
	- ohilikeit/empathetic_dialogues_mutli_turn_ko
	- KevinZ/psycholinguistic_eval
	- fiveflow/psychology-dataset
	- shahidul034/text_generation_model_data
	- qwedsacf/story-generation
	- EnigmaOfTheWorld/b-mc2-sql-create-context
	- HuggingFaceH4/testing_self_instruct_small
	- RUCAIBox/Data-to-text-Generation
	- Fhrozen/AudioSet2K22
	- Chr0my/Epidemic_sounds
	- ChristophSchuhmann/lyrics-index
	- Cropinky/rap_lyrics_english
	- tsterbak/eurovision-lyrics-1956-2023
	- brunokreiner/genius-lyrics
	- google/MusicCaps
	- ccmusic-database/music_genre
	- Hyeon2/riffusion-musiccaps-dataset
	- SamAct/autotrain-data-musicprompt
	- Chr0my/Epidemic_music
	- juliensimon/autonlp-data-song-lyrics
	- Datatang/North_American_English_Speech_Data_by_Mobile_Phone_and_PC
	- Chr0my/freesound.org
	- teticio/audio-diffusion-256
	- KELONMYOSA/dusha_emotion_audio
	- Ar4ikov/iemocap_audio_text_splitted
	- flexthink/ljspeech
	- mozilla-foundation/common_voice_13_0
	- facebook/voxpopuli
	- SocialGrep/one-million-reddit-jokes
	- breadlicker45/human-midi-rlhf
	- breadlicker45/midi-gpt-music-small
	- projectlosangeles/Los-Angeles-MIDI-Dataset
	- huggingartists/epic-rap-battles-of-history
	- SocialGrep/one-million-reddit-confessions
	- shahules786/prosocial-nsfw-reddit
	- Thewillonline/reddit-sarcasm
	- autoevaluate/autoeval-eval-futin__guess-vi-4200fb-2012366606
	- lmsys/chatbot_arena_conversations
	- mozilla-foundation/common_voice_11_0
	- mozilla-foundation/common_voice_4_0
	- dell-research-harvard/AmericanStories
	- zZWipeoutZz/insane_style
	- mu-llama/MusicQA
	- RaphaelOlivier/whisper_adversarial_examples
	- huggingartists/metallica
	- vldsavelyev/guitar_tab
	- NLPCoreTeam/humaneval_ru
	- seungheondoh/audioset-music
	- gary109/onset-singing3_corpora_parliament_processed_MIR-ST500
	- LDD5522/Rock_Vocals
	- huggingartists/rage-against-the-machine
	- huggingartists/chester-bennington
	- huggingartists/logic
	- cmsolson75/artist_song_lyric_dataset
	- BhavyaMuni/artist-lyrics
	- vjain/emotional_intelligence
	- mhenrichsen/context-aware-splits
	language:
	- en
	- es
	- it
	- ru
	- la
	metrics:
	- accuracy
	- bertscore
	- code_eval
	- f1
	- bleu
	- perplexity
	- mean_iou
	- hyperml/balanced_accuracy
	tags:
	- code
	- music
	library_name: transformers
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	[More Information Needed]

	### Downstream Use [optional]

	<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

	[More Information Needed]

	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	[More Information Needed]

	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	[More Information Needed]

	## Training Details

	### Training Data

	<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	[More Information Needed]

	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	- Training regime: [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

	[More Information Needed]

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Data Card if possible. -->

	[More Information Needed]

	#### Factors

	<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

	[More Information Needed]

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary



	## Model Examination [optional]

	<!-- Relevant interpretability work for the model goes here -->

	[More Information Needed]

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: [More Information Needed]
	- Hours used: [More Information Needed]
	- Cloud Provider: [More Information Needed]
	- Compute Region: [More Information Needed]
	- Carbon Emitted: [More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	[More Information Needed]

	#### Software

	[More Information Needed]

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]