xlm-mlm-en-2048
Table of Contents
- Model Details
- Uses
- Bias, Risks, and Limitations
- Training
- Evaluation
- Environmental Impact
- Citation
- Model Card Authors
- How To Get Started With the Model
Model Details
The XLM model was proposed in Cross-lingual Language Model Pretraining by Guillaume Lample and Alexis Conneau. It’s a transformer pretrained with either a causal language modeling (CLM) objective (next token prediction), a masked language modeling (MLM) objective (BERT-like), or a Translation Language Modeling (TLM) object (extension of BERT’s MLM to multiple language inputs). This model is trained with a masked language modeling objective on English text.
Model Description
- Developed by: Researchers affiliated with Facebook AI, see associated paper and GitHub Repo
- Model type: Language model
- Language(s) (NLP): English
- License: CC-BY-NC-4.0
- Related Models: Other XLM models
- Resources for more information:
- Cross-lingual Language Model Pretraining by Guillaume Lample and Alexis Conneau (2019)
- Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al. (2020)
- GitHub Repo
- Hugging Face XLM docs
Uses
Direct Use
The model is a language model. The model can be used for masked language modeling.
Downstream Use
To learn more about this task and potential downstream uses, see the Hugging Face fill mask docs and the Hugging Face Multilingual Models for Inference docs. Also see the associated paper.
Out-of-Scope Use
The model should not be used to intentionally create hostile or alienating environments for people.
Bias, Risks, and Limitations
Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)).
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
Training
More information needed. See the associated GitHub Repo.
Evaluation
More information needed. See the associated GitHub Repo.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: More information needed
- Hours used: More information needed
- Cloud Provider: More information needed
- Compute Region: More information needed
- Carbon Emitted: More information needed
Citation
BibTeX:
@article{lample2019cross,
title={Cross-lingual language model pretraining},
author={Lample, Guillaume and Conneau, Alexis},
journal={arXiv preprint arXiv:1901.07291},
year={2019}
}
APA:
- Lample, G., & Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
Model Card Authors
This model card was written by the team at Hugging Face.
How to Get Started with the Model
Use the code below to get started with the model. See the Hugging Face XLM docs for more examples.
from transformers import XLMTokenizer, XLMModel
import torch
tokenizer = XLMTokenizer.from_pretrained("xlm-mlm-en-2048")
model = XLMModel.from_pretrained("xlm-mlm-en-2048")
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
- Downloads last month
- 1,430