Token Classification
Transformers
Safetensors
bert
Inference Endpoints
Xmm commited on
Commit
b07e637
1 Parent(s): ece567b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -47
README.md CHANGED
@@ -107,6 +107,18 @@ language:
107
  license: apache-2.0
108
  datasets:
109
  - wikipedia
 
 
 
 
 
 
 
 
 
 
 
 
110
  ---
111
 
112
  # BERT multilingual base model (cased)
@@ -151,55 +163,17 @@ generation you should look at model like GPT2.
151
 
152
  ### How to use
153
 
154
- You can use this model directly with a pipeline for masked language modeling:
155
 
156
  ```python
157
- >>> from transformers import pipeline
158
- >>> unmasker = pipeline('fill-mask', model='bert-base-multilingual-cased')
159
- >>> unmasker("Hello I'm a [MASK] model.")
160
-
161
- [{'sequence': "[CLS] Hello I'm a model model. [SEP]",
162
- 'score': 0.10182085633277893,
163
- 'token': 13192,
164
- 'token_str': 'model'},
165
- {'sequence': "[CLS] Hello I'm a world model. [SEP]",
166
- 'score': 0.052126359194517136,
167
- 'token': 11356,
168
- 'token_str': 'world'},
169
- {'sequence': "[CLS] Hello I'm a data model. [SEP]",
170
- 'score': 0.048930276185274124,
171
- 'token': 11165,
172
- 'token_str': 'data'},
173
- {'sequence': "[CLS] Hello I'm a flight model. [SEP]",
174
- 'score': 0.02036019042134285,
175
- 'token': 23578,
176
- 'token_str': 'flight'},
177
- {'sequence': "[CLS] Hello I'm a business model. [SEP]",
178
- 'score': 0.020079681649804115,
179
- 'token': 14155,
180
- 'token_str': 'business'}]
181
- ```
182
-
183
- Here is how to use this model to get the features of a given text in PyTorch:
184
-
185
- ```python
186
- from transformers import BertTokenizer, BertModel
187
- tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-cased')
188
- model = BertModel.from_pretrained("bert-base-multilingual-cased")
189
- text = "Replace me by any text you'd like."
190
- encoded_input = tokenizer(text, return_tensors='pt')
191
- output = model(**encoded_input)
192
- ```
193
-
194
- and in TensorFlow:
195
-
196
- ```python
197
- from transformers import BertTokenizer, TFBertModel
198
- tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-cased')
199
- model = TFBertModel.from_pretrained("bert-base-multilingual-cased")
200
- text = "Replace me by any text you'd like."
201
- encoded_input = tokenizer(text, return_tensors='tf')
202
- output = model(encoded_input)
203
  ```
204
 
205
  ## Training data
 
107
  license: apache-2.0
108
  datasets:
109
  - wikipedia
110
+ examples:
111
+ widget:
112
+ - text: "মারভিন দি মারসিয়ান"
113
+ example_title: "Sentence_1"
114
+ - text: "লিওনার্দো দা ভিঞ্চি"
115
+ example_title: "Sentence_2"
116
+ - text: "বসনিয়া ও হার্জেগোভিনা"
117
+ example_title: "Sentence_3"
118
+ - text: "সাউথ ইস্ট ইউনিভার্সিটি"
119
+ example_title: "Sentence_4"
120
+ - text: "মানিক বন্দ্যোপাধ্যায় লেখক"
121
+ example_title: "Sentence_5"
122
  ---
123
 
124
  # BERT multilingual base model (cased)
 
163
 
164
  ### How to use
165
 
166
+ You can use this model directly with a pipeline for named entity recognition:
167
 
168
  ```python
169
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
170
+ from transformers import pipeline
171
+ tokenizer = AutoTokenizer.from_pretrained("orgcatorg/bert-base-multilingual-cased-ner")
172
+ model = AutoModelForTokenClassification.from_pretrained("orgcatorg/bert-base-multilingual-cased-ner")
173
+ nlp = pipeline("ner", model=model, tokenizer=tokenizer)
174
+ example = "মারভিন দি মারসিয়ান"
175
+ ner_results = nlp(example)
176
+ ner_results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
  ```
178
 
179
  ## Training data