NLP Indonesia Multitask
Collection
A collection of Indonesian NLP models for various text classification tasks such as spam detection, hate speech, abusive language, and more. Suitable
•
5 items
•
Updated
This is a BERT-based model fine-tuned for Named Entity Recognition (NER) tasks in Indonesian.
The model is trained to identify and classify named entities such as persons, organizations, locations, and other relevant entities in Indonesian text.
cahya/bert-base-indonesian-1.5G
The base model, BERT Base Indonesian (uncased), was pre-trained on:
Full details are available on its model card.
This fine-tuned model is intended for:
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
model_name = "nahiar/BERT-NER" # replace with your Hugging Face repo ID
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
text = "Presiden Joko Widodo berkunjung ke Jakarta untuk bertemu dengan Gubernur Anies Baswedan."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=2)
tokens = [tokenizer.convert_ids_to_tokens(ids) for ids in inputs["input_ids"]]
labels = [model.config.id2label[label_id] for label_id in predictions[0].tolist()]
print("Tokens:", tokens)
print("Labels:", labels)