zhibinlu
/

vgcn-bert-distilbert-base-uncased

Feature Extraction

Model card Files Files and versions Community

zhibinlu commited on Jun 9, 2023

Commit

14bd3ef

·

1 Parent(s): b5a178d

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+language: en
+tags:
+- exbert
+license: apache-2.0
+datasets:
+- bookcorpus
+- wikipedia
+---
+# VGCN-BERT (DistilBERT based, uncased)
+This model is a VGCN-BERT model based on [DistilBert-base-uncased](https://huggingface.co/distilbert-base-uncased) version. The original paper is [VGCN-BERT](https://arxiv.org/abs/2004.05707).
+### How to use
+- First prepare WGraph symmetric adjacency matrix
+```python
+import transformers as tfr
+from transformers.models.vgcn_bert.modeling_graph import WordGraph,_normalize_adj
+tokenizer = tfr.AutoTokenizer.from_pretrained(
+    "zhibinlu/vgcn-bert-distilbert-base-uncased"
+)
+# 1st method: Build graph using NPMI statistical method from training corpus
+wgraph = WordGraph(rows=train_valid_df["text"], tokenizer=tokenizer)
+# 2nd method: Build graph from pre-defined entity relationship tuple with weight
+entity_relations = [
+    ("dog", "labrador", 0.6),
+    ("cat", "garfield", 0.7),
+    ("city", "montreal", 0.8),
+    ("weather", "rain", 0.3),
+]
+wgraph = WordGraph(rows=entity_relations, tokenizer=tokenizer)
+```
+- Then instantiate VGCN-BERT model with your WGraphs (support multiple graphs).
+```python
+from transformers.models.vgcn_bert.modeling_vgcn_bert import VGCNBertModel
+model = VGCNBertModel.from_pretrained(
+    "zhibinlu/vgcn-bert-distilbert-base-uncased", trust_remote_code=True,
+    wgraphs=[wgraph.to_torch_sparse()],
+    wgraph_id_to_tokenizer_id_maps=[wgraph.wgraph_id_to_tokenizer_id_map])
+)
+text = "Replace me by any text you'd like."
+encoded_input = tokenizer(text, return_tensors="pt")
+output = model(**encoded_input)
+```
+## Fine-tune model
+It's better fin-tune vgcn-bert model for the specific tasks.