Lukekim commited on
Commit
46e4632
β€’
1 Parent(s): 0494871

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -1,3 +1,29 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ λŒ€λŸ‰μ˜ ν•œκΈ€ νŠΉν—ˆ λ°μ΄ν„°λ‘œ μ‚¬μ „ν•™μŠ΅ (pre-training)을 μ§„ν–‰ν•œ DeBERTa-v2 λͺ¨λΈμž…λ‹ˆλ‹€.
6
+
7
+ νŠΉν—ˆ λ¬Έμ„œμ˜ abstract, claims, description μœ„μ£Όμ˜ ν…μŠ€νŠΈλ‘œ μ‚¬μ „ν•™μŠ΅μ΄ μ§„ν–‰λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
8
+
9
+ νŠΉν—ˆ λ¬Έμ„œ μž„λ² λ”© 계산, ν˜Ήμ€ νŠΉν—ˆ λ¬Έμ„œ λΆ„λ₯˜λ“±μ˜ νƒœμŠ€ν¬μ— ν™œμš©ν•  수 μžˆλŠ” ν•œκΈ€ μ–Έμ–΄λͺ¨λΈ (Language Model)μž…λ‹ˆλ‹€.
10
+
11
+ ## Patent Text Embedding 계산 μ˜ˆμ‹œ
12
+
13
+ ```
14
+ patent_abstract = '''λ³Έ 발λͺ…은 νŠΉν—ˆ 검색 μ‹œμŠ€ν…œ 및 검색 방법에 κ΄€ν•œ κ²ƒμœΌλ‘œ, 보닀 μžμ„Έν•˜κ²ŒλŠ” μž…λ ₯ν•œ κ²€μƒ‰μ–΄μ˜ λ™μ˜μ–΄λ₯Ό 제곡, 검색어λ₯Ό μžλ™μœΌλ‘œ λ²ˆμ—­ν•˜μ—¬ ꡭ가에 상관없이 검색을 κ°€λŠ₯토둝 ν•˜κ±°λ‚˜ λŒ€λΆ„λ₯˜, 쀑뢄λ₯˜, μ†ŒλΆ„λ₯˜ λ“± λΆ„λ₯˜ν•œ 검색어λ₯Ό μ‘°ν•©ν•˜μ—¬ 검색을 ν–‰ν•¨μœΌλ‘œμ¨, 효율적인 μ„ ν–‰κΈ°μˆ μ„ 검색할 수 μžˆλ„λ‘ ν•˜λŠ” νŠΉν—ˆ 검색 μ‹œμŠ€ν…œ 및 검색 방법에 κ΄€ν•œ 것이닀.
15
+ νŠΉν—ˆ 검색, μœ μ‚¬λ„, ν‚€μ›Œλ“œ μΆ”μΆœ, 검색식 '''
16
+
17
+ tokenizer = AutoTokenizer.from_pretrained("axiomlabs/KR-patent-deberta-large")
18
+
19
+ encoded_inputs = tokenizer(patent_abstract, max_length=512, truncation=True, padding="max_length", return_tensors="pt")
20
+
21
+ model = AutoModel.from_pretrained("axiomlabs/KR-patent-deberta-large")
22
+
23
+ model.eval()
24
+
25
+ with torch.no_grad():
26
+ outputs = model(**encoded_inputs)[0][:,0,:] # CLS-Pooling
27
+ print(outputs.shape) # [1, 2048]
28
+ ```
29
+