IsmaelMousa commited on
Commit
d926cf1
1 Parent(s): 81a551d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
  ## Byte Level (BPE) Tokenizer for Arabic
15
 
16
  Byte Level Tokenizer for Arabic, a robust tokenizer designed to handle Arabic text with precision and efficiency.
17
- This tokenizer utilizes a `Byte-Pair Encoding (BPE)` approach to create a vocabulary of `32,000` tokens, catering specifically to the intricacies of the Arabic language.
18
 
19
  ### Goal
20
 
@@ -25,7 +25,7 @@ While there are Arabic-only tokenizers and multilingual BPE tokenizers, a dedica
25
  ### Checkpoint Information
26
 
27
  - **Name**: `IsmaelMousa/arabic-bpe-tokenizer`
28
- - **Vocabulary Size**: `32,000`
29
 
30
  ### Overview
31
 
 
14
  ## Byte Level (BPE) Tokenizer for Arabic
15
 
16
  Byte Level Tokenizer for Arabic, a robust tokenizer designed to handle Arabic text with precision and efficiency.
17
+ This tokenizer utilizes a `Byte-Pair Encoding (BPE)` approach to create a vocabulary of `50,000` tokens, catering specifically to the intricacies of the Arabic language.
18
 
19
  ### Goal
20
 
 
25
  ### Checkpoint Information
26
 
27
  - **Name**: `IsmaelMousa/arabic-bpe-tokenizer`
28
+ - **Vocabulary Size**: `50,000`
29
 
30
  ### Overview
31