1-800-BAD-CODE
commited on
Commit
•
0575d5b
1
Parent(s):
0c649bf
Update README.md
Browse files
README.md
CHANGED
@@ -108,7 +108,7 @@ This scheme captures acronyms, e.g., "NATO", as well as bi-capitalized words, e.
|
|
108 |
This model predicts the following set of "post" punctuation tokens:
|
109 |
|
110 |
| Token | Description | Relavant Languages |
|
111 |
-
|
|
112 |
| . | Latin full stop | Many |
|
113 |
| , | Latin comma | Many |
|
114 |
| ? | Latin question mark | Many |
|
@@ -117,7 +117,7 @@ This model predicts the following set of "post" punctuation tokens:
|
|
117 |
| 。 | Full-width full stop | Chinese, Japanese |
|
118 |
| 、 | Ideographic comma | Chinese, Japanese |
|
119 |
| ・ | Middle dot | Japanese |
|
120 |
-
| । | Danda | Hindi |
|
121 |
| ؟ | Arabic question mark | Arabic |
|
122 |
| ; | Greek question mark | Greek |
|
123 |
| ። | Ethiopic full stop | Amharic |
|
|
|
108 |
This model predicts the following set of "post" punctuation tokens:
|
109 |
|
110 |
| Token | Description | Relavant Languages |
|
111 |
+
| ---: | :---------- | :----------- |
|
112 |
| . | Latin full stop | Many |
|
113 |
| , | Latin comma | Many |
|
114 |
| ? | Latin question mark | Many |
|
|
|
117 |
| 。 | Full-width full stop | Chinese, Japanese |
|
118 |
| 、 | Ideographic comma | Chinese, Japanese |
|
119 |
| ・ | Middle dot | Japanese |
|
120 |
+
| । | Danda | Hindi, Bengali, Oriya |
|
121 |
| ؟ | Arabic question mark | Arabic |
|
122 |
| ; | Greek question mark | Greek |
|
123 |
| ። | Ethiopic full stop | Amharic |
|