Spaces:

yenniejun
/

tokenizers-languages

Runtime error

yenniejun commited on May 14

Commit

cb5b76e

•

1 Parent(s): ba38c3f

fix tabs

Files changed (1) hide show

app.py CHANGED Viewed

@@ -50,11 +50,10 @@ tokenizer_names_to_test = [
 with st.sidebar:
-    st.header('All languages are NOT created (tokenized) equal!')
-    link="This project compares the tokenization length for different languages. For some tokenizers, tokenizing a message in one language may result in 10-20x more tokens than a comparable message in another language (e.g. try English vs. Burmese). This is part of a larger project of measuring inequality in NLP. See the original article: [All languages are NOT created (tokenized) equal](https://www.artfish.ai/p/all-languages-are-not-created-tokenized) on [Art Fish Intelligence](https://www.artfish.ai/)."
 	st.markdown(link)
-    st.divider()
 	st.subheader('Tokenizer')
 	# TODO multi-select tokenizers

 with st.sidebar:
+	st.header('All languages are NOT created (tokenized) equal!')
+	link="This project compares the tokenization length for different languages. For some tokenizers, tokenizing a message in one language may result in 10-20x more tokens than a comparable message in another language (e.g. try English vs. Burmese). This is part of a larger project of measuring inequality in NLP. See the original article: [All languages are NOT created (tokenized) equal](https://www.artfish.ai/p/all-languages-are-not-created-tokenized) on [Art Fish Intelligence](https://www.artfish.ai/)."
 	st.markdown(link)
+	st.divider()
 	st.subheader('Tokenizer')
 	# TODO multi-select tokenizers