|
--- |
|
license: apache-2.0 |
|
base_model: t5-small |
|
language: |
|
- en |
|
- fr |
|
- de |
|
- es |
|
--- |
|
|
|
**Topical** is a small language model specialized for topic extraction. Given a document Pleias-Topic-Deduction will return a main topic that can be used for further downstream tasks (annotation, embedding indexation) |
|
|
|
Like other model from PleIAs Bad Data Toolbox, Topical has been volontarily trained on 70,000 documents extracted from Common Corpus with a various range of digitization artifact. |
|
|
|
Topical is a lightweight model (70 million parameters) tha can be especially used for classification at scale on a large corpus. |
|
|
|
## Example |