|
--- |
|
license: mit |
|
language: |
|
- en |
|
tags: |
|
- schema |
|
- word-embeddings |
|
- embeddings |
|
- unsupervised-learning |
|
- tables |
|
- web-table |
|
- schema-data |
|
--- |
|
# Pre-trained Web Table Embeddings |
|
|
|
The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data. |
|
|
|
The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings) |
|
|
|
## Quick Start |
|
|
|
You can install the table_embeddings package to encode text from tables by running the following commands: |
|
|
|
|
|
```bash |
|
pip install cython |
|
pip install git+https://github.com/guenthermi/table-embeddings.git |
|
``` |
|
|
|
After that you can encode text with the following Python snippet: |
|
|
|
```python |
|
from table_embeddings import TableEmbeddingModel |
|
model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_combo150') |
|
embedding = model.get_header_vector('headline') |
|
``` |
|
|
|
## Model Types |
|
|
|
| Model Type | Description | Download-Links | |
|
| ---------- | ----------- | -------------- | |
|
| W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150)) |
|
| W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150)) |
|
| W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150)) |
|
| W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150)) |
|
|
|
## More Information |
|
|
|
For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings) |
|
|
|
More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892) |
|
``` |
|
@inproceedings{gunther2021pre, |
|
title={Pre-Trained Web Table Embeddings for Table Discovery}, |
|
author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang}, |
|
booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management}, |
|
pages={24--31}, |
|
year={2021} |
|
} |
|
``` |