Getting Started
Installation
- conda environment
conda env create --name NAME --file=environment.yaml
The Project is designed around several scripts that simulate a typical machine learning workflow. Starting with data preparation after preparing data, training model, evaluation and inference model. google/t5-small
model was being trained on above dataset for 10
epochs. Later inference ran on evaluation data, performance metrics and evaluation results were stored inside result
subdirectory of project
directory.
I added Makefile which can be used to run python scripts separately using following bash commands.
make data
make train
make eval
make inference
run
is a bash command which can aggregately run entire project.
make run
clean
is a bash command which can be used to clean the previous runs.
make clean
Performance metrics stores into performance.json
file inside results
directory.
{
"rouge1": 0.79689240266461,
"rouge2": 0.7606140631154827,
"rougeL": 0.7733855633904199,
"rougeLsum": 0.7734703253159519
}
And also, eval_results.csv
containing predictions of evaluation file.
original | compressed | predictions |
---|---|---|
sentence1 | compress1 | prediction1 |
sentence2 | compress2 | prediction2 |
: | : | : |
References:
- https://github.com/google-research-datasets/sentence-compression
- https://huggingface.co/docs/transformers/en/tasks/summarization
Note:
Download trained checkpoint from given drive link checkpoint
- Downloads last month
- 14