File size: 2,737 Bytes
b8a860b
 
 
 
c08e521
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
878c97f
c08e521
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
language: zh
license: apache-2.0
---

# TAAS

## Introduction
TAAS: A Text-based Delivery Address Analysis System in Logistics

## System description
TAAS is an integrated system for text-based address analysis in logistics field. TAAS supports several address perception tasks, as well as other logistics related tasks. Our system is based on a Geography-Graph Pre-trained model in logistics, termed G2PTL, which promotes the delivery address encoding by combining the semantic learning capabilities of text pre-training with the geographical-relationship encoding abilities of graph modeling. 

![overview.png](./imgs/overview.png)

## Supported Tasks

1. **Address perception tasks**
* Address Completion
* Address Standardization
* House Info Extraction
* Address Entity Tokenization
* Address embedding
2. **Logistics related tasks**
* Geo-locating From Text to Geospatial
* Pick-up Estimation Time of Arrival
* Pick-up and Delivery Route Prediction

## How To Use

Once installed, loading and using a fine-tuned model on any specific task can be done as follows:

```python
from transformers import AutoModel
model = AutoModel.from_pretrained('Cainiao-AI/TAAS',trust_remote_code=True)
model.eval()
address = ['北京市马驹桥镇兴贸二街幸福家园1幢5单元1009室 注:放在门口即可']

# Address completion
output = model.addr_complet(address)
print(output)
```
```python
['北京市通州区马驹桥镇兴贸二街幸福家园1幢5单元1009室 注:放在门口即可']
```
```python
# Address standardization
output = model.addr_standardize(address)
print(output)
```
```python
['北京马驹桥镇兴贸二街幸福家园1幢5单元1009室']
```
```python
# House info extraction
output = model.house_info(address)
print(output)
```
```python
[{'楼栋': '1', '单元': '5', '门牌号': '1009'}]
```
```python
# Address entity tokenization
output = model.addr_entity(address)
print(output)
```
```python
[{'省': '北京', '市': '', '区': '马驹桥', '街道/镇': '镇兴贸二街', '道路': '', '道路号': '', 'poi': '幸福家园', '楼栋号': '1', '单元号': '5', '门牌号': '1009'}]
```
```python
# Geo-locating from text to geospatial
output = model.geolocate(address)
```
```python
's2网格化结果:453cf541fcb147b437433cf3cff43f470'
```
```python
# Pick-up estimation time of arrival
output = model.pickup_ETA(eta_data)
# Users can get the address embeddings for their pick-up ETA model
```
```python
# Pick-up and Delivery Route prediction
output = model.route_predict(route_data)
# Users can get the address embeddings for their route prediction model
```

## Requirements
python>=3.8
```shell
tqdm==4.65.0
torch==1.13.1
transformers==4.27.4
datasets==2.11.0
fairseq==0.12.2
```