Liangrj5
commited on
Commit
•
876e08a
1
Parent(s):
75210dc
annotation
Browse files- .gitattributes +2 -0
- data/TVR_Ranking/raw_annotations.csv +3 -0
- data/TVR_Ranking/readme.md +68 -0
- data/TVR_Ranking/test.json +3 -0
- data/TVR_Ranking/train_top01.json +3 -0
- data/TVR_Ranking/train_top20.json +3 -0
- data/TVR_Ranking/train_top40.json +3 -0
- data/TVR_Ranking/val.json +3 -0
- data/TVR_Ranking/video_corpus.json +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
*.json filter=lfs diff=lfs merge=lfs -text
|
37 |
+
*.csv filter=lfs diff=lfs merge=lfs -text
|
data/TVR_Ranking/raw_annotations.csv
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cafe5d5f813fa7b75f6e58b19eb33193b5e3718222dc9357d59e5830d5298c7d
|
3 |
+
size 57252256
|
data/TVR_Ranking/readme.md
ADDED
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
## TVR-Ranking Dataset Introduction
|
3 |
+
We curated a TVR-Ranking dataset to support the Ranked Video Moment Retrieval (RVMR) task, the videos and queries are sourced from [TVR](https://github.com/jayleicn/TVRetrieval).
|
4 |
+
Our dataset evaluates models based on their ability to retrieve relevant video moments. Annotators scored relevance on a five-level scale for each moment.
|
5 |
+
|
6 |
+
The validation and test sets were manually annotated. Additionally, we generated pseudo training sets (top 1, top 20, and top 40) based on query-caption similarity. Raw annotations are released to encourage further exploration. Video durations were aligned by frame number, which may result in slight differences compared to the TVR dataset. The full dataset file can be downloaded from Hugging Face at [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking).
|
7 |
+
|
8 |
+
|
9 |
+
|
10 |
+
### Data Organization
|
11 |
+
Here we show how the data is organized.
|
12 |
+
```
|
13 |
+
TVR_Ranking/
|
14 |
+
-val.json
|
15 |
+
-test.json
|
16 |
+
-train_top01.json
|
17 |
+
-train_top20.json
|
18 |
+
-train_top40.json
|
19 |
+
-raw_annotation.csv
|
20 |
+
```
|
21 |
+
|
22 |
+
The moments in raw annotation were labeled by 2 or 4 workers and we remain the unconsence data. In the validation/test set, we remove the unconsensus annotation and average the relevances.
|
23 |
+
|
24 |
+
### Data Formats
|
25 |
+
|
26 |
+
The TVR-Ranking dataset contains the following information:
|
27 |
+
```
|
28 |
+
pair_id: A unique identifier for the query-moment pair.
|
29 |
+
query_id: A unique identifier for the query.
|
30 |
+
query: The textual sentence that the user may want to know.
|
31 |
+
video_name: The name of the video file.
|
32 |
+
timestamp: The start and end times (in seconds) to identify a moment.
|
33 |
+
duration: The duration of the video (in seconds).
|
34 |
+
caption: A textual description of the content of the moment.
|
35 |
+
similarity: A score (-1 to 1) indicates the similarity between the query and the caption of the moment.
|
36 |
+
worker: The unique identifier of each worker.
|
37 |
+
relevance: A rating (0 to 4) indicating the relevance of the moment to the query, with 4 being the most relevant.
|
38 |
+
```
|
39 |
+
|
40 |
+
|
41 |
+
An example from validation was shown as follows.
|
42 |
+
``` json
|
43 |
+
{
|
44 |
+
"pair_id": 0,
|
45 |
+
"query_id": 54251,
|
46 |
+
"query": "A man and a woman are talking about how he lied to a patient.",
|
47 |
+
"video_name": "house_s07e03_seg02_clip_24",
|
48 |
+
"timestamp": [63.47, 77.42],
|
49 |
+
"duration": 90.02,
|
50 |
+
"caption": "A man and a woman are talking about how he lied to a patient.",
|
51 |
+
"similarity": 1.0,
|
52 |
+
"relevance": 4
|
53 |
+
}
|
54 |
+
```
|
55 |
+
|
56 |
+
|
57 |
+
## License
|
58 |
+
This project and dataset are licensed under a Creative Commons license.
|
59 |
+
|
60 |
+
## Citing the Dataset
|
61 |
+
If you use the TVR-Ranking dataset in your research, please cite it as follows:
|
62 |
+
|
63 |
+
```
|
64 |
+
citation
|
65 |
+
```
|
66 |
+
|
67 |
+
## Contact Information
|
68 |
+
For any questions or further information regarding the TVR-Ranking dataset, please contact Renjie Liang at liangrj5@gmail.com.
|
data/TVR_Ranking/test.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:85eb8a4a94cbaad4691e4fd1447ef0ccd5681f19be5ccf9eda6c57bb0b625d08
|
3 |
+
size 35038823
|
data/TVR_Ranking/train_top01.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6a8e4acb9a97fe32728c6dd3bb214d924a98f97df9b0cb26c3891e0a742d439b
|
3 |
+
size 22703070
|
data/TVR_Ranking/train_top20.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fc54d8766fdd84ce5bd6cfa7f41cf1aed8186a2506f6701f04f55e1140ffe397
|
3 |
+
size 298028341
|
data/TVR_Ranking/train_top40.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c323ba9b5e346b5f689ab0b7a5d1e3589eec5ec121a425ab94ac1c9d67a5a41
|
3 |
+
size 589337973
|
data/TVR_Ranking/val.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:665986cbdb00afa601b5c2184854c20285ee6f6b2d27c1a073bff11e948153ad
|
3 |
+
size 3281384
|
data/TVR_Ranking/video_corpus.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c906e5fd950ed73f5840d4dc565a6907b3c521850d2b6f29eaa48cf5d11b8ca
|
3 |
+
size 793192
|