English
Liangrj5 commited on
Commit
876e08a
1 Parent(s): 75210dc

annotation

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.json filter=lfs diff=lfs merge=lfs -text
37
+ *.csv filter=lfs diff=lfs merge=lfs -text
data/TVR_Ranking/raw_annotations.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cafe5d5f813fa7b75f6e58b19eb33193b5e3718222dc9357d59e5830d5298c7d
3
+ size 57252256
data/TVR_Ranking/readme.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ## TVR-Ranking Dataset Introduction
3
+ We curated a TVR-Ranking dataset to support the Ranked Video Moment Retrieval (RVMR) task, the videos and queries are sourced from [TVR](https://github.com/jayleicn/TVRetrieval).
4
+ Our dataset evaluates models based on their ability to retrieve relevant video moments. Annotators scored relevance on a five-level scale for each moment.
5
+
6
+ The validation and test sets were manually annotated. Additionally, we generated pseudo training sets (top 1, top 20, and top 40) based on query-caption similarity. Raw annotations are released to encourage further exploration. Video durations were aligned by frame number, which may result in slight differences compared to the TVR dataset. The full dataset file can be downloaded from Hugging Face at [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking).
7
+
8
+
9
+
10
+ ### Data Organization
11
+ Here we show how the data is organized.
12
+ ```
13
+ TVR_Ranking/
14
+ -val.json
15
+ -test.json
16
+ -train_top01.json
17
+ -train_top20.json
18
+ -train_top40.json
19
+ -raw_annotation.csv
20
+ ```
21
+
22
+ The moments in raw annotation were labeled by 2 or 4 workers and we remain the unconsence data. In the validation/test set, we remove the unconsensus annotation and average the relevances.
23
+
24
+ ### Data Formats
25
+
26
+ The TVR-Ranking dataset contains the following information:
27
+ ```
28
+ pair_id: A unique identifier for the query-moment pair.
29
+ query_id: A unique identifier for the query.
30
+ query: The textual sentence that the user may want to know.
31
+ video_name: The name of the video file.
32
+ timestamp: The start and end times (in seconds) to identify a moment.
33
+ duration: The duration of the video (in seconds).
34
+ caption: A textual description of the content of the moment.
35
+ similarity: A score (-1 to 1) indicates the similarity between the query and the caption of the moment.
36
+ worker: The unique identifier of each worker.
37
+ relevance: A rating (0 to 4) indicating the relevance of the moment to the query, with 4 being the most relevant.
38
+ ```
39
+
40
+
41
+ An example from validation was shown as follows.
42
+ ``` json
43
+ {
44
+ "pair_id": 0,
45
+ "query_id": 54251,
46
+ "query": "A man and a woman are talking about how he lied to a patient.",
47
+ "video_name": "house_s07e03_seg02_clip_24",
48
+ "timestamp": [63.47, 77.42],
49
+ "duration": 90.02,
50
+ "caption": "A man and a woman are talking about how he lied to a patient.",
51
+ "similarity": 1.0,
52
+ "relevance": 4
53
+ }
54
+ ```
55
+
56
+
57
+ ## License
58
+ This project and dataset are licensed under a Creative Commons license.
59
+
60
+ ## Citing the Dataset
61
+ If you use the TVR-Ranking dataset in your research, please cite it as follows:
62
+
63
+ ```
64
+ citation
65
+ ```
66
+
67
+ ## Contact Information
68
+ For any questions or further information regarding the TVR-Ranking dataset, please contact Renjie Liang at liangrj5@gmail.com.
data/TVR_Ranking/test.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85eb8a4a94cbaad4691e4fd1447ef0ccd5681f19be5ccf9eda6c57bb0b625d08
3
+ size 35038823
data/TVR_Ranking/train_top01.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a8e4acb9a97fe32728c6dd3bb214d924a98f97df9b0cb26c3891e0a742d439b
3
+ size 22703070
data/TVR_Ranking/train_top20.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc54d8766fdd84ce5bd6cfa7f41cf1aed8186a2506f6701f04f55e1140ffe397
3
+ size 298028341
data/TVR_Ranking/train_top40.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c323ba9b5e346b5f689ab0b7a5d1e3589eec5ec121a425ab94ac1c9d67a5a41
3
+ size 589337973
data/TVR_Ranking/val.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:665986cbdb00afa601b5c2184854c20285ee6f6b2d27c1a073bff11e948153ad
3
+ size 3281384
data/TVR_Ranking/video_corpus.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c906e5fd950ed73f5840d4dc565a6907b3c521850d2b6f29eaa48cf5d11b8ca
3
+ size 793192