Spaces:
Sleeping
Sleeping
[ | |
{ | |
"path": "table_paper/2407.00100v1.json", | |
"table_id": "1", | |
"section": "5.1", | |
"all_context": [ | |
"Table 1 displays the comparison results between IDAICL and four ICL baselines (Vanilla ICL, MetaICL, Channel ICL, and EPR) across GPT-2 models (with 0.8B and 1.5B parameters) and the GPT-Neo model.", | |
"These results lead to three main findings.", | |
"Firstly, IDAICL consistently exhibits high effectiveness across various model sizes and datasets, highlighting its strong generalization capacity, even under scenarios involving imbalanced training data.", | |
"Compared to Vanilla ICL, IDAICL outperforms by an average of 17.7% and 18.4% across diverse datasets and values for GPT-2 with 0.8B and 1.5B parameters, respectively.", | |
"Secondly, in comparison to other ICL baselines like Channel ICL, MetaICL, and EPR, the integration of IDAICL consistently delivers notable performance improvements, emphasizing the efficacy of enhancing demonstrations for refined predictions.", | |
"The inclusion of IDAICL led to an average performance boost of 7.3% for MetaICL and 8.2% for Channel ICL.", | |
"Lastly, IDAICL notably enhances worst-case accuracy and diminishes performance variance across different seeds, showcasing its ability to improve prediction stability.", | |
"Additional results on LLaMA and smaller GPT-2 models are available in Tables 7 and 8 of the Appendix.", | |
"" | |
], | |
"target_context_ids": [ | |
0, | |
2, | |
3, | |
4, | |
5, | |
6 | |
], | |
"selected_paragraphs": [ | |
"[paragraph id = 0] Table 1 displays the comparison results between IDAICL and four ICL baselines (Vanilla ICL, MetaICL, Channel ICL, and EPR) across GPT-2 models (with 0.8B and 1.5B parameters) and the GPT-Neo model.", | |
"[paragraph id = 2] Firstly, IDAICL consistently exhibits high effectiveness across various model sizes and datasets, highlighting its strong generalization capacity, even under scenarios involving imbalanced training data.", | |
"[paragraph id = 3] Compared to Vanilla ICL, IDAICL outperforms by an average of 17.7% and 18.4% across diverse datasets and values for GPT-2 with 0.8B and 1.5B parameters, respectively.", | |
"[paragraph id = 4] Secondly, in comparison to other ICL baselines like Channel ICL, MetaICL, and EPR, the integration of IDAICL consistently delivers notable performance improvements, emphasizing the efficacy of enhancing demonstrations for refined predictions.", | |
"[paragraph id = 5] The inclusion of IDAICL led to an average performance boost of 7.3% for MetaICL and 8.2% for Channel ICL.", | |
"[paragraph id = 6] Lastly, IDAICL notably enhances worst-case accuracy and diminishes performance variance across different seeds, showcasing its ability to improve prediction stability." | |
], | |
"table_html": "<figure class=\"ltx_table\" id=\"S3.T1\">\n<div class=\"ltx_inline-block ltx_align_center ltx_transformed_outer\" id=\"S3.T1.300\" style=\"width:433.6pt;height:437.5pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"transform:translate(-59.7pt,60.3pt) scale(0.784039515230472,0.784039515230472) ;\">\n<table class=\"ltx_tabular ltx_align_middle\" id=\"S3.T1.300.300\">\n<tr class=\"ltx_tr\" id=\"S3.T1.300.300.301\" style=\"background-color:#D9D9D9;\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.300.300.301.1\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.1.1\" style=\"background-color:#D9D9D9;\">PLM</span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.300.300.301.2\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.2.1\" style=\"background-color:#D9D9D9;\">Method</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_tt\" id=\"S3.T1.300.300.301.3\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.3.1\" style=\"background-color:#D9D9D9;\">m</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.4\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.4.1\" style=\"background-color:#D9D9D9;\">SST-2</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.5\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.5.1\" style=\"background-color:#D9D9D9;\">SST-5</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.6\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.6.1\" style=\"background-color:#D9D9D9;\">MR</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.7\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.7.1\" style=\"background-color:#D9D9D9;\">CR</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.8\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.8.1\" style=\"background-color:#D9D9D9;\">Amazon</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.9\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.9.1\" style=\"background-color:#D9D9D9;\">Subj</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.10\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.10.1\" style=\"background-color:#D9D9D9;\">TREC</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.11\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.11.1\" style=\"background-color:#D9D9D9;\">DBPedia</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.12\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.12.1\" style=\"background-color:#D9D9D9;\">AGNews</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.300.300.301.13\"><span class=\"ltx_text\" id=\"S3.T1.300.300.301.13.1\" style=\"background-color:#D9D9D9;\">CB</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.10.10.10\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.10.10.10.11\" rowspan=\"12\"><span class=\"ltx_text\" id=\"S3.T1.10.10.10.11.1\">\n<span class=\"ltx_inline-block ltx_transformed_outer\" id=\"S3.T1.10.10.10.11.1.1\" style=\"width:6.8pt;height:53.4pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"width:53.4pt;transform:translate(-23.28pt,-23.28pt) rotate(-90deg) ;\">\n<span class=\"ltx_p\" id=\"S3.T1.10.10.10.11.1.1.1\">GPT-2 0.8B</span>\n</span></span></span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.10.10.10.12\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_tt\" id=\"S3.T1.10.10.10.13\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.10.10.10.13.1\">4</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.1.1.1.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.2.2.2.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.3.3.3.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.4.4.4.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.5.5.5.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.6.6.6.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.7.7.7.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.8.8.8.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.9.9.9.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.10.10.10.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.20.20.20\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.20.20.20.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.11.11.11.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.12.12.12.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.13.13.13.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.14.14.14.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.15.15.15.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.16.16.16.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.17.17.17.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.18.18.18.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.19.19.19.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.20.20.20.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.30.30.30\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.30.30.30.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.30.30.30.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.30.30.30.12.1\">8</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.21.21.21.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.22.22.22.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.23.23.23.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.24.24.24.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.25.25.25.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.26.26.26.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.27.27.27.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.28.28.28.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.29.29.29.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.30.30.30.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.40.40.40\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.40.40.40.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.31.31.31.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.32.32.32.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.33.33.33.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.34.34.34.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.35.35.35.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.36.36.36.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.37.37.37.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.38.38.38.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.39.39.39.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.40.40.40.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.50.50.50\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.50.50.50.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.50.50.50.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.50.50.50.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.41.41.41.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.42.42.42.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.43.43.43.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.44.44.44.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.45.45.45.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.46.46.46.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.47.47.47.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.48.48.48.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.49.49.49.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.50.50.50.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.60.60.60\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.60.60.60.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.51.51.51.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.52.52.52.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.53.53.53.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.54.54.54.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.55.55.55.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.56.56.56.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.57.57.57.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.58.58.58.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.59.59.59.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.60.60.60.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.70.70.70\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.70.70.70.11\">MetaICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.70.70.70.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.70.70.70.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.61.61.61.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.62.62.62.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.63.63.63.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.64.64.64.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.65.65.65.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.66.66.66.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.67.67.67.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.68.68.68.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.69.69.69.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.70.70.70.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.80.80.80\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.80.80.80.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.71.71.71.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.72.72.72.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.73.73.73.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.74.74.74.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.75.75.75.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.76.76.76.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.77.77.77.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.78.78.78.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.79.79.79.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.80.80.80.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.90.90.90\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.90.90.90.11\">Channel ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.90.90.90.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.90.90.90.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.81.81.81.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.82.82.82.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.83.83.83.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.84.84.84.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.85.85.85.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.86.86.86.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.87.87.87.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.88.88.88.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.89.89.89.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.90.90.90.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.100.100.100\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.100.100.100.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.91.91.91.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.92.92.92.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.93.93.93.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.94.94.94.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.95.95.95.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.96.96.96.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.97.97.97.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.98.98.98.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.99.99.99.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.100.100.100.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.110.110.110\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.110.110.110.11\">EPR</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.110.110.110.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.110.110.110.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.101.101.101.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.102.102.102.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.103.103.103.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.104.104.104.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.105.105.105.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.106.106.106.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.107.107.107.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.108.108.108.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.109.109.109.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.110.110.110.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.120.120.120\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.120.120.120.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.111.111.111.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.112.112.112.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.113.113.113.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.114.114.114.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.115.115.115.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.116.116.116.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.117.117.117.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.118.118.118.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.119.119.119.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.120.120.120.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.130.130.130\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.130.130.130.11\" rowspan=\"12\"><span class=\"ltx_text\" id=\"S3.T1.130.130.130.11.1\">\n<span class=\"ltx_inline-block ltx_transformed_outer\" id=\"S3.T1.130.130.130.11.1.1\" style=\"width:6.8pt;height:53.4pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"width:53.4pt;transform:translate(-23.28pt,-23.28pt) rotate(-90deg) ;\">\n<span class=\"ltx_p\" id=\"S3.T1.130.130.130.11.1.1.1\">GPT-2 1.5B</span>\n</span></span></span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.130.130.130.12\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_tt\" id=\"S3.T1.130.130.130.13\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.130.130.130.13.1\">4</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.121.121.121.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.122.122.122.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.123.123.123.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.124.124.124.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.125.125.125.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.126.126.126.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.127.127.127.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.128.128.128.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.129.129.129.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.130.130.130.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.140.140.140\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.140.140.140.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.131.131.131.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.132.132.132.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.133.133.133.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.134.134.134.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.135.135.135.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.136.136.136.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.137.137.137.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.138.138.138.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.139.139.139.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.140.140.140.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.150.150.150\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.150.150.150.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.150.150.150.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.150.150.150.12.1\">8</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.141.141.141.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.142.142.142.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.143.143.143.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.144.144.144.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.145.145.145.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.146.146.146.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.147.147.147.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.148.148.148.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.149.149.149.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.150.150.150.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.160.160.160\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.160.160.160.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.151.151.151.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.152.152.152.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.153.153.153.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.154.154.154.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.155.155.155.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.156.156.156.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.157.157.157.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.158.158.158.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.159.159.159.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.160.160.160.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.170.170.170\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.170.170.170.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.170.170.170.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.170.170.170.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.161.161.161.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.162.162.162.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.163.163.163.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.164.164.164.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.165.165.165.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.166.166.166.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.167.167.167.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.168.168.168.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.169.169.169.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.170.170.170.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.180.180.180\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.180.180.180.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.171.171.171.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.172.172.172.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.173.173.173.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.174.174.174.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.175.175.175.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.176.176.176.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.177.177.177.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.178.178.178.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.179.179.179.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.180.180.180.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.190.190.190\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.190.190.190.11\">MetaICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.190.190.190.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.190.190.190.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.181.181.181.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.182.182.182.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.183.183.183.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.184.184.184.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.185.185.185.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.186.186.186.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.187.187.187.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.188.188.188.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.189.189.189.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.190.190.190.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.200.200.200\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.200.200.200.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.191.191.191.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.192.192.192.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.193.193.193.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.194.194.194.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.195.195.195.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.196.196.196.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.197.197.197.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.198.198.198.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.199.199.199.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.200.200.200.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.210.210.210\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.210.210.210.11\">Channel ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.210.210.210.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.210.210.210.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.201.201.201.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.202.202.202.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.203.203.203.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.204.204.204.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.205.205.205.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.206.206.206.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.207.207.207.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.208.208.208.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.209.209.209.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.210.210.210.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.220.220.220\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.220.220.220.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.211.211.211.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.212.212.212.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.213.213.213.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.214.214.214.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.215.215.215.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.216.216.216.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.217.217.217.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.218.218.218.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.219.219.219.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.220.220.220.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.230.230.230\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.230.230.230.11\">EPR</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.230.230.230.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.230.230.230.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.221.221.221.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.222.222.222.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.223.223.223.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.224.224.224.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.225.225.225.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.226.226.226.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.227.227.227.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.228.228.228.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.229.229.229.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.230.230.230.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.240.240.240\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.240.240.240.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.231.231.231.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.232.232.232.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.233.233.233.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.234.234.234.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.235.235.235.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.236.236.236.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.237.237.237.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.238.238.238.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.239.239.239.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.240.240.240.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.250.250.250\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_tt\" id=\"S3.T1.250.250.250.11\" rowspan=\"6\"><span class=\"ltx_text\" id=\"S3.T1.250.250.250.11.1\">\n<span class=\"ltx_inline-block ltx_transformed_outer\" id=\"S3.T1.250.250.250.11.1.1\" style=\"width:6.8pt;height:42.2pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"width:42.2pt;transform:translate(-17.66pt,-17.66pt) rotate(-90deg) ;\">\n<span class=\"ltx_p\" id=\"S3.T1.250.250.250.11.1.1.1\">GPT-Neo</span>\n</span></span></span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S3.T1.250.250.250.12\">MetaICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_tt\" id=\"S3.T1.250.250.250.13\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.250.250.250.13.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.241.241.241.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.242.242.242.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.243.243.243.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.244.244.244.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.245.245.245.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.246.246.246.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.247.247.247.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.248.248.248.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.249.249.249.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S3.T1.250.250.250.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.260.260.260\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.260.260.260.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.251.251.251.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.252.252.252.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.253.253.253.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.254.254.254.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.255.255.255.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.256.256.256.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.257.257.257.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.258.258.258.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.259.259.259.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.260.260.260.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.270.270.270\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.270.270.270.11\">Channel ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_r ltx_border_t\" id=\"S3.T1.270.270.270.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.270.270.270.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.261.261.261.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.262.262.262.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.263.263.263.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.264.264.264.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.265.265.265.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.266.266.266.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.267.267.267.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.268.268.268.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.269.269.269.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.270.270.270.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.280.280.280\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S3.T1.280.280.280.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.271.271.271.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.272.272.272.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.273.273.273.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.274.274.274.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.275.275.275.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.276.276.276.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.277.277.277.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.278.278.278.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.279.279.279.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S3.T1.280.280.280.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.290.290.290\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"S3.T1.290.290.290.11\">EPR</td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t\" id=\"S3.T1.290.290.290.12\" rowspan=\"2\"><span class=\"ltx_text\" id=\"S3.T1.290.290.290.12.1\">12</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.281.281.281.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.282.282.282.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.283.283.283.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.284.284.284.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.285.285.285.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.286.286.286.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.287.287.287.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.288.288.288.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.289.289.289.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"S3.T1.290.290.290.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S3.T1.300.300.300\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r\" id=\"S3.T1.300.300.300.11\">+IDAICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.291.291.291.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.292.292.292.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.293.293.293.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.294.294.294.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.295.295.295.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.296.296.296.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.297.297.297.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.298.298.298.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.299.299.299.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S3.T1.300.300.300.10\"></td>\n</tr>\n</table>\n</span></div>\n<figcaption class=\"ltx_caption ltx_centering\"><span class=\"ltx_tag ltx_tag_table\">Table 1: </span>Comparison results of three PLMs.\nTwo numbers indicate the mean accuracy (%) and standard deviation over different seeds. The best and second-best results per PLM per dataset are highlighted in bold and underlined, respectively. \"+IDAICL\" means that the current approach is used in conjunction with IDAICL. The results for different numbers of demonstration examples (i.e., values) using the GPT-Neo model are illustrated in Figure <a class=\"ltx_ref\" href=\"https://arxiv.org/html/2407.00100v1#S4.F3\" title=\"Figure 3 ‣ 4.1 Models and Datasets ‣ 4 Experimental Setup ‣ Enhancing In-Context Learning via Implicit Demonstration Augmentation\"><span class=\"ltx_text ltx_ref_tag\">3</span></a>.\n</figcaption>\n</figure>", | |
"perturb_sentence_id": [ | |
3, | |
4 | |
], | |
"output": { | |
"perturbed_statement": "[paragraph id = 3] Compared to Vanilla ICL, IDAICL outperforms by an average of 17.7% and 18.4% across diverse datasets and values for GPT-2 with 0.8B and 1.5B parameters, respectively. Secondly, in comparison to other ICL baselines like MetaICL, MetaICL, and EPR, the integration of IDAICL consistently delivers notable performance improvements, emphasizing the efficacy of enhancing demonstrations for refined predictions.", | |
"perturbed_explanation": "Original Explanation: The original statement highlights IDAICL's superior performance over Vanilla ICL by providing specific percentage improvements for both 0.8B and 1.5B parameter models. It further reiterates IDAICL's effectiveness in comparison with other ICL baselines, including Channel ICL, MetaICL, and EPR. 1. The statement incorrectly repeats MetaICL instead of mentioning the correct baseline, Channel ICL, after MetaICL. This error falsely indicates that MetaICL is being compared to itself, misrepresenting the diversity of the ICL baselines used for comparison." | |
} | |
}, | |
{ | |
"path": "table_paper/2407.00100v1.json", | |
"table_id": "2", | |
"section": "5.2", | |
"all_context": [ | |
"We compared IDAICL with three advanced prediction calibration methods (ConCa, PROCA, and D-ConCa) across three PLMs: GPT-2, GPT-Neo, and LLaMA.", | |
"Table 2 presents the comparison results for the LLaMA models, where IDAICL consistently achieves state-of-the-art performance, except for TREC using the LLaMA model with 33B parameters.", | |
"These findings suggest that IDAICL which leverages statistical information derived from the input data distribution for prediction calibration, generally outperforms methods relying on estimated biases for correction.", | |
"Further comparison results can be found in Table 9 of the Appendix.", | |
"" | |
], | |
"target_context_ids": [ | |
1 | |
], | |
"selected_paragraphs": [ | |
"[paragraph id = 1] Table 2 presents the comparison results for the LLaMA models, where IDAICL consistently achieves state-of-the-art performance, except for TREC using the LLaMA model with 33B parameters." | |
], | |
"table_html": "<figure class=\"ltx_table\" id=\"S4.T2\">\n<div class=\"ltx_inline-block ltx_align_center ltx_transformed_outer\" id=\"S4.T2.100\" style=\"width:433.6pt;height:169.5pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"transform:translate(-36.5pt,14.3pt) scale(0.855879733045295,0.855879733045295) ;\">\n<table class=\"ltx_tabular ltx_align_middle\" id=\"S4.T2.100.100\">\n<tr class=\"ltx_tr\" id=\"S4.T2.100.100.101\" style=\"background-color:#D9D9D9;\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S4.T2.100.100.101.1\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.1.1\" style=\"background-color:#D9D9D9;\">PLM</span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S4.T2.100.100.101.2\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.2.1\" style=\"background-color:#D9D9D9;\">Method</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.3\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.3.1\" style=\"background-color:#D9D9D9;\">SST-2</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.4\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.4.1\" style=\"background-color:#D9D9D9;\">SST-5</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.5\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.5.1\" style=\"background-color:#D9D9D9;\">MR</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.6\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.6.1\" style=\"background-color:#D9D9D9;\">CR</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.7\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.7.1\" style=\"background-color:#D9D9D9;\">Subj</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.8\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.8.1\" style=\"background-color:#D9D9D9;\">TREC</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.9\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.9.1\" style=\"background-color:#D9D9D9;\">DBPedia</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.10\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.10.1\" style=\"background-color:#D9D9D9;\">AGNews</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.11\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.11.1\" style=\"background-color:#D9D9D9;\">CB</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.100.100.101.12\"><span class=\"ltx_text\" id=\"S4.T2.100.100.101.12.1\" style=\"background-color:#D9D9D9;\">Avg.</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.9.9.9\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S4.T2.9.9.9.10\" rowspan=\"5\"><span class=\"ltx_text\" id=\"S4.T2.9.9.9.10.1\">\n<span class=\"ltx_inline-block ltx_transformed_outer\" id=\"S4.T2.9.9.9.10.1.1\" style=\"width:6.8pt;height:54.6pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"width:54.6pt;transform:translate(-23.88pt,-23.88pt) rotate(-90deg) ;\">\n<span class=\"ltx_p\" id=\"S4.T2.9.9.9.10.1.1.1\">LLaMA 13B</span>\n</span></span></span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S4.T2.9.9.9.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.1.1.1.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.2.2.2.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.3.3.3.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.4.4.4.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.5.5.5.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.6.6.6.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.7.7.7.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.8.8.8.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.9.9.9.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.9.9.9.12\">72.8</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.18.18.18\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.18.18.18.10\">ConCa</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.10.10.10.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.11.11.11.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.12.12.12.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.13.13.13.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.14.14.14.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.15.15.15.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.16.16.16.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.17.17.17.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.18.18.18.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.18.18.18.11\">77.0</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.31.31.31\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.22.22.22.4\">\n<span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.22.22.22.4.1\">P</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.22.22.22.4.2\">RO</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.22.22.22.4.3\">C</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.22.22.22.4.4\">A</span>\n</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.23.23.23.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.24.24.24.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.25.25.25.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.26.26.26.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.27.27.27.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.28.28.28.10\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.29.29.29.11\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.30.30.30.12\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.31.31.31.13\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.31.31.31.14\"><span class=\"ltx_text ltx_framed ltx_framed_underline\" id=\"S4.T2.31.31.31.14.1\">77.9</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.40.40.40\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.40.40.40.10\">D-ConCa</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.32.32.32.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.33.33.33.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.34.34.34.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.35.35.35.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.36.36.36.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.37.37.37.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.38.38.38.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.39.39.39.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.40.40.40.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.40.40.40.11\">77.8</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.50.50.50\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.50.50.50.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.41.41.41.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.42.42.42.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.43.43.43.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.44.44.44.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.45.45.45.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.46.46.46.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.47.47.47.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.48.48.48.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.49.49.49.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.50.50.50.10\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.59.59.59\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_tt\" id=\"S4.T2.59.59.59.10\" rowspan=\"5\"><span class=\"ltx_text\" id=\"S4.T2.59.59.59.10.1\">\n<span class=\"ltx_inline-block ltx_transformed_outer\" id=\"S4.T2.59.59.59.10.1.1\" style=\"width:6.8pt;height:54.6pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"width:54.6pt;transform:translate(-23.88pt,-23.88pt) rotate(-90deg) ;\">\n<span class=\"ltx_p\" id=\"S4.T2.59.59.59.10.1.1.1\">LLaMA 33B</span>\n</span></span></span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S4.T2.59.59.59.11\">Vanilla ICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.51.51.51.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.52.52.52.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.53.53.53.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.54.54.54.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.55.55.55.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.56.56.56.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.57.57.57.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.58.58.58.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.59.59.59.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S4.T2.59.59.59.12\">76.2</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.68.68.68\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.68.68.68.10\">ConCa</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.60.60.60.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.61.61.61.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.62.62.62.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.63.63.63.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.64.64.64.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.65.65.65.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.66.66.66.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.67.67.67.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.68.68.68.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.68.68.68.11\">78.4</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.81.81.81\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.72.72.72.4\">\n<span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.72.72.72.4.1\">P</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.72.72.72.4.2\">RO</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.72.72.72.4.3\">C</span><span class=\"ltx_text ltx_markedasmath\" id=\"S4.T2.72.72.72.4.4\">A</span>\n</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.73.73.73.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.74.74.74.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.75.75.75.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.76.76.76.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.77.77.77.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.78.78.78.10\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.79.79.79.11\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.80.80.80.12\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.81.81.81.13\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.81.81.81.14\">78.2</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.90.90.90\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S4.T2.90.90.90.10\">D-ConCa</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.82.82.82.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.83.83.83.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.84.84.84.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.85.85.85.4\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.86.86.86.5\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.87.87.87.6\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.88.88.88.7\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.89.89.89.8\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.90.90.90.9\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S4.T2.90.90.90.11\"><span class=\"ltx_text ltx_framed ltx_framed_underline\" id=\"S4.T2.90.90.90.11.1\">78.6</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S4.T2.100.100.100\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r\" id=\"S4.T2.100.100.100.11\">IDAICL</td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.91.91.91.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.92.92.92.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.93.93.93.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.94.94.94.4\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.95.95.95.5\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.96.96.96.6\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.97.97.97.7\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.98.98.98.8\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.99.99.99.9\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S4.T2.100.100.100.10\"></td>\n</tr>\n</table>\n</span></div>\n<figcaption class=\"ltx_caption ltx_centering\"><span class=\"ltx_tag ltx_tag_table\">Table 2: </span>Comparison results of Macro-F1 for the LLaMA model with 13B and 33B parameters, setting to 4.\n</figcaption>\n</figure>", | |
"perturb_sentence_id": [ | |
1, | |
2 | |
], | |
"output": { | |
"perturbed_statement": "[paragraph id = 1] Table 2 presents the comparison results for the LLaMA models, where IDAICL consistently achieves state-of-the-art performance, except for the IMDB using the LLaMA model with 33B parameters. These findings suggest that IDAICL which leverages statistical information derived from the input data distribution for prediction calibration, generally outperforms methods relying on estimated biases for correction.", | |
"perturbed_explanation": "These findings suggest that IDAICL which leverages statistical information derived from the input data distribution for prediction calibration, generally outperforms methods relying on estimated biases for correction. However, the statement incorrectly says that IDAICL did not achieve state-of-the-art performance for the IMDB dataset, when it actually refers to the TREC dataset. This change alters the specific dataset where IDAICL did not reach state-of-the-art performance, making the statement factually incorrect." | |
} | |
}, | |
{ | |
"path": "table_paper/2407.00100v1.json", | |
"table_id": "3", | |
"section": "5.5", | |
"all_context": [ | |
"To further investigate the effect of statistical properties within demonstrations on model performance, we exclusively employed queries along with statistical information for inference, excluding the inclusion of demonstrations for each test sample.", | |
"These statistics were estimated using deep features of all training samples.", | |
"As shown in Table 3 , IDAICL relying solely on statistical properties distinctly outperforms Vanilla ICL across scenarios with zero, one, and even four demonstrations.", | |
"This emphasizes the crucial role of prior statistics obtained from training data in PLMs predictions.", | |
"This phenomenon is understandable as statistical properties inherently encompass richer global information compared to individual demonstrations.", | |
"" | |
], | |
"target_context_ids": [ | |
0, | |
2, | |
3, | |
4 | |
], | |
"selected_paragraphs": [ | |
"[paragraph id = 0] To further investigate the effect of statistical properties within demonstrations on model performance, we exclusively employed queries along with statistical information for inference, excluding the inclusion of demonstrations for each test sample.", | |
"[paragraph id = 2] As shown in Table 3 , IDAICL relying solely on statistical properties distinctly outperforms Vanilla ICL across scenarios with zero, one, and even four demonstrations.", | |
"[paragraph id = 3] This emphasizes the crucial role of prior statistics obtained from training data in PLMs predictions.", | |
"[paragraph id = 4] This phenomenon is understandable as statistical properties inherently encompass richer global information compared to individual demonstrations." | |
], | |
"table_html": "<figure class=\"ltx_table\" id=\"S5.T3\">\n<table class=\"ltx_tabular ltx_centering ltx_align_middle\" id=\"S5.T3.16\">\n<tr class=\"ltx_tr\" id=\"S5.T3.16.17\" style=\"background-color:#D9D9D9;\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S5.T3.16.17.1\"><span class=\"ltx_text\" id=\"S5.T3.16.17.1.1\" style=\"background-color:#D9D9D9;\">Dataset</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.16.17.2\"><span class=\"ltx_text\" id=\"S5.T3.16.17.2.1\" style=\"background-color:#D9D9D9;\">0-shot</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.16.17.3\"><span class=\"ltx_text\" id=\"S5.T3.16.17.3.1\" style=\"background-color:#D9D9D9;\">1-shot</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.16.17.4\"><span class=\"ltx_text\" id=\"S5.T3.16.17.4.1\" style=\"background-color:#D9D9D9;\">4-shot</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.16.17.5\"><span class=\"ltx_text\" id=\"S5.T3.16.17.5.1\" style=\"background-color:#D9D9D9;\">IDAICL</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S5.T3.4.4\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"S5.T3.4.4.5\">SST-2</td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.1.1.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.2.2.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.3.3.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"S5.T3.4.4.4\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S5.T3.8.8\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S5.T3.8.8.5\">SST-5</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.5.5.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.6.6.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.7.7.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.8.8.4\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S5.T3.12.12\">\n<td class=\"ltx_td ltx_align_left ltx_border_r\" id=\"S5.T3.12.12.5\">MR</td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.9.9.1\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.10.10.2\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.11.11.3\"></td>\n<td class=\"ltx_td ltx_align_center\" id=\"S5.T3.12.12.4\"></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"S5.T3.16.16\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r\" id=\"S5.T3.16.16.5\">Subj</td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S5.T3.13.13.1\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S5.T3.14.14.2\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S5.T3.15.15.3\"></td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb\" id=\"S5.T3.16.16.4\"></td>\n</tr>\n</table>\n<figcaption class=\"ltx_caption ltx_centering\"><span class=\"ltx_tag ltx_tag_table\">Table 3: </span>Accuracy comparison between Vanilla ICL and IDAICL based solely on statistical properties, using the GPT-2 model with 0.8B parameters.\n</figcaption>\n</figure>", | |
"perturb_sentence_id": [ | |
2, | |
3 | |
], | |
"output": { | |
"perturbed_statement": "[paragraph id = 2] As shown in Table 3 , IDAICL relying solely on statistical properties generally performs worse than Vanilla ICL across scenarios with zero, one, and even four demonstrations. This de-emphasizes the crucial role of prior statistics obtained from training data in PLMs predictions.", | |
"perturbed_explanation": "The original explanation states that relying on statistical properties allows IDAICL to distinctly outperform Vanilla ICL in various scenarios, highlighting the importance of prior statistics in PLM predictions. 1. The statement claims that IDAICL relying solely on statistical properties performs worse than Vanilla ICL, which contradicts the original context that suggests IDAICL actually outperforms Vanilla ICL in those scenarios. 2. The conclusion that this de-emphasizes the role of prior statistics is incorrect because the original context indicates the importance of prior statistics in enhancing performance, not diminishing it." | |
} | |
}, | |
{ | |
"path": "table_paper/2407.00100v1.json", | |
"table_id": "10", | |
"section": "5.3", | |
"all_context": [ | |
"Previous studies Zhao et al.", | |
"(2021 ); Sorensen et al.", | |
"(2022 ); Min et al.", | |
"(2022a ); Zhang et al.", | |
"(2022b ) have highlighted the considerable variability in ICL s performance.", | |
"In this section, we verified that IDAICL can effectively enhance performance stability across diverse scenarios.", | |
"We have presented the results across different numbers of demonstrations in Table 1 .", | |
"For a clearer depiction, the outcomes regarding GPT-Neo are illustrated in Figure 3 .", | |
"As the number of demonstration examples (represented by ) increases, both Vanilla ICL and IDAICL exhibit improved performance, emphasizing the importance of comprehensive statistical properties of the input data for IDAICL s effectiveness.", | |
"Notably, IDAICL significantly enhances performance stability across various numbers of demonstrations and consistently outperforms Vanilla ICL.", | |
"The performance improvement is particularly pronounced when takes on smaller values, indicating the efficacy of IDAICL in enriching the available knowledge for PLMs.", | |
"To confirm that augmenting demonstrations can enhance the robustness of the ICL strategy across various demonstrations, we investigated three distinct demonstration selection settings.", | |
"Setting I: Training samples most similar to the test sample are chosen.", | |
"Setting II: Samples are randomly selected from the training data.", | |
"Setting III: Training samples exhibiting the greatest dissimilarity from the test sample are selected.", | |
"As shown in Figures 4 (a) and (b), IDAICL significantly outperforms Vanilla ICL and demonstrates greater robustness across the three selection settings.", | |
"Additionally, our discoveries suggest that selecting demonstrations that are more similar to the test samples leads to better performance than exclusively selecting dissimilar ones, which aligns with the findings obtained by Wang et al.", | |
"Wang et al.", | |
"(2022 ).", | |
"To assess the performance of IDAICL across various templates, we employed fifteen templates on the SST-2 dataset following those outlined by Zhao et al.", | |
"Zhao et al.", | |
"(2021 ).", | |
"The templates are elaborated in Table 10 of the Appendix.", | |
"Figures 4 (c) and (d) display the performance of Vanilla ICL and IDAICL across six templates.", | |
"Some templates achieve higher average performance than others.", | |
"Nevertheless, IDAICL consistently enhances both average and worst-case accuracy, simultaneously reducing performance variance across different templates.", | |
"The complete results are available in Figure 7 of the Appendix.", | |
"Figures 5 (a) and (b) depict comparison results among Vanilla ICL, MetaICL, Channel ICL, and IDAICL across different degrees of imbalances.", | |
"It is evident that the performance of Vanilla ICL is sensitive to class imbalance, while that of IDAICL and Channel ICL exhibit robustness to the imbalance.", | |
"Moreover, notable performance improvements are observed with higher levels of imbalance.", | |
"Additionally, Figures 5 (c) and (d) illustrate the confusion matrices for CR and Subj datasets, with the proportion of one category (i.e., \"Negative\" and \"Subjective\") in demonstrations setting to 0.1 and 0.2.", | |
"IDAICL significantly improves the accuracy of the underrepresented classes when compared to Vanilla ICL, thereby contributing to enhanced fairness among classes.", | |
"In the subsequent section, we demonstrate that the strong performance of IDAICL in handling imbalanced label distributions stems from both the statistical properties and the class proportion term.", | |
"" | |
], | |
"target_context_ids": [ | |
20, | |
21, | |
22, | |
23, | |
24 | |
], | |
"selected_paragraphs": [ | |
"[paragraph id = 20] Zhao et al.", | |
"[paragraph id = 21] (2021 ).", | |
"[paragraph id = 22] The templates are elaborated in Table 10 of the Appendix.", | |
"[paragraph id = 23] Figures 4 (c) and (d) display the performance of Vanilla ICL and IDAICL across six templates.", | |
"[paragraph id = 24] Some templates achieve higher average performance than others." | |
], | |
"table_html": "<figure class=\"ltx_table\" id=\"A8.T10\">\n<div class=\"ltx_inline-block ltx_align_center ltx_transformed_outer\" id=\"A8.T10.1\" style=\"width:433.6pt;height:1039.7pt;vertical-align:-0.0pt;\"><span class=\"ltx_transformed_inner\" style=\"transform:translate(-34.7pt,83.2pt) scale(0.86199860309836,0.86199860309836) ;\">\n<table class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1\">\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.1\" style=\"background-color:#D9D9D9;\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"A8.T10.1.1.1.1\"><span class=\"ltx_text\" id=\"A8.T10.1.1.1.1.1\" style=\"background-color:#D9D9D9;\">Format ID</span></td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"A8.T10.1.1.1.2\"><span class=\"ltx_text\" id=\"A8.T10.1.1.1.2.1\" style=\"background-color:#D9D9D9;\">Prompt</span></td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"A8.T10.1.1.1.3\"><span class=\"ltx_text\" id=\"A8.T10.1.1.1.3.1\" style=\"background-color:#D9D9D9;\">Label names</span></td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.2\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"A8.T10.1.1.2.1\">1</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_tt\" id=\"A8.T10.1.1.2.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.2.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.2.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.2.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.2.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.2.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.2.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.2.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Answer: Positive</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.2.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.2.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.2.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.2.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.2.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Answer:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.2.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_tt\" id=\"A8.T10.1.1.2.3\">Positive / Negative</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.3\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.3.1\">2</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.3.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.3.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.3.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.3.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.3.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.3.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.3.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.3.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Answer: good</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.3.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.3.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.3.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.3.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.3.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Answer:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.3.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.3.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.4\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.4.1\">3</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.4.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.4.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.4.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.4.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.4.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.4.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.4.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.4.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">My review for last night’s film: This movie is amazing! The critics agreed that this movie was good</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.4.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.4.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.4.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.4.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">My review for last night’s film: Horrific movie, don’t see it. The critics agreed that this movie was</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.4.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.4.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.5\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.5.1\">4</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.5.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.5.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.5.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.5.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.5.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.5.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Here is what our critics think for this month’s films.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.5.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.5.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">One of our critics wrote \"This movie is amazing!\". Her sentiment towards the film was positive.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.5.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.5.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.5.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">One of our critics wrote \"Horrific movie, don’t see it\". Her sentiment towards the film was</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.5.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.5.3\">positive / negative</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.6\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.6.1\">5</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.6.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.6.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.6.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.6.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.6.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.6.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Critical reception [ edit ]</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.6.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.6.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">In a contemporary review, Roger Ebert wrote \"This movie is amazing!\". Entertainment Weekly agreed, and\nthe overall critical reception of the film was good.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.6.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.6.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.6.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">In a contemporary review, Roger Ebert wrote \"Horrific movie, don’t see it\". Entertainment Weekly agreed, and\nthe overall critical reception of the film was</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.6.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.6.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.7\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.7.1\">6</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.7.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.7.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.7.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.7.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.7.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.7.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.7.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.7.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Positive Review? Yes</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.7.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.7.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.7.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.7.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.7.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Positive Review?</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.7.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.7.3\">Yes / No</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.8\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.8.1\">7</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.8.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.8.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.8.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.8.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Question: Is the sentiment of the above review Positive or Negative?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Answer: Positive</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.5\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.5.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.5.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.5.1.1.1\" style=\"width:327.2pt;\">Question: Is the sentiment of the above review Positive or Negative?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.8.2.2.1.6\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.6.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.8.2.2.1.6.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.8.2.2.1.6.1.1.1\" style=\"width:327.2pt;\">Answer:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.8.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.8.3\">Positive / Negative</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.9\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.9.1\">8</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.9.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.9.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.9.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.9.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Question: Did the author think that the movie was good or bad?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Answer: good</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.5\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.5.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.5.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.5.1.1.1\" style=\"width:327.2pt;\">Question: Did the author think that the movie was good or bad?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.9.2.2.1.6\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.6.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.9.2.2.1.6.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.9.2.2.1.6.1.1.1\" style=\"width:327.2pt;\">Answer:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.9.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.9.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.10\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.10.1\">9</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.10.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.10.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.10.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.10.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Question: Did the author of the following tweet think that the movie was good or bad?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Tweet: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Answer: good</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Question: Did the author of the following tweet think that the movie was good or bad?</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.5\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.5.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.5.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.5.1.1.1\" style=\"width:327.2pt;\">Tweet: Horrific movie, don’t see it</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.10.2.2.1.6\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.6.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.10.2.2.1.6.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.10.2.2.1.6.1.1.1\" style=\"width:327.2pt;\">Answer:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.10.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.10.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.11\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.11.1\">10</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.11.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.11.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.11.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.11.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.11.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.11.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.11.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.11.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">This movie is amazing! My overall feeling was that the movie was good</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.11.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.11.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.11.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.11.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Horrific movie, don’t see it. My overall feeling was that the movie was</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.11.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.11.3\">good / bad</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.12\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.12.1\">11</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.12.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.12.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.12.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.12.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.12.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.12.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.12.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.12.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">This movie is amazing! I liked the movie.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.12.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.12.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.12.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.12.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Horrific movie, don’t see it. I</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.12.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.12.3\">liked / hated</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.13\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.13.1\">12</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.13.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.13.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.13.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.13.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.13.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.13.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.13.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.13.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">This movie is amazing! My friend asked me if I would give the movie 0 or 5 stars, I said 5</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.13.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.13.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.13.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.13.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Horrific movie, don’t see it. My friend asked me if I would give the movie 0 or 5 stars, I said</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.13.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.13.3\">0 / 5</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.14\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.14.1\">13</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.14.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.14.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.14.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.14.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.14.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.14.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Input: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.14.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.14.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Sentiment: Positive</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.14.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.14.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Input: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.14.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.14.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.14.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Sentiment:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.14.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.14.3\">Positive / Negative</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.15\">\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.15.1\">14</td>\n<td class=\"ltx_td ltx_align_left ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.15.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.15.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.15.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.15.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.15.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.15.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.15.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.15.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Positive: True</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.15.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.15.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.15.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.15.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.15.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Positive:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.15.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_t\" id=\"A8.T10.1.1.15.3\">True / False</td>\n</tr>\n<tr class=\"ltx_tr\" id=\"A8.T10.1.1.16\">\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.16.1\">15</td>\n<td class=\"ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t\" id=\"A8.T10.1.1.16.2\">\n<span class=\"ltx_text\" id=\"A8.T10.1.1.16.2.1\"></span><span class=\"ltx_text\" id=\"A8.T10.1.1.16.2.2\">\n<span class=\"ltx_tabular ltx_align_middle\" id=\"A8.T10.1.1.16.2.2.1\">\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.16.2.2.1.1\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.1.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.1.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.16.2.2.1.1.1.1.1\" style=\"width:327.2pt;\">Review: This movie is amazing!</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.16.2.2.1.2\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.2.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.2.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.16.2.2.1.2.1.1.1\" style=\"width:327.2pt;\">Stars: 5</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.16.2.2.1.3\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.3.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.3.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.16.2.2.1.3.1.1.1\" style=\"width:327.2pt;\">Review: Horrific movie, don’t see it.</span>\n</span></span></span>\n<span class=\"ltx_tr\" id=\"A8.T10.1.1.16.2.2.1.4\">\n<span class=\"ltx_td ltx_nopad_r ltx_align_justify ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.4.1\">\n<span class=\"ltx_inline-block ltx_align_top\" id=\"A8.T10.1.1.16.2.2.1.4.1.1\">\n<span class=\"ltx_p\" id=\"A8.T10.1.1.16.2.2.1.4.1.1.1\" style=\"width:327.2pt;\">Stars:</span>\n</span></span></span>\n</span></span><span class=\"ltx_text\" id=\"A8.T10.1.1.16.2.3\"></span>\n</td>\n<td class=\"ltx_td ltx_align_center ltx_border_bb ltx_border_t\" id=\"A8.T10.1.1.16.3\">5 / 0</td>\n</tr>\n</table>\n</span></div>\n<figcaption class=\"ltx_caption ltx_centering\"><span class=\"ltx_tag ltx_tag_table\">Table 10: </span>The templates employed for examining the influence of formats on the SST-2 dataset, following those outlined by Zhao et al. <cite class=\"ltx_cite ltx_citemacro_cite\">Zhao et al. (<a class=\"ltx_ref\" href=\"https://arxiv.org/html/2407.00100v1#bib.bib62\" title=\"\">2021</a>)</cite>. An example from the training data is used for illustration.</figcaption>\n</figure>", | |
"perturb_sentence_id": [ | |
21, | |
22 | |
], | |
"output": { | |
"perturbed_statement": "[paragraph id = 21] (2021). The templates are presented in Table 5 of the Appendix.", | |
"perturbed_explanation": "1. The original statement indicates that the templates are elaborated in Table 10 of the Appendix. 2. The statement incorrectly mentions Table 5 instead of Table 10, which changes the reference location of the templates in the appendix, making it factually incorrect." | |
} | |
} | |
] |