--- license: cc-by-4.0 language: - ar --- # Verified Claim Reterieval This repo demonstrates how to perform retrieval over a collection of verified claims for a given query. The provided model is a tuned version of AraBERT on AraFACTs dataset. Tuning details can be found in this [repo](https://gitlab.com/watheq/detecting-previously-checked-claims-over-twitter). First, you need to fill in the required paths for BERT model and Index. After that, you just need two steps: 1. Create an object of *ClaimRetrieval* class and pass the suitable parameters. 2. Invoke *retrieve_relevant_vclaims* and pass the tweets as parameters. ## Example Here is a full example to do that. First, prepare the input: ``` tweets = [{'id_str': '1433976054562045952', 'full_text': 'مرتضى منصور : قررت ايقاف الولد امام عاشور وبيعه ولو هيجيبلي كأس العالم.. "مينفعش لاعيية تبقى بتصلي ولاعب صابغ شعره زي البنات.. ايه القرف ده .. وشاطر بس يطلب زيادة عقده و مش قادر يجري و يروح نايم لي على بطنه ويتسبب ان يخش فينا اجوان ، ما تسترجل يا ولد انت موقوف ومتحول للتحقيق " https://t.co/df2QvC0Zu9'},] # input tweet ``` Then, initialize an object only once. ``` lang = "ar" index_path = "path/to/pyterrier_index" bert_name = "aubmindlab/bert-base-arabertv02" trained_model_weights = "tuned_model_weights.bin" # AraBERT weights claim_retrieval = ClaimRetrieval(index_path=index_path, lang='ar', bert_name, trained_model_weights, random_seed=42, depth=20, batch_size= 8,num_classes=2, dropout=0.3, is_output_probability=True, num_layers=2, max_len=256) ``` Pass the input tweet to retrieve the relevat vclaims ``` queries_and_relevant_vclaims = claim_retrieval.retrieve_relevant_vclaims(tweets) ``` ## Citation If you used any piece of this repository, please consider citing our work : ```plaintext @inproceedings{mansour2022did, title={Did I See It Before? Detecting Previously-Checked Claims over Twitter}, author={Mansour, Watheq and Elsayed, Tamer and Al-Ali, Abdulaziz}, booktitle={European Conference on Information Retrieval}, pages={367--381}, year={2022}, organization={Springer} } ``` license: cc-by-4.0