Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information RetrievalใƒปMedical Multimodal NLP (๐Ÿ–ผ+๐Ÿ“) Research Fellow @BU_Researchใƒปsoftware developer http://arekit.ioใƒปPhD in NLP

Recent Activity

posted an update 6 days ago
posted an update 12 days ago
reacted to m-ric's post with ๐Ÿ”ฅ 19 days ago

Organizations

None yet

Posts 29

view post
Post
425
๐Ÿ“ข For those who are interested in extracting information about โœ๏ธ authors from texts, happy to share personal ๐Ÿ“น on Reading Between the lines: adapting ChatGPT-related systems ๐Ÿค– for Implicit Information Retrieval National

Youtube: https://youtu.be/nXClX7EDYbE

๐Ÿ”‘ In this talk, we refer to IIR as such information that is indirectly expressed by โœ๏ธ author / ๐Ÿ‘จ character / patient / any other entity.

๐Ÿ“Š I cover the 1๏ธโƒฃ pre-processing and 2๏ธโƒฃ reasoning techniques, aimed at enhancing gen AI capabilities in IIR. To showcase the effectiveness of the proposed techniques, we experiment with such IIR tasks as Sentiment Analysis, Emotion Extraction / Causes Prediction.

In pictures below, sharing the quick takeaways on the pipeline construction and experiment results ๐Ÿงช

Related paper cards:
๐Ÿ“œ emotion-extraction: https://nicolay-r.github.io/#semeval2024-nicolay
๐Ÿ“œ sentiment-analysis: https://nicolay-r.github.io/#ljom2024

Models:
nicolay-r/flan-t5-tsa-thor-base
nicolay-r/flan-t5-emotion-cause-thor-base


๐Ÿ““ PS: I got a hoppy for advetising HPMoR โœจ ๐Ÿ˜
view post
Post
702
๐Ÿ“ข Have you ever been wondered how specifically Transformers were capable for handling long input contexts?
I got a chance to tackle this through long document texts summarization problem, and delighted to share the related survey and diagram for a quick skimming below:

Preprint ๐Ÿ“ https://nicolay-r.github.io/website/data/preprint-AINL_2023_longt5_summarization.pdf
Springer ๐Ÿ“ https://link.springer.com/article/10.1007/s10958-024-07435-z

๐ŸŽฏ The aim of the survey was the development of the long-document summarizer for mass-media news in Vietnamese language. ๐Ÿ‡ป๐Ÿ‡ณ

Sharing for a quick skimming of the methods performance overview of various LM-based solution across several datasets, covering domain-oriented advances in Vietnamese language (see attached screenshots)

As for solution we consider:
โ˜‘๏ธ 1. Adapt existed google/pegasus-cnn_dailymail for summarizing large dataset for arranging training
โ˜‘๏ธ 2. Tuning google/long-t5-tglobal-large suitable for performing generative summarization.

Implementation details:
๐ŸŒŸ https://github.com/nicolay-r/ViLongT5
(Simplier to go with huggingface rather flaxformer that so far become a legacy engine)

datasets

None public yet