nicolay-r (Nicolay Rusnachenko)

reacted to merve's post with ❤️ about 1 hour ago

Post

98

your hugging face profile now has your recent activities 🤗

posted an update 6 days ago

Post

426

📢 For those who are interested in extracting information about ✍️ authors from texts, happy to share personal 📹 on Reading Between the lines: adapting ChatGPT-related systems 🤖 for Implicit Information Retrieval National

Youtube: https://youtu.be/nXClX7EDYbE

🔑 In this talk, we refer to IIR as such information that is indirectly expressed by ✍️ author / 👨 character / patient / any other entity.

📊 I cover the 1️⃣ pre-processing and 2️⃣ reasoning techniques, aimed at enhancing gen AI capabilities in IIR. To showcase the effectiveness of the proposed techniques, we experiment with such IIR tasks as Sentiment Analysis, Emotion Extraction / Causes Prediction.

In pictures below, sharing the quick takeaways on the pipeline construction and experiment results 🧪

Related paper cards:
📜 emotion-extraction: https://nicolay-r.github.io/#semeval2024-nicolay
📜 sentiment-analysis: https://nicolay-r.github.io/#ljom2024

Models:
nicolay-r/flan-t5-tsa-thor-base
nicolay-r/flan-t5-emotion-cause-thor-base

📓 PS: I got a hoppy for advetising HPMoR ✨ 😁

posted an update 12 days ago

Post

703

📢 Have you ever been wondered how specifically Transformers were capable for handling long input contexts?
I got a chance to tackle this through long document texts summarization problem, and delighted to share the related survey and diagram for a quick skimming below:

Preprint 📝 https://nicolay-r.github.io/website/data/preprint-AINL_2023_longt5_summarization.pdf
Springer 📝 https://link.springer.com/article/10.1007/s10958-024-07435-z

🎯 The aim of the survey was the development of the long-document summarizer for mass-media news in Vietnamese language. 🇻🇳

Sharing for a quick skimming of the methods performance overview of various LM-based solution across several datasets, covering domain-oriented advances in Vietnamese language (see attached screenshots)

As for solution we consider:
☑️ 1. Adapt existed google/pegasus-cnn_dailymail for summarizing large dataset for arranging training
☑️ 2. Tuning google/long-t5-tglobal-large suitable for performing generative summarization.

Implementation details:
🌟 https://github.com/nicolay-r/ViLongT5
(Simplier to go with huggingface rather flaxformer that so far become a legacy engine)

reacted to m-ric's post with 🔥 19 days ago

Post

2336

> Oasis: First Real-Time Video Game Without a Game Engine! 🎮

DecartAI & Etched just released Oasis - a fully AI-generated video game running at 20 FPS (frames per second). The model takes keyboard inputs and generates everything - physics, rules, graphics - on the fly, without any game engine.

⚡️ What makes this special? Current text-to-video models (Mochi-1, Sora, Kling) generate about 1 frame every 10-20 seconds (that's the kind of device I had to play LoL back in the day, thus my low rankings). Oasis is 200 times faster, making it the first playable AI-generated game.

⚙️ Under the hood, it uses a vision transformer to encode space and a diffusion model to generate frames. The secret sauce is "dynamic noising" - a technique that keeps the video stable between frames.

Key insights:
⚡️ Generates 20 FPS, vs 0.2 FPS for other DIT-based video models
‣ The specialized hardware Sohu developed by Etched allows to handle 10x more player than H100

🎮 Features real game mechanics
‣ Movement, jumping, item management
‣ Physics and lighting
‣ Procedurally generated worlds

⚠️ Current limitations
‣ Blurry graphics at a distance
‣ Objects sometimes change appearance
‣ Memory issues in long sessions

Try it yourself, the playable demo is impressive! 👉 https://oasis.decart.ai/welcome
Code 👉 https://github.com/etched-ai/open-oasis
Read it in full 👉 https://oasis-model.github.io/

posted an update 19 days ago

Post

1830

📢 If you're aimed at processing complex dependencies in spreadsheet data with LLM Chain-of-Thought technique, then this update might be valuable for you 💎

The updated 📦 bulk-chain-0.24.1 which is aimed at iterative processing of CSV/JSONL data with no-string dependencies from third party LLM frameworks is out 🎉

📦: https://pypi.org/project/bulk-chain/0.24.1/
🌟: https://github.com/nicolay-r/bulk-chain
📘: https://github.com/nicolay-r/bulk-chain/issues/26

The key feature of bulk-chain is SQLite caching that saves your time ⏰️ and money 💵 by guarantee no-data-lost, which is important once using the remote LLM providers such as OpenAI, ReplicateIO, OpenRouter, etc.

🔧 This release has the following updates:
✅ Improved stability for various header conditions and the related support from SQLite
✅ Manual setup for ID column / assigning the ID
✅ Make CSV-related setups dynamic, that refers to the related Python 📦 csv package.

Quick start on GoogleColab:
📙: https://colab.research.google.com/github/nicolay-r/bulk-chain/blob/master/bulk_chain_tutorial.ipynb

Below is an example of the three simple steps in pictures:
1. ⬇️ Package installation
2. ✍️ Declaring schema
3. 🚀 Launching inference for your data with Replicate and 🤖 meta-llama/Llama-3.1-405B

posted an update about 1 month ago

Post

681

📢 Excited to share that our studies 📄 "Large Language Models in Targeted Sentiment Analysis for Russian" has recently become in 📘 Springer Lobachevskii Journal of Mathematics 🥳✨ ...

📘 https://link.springer.com/article/10.1134/S1995080224603758

In this studies we provide such a diverse look and experiments over various 🤖 LLM models 🤖 scaled from 7B in two different modes: ❄️ zero-shot and 🔥 fine-tuned (Flan-T5 only) using Three-Hop reasoning technique.
We showcase the importance of performing:
💚 text translation into English
💚 application on Chain-of-Thought for Implicit Sentiment Analysis

More:
📄 Arxiv: https://arxiv.org/abs/2404.12342
🧑‍💻️ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework
🤗 Models: Large Language Models in Targeted Sentiment Analysis (2404.12342)
🎥 Video @NLPSummit : https://www.youtube.com/watch?v=qawLJsRHzB4

THOR: https://github.com/scofield7419/THOR-ISA

posted an update about 1 month ago

Post

640

📢 We are giving extra two weeks before switching to the final stage of RuOpinionNE-2024.
⏰ The final stage starts since 1-st of November 2024.
We have already first baseline submission by 👨‍💻 RefalMachine that showcase F1 = 0.17 based on Qwen2 model series.

For those who wish to attend:
📊 Codalab: https://codalab.lisn.upsaclay.fr/competitions/20244
🗒 Task: https://codalab.lisn.upsaclay.fr/competitions/20244#learn_the_details-overview
🔔 Updates: https://t.me/RuOpinionNE2024

🙋 Questions: https://nicolay-r.github.io/
🧪 Past experiments: https://github.com/nicolay-r/RuSentNE-LLM-Benchmark

reacted to fdaudens's post with 🧠🔥 about 1 month ago

Post

3037

The Nobel Prize background for Hopfield and Hinton's work on neural networks is pure gold. It's a masterclass in explaining AI basics.

Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics 🤝 AI: A two-way street of innovation.

Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells

We're just scratching the surface. The interplay between AI and physics is reshaping both fields.

Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)

#AI #MachineLearning #Physics #Ethics #Innovation

1 reply

·

reacted to Jaward's post with 👀 about 1 month ago

Post

1133

Lightweight implementation of newly introduced “Differential Transformer”:
Proposes differential attention mechanism which computes attention scores as a difference between two separate softmax attention maps thereby reducing noise in attention blocks. [[[Differential nanoGPT]]] :)

Code: https://github.com/Jaykef/ai-algorithms/blob/main/DIFF_Transformer.ipynb
YT Video: https://youtu.be/9V4mJA5y7dg

reacted to fantaxy's post with 😎 about 1 month ago

Post

3463

NSFW Erotic Novel AI Generation
-NSFW Text (Data) Generator for Detecting 'NSFW' Text: Multilingual Experience

The multilingual NSFW text (data) auto-generator is a tool designed to automatically generate and analyze adult content in various languages. This service uses AI-based text generation to produce various types of NSFW content, which can then be used as training data to build effective filtering models. It supports multiple languages, including English, and allows users to input the desired language through the system prompt in the on-screen options to generate content in the specified language. Users can create datasets from the generated data, train machine learning models, and improve the accuracy of text analysis systems. Furthermore, content generation can be customized according to user specifications, allowing for the creation of tailored data. This maximizes the performance of NSFW text detection models.

Web: https://fantaxy-erotica.hf.space
API: https://replicate.com/aitechtree/nsfw-novel-generation

Usage Warnings and Notices: This tool is intended for research and development purposes only, and the generated NSFW content must adhere to appropriate legal and ethical guidelines. Proper monitoring is required to prevent the misuse of inappropriate content, and legal responsibility lies with the user. Users must comply with local laws and regulations when using the data, and the service provider is not liable for any issues arising from the misuse of the data.

2 replies

·

reacted to reach-vb's post with 👍 about 1 month ago

Post

2050

On-device AI framework ecosystem is blooming these days:

1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp

2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm

3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples

4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle

Honorable mentions:

1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js

2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs

3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet

4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml

Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! 🤗

Also, which frameworks did I miss?

1 reply

·

posted an update about 1 month ago

Post

1004

📢 Two weeks ago I got a chance to share the most recent reasoning 🧠 capabilities of Large Language models in Sentiment Analysis NLPSummit-2024.

For those who missed and still wish to find out the advances of GenAI in that field, the recording is now available:
https://www.youtube.com/watch?v=qawLJsRHzB4

You will be aware of:
☑️ how well LLMs reasoning can be used for reasoning in sentiment analysis as in Zero-shot-Learning,
☑️ how to improve reasoning by applying and leaving step-by-step chains (Chain-of-Thought)
☑️ how to prepare the most advanced model in sentiment analysis using Chain-of-Thought.

Links:
📜 Paper: Large Language Models in Targeted Sentiment Analysis (2404.12342)
⭐ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework

reacted to qnguyen3's post with 🔥 about 1 month ago

Post

3740

nanoLLaVA-1.5 is here! Same size (1B), better performance 🔥🔥🔥
And it is much more powerful than v1.0
Try it out now on HF Spaces: qnguyen3/nanoLLaVA
Model: qnguyen3/nanoLLaVA-1.5

3 replies

·

reacted to mmhamdy's post with 👀 about 1 month ago

Post

1829

🔗 Evaluating Long Context #1: Long Range Arena (LRA)

Accurately evaluating how well language models handle long contexts is crucial, but it's also quite challenging to do well. In this series of posts, we're going to examine the various benchmarks that were proposed to assess long context understanding, starting with Long Range Arens (LRA)

Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation.

📌 Key Features of LRA

1️⃣ Diverse Tasks: The LRA benchmark consists of a suite of tasks designed to evaluate model performance on long sequences ranging from 1,000 to 16,000 tokens. These tasks encompass different data types and modalities: Text, Natural and Synthetic Images, and Mathematical Expressions.

2️⃣ Synthetic and Real-world Tasks: LRA is comprised of both synthetic probing tasks and real-world tasks.

3️⃣ Open-Source and Extensible: Implemented in Python using Jax and Flax, the LRA benchmark code is publicly available, making it easy to extend.

📌 Tasks

1️⃣ Long ListOps

2️⃣ Byte-level Text Classification and Document Retrieval

3️⃣ Image Classification

4️⃣ Pathfinder and Pathfinder-X (Long-range spatial dependency)

👨‍💻 Long Range Arena (LRA) Github Repository: https://github.com/google-research/long-range-arena

📄 Long Range Arena (LRA) paper: Long Range Arena: A Benchmark for Efficient Transformers (2011.04006)

reacted to merve's post with 🔥 about 1 month ago

Post

3740

Meta AI vision has been cooking @facebook
They shipped multiple models and demos for their papers at @ECCV 🤗

Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos 👏

All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images

Model: facebook/vfusion3d
Demo: facebook/VFusion3D

- CoTracker is the state-of-the-art point (pixel) tracking model

Demo: facebook/cotracker
Model: facebook/cotracker

posted an update about 2 months ago

Post

643

📢 This year I made decent amout of experiments on LLM reasoning capabilities in author opinion extraction.
However, they did not go further with:
↗️ annoation of other sources of opinion causes: entities, out-of-context object (None).
📏 evaluation of factual statements that support the extracted sentiment.

To address these limitations, so far we launch 🚀 RuOpinionNE-2024 competition on the Codalab platform:
📊 https://codalab.lisn.upsaclay.fr/competitions/20244

The competition is aimed at extraction of opinion tuples (see attached images) from texts written in Russian.
It proceeds the past RuSentNE-2023 codalab competition findings:
🔎 Past year competition: https://www.dialog-21.ru/media/5896/golubevaplusetal118.pdf
🔎 LLM reasoning 🧠: https://arxiv.org/abs/2404.12342

For those who interested to adopt Generative AI, the complete information about competition is below:
📊 RuOpinionNE-2024: https://codalab.lisn.upsaclay.fr/competitions/20244
🗒 Task description: https://codalab.lisn.upsaclay.fr/competitions/20244#learn_the_details-overview
🔔 To follow updates: https://t.me/RuOpinionNE2024
⏰ Stages Deadlines (might be extended)
📦 Submission details (bottom of the competition page)

🙋 For questions you can contact @nicolay-r : https://nicolay-r.github.io/
🧪 Most recent findings on LLM application: https://github.com/nicolay-r/RuSentNE-LLM-Benchmark

reacted to Tonic's post with 👀 about 2 months ago

Post

2730

🙋🏻‍♂️hey there folks ,

did you know that https://huggingface.co/lmms-lab released a new version of 🌋🌋Llava on thursday ? Now it has 🎥video understanding !
check it out 👇🏻

collection : lmms-lab/llava-video-661e86f5e8dabc3ff793c944
demo : Tonic/Llava-Video

reacted to nroggendorff's post with 👀 about 2 months ago

Post

1517

pretty much all of the values in the llama training post are placeholders, so if you dont get a desireable result tweak and tweak and tweak. it took months to get smallama to do anything

reacted to clem's post with 👀 about 2 months ago

Post

2056

What are we thinking about MovieGen from Meta? Are the researchers on Hugging Face to be able to ask them questions?

The paper is here: https://ai.meta.com/static-resource/movie-gen-research-paper

Nicolay Rusnachenko

AI & ML interests

Recent Activity

Organizations

nicolay-r's activity