@deoxykev , I am still working on it, but for now, other zero-shot approaches that directly build for that like gliner demonstrate better precision.
Stepanov
Ihor
AI & ML interests
Text classification, computational biology, relations extraction, path reasoning
Recent Activity
liked
a model
2 days ago
knowledgator/gliclass-small-v1.0-init
updated
a model
4 days ago
Ihor/gliner-biomed-large-rel-1stg-v1.0
published
a model
4 days ago
Ihor/gliner-biomed-large-rel-1stg-v1.0
Organizations
Ihor's activity
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
replied to
their
post
11 days ago
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
16 days ago
Post
1408
🚀 Reproducing DeepSeek R1 for Text-to-Graph Extraction
I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.
🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.
💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.
📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!
🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b
📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419
Feel free to share your thoughts and ask questions!
I’ve been working on replicating DeepSeek R1, focusing on zero-shot text-to-graph extraction—a challenging task where LMs extract entities and relations from text based on predefined types.
🧠 Key Insight:
Language models struggle when constrained by entity/relation types. Supervised training alone isn’t enough, but reinforcement learning (RL), specifically Guided Reward Policy Optimization (GRPO), shows promise.
💡 Why GRPO?
It trains the model to generate structured graphs, optimizing multiple reward functions (format, JSON validity, and extraction accuracy).
It allows the model to learn from both positive and hard negative examples dynamically.
RL can be fine-tuned to emphasize relation extraction improvements.
📊 Early Results:
Even with limited training, F1 scores consistently improved, and we saw clear benefits from RL-based optimization. More training = better performance!
🔬 Next Steps:
We’re scaling up experiments with larger models and high-quality data. Stay tuned for updates! Meanwhile, check out one of our experimental models here:
Ihor/Text2Graph-R1-Qwen2.5-0.5b
📔 Learn more details from the blog post: https://medium.com/p/d8b648d9f419
Feel free to share your thoughts and ask questions!
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
2 months ago
Post
1154
🚀 Welcome the New and Improved GLiNER-Multitask! 🚀
Since the release of our beta version, GLiNER-Multitask has received many positive responses. It's been embraced in many consulting, research, and production environments. Thank you everyone for your feedback, it helped us rethink the strengths and weaknesses of the first model and we are excited to present the next iteration of this multi-task information extraction model.
💡 What’s New?
Here are the key improvements in this latest version:
🔹 Expanded Task Support: Now includes text classification and other new capabilities.
🔹 Enhanced Relation Extraction: Significantly improved accuracy and robustness.
🔹 Improved Prompt Understanding: Optimized for open-information extraction tasks.
🔹 Better Named Entity Recognition (NER): More accurate and reliable results.
🔧 How We Made It Better:
These advancements were made possible by:
🔹 Leveraging a better and more diverse dataset.
🔹 Using a larger backbone model for increased capacity.
🔹 Implementing advanced model merging techniques.
🔹 Employing self-learning strategies for continuous improvement.
🔹 Better training strategies and hyperparameters tuning.
📄 Read the Paper: https://arxiv.org/abs/2406.12925
⚙️ Try the Model: knowledgator/gliner-multitask-v1.0
💻 Test the Demo: knowledgator/GLiNER_HandyLab
📌 Explore the Repo: https://github.com/urchade/GLiNER
Since the release of our beta version, GLiNER-Multitask has received many positive responses. It's been embraced in many consulting, research, and production environments. Thank you everyone for your feedback, it helped us rethink the strengths and weaknesses of the first model and we are excited to present the next iteration of this multi-task information extraction model.
💡 What’s New?
Here are the key improvements in this latest version:
🔹 Expanded Task Support: Now includes text classification and other new capabilities.
🔹 Enhanced Relation Extraction: Significantly improved accuracy and robustness.
🔹 Improved Prompt Understanding: Optimized for open-information extraction tasks.
🔹 Better Named Entity Recognition (NER): More accurate and reliable results.
🔧 How We Made It Better:
These advancements were made possible by:
🔹 Leveraging a better and more diverse dataset.
🔹 Using a larger backbone model for increased capacity.
🔹 Implementing advanced model merging techniques.
🔹 Employing self-learning strategies for continuous improvement.
🔹 Better training strategies and hyperparameters tuning.
📄 Read the Paper: https://arxiv.org/abs/2406.12925
⚙️ Try the Model: knowledgator/gliner-multitask-v1.0
💻 Test the Demo: knowledgator/GLiNER_HandyLab
📌 Explore the Repo: https://github.com/urchade/GLiNER
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
5 months ago
Post
404
🚀 Let’s transform LLMs into encoders 🚀
Auto-regressive LMs have ruled, but encoder-based architectures like GLiNER are proving to be just as powerful for information extraction while offering better efficiency and interpretability. 🔍✨
Past encoder backbones were limited by small pre-training datasets and old techniques, but with innovations like LLM2Vec, we've transformed decoders into high-performing encoders! 🔄💡
What’s New?
🔹Converted Llama & Qwen decoders to advanced encoders
🔹Improved GLiNER architecture to be able to work with rotary positional encoding
🔹New GLiNER (zero-shot NER) & GLiClass (zero-shot classification) models
🔥 Check it out:
New models: knowledgator/llm2encoder-66d1c76e3c8270397efc5b5e
GLiNER package: https://github.com/urchade/GLiNER
GLiClass package: https://github.com/Knowledgator/GLiClass
💻 Read our blog for more insights, and stay tuned for what’s next!
https://medium.com/@knowledgrator/llm2encoders-e7d90b9f5966
Auto-regressive LMs have ruled, but encoder-based architectures like GLiNER are proving to be just as powerful for information extraction while offering better efficiency and interpretability. 🔍✨
Past encoder backbones were limited by small pre-training datasets and old techniques, but with innovations like LLM2Vec, we've transformed decoders into high-performing encoders! 🔄💡
What’s New?
🔹Converted Llama & Qwen decoders to advanced encoders
🔹Improved GLiNER architecture to be able to work with rotary positional encoding
🔹New GLiNER (zero-shot NER) & GLiClass (zero-shot classification) models
🔥 Check it out:
New models: knowledgator/llm2encoder-66d1c76e3c8270397efc5b5e
GLiNER package: https://github.com/urchade/GLiNER
GLiClass package: https://github.com/Knowledgator/GLiClass
💻 Read our blog for more insights, and stay tuned for what’s next!
https://medium.com/@knowledgrator/llm2encoders-e7d90b9f5966
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
reacted to
tomaarsen's
post with 🚀
6 months ago
Post
3319
I just published Sentence Transformers v3.0.1: the first patch release since v3 from last week. It introduces gradient checkpointing, pushing model checkpoints to Hugging Face while training, model card improvements and fixes. Details:
1️⃣ Gradient checkpointing allows for much less memory usage at a cost of ~20% training speed. Seems to allow for higher batch sizes, which is quite important for loss functions with in-batch negatives.
2️⃣ You can specify
3️⃣ Model card improvements: improved automatic widget examples, better tags, and the default of "sentence_transformers_model_id" now gets replaced when possible.
4️⃣ Several evaluator fixes, see release notes for details.
5️⃣ Fixed a bug with MatryoshkaLoss throwing an error if the supplied Matryoshka dimensions are ascending instead of descending.
6️⃣ Full Safetensors support; even the uncommon modules can now save and load "model.safetensors" files: no more pickle risks.
Check out the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.0.1
And let me know what kind of features you'd like to see next! I have some plans already (ONNX, Sparse models, ColBERT, PEFT), but I don't yet know how I should prioritize everything.
1️⃣ Gradient checkpointing allows for much less memory usage at a cost of ~20% training speed. Seems to allow for higher batch sizes, which is quite important for loss functions with in-batch negatives.
2️⃣ You can specify
args.push_to_hub=True
and args.hub_model_id
to upload your model checkpoints to Hugging Face while training. It also uploads your emissions (if codecarbon is installed) and your Tensorboard logs (if tensorboard is installed)3️⃣ Model card improvements: improved automatic widget examples, better tags, and the default of "sentence_transformers_model_id" now gets replaced when possible.
4️⃣ Several evaluator fixes, see release notes for details.
5️⃣ Fixed a bug with MatryoshkaLoss throwing an error if the supplied Matryoshka dimensions are ascending instead of descending.
6️⃣ Full Safetensors support; even the uncommon modules can now save and load "model.safetensors" files: no more pickle risks.
Check out the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.0.1
And let me know what kind of features you'd like to see next! I have some plans already (ONNX, Sparse models, ColBERT, PEFT), but I don't yet know how I should prioritize everything.
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
replied to
their
post
6 months ago
Thanks for sharing this development! Can you also write a blog or paper to understand it better? Thanks
https://blog.knowledgator.com/meet-the-new-zero-shot-ner-architecture-30ffc2cb1ee0
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
replied to
their
post
6 months ago
Yeah, we are working on it
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
6 months ago
Post
741
🚀 Meet the new GLiNER architecture 🚀
GLiNER revolutionized zero-shot NER by demonstrating that lightweight encoders can achieve excellent results. We're excited to continue R&D with this spirit 🔥. Our new bi-encoder and poly-encoder architectures were developed to address the main limitations of the original GLiNER architecture and bring the following new possibilities:
🔹 An unlimited number of entities can be recognized at once.
🔹Faster inference when entity embeddings are preprocessed.
🔹Better generalization to unseen entities.
While the bi-encoder architecture can lack inter-label understanding, we developed a poly-encoder architecture with post-fusion. It achieves the same or even better results on many benchmarking datasets compared to the original GLiNER, while still offering the listed advantages of bi-encoders.
Now, it’s possible to run GLiNER with hundreds of entities much faster and more reliably.
📌 Try the new models here:
knowledgator/gliner-bi-encoders-66c492ce224a51c54232657b
GLiNER revolutionized zero-shot NER by demonstrating that lightweight encoders can achieve excellent results. We're excited to continue R&D with this spirit 🔥. Our new bi-encoder and poly-encoder architectures were developed to address the main limitations of the original GLiNER architecture and bring the following new possibilities:
🔹 An unlimited number of entities can be recognized at once.
🔹Faster inference when entity embeddings are preprocessed.
🔹Better generalization to unseen entities.
While the bi-encoder architecture can lack inter-label understanding, we developed a poly-encoder architecture with post-fusion. It achieves the same or even better results on many benchmarking datasets compared to the original GLiNER, while still offering the listed advantages of bi-encoders.
Now, it’s possible to run GLiNER with hundreds of entities much faster and more reliably.
📌 Try the new models here:
knowledgator/gliner-bi-encoders-66c492ce224a51c54232657b
Post
895
🚀 Meet Our New Line of Efficient and Accurate Zero-Shot Classifiers! 🚀
The new architecture brings better inter-label understanding and can solve complex classification tasks at a single forward pass.
Key Applications:
✅ Multi-class classification (up to 100 classes in a single run)
✅ Topic classification
✅ Sentiment analysis
✅ Event classification
✅ Prompt-based constrained classification
✅ Natural Language Inference
✅ Multi- and single-label classification
knowledgator/gliclass-6661838823756265f2ac3848
knowledgator/GLiClass_SandBox
knowledgator/gliclass-base-v1.0-lw
The new architecture brings better inter-label understanding and can solve complex classification tasks at a single forward pass.
Key Applications:
✅ Multi-class classification (up to 100 classes in a single run)
✅ Topic classification
✅ Sentiment analysis
✅ Event classification
✅ Prompt-based constrained classification
✅ Natural Language Inference
✅ Multi- and single-label classification
knowledgator/gliclass-6661838823756265f2ac3848
knowledgator/GLiClass_SandBox
knowledgator/gliclass-base-v1.0-lw
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
7 months ago
Post
895
🚀 Meet Our New Line of Efficient and Accurate Zero-Shot Classifiers! 🚀
The new architecture brings better inter-label understanding and can solve complex classification tasks at a single forward pass.
Key Applications:
✅ Multi-class classification (up to 100 classes in a single run)
✅ Topic classification
✅ Sentiment analysis
✅ Event classification
✅ Prompt-based constrained classification
✅ Natural Language Inference
✅ Multi- and single-label classification
knowledgator/gliclass-6661838823756265f2ac3848
knowledgator/GLiClass_SandBox
knowledgator/gliclass-base-v1.0-lw
The new architecture brings better inter-label understanding and can solve complex classification tasks at a single forward pass.
Key Applications:
✅ Multi-class classification (up to 100 classes in a single run)
✅ Topic classification
✅ Sentiment analysis
✅ Event classification
✅ Prompt-based constrained classification
✅ Natural Language Inference
✅ Multi- and single-label classification
knowledgator/gliclass-6661838823756265f2ac3848
knowledgator/GLiClass_SandBox
knowledgator/gliclass-base-v1.0-lw
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
8 months ago
Post
598
We’re thrilled to share our latest technical paper on the multi-task GLiNER model. Our research dives into the following exciting and forward-thinking topics:
🔍 Zero-shot NER & Information Extraction: We demonstrate that with diverse and ample data, paired with the right architecture, encoders can achieve impressive results across various extraction tasks;
🛠️ Synthetic Data Generation: Leveraging open labelling by LLMs like Llama, we generated high-quality training data. Our student model even outperformed the teacher model, highlighting the potential of this approach.
🤖 Self-Learning: Our model showed consistent improvements in performance without labelled data, achieving up to a 12% increase in F1 score for initially challenging topics. This ability to learn and improve autonomously is a very perspective direction of future research!
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks (2406.12925)
knowledgator/gliner-multitask-large-v0.5
knowledgator/GLiNER_HandyLab
🔍 Zero-shot NER & Information Extraction: We demonstrate that with diverse and ample data, paired with the right architecture, encoders can achieve impressive results across various extraction tasks;
🛠️ Synthetic Data Generation: Leveraging open labelling by LLMs like Llama, we generated high-quality training data. Our student model even outperformed the teacher model, highlighting the potential of this approach.
🤖 Self-Learning: Our model showed consistent improvements in performance without labelled data, achieving up to a 12% increase in F1 score for initially challenging topics. This ability to learn and improve autonomously is a very perspective direction of future research!
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks (2406.12925)
knowledgator/gliner-multitask-large-v0.5
knowledgator/GLiNER_HandyLab
#!pip install gliner -U
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner-multitask-large-v0.5")
text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800.
"""
labels = ["founder", "computer", "software", "position", "date"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
9 months ago
Post
798
We are super happy to contribute to the GLiNER ecosystem by optimizing training code and releasing a multi-task, prompt-tunable model.
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Summarization;
Model: knowledgator/gliner-multitask-large-v0.5
Demo: knowledgator/GLiNER_HandyLab
Repo: 👨💻 https://github.com/urchade/GLiNER
**How to use**
First of all, install gliner package.
Then try the following code:
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Summarization;
Model: knowledgator/gliner-multitask-large-v0.5
Demo: knowledgator/GLiNER_HandyLab
Repo: 👨💻 https://github.com/urchade/GLiNER
**How to use**
First of all, install gliner package.
pip install gliner
Then try the following code:
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner_small-v2.1")
prompt = """Find all positive aspects about the product:\n"""
text = """
I recently purchased the Sony WH-1000XM4 Wireless Noise-Canceling Headphones from Amazon and I must say, I'm thoroughly impressed. The package arrived in New York within 2 days, thanks to Amazon Prime's expedited shipping.
The headphones themselves are remarkable. The noise-canceling feature works like a charm in the bustling city environment, and the 30-hour battery life means I don't have to charge them every day. Connecting them to my Samsung Galaxy S21 was a breeze, and the sound quality is second to none.
I also appreciated the customer service from Amazon when I had a question about the warranty. They responded within an hour and provided all the information I needed.
However, the headphones did not come with a hard case, which was listed in the product description. I contacted Amazon, and they offered a 10% discount on my next purchase as an apology.
Overall, I'd give these headphones a 4.5/5 rating and highly recommend them to anyone looking for top-notch quality in both product and service.
"""
input_ = prompt+text
labels = ["match"]
matches = model.predict_entities(input_, labels)
for match in matches:
print(match["text"], "=>", match["score"])
Post
1898
We are pleased to announce the new line of universal token classification models 🔥
knowledgator/universal-token-classification-65a3a5d3f266d20b2e05c34d
It can perform various information extraction tasks by analysing input prompts and recognizing parts of texts that satisfy prompts. In comparison with the first version, the second one is more general and can be recognised as entities, whole sentences, and even paragraphs.
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Coreference resolution;
* Text cleaning;
* Summarization;
How to use:
knowledgator/universal-token-classification-65a3a5d3f266d20b2e05c34d
It can perform various information extraction tasks by analysing input prompts and recognizing parts of texts that satisfy prompts. In comparison with the first version, the second one is more general and can be recognised as entities, whole sentences, and even paragraphs.
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Coreference resolution;
* Text cleaning;
* Summarization;
How to use:
from utca.core import (
AddData,
RenameAttribute,
Flush
)
from utca.implementation.predictors import (
TokenSearcherPredictor, TokenSearcherPredictorConfig
)
from utca.implementation.tasks import (
TokenSearcherNER,
TokenSearcherNERPostprocessor,
)
predictor = TokenSearcherPredictor(
TokenSearcherPredictorConfig(
device="cuda:0",
model="knowledgator/UTC-DeBERTa-base-v2"
)
)
ner_task = TokenSearcherNER(
predictor=predictor,
postprocess=[TokenSearcherNERPostprocessor(
threshold=0.5
)]
)
ner_task = TokenSearcherNER()
pipeline = (
AddData({"labels": ["scientist", "university", "city"]})
| ner_task
| Flush(keys=["labels"])
| RenameAttribute("output", "entities")
)
res = pipeline.run({
"text": """Dr. Paul Hammond, a renowned neurologist at Johns Hopkins University, has recently published a paper in the prestigious journal "Nature Neuroscience". """
})
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658166666371-noauth.png)
posted
an
update
9 months ago
Post
1898
We are pleased to announce the new line of universal token classification models 🔥
knowledgator/universal-token-classification-65a3a5d3f266d20b2e05c34d
It can perform various information extraction tasks by analysing input prompts and recognizing parts of texts that satisfy prompts. In comparison with the first version, the second one is more general and can be recognised as entities, whole sentences, and even paragraphs.
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Coreference resolution;
* Text cleaning;
* Summarization;
How to use:
knowledgator/universal-token-classification-65a3a5d3f266d20b2e05c34d
It can perform various information extraction tasks by analysing input prompts and recognizing parts of texts that satisfy prompts. In comparison with the first version, the second one is more general and can be recognised as entities, whole sentences, and even paragraphs.
The model can be used for the following tasks:
* Named entity recognition (NER);
* Open information extraction;
* Question answering;
* Relation extraction;
* Coreference resolution;
* Text cleaning;
* Summarization;
How to use:
from utca.core import (
AddData,
RenameAttribute,
Flush
)
from utca.implementation.predictors import (
TokenSearcherPredictor, TokenSearcherPredictorConfig
)
from utca.implementation.tasks import (
TokenSearcherNER,
TokenSearcherNERPostprocessor,
)
predictor = TokenSearcherPredictor(
TokenSearcherPredictorConfig(
device="cuda:0",
model="knowledgator/UTC-DeBERTa-base-v2"
)
)
ner_task = TokenSearcherNER(
predictor=predictor,
postprocess=[TokenSearcherNERPostprocessor(
threshold=0.5
)]
)
ner_task = TokenSearcherNER()
pipeline = (
AddData({"labels": ["scientist", "university", "city"]})
| ner_task
| Flush(keys=["labels"])
| RenameAttribute("output", "entities")
)
res = pipeline.run({
"text": """Dr. Paul Hammond, a renowned neurologist at Johns Hopkins University, has recently published a paper in the prestigious journal "Nature Neuroscience". """
})