Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

umarbutler 
posted an update 3 days ago
view post
Post
4695
Isaacus, the AI research company building legal superintelligence, is hiring!

We're looking for passionate engineers who love to build and tinker and want to have an impact on the world. Specifically, we're hiring:
• ML engineers (Australia).
• Data engineers (Australia).
• Full-stack engineers (Australia).
• DevRel engineers (Australia, San Francisco, and London).
• DevOps engineers (Australia, San Francisco, and London).

If you'd like to be a founding employee at one of the few VC-backed LLM research labs in the world, receive generous equity compensation, and work alongside other highly motivated, highly skilled engineers, get in touch: https://isaacus.com/careers
DedeProGames 
posted an update 1 day ago
view post
Post
3342
Introducing GRM2, a powerful 3 billion parameter model designed for long-term reasoning and high performance in complex tasks.

Even with only 3 billion parameters, it outperforms qwen3-32b in several benchmarks and complex reasoning tasks.

With just 3 billion parameters, it can also generate extensive and complex code with over 1000 lines, utilize tools comparable to larger models, and is perfect for agentic tasks.

GRM2 is licensed under Apache 2.0, making it ideal as a base for FineTune in other tasks.
You can see more here: OrionLLM/GRM2-3b
salma-remyx 
posted an update 2 days ago
view post
Post
3338
We built an OpenClaw 🦞 skill that sends daily ranked research recommendations to Slack using the Remyx AI CLI.

You define Research Interests (topics, HF models, GitHub repos, blogs etc), Remyx ranks new arXiv papers and repos to find the most relevant resources, and an OpenClaw cron job posts a formatted digest to your team's #research channel every weekday morning.

The tutorial covers the full setup end-to-end: installing the CLI, creating interests, connecting OpenClaw to Slack, installing the Remyx skill, and scheduling the cron. About 15 minutes start to finish.

Tutorial: https://docs.remyx.ai/tutorials/daily-research-digest-slack

Would love to hear how folks are tracking research for their projects. If you give this a try, let us know what you think!
Shrijanagain 
posted an update 1 day ago
view post
Post
2200

​We are thrilled to announce the launch of SKT-OMNI-CORPUS-146T-V1, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch.
​Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge.

​💎 Key Highlights:

​•• Massive Scale: Targeting a multi-terabyte architecture for 146T-level tokenization.

•• ​Pure Quality: Curated from 500+ Elite Sources

•• ​Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training.

​🤝 Open for Collaboration!

​We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together.

​Explore the Dataset on Hugging Face:

🔗 Shrijanagain/SKT-OMNI-CORPUS-146T-V1

DSR -- 🔗 Shrijanagain/SKT-DSRx10000

​#AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia
unmodeled-tyler 
posted an update 2 days ago
view post
Post
3361
Here's a demo of Vessel Browser in action!

Minimax M2.7 was challenged with navigating to a large Ecom site, curate a selection of 5 different products, and add them all to the cart with included reasoning behind choices. (or try it yourself - open source, MIT license, and BYOK!)
npm i @quanta-intellect/vessel-browser

Vessel is a browser that I've been working on which is designed specifically for agents with human-in-the-loop visibility. It comes with a local MCP server allowing any harness that supports custom MCP to control the browser. Additionally, you can BYOK to 8+ different providers (including custom OAI compatible endpoints and local models).

One of my favorite features of the browser is persistent, bi-directional highlighting - meaning that both you AND the agent can highlight anything on the screen and the agent receives the context.

Vessel Browser is unique in that it surfaces available tools contextually to the agent, meaning the agent doesn't have to decide between 80+ tools at any given time, but rather is focused on a subset of tools most applicable to the current state.

Give it a try!

https://github.com/unmodeled-tyler/vessel-browser
  • 2 replies
·
prithivMLmods 
posted an update 4 days ago
view post
Post
3864
Map-Anything v1 (Universal Feed-Forward Metric 3D Reconstruction) demo is now available on Hugging Face Spaces. Built with Gradio and integrated with Rerun, it performs multi-image and video-based 3D reconstruction, depth, normal map, and interactive measurements.

🤗 Demo: prithivMLmods/Map-Anything-v1
🤗 Model: facebook/map-anything-v1
🤗 Hf-Papers: MapAnything: Universal Feed-Forward Metric 3D Reconstruction (2509.13414)
kanaria007 
posted an update about 10 hours ago
view post
Post
53
✅ Article highlight: *Law as Goal Surfaces* (art-60-048, v0.1)

TL;DR:
Most “AI + law” discussions go wrong in one of two ways: either an LLM is asked to explain the law and everyone hopes it is right, or a rules engine gets bolted onto the side of the system.

This article sketches a different approach: treat *law itself as structure* inside SI-Core. Legal constraints sit alongside safety, fairness, and budget in the same GoalSurface / ETH machinery, while procedure — who may do what, when, with which approvals, exceptions, and appeals — becomes first-class runtime structure.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• moves law from “best-effort compliance” to structural constraints
• makes legal procedure explicit instead of hiding it in side channels
• supports both *ex-ante* prevention of illegal actions and *ex-post* auditability
• treats appeals, exceptions, and discretion as governed objects, not ad hoc overrides

What’s inside:
• *LegalSurface* as a GoalSurface specialization for regulation and policy
• hard rules in *ETH constraints* + soft legal/policy objectives for optimization
• roles, principals, jurisdictions, approvals, and source provenance
• procedural structure for conditions, exceptions, and appeals
• a mental model: *law = goal surfaces + hard ETH constraints + roles/principals*
• SI-Core as a kind of *procedural VM* for executing those bundles on real events

Key idea:
Law should not be an afterthought around intelligent systems. It should be part of the runtime structure that determines what is admissible, what needs review, and how decisions remain explainable.
Jiaqi-hkust 
posted an update about 18 hours ago
view post
Post
558
🛰️ Introducing Awesome-Remote-Sensing-Agents: The Largest Curated Collection of Intelligent Remote Sensing Agents

We are excited to share our new repository Awesome-Remote-Sensing-Agents – a comprehensive, community-driven collection of 100+ papers at the intersection of remote sensing and intelligent agents (LLMs, VLM, multi‑agent systems, etc.).

🔗 GitHub Repository: https://github.com/PolyX-Research/Awesome-Remote-Sensing-Agents

Our repository organizes this rapidly growing field into a structured, easy‑to‑navigate resource for researchers, practitioners, and enthusiasts.

📚 What’s Inside?
We’ve carefully curated papers across 6 key application domains:
🌿 Ecological Monitoring – forest fires, biodiversity, climate science
🚨 Emergency Response – flood mapping, wildfire tracking, disaster geolocalization
⛏️ Geological Exploration – mineral mapping, lithological recognition, geologic reasoning
🌊 Marine Supervision – ocean science, autonomous surface vehicles
🌾 Precision Agriculture – crop disease detection, land use simulation
🏙️ Urban Governance – change detection, urban planning, embodied navigation

🤝 Join the Community!
We warmly welcome contributions to keep this list up‑to‑date:
📝 Add missing papers via Pull Request
🏷️ Propose new or refined categories
🔗 Report broken links or outdated entries
💬 Discuss via GitHub Issues or contact the authors
datalyes 
posted an update 1 day ago
view post
Post
89
Hello there ! quick update :

After multiple requests ( very grateful for the interest ) ALL 15 PatenTeb tasks are accessible now ( automatic request approval ).
abdurrahmanbutler 
posted an update 1 day ago
view post
Post
98
Isaacus just shipped a major update to semchunk: AI-powered chunking based on a document’s knowledge graph representation⚡

This isn’t a tweak on existing semantic chunking. It’s an entirely new paradigm, built on hierarchical document segmentation rather than heuristics or standard embedding-based semantic approaches.

We benchmarked our AI chunking mode across a full RAG pipeline against popular alternatives like LangChain, Chonkie, and our own non-AI semantic chunker. The results were clear: semchunk’s AI mode delivered a 15% relative improvement in RAG correctness over Chonkie. It also produced more aesthetically coherent and readable when judged by a human evaluator while also being faster than all other chunking methods when run on a consumer PC.

These gains are powered by Isaacus' Kanon 2 Enricher model, which performs hierarchical document segmentation and directly powers our AI chunking mode.

As far as we know, semchunk is one of the first chunking libraries to offer true AI-powered, hierarchical-segmentation-based chunking, and the results show how much better RAG can get when chunking improves.

https://huggingface.co/blog/isaacus/introducing-ai-chunking-to-semchunk