NVIDIA released Chat with RTX. Why is it significant? It's NVIDIA's first step towards the vision of "LLM as Operating System" - a locally running, heavily optimized AI assistant that deeply integrates with your file systems, features retrieval as the first-class citizen, preserves privacy, and leverages your desktop gaming GPUs to the full.
A web-hosted version will have a hard time matching the same functionality. NVIDIA is going local even before OpenAI.
My TED talk is finally live!! I proposed the recipe for the "Foundation Agent": a single model that learns how to act in different worlds. LLM scales across lots and lots of texts. Foundation Agent scales across lots of lots of realities. If it is able to master 10,000 diverse simulated realities, it may well generalize to our physical world, which is simply the 10,001st reality.
Why do we want a single Foundation Agent instead of many smaller models? I'd like to quote the idea from my friend Prof. Yuke Zhu's CoRL keynote talk. If we trace each AI field's evolution, we'd find this pattern:
And the "specialized generalist" is often way more powerful than the original specialist. Just like distilled versions of LlaMA are way better than custom-built NLP systems 5 years ago.
TED talks do not have teleprompters!! All I have is a "confidence monitor" at my foot, showing only the current slide and timer. That means I need to memorize the whole speech. It sounds intimidating at first, but turns out to be the best way to connect with the audience and deliver the ideas right to heart.