๐ค ๐๐ฑ๐ผ๐ฏ๐ฒ'๐ ๐ฐ๐ผ๐ฑ๐ฒ-๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐ป๐ด ๐ฎ๐ด๐ฒ๐ป๐ ๐ฟ๐ฒ๐ฎ๐ฐ๐ต๐ฒ๐ ๐๐ต๐ฒ ๐๐ผ๐ฝ ๐ผ๐ณ ๐๐๐๐ ๐น๐ฒ๐ฎ๐ฑ๐ฒ๐ฟ๐ฏ๐ผ๐ฎ๐ฟ๐ฑ - and their paper cites my work!
๐ก Reminder:ย In short, Agentic systems are a vehicle in which you put your LLM to allow it access to the outside world.
โก๏ธ The team of researchers at Adobe started from the idea that current agentic systems lack the ability to define their own tools. So they decided to make an agent that writes actions as code, thus allowing it to write python functions that can be re-used later as tools!
Here's what the LLM generations can look like with the proper prompt:
Thought: I need to access the excel file using a different method. Action:
defaccess_excel_file(file_path)
... # rest of the code (the agent does writes it, but I don't have room in this post)return rows
Then your system executes this and appends the observation to the agent's memory.
Why is this code formulation better than classical tool use formulation as JSON? The paper explains:
"Most existing work uses text or JSON as the representation of actions, which significantly lacks the two criteria mentioned earlier: generality and composability. In contrast, DynaSaur can utilize available actions or create new ones if necessary, using code as a unified representation. In principle, acting with code enables agents to solve any Turing-complete problem."
The idea of using code is not new: in fact, we do it in transformers.agents (thus the citation that I got). They implementation adds further refinements, like using RAG to retrieve relevant functions before generating an action, which increases performance further.
And they observe that code agents perform much better, reaching the top of GAIA leaderboard! ๐ฅ
Go take a look, it's really clear and informative!
๐ 1M public posts from Bluesky's firehose API ๐ Includes text, metadata, and language predictions ๐ฌ Perfect to experiment with using ML for Bluesky ๐ค
Excited to see people build more open tools for a more open social media platform!