Custom programming languages make agents good (blog.firetiger.com)

hwernetti 2 days ago

Great read! We're running into a similar problem at my company: we've given agents the ability to query our databases but not enough guidance to write correct and efficient queries. I haven't tried solving this problem yet but I'm curious if you explored any code-to-sql approaches, something similar to SQLAlchemy but with your own guardrails and customizations?

spenczar5 1 day ago

That's a pretty interesting idea! I guess 160+ is sort of doing some of that for us - it compiles to SQL WHERE clauses, right - but generally, we found good results giving it a SQL dialect directly.

I think some of the reason is that there's so much coverage of writing SQL in its training set.

hwernetti 1 day ago

Good point, that makes a lot of sense to use a tool that has plenty of sample usage data available.

mmmehulll 2 days ago

This is really interesting, I feel if llms can respond at existing without finetuning, it can be huge.

spenczar5 2 days ago

Yes! This works really well from Sonnet 4.5 onwards, in our experience. Sonnet 4.0 was a little rocky - we had to give it tons of documentation - but by now it works without much effort.

One thing that works very well is just giving it one or two example valid programs/statements in the custom language. It usually picks up what you're getting at very quickly.

When it slips up, you get good signal you can capture for improving the language. If you're doing things in a standard agent-y loop, a good error message also helps it course-correct.

mmmehulll 2 days ago

That’s really interesting. The “one or two examples + good error messages” part feels especially important. It suggests the limiting factor may be less finetuning and more whether the model is given a tight representation and a feedback loop it can recover from.

spenczar5 2 days ago

Author here! I am pretty jazzed about these ideas and happy to dig into more detail than a blog post allows.

StefanJVA 1 day ago

Intressting read!

shablulman 2 days ago

[flagged]