Claude's [new] tool usage is pretty good. Unlike with GPT-4 where I had to really minimize the context and descriptions for each tool, Claude Opus does better when provided more details and context for each tool.
I'm now using it with 9 different tools for https://olly.bot and it hits the nail on the head about 8/10 times. Anthropic says it can handle 250+ tools with 90% accuracy [1], but anecdotally from my production usage in the last 24 hours that seems a little too optimistic.
Of course, it also comes with a few idiosyncracies like sometimes spitting out <thinking> or <answer> blocks, and has more constraints on the messages field, so don't expect a drop-in replacement for OpenAI.
https://youtu.be/6wkFb2_cUik?si=6gdpubgOiSUTlCAq
(Not sure why there isn't a blog post or news on this yet, just a yt video).
I use a GUI for the API called typingmind, which has already implemented these and now you can search the web using google with Claude3.
Here's a screenshot with the sites it searched: (and it was much faster than Chatgpt):
https://i.imgur.com/2HxcT6k.png
Here is Chatgpt4's response:
https://i.imgur.com/IMqPfGK.png
So, not good - reading a page from Nov 2023 (though I wasn't specific, Claude figured it out).
Here is Google 1.0 Ultra's reply:
https://i.imgur.com/bUxRf9S.png
(FYI in the main screenshot I used a query that came to mind that was recent; not trying to be political)