A note on o3 and AGI
Basically it wipes the floor with the humans, pretty much across the board.
Try, following Nabeel, why Bolaño’s prose is so electrifying.
Or my query why early David Burliuk works cost more in the marketplace than do late Burliuk works.
Or how Trump’s trade policy will affect Knoxville, Tennessee. (Or try this link if the first one is not working for you.)
Even human experts have a tough time doing that well on those questions. They don’t, and I have even chatted with the guy at the center of the Burliuk market.
I don’t mind if you don’t want to call it AGI. And no it doesn’t get everything right, and there are some ways to trick it, typically with quite simple (for humans) questions. But let’s not fool ourselves about what is going on here. On a vast array of topics and methods, it wipes the floor with the humans. It is time to just fess up and admit that.