menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

I pushed AI assistants to their limits, so you don’t have to. Here’s what really works.

4 1
12.05.2025

Staying on top of AI developments is a full-time job.

I would know, because it’s my full-time job. I subscribe to Anthropic’s Pro mode for access to their latest model, Claude 3.7, in “extended thinking” mode; I have a complementary subscription to OpenAI’s Enterprise mode so that I can test out their latest models, o3 and o4-mini-high (more later on OpenAI’s absurd naming scheme!), and make lots of images with OpenAI’s new image generation model 4o, which is so good I have cancelled my subscription to my previous image generation tool Midjourney.

I subscribe to Elon Musk’s Grok 3, which has one of my favorite features of any AI, and I’ve tried using the Chinese AI agent platform Manus for shopping and scheduling. And while that exhausts my paid subscription budget, it doesn’t include all the AIs I work with in some form. In just the month I spent writing this piece, Google massively upgraded its best AI offering, Gemini 2.5, and Meta released Llama 4, the biggest open source AI model yet.

So what do you do if keeping up with AI developments is not your full-time job, but you still want to know which AI to use when in ways that genuinely improve your life, without wasting time on the models that can’t?

That’s what we’re here for. This article is a detailed, Consumer Reports-style dive into which AI is the best for a wide range of cases and how to actually use them, all based on my experience with real-world tasks.

But first, the disclosures: Vox Media is one of several publishers that have signed partnership agreements with OpenAI, but our reporting remains editorially independent. Future Perfect is funded in part by the BEMC Foundation, whose major funder was also an early investor in Anthropic; they don’t have any editorial input into our content either. My wife works at Google, though not in any area related to their AI offerings; for this reason, I usually don’t cover Google, but in a piece like this, it’d be irresponsible to exclude it.

The good thing is that this piece doesn’t require you to trust me about my editorial independence; I show my work. I ran dozens of comparisons, many of which I invented myself, on every major AI out there. I encourage you to compare their answers and decide for yourself if I picked the right one to recommend.

On AI art ethics

AI art is made by training a computer on the contents of the internet, with little regard for copyright or the intent of the creators. For that reason, most artists can’t stand it. Given that, is it defensible to use AI art at all?

I think in a just world OpenAI would certainly compensate some artists — and in a just world, Congress would be moving to lay out the limits on artistic borrowing. At the same time, I am increasingly convinced that existing copyright law is a poor fit for this problem. Artists influence one another, comment on one another, and draw inspiration from one another, and people with access to AI tools will keep wanting to do that.

My personal philosophy is shaped by the fan cultures of my childhood: It’s okay to build on someone else’s work for your own enjoyment, but if you like it, you should pay them for it, and it’s absolutely not okay to sell it. That means no generative AI art in someone else’s style for commercial purposes, but it’s fine to play around with your family photos.

Best for images

OpenAI’s new 4o image creation mode is the best AI out there for generating images, by a large margin. It’s best in the free category, and it’s best in the paid category.

Before it was released, I was subscribed to Midjourney, an AI image generator platform. Midjourney is probably what you think of when you think of AI art: It produces mystical, haunting, visually beautiful stuff, and has some great tools for improving and editing your final results, like touching up someone’s hair while leaving everything else in place.

The big thing that 4o can do, which no model before could reliably pull off, is take a picture that didn’t come out well and turn it into a beautiful work of art, all while still preserving the character of the original.

For example, here’s a still from a video of my wife and I singing “Happy Birthday” to our baby on her first birthday:

It’s a beautiful moment, but not exactly a flattering picture. So I asked ChatGPT to render it in the style of Norman Rockwell, a mid-century illustrator whose work I love, and got this:

The AI moved the cake (which had been barely visible behind the paper towel roll in the original still) to be the focal point of the image, while keeping the way my wife and I are holding the baby together, as well as the cluttered table, and the photograph-covered fridge in the background. The result is warm, flattering, and adorable.

It’s this capability that made 4o go viral recently in a way that no image generator before it had. Here’s Midjourney’s attempt, for example:

You’ll notice that it’s a seemingly, uh, completely different family, with no real inspiration from the original at all! You can eventually get a better result than this out of Midjourney, but only by spending weeks becoming a pro at prompting with the platform’s highly specific language and toolset.

By contrast, ChatGPT was able to give me a far superior output on the first try in response to a simple request without specialized language.

The difference between 4o and other image models is most notable with this kind of request, but it’s better for almost everything else I use images for, too. The product you get out of the box is pretty good, and it’s not hard to produce something much better. That, ideally, is what we should be getting out of our AI tools — something amazing that can be created with simple language by a nonexpert.

The one place 4o still falls........

© Vox