Is OpenAI's 4o Snake Oil?

On March 25, 2025, OpenAI announced their flagship 4o image generation model with a shocking press release.

Their release immediately demonstrates the model's seemingly magical capabilities — a highly realistic image of a woman wearing writing more than 100 words, in a specific format, is the first image seen on the web page. Of course, it was generated by their new image model.

Soon, the real magic appears. With the simple, plain-English incantation…

selfie view of the photographer, as she turns around to high five him

...a second, stunningly accurate image follows the first. All the text on the whiteboard is there (accurate text rendering is something which all image models of the past have struggled with deeply). The selfie view and the high five are the least impressive part, almost unnoticeable to my desensitized eyes, though such a feat would be considered impossible just a few years ago.

So... is this legit?

When I saw this post, I, along with numerous others, immediately realized the immense opportunities that were soon to be unlocked. Some claimed behemoths the likes of Adobe Photoshop would be replaced and categorically eliminated.

Yeah, I'd postulate that multiple $100m+ businesses will launch the moment the API is released. Even in it's current, unimproved state, 4o nearly renders Photoshop obsolete. This is a multi-billion dollar industry that will be created overnight. I have incorporated and I have code in place to hit the ground running, the second that the API gets turned on.

So I set out to build a business, assuming OpenAI had acted in good faith and not cherry-picked the demonstrations for their press release. At least I hoped any cherry-picking wasn't egregious.

...Kinda?

// BEGIN FLUFF (skip this if you want to get to the point; read it if you'd like a leisurely break from your hurried day)

I originally started programming because, like many others, I wanted to make video games. A bachelor's degree in Computer Science, a few internships in Silicon Valley, and some year-long stints at various companies that shall not be named here -later, I ended up where I started from: making video games.

I set out on my game-making journey intent to fully embrace change, because the times they are a-changing, as they always seem to do. First, I'd write the game in Rust, a language which I wish we had when I was in college (though I admit learning C is absolutely a pedagogical rite in any up-to-snuff CS program). Second, I'd use LLMs to guide me on my way.

I quickly found my bearings with the Bevy game engine. I implemented Flappy Bird for practice. Sailing was smooth. It was time to make the game I originally intended to make.

Early in the development of the game I found the need to procure assets. I had already stumbled upon Meshy, a web app that let's you create 3d meshes from prompts. I think this is a great product and I gladly paid for it. However, I was making a 2D game, and 2D games need 2D assets. I assumed the 2D-equivalent of Meshy existed and I could pay for it. How wrong I was. Alas, I could simply ask ChatGPT to generate 2D sprites for me, no? No. Well, yes and no. Yes, in the sense that ChatGPT does generate some pretty damn good looking sprites. No, in the sense that hours of painful manually cropping and Photoshop work was about to beset me. So I took matters into my own hands. I began to build GameTorch.

// END FLUFF (actually include this in the article lol)

At GameTorch, our core business is reducing the friction between your brain thinking of a 2D video game asset and you using that dreampt-up asset in your game.

Current image models are pretty damn good at coming up with the image you've thought of. According to OpenAI's 4o release article, we should be able to take it a step further: we should be able to produce full spritesheets for the various animations our game necessitates. Sadly, we aren't completely there yet. Before we get to the counterexamples, let's look at some success stories:

As an old person (ahem, a person under 30 and yet still old enough to remember the time when something like this would be incomprehenisble), it is extremely satisfying that we are able to produce an animated spritesheet like this winking jellyfish for less than a dollar.

But today's models still fall short of their portrayed capabilities, in my humble opinion. Based on the 4o press release, I had expected to be able to generate spritesheets of my characters running to the left or jumping up and landing with one or two tries. Aye, the time is naught. You can get close, and it's just as much of a business logic problem as it is the model's problem — there's still a lot we can do to try to generate such spritesheets with today's models. Here's just one of the many fraught attempts at getting 4o to generate a running animation:

Extreme Optimism

It took me just barely more than 2 months — first commit was on March 29, 2025 — to build and launch GameTorch. In those short two months, I saw model quality improve by orders of magnitude, while cost-for-quality dropped by orders of magnitude. I was also able to build and launch an entire, production-grade web app that I actually use to solve legitimate problems I have in 2 months, as a one-man show, all thanks to LLMs.

Things will only go up from here, at least in terms of capabilities, quality, quality-to-cost ratio, and productivity conferred. Prove me wrong, if you can.

– Tom, Creator of GameTorch

P.S. This is an artisinal, hand-written article. I did not use AI to write it. I did however, use AI to imbue my words with nice-looking HTML and CSS.

The line it is drawn
The curse it is cast
The slow one now
Will later be fast
As the present now
Will later be past
The order is rapidly fadin'
And the first one now
Will later be last
For the times they are a-changin'

– Bob Dylan