how NOT to integrate LLMs into things

October 20, 2025

"Why don't you just tell me the name of the movie you've selected?"

For a while now, I've been imagining & trying in my own small way to build towards this future:

Kramer is the Movie-Schedule Phone Operator | Seinfeld

Seinfeld is a television sitcom created by Larry David and Jerry Seinfeld. The show stars Jerry Seinfeld, Jason Alexander, Julia Louis-Dreyfus and Michael Richards in the lead. The series revolves around the continuing misadventures of neurotic New York City stand-up comedian Jerry Seinfeld and his equally neurotic New York City friends. It also describes the lives of four single friends: comic Jerry Seinfeld, bungling George Costanza, frustrated working gal Elaine Benes, and eccentric neighbor Cosmo Kramer, who deal with the absurdities of everyday life in New York City. Stay Tuned for More Entertainment! Click here to subscribe: https://www.youtube.com/@SonyPicturesEntertainmentIndia Follow us on: Instagram: https://www.instagram.com/sonypicturestvindia/ Facebook: https://www.facebook.com/SonyPicturesTelevisionIndia #seinfeld #hollywood #compilation #dramaclips #comedysequences #funnyclips #larrydavid #jerryseinfeld #comedysitcom #newyorkcity #sonypicturesentertainmentindia #sonypictures #sony #truelifecomedy #comedyseries #friendsgroup #relationships #iconicmoments #bestmoments #beststorytellingclips #favouriteclips #televisionseries #eccentricfriends #george #elaine #kramer #romanticsituations #comedymemes #hilariousmoments #friendshipgoals #jerryseinfeld #jasonalexander #julialouisdreyfus #michaelrichards

https://www.youtube.com/shorts/AXuK7dndrHU

(If you'd rather not watch: Clip is from the Seinfeld episode where Kramer starts getting misdial traffic to Moviefone and "pretends" to be the bot. He tells George to input his movie name though the keypad, George does so, and then Kramer of course doesn't know what the beeps mean so he finally says (still in a robotic voice), "why don't you just tell me the movie you've selected?")

A year ago I saw a world of clunky interfaces, and LLMs, and I thought: oh how cool, this is the low hanging fruit of LLMs just replacing all the clunky interfaces with natural language so you can just tell the computer what you want in your own words rather than thru a terrible phone tree or something.

I'm not saying we won't get this world, I think a lot of people are still building it. But it's trickier than I think I thought it was (or I'm a shit engineer (or both)).

A less silly example: Reading a book on my macbook the screen kept dimming

I never do this, but today I was reading an ebook on my laptop and the screen kept dimming. It's 2025, almost 3 years since chatGPT, why can't I tell my macbook in natural language: "Hey I'm reading a book, leave the screen on and then when I'm done reading you can go back to normal settings"?

For over a year I've been aiming to do things like that, and I don't have a solution but I have a couple bullet-point-type takeaways:

First point:

If you really know your macbook, and your macbook has been really well designed, you shouldn't have to use an LLM intermediary to accomplish a task. I wanted to tell my computer to leave the screen on because I didn't immediately know how to find the setting to do that. Already we're in a quasi-failure case: I can't do the thing I want to do immediately. That's not a judgement or a reason this can't work, but worth acknowledging. Things that are easy an LLM will only make awkward (you'd never say: "Open a browser and go to bluesky please!", its faster to just do it).

Second point:

Say you build this and you do it just right. Your LLM integration is still super fragile to updates and tweaks. Your documentation for the LLM better be PERFECT. Or the LLM will try to do something it can't do. And you'll never anticipate what people ask for so how are you gonna prompt the LLM to respond correctly to everything?

You have the same problem as in this joke:

Third point:

A lot of LLM integrations, the actions the LLM can take are a different set of actions than the actions the user can take.

This actually is a lot better than when I tried the same prompt a while ago. But still. I think LLM integrations should follow a variant of WYSIWYG where, if on the page I'm on the user can take an action (like make a directory), the LLM should also be able to take that same action and the same thing should happen.

Like, the LLM under the hood called a 'whats-the-date' API. I wish that had been made more explicit in this case. The more steps that get hidden, the more surface area for things to go wrong and for the user to just be confused and frustrated.

Conclusion:

I don't have one.