Gloss

Tagged: engineering

AI Context Windows Got 10x Bigger. Nobody Changed Their Architecture.

Million-token context windows changed everything about what's possible, but most teams are still building for 4K limits.

ARCHITECTURE

The Pilot-to-Production Gap Is Where AI Projects Go to Die

GPT-5.4 can handle a million tokens. But most application architectures were designed for 4K-32K contexts, and the jump to 1M doesn't just expand capacity, it breaks fundamental assumptions about how you build.

Prompt Caching Is the Difference Between a Viable AI Product and a Bankrupt One

Claude Code treats prompt cache misses like server outages. The engineering behind that decision saves millions in API costs.

Your AI Stack Is Already Legacy

The frameworks and abstractions built twelve months ago are already getting in the way. The models got good enough that the middleware became the bottleneck.

All engineering ai architecture context-windows llm