Frontier Models Are Prototypes
- published
- reading time
- 9 minutes

Frontier models are expensive prototypes that reveal what AI can do. Vertical SaaS companies can build durable advantage by converting those capabilities into domain-specific, economically sustainable workflows, even when any particular model has a short shelf life.
The true cost of state-of-the-art artificial intelligence is becoming apparent. Frontier model token costs and the products built on top of them have been heavily subsidized by venture investment. This is the typical strategy to drive early adoption of new technology and for competitors to execute land grabs. We remember when rideshares were awkwardly cheap and Uber and Lyft used unprofitable incentives to keep drivers on the road. An AI price reckoning is unfolding, and I’m certainly not the only one talking about it .
At Paperless Parts, like everywhere else, we’ve been using AI tools for a few years now. Some tools have gone away, some have been introduced, but until very recently, we have not had to have any serious budget discussions. The chat apps, like ChatGPT and Gemini, have been effectively unlimited for a human user, with a few constraints around the most advanced models. This anchored AI as a low-cost, go-to tool for even trivial questions. Until last month, Cursor felt similar for many developers. You were billed per “request” for agentic coding help. The specific foundation model you selected and the complexity of the task mattered very little. Besides being cheap, this had the nice advantage of being very predictable—you knew the “cost” of the task before pressing enter. But now, with their May 2026 pricing model update, you pay for every token consumed at your selected model’s market price. This takes users from a monthly request allotment to a token budget measured in dollars. The bottom line is it’s more expensive now. Just as important, it can be difficult to predict how hard a model will work on a given task and what the final charge will be.
Tools like Claude Code and Cowork are gaining a lot of traction with engineers and other knowledge workers. These are primarily priced per token. Under pressure not to fall behind, companies are making these tools available to employees, assuming they will eventually contribute to business outcomes. We’ve rolled out plenty of software tools to our team over the years, but for the first time, we had to grapple with two new budgeting questions: how should we set per-user usage limits and how should we handle employee requests for more tokens? Every company will respond differently, but unless your goal is “tokenmaxxing,” you will inevitably handcuff your team to some extent.
Coincidentally, tokenmaxxing is a term I learned from “How to get your company AI pilled,” a post by Geoff Charles, CPO at Ramp, bragging about how their internal use of AI grew more than 6300% year-over-year. The post made the rounds in my network, leaving many of us nervous that we weren’t spending enough. After all, Ramp is B2B SaaS. If their team is using that much AI, then shouldn’t ours be? Then I realized Ramp had just released AI spend management software, and they’re very good at marketing. In any case, I’ve yet to talk to anyone running a real business who’s truly tokenmaxxing at a non-trivial scale.
Not only is the world adapting to real token costs, we’re also seeing the beginning of public pushback against uncontained AI growth. Resistance to data centers in your backyard is the “most bipartisan issue since beer” . Public opinion will only get stronger if the anticipated AI-driven job losses materialize. Industry leaders are seriously discussing data centers in space , personal security , and drone defense .
In my work applying AI technology to manufacturing, I’m constantly reminded that exponentials don’t go on forever; they reach equilibrium. Even Moore’s Law, technology’s most prolific exponential of my lifetime, eventually ran out of steam, at least by its original definition. Singularities are abstract ideas, not physical realities. The AI exponential assumes unlimited training data, power, data center space, semiconductor availability, neural network scalability, funding, and societal acceptance. In time, constraints emerge and progress slows. Even if we accept the premise of a looming event horizon driven by self-improving AI, software vendors and strategics must still build businesses for a human economy governed by earthbound physics. Still, AI is clearly going to keep getting better for multiple generations of technology, and even if technical progress stopped today, it would take years for the current state-of-the-art to achieve its due impact in the real world.
So what does this mean for my two worlds of vertical SaaS and custom part manufacturing? Frontier models add the most value as a “zero-shot” solution to a problem. That is, they can dive into a previously unseen problem without explicit training or instruction. The faster, cheaper versions of these models have the same benefits, operating as a distilled version of the larger models optimized for different performance characteristics. Almost by definition, however, SaaS applications assist with repetitive tasks. In a sense, software engineering is “human learning” or “natural intelligence” applied to a problem and then encoded to handle the same situation again and again at a marginal cost well below that of a human worker.
In the manufacturing industry, we work with a particular type of challenging document: the engineering drawing, also called a blueprint or simply a print. These have proliferated in the billions over the span of decades. While the example prints you see on social media or in demos are straightforward, real industrial prints are anything but. Human experts, often pulled away from other more valuable tasks, have been the only reliable way to read these. Wouldn’t it be great if AI could read them, perhaps automatically generating an accurate 3D model, or at minimum, faithfully enumerating the requirements and characteristics described? Each new AI model makes progress, and today, an agent can “read” many typical drawings.
This task is essentially domain-specific, next-generation OCR. It’s not a question of whether current and future frontier models can solve this problem, or any other given challenge in software; it’s a question of economics. Our analysis tells us that the average information content of the text on a drawing page is about 50 kilobytes in JSON format. Even with some format optimization, if we look at the cost of output tokens alone to produce this content, we can expect the zero-shot frontier-model reading solution to cost roughly 25 cents per page. But in reality, it’s worse than that. Today’s models need agentic supervision to ensure they process everything on the page and to correct common classes of errors. Internal benchmarking shows that this agentic solution can raise costs to a dollar or more per page.
This opens the door for smaller models. SaaS companies can bring their domain knowledge, and to some extent their datasets, and train a model from scratch or fine-tune one of the almost three million open-source models on Hugging Face . Just a few years ago, this was out of reach for a typical software engineering team, but of course the AI tools have helped with that, too. Model training is no longer the exclusive realm of AI research labs.
Fine-tuning is especially useful for discriminative AI applications (as opposed to generative AI), where a given input has a labelable, correct output. But consider how generative models are trained to produce novel text, images, and video. Due to the enormous volume of data they’re trained on, as well as the fact that there’s no verifiably right or wrong output for a given input, generative models rely on self-supervised training, such as predicting the next or missing tokens from large corpora. In other words, generative models learn to mimic their training data. Clearly, this has proven powerful, but it’s also what makes them controversial. They risk training-data disclosure, raising concerns about privacy and copyright. SaaS operators must tread extremely carefully when it comes to training generative models using customer data. Discriminative models, on the other hand, use more traditional training techniques, and today they aren’t limited to simple classification and regression problems. Discriminative applications can use transformers and other AI-era technology under the hood, but there is considerably less risk in using proprietary training data. The resulting models are orders of magnitude smaller than frontier models, which also changes the underlying unit economics. However, these small, efficient models can’t hold their lead forever. Newer general-purpose models get better and cheaper at niche tasks with time. SaaS companies must continue to invest in improving models as part of their operating system.
We’ve talked publicly about our “Wingman” print-reading model, which is about fifty times cheaper per page of inference compared to a frontier model. I don’t expect software buyers to care about their vendors’ margins, but it is in their best interest for software companies to have sustainable business models. When not backed by a healthy business, SaaS products fail to keep up with the larger technology ecosystem and eventually disappear, taking your data and workflows with them. I’ve talked to multiple founders who have bolted agents onto their apps and have lost almost all margin. This can work with an extremely large, horizontal user base or market, but vendors serving a vertical simply can’t stay in business by distributing tokens at or near cost. Eventually, in order to survive, those companies will need to significantly increase prices, at which point users will re-evaluate their subscription. State-of-the-art models and agents will clearly be integrated into the SaaS experience, but they will not be part of the business model that made software companies so valuable. One way we might see this play out is through Bring-Your-Own-Key (or “BYOK,” as explained in this GitHub Copilot documentation ), in which applications integrate AI, but the user provides their own model via API key. In some cases, this helps both customers, particularly in the enterprise market where IT departments want visibility, control, and governance over AI subprocessors, as well as software vendors, who don’t need to absorb expensive, low-margin tokens into their prices. However, if the BYOK features are core to the application’s value, this takes a software company out of the “token path,” limiting growth and enterprise value.
Tidemark’s Vertical SaaS Knowledge Project has been preaching about the impact of AI on vertical SaaS for years. SaaS companies that own one or more control points and establish data gravity are well positioned to weather AI disruptors. We can extend that thinking beyond just defending against vibe-coded competition. Companies with valuable data, and who are solving a problem with a high automation ceiling and well-suited for discriminative AI, can develop a structural advantage over both startups and hyperscalers. Startups whose revenue is dominated by token costs will struggle to achieve sustainable margin in verticals, and hyperscalers likely won’t create the important workflows and purpose-built models that power them. For scaled vertical SaaS providers, the path directly in front of us is clear: use frontier models to discover what is possible, then use domain data and workflow integration to make it economical and sticky. Any particular model may have a short shelf life, but in manufacturing, and I suspect in most industries, we have hard problems, rich data, and long technical runways.