The Airbender Manifesto

Safely deploy your GenAI application layer in a performant and scalable manner.

April 21, 2025

Tldr; “innovation in the application layer”

Airbender is based on the following 12 principles:

There are no longer any frontier models, but rather 10+ tier 1 foundation models (OpenAI, Anthropic, Gemini, Cohere, Mistral, LLaMA, Grok, DeepSeek, Tülu3(AI2), Ernie(Baidu)) and many more tier 2 foundation models (Qwen, Yi, Phi(Microsoft), Nemotron(Nvidia), Nous(Databricks), GLM-4, InternLM2, OpenChat, Gemma, Mixtral, etc.).
The “AI-Wrapper Application" IS THE MOAT, not the risk. It’s the models that don’t have a moat. Companies that were paralyzed by the vague promises of the model vendors and warned to not risk building AI-wrapper applications are now behind. Companies betting on AI tacked on to SaaS are now behind. Companies that outsourced all IT to SaaS applications and forgot how to build are now behind.
Tool-use is largely application progress, not model progress. **Tool use is amazing ** but largely is success in the application layer behind the MCP protocol, even as models are trained on tool use and get better at it.
Reinforcement learning and small open models perform better in real operations than large models. And will for long time (years) due to the tacit knowledge gap (written language transmits approximately 10% of our knowledge). This means smaller context with specialized models will likely have better predictive outcomes (which are different than benchmark outcomes).
An agent is simply a unit of GenAI compute. In today’s GenAI world: LLM + instruction/prompts + memory (short and long-term) + tool use (including RAG). Autonomous vs. non-autonomous should be an adjective, not an inherent property of the word agent.
Scaling reasoning is increasing hallucinations, not decreasing them. This is makes sense since chains of thoughts without a world model would compound hallucinations. See point 4.
Measured AI progress has been in 1) fixed answers (math) and 2) general accuracy (accepting of many equivalent answers- e.g. an essay or blog post on a topic). Unfortunately, real-world AI operations is based on 3) predictive accuracy in qualitative domains is the most important measure for agentic operations (so we can reduce the cost and time for decision making). This explains the lack of real-world applications- we have to build around the inherent limitations of LLMs, not wait for the speculative promises of large platform vendors.
As a result, **autonomous agent strategies will mostly fail **due to the wrong focus (autonomous) and the wrong effort (low code, no code). The silver lining? Agentic smart experiences are much cheaper to build than traditional software- once you realize how the need and the approach.
Autonomous low-code agents are mostly wishful thinking from vendors that can sell platforms but don’t have services organizations with scaled innovation, experience, and workflow practices. The key insight is that they don't have a choice but to take that approach since they are attempting to win markets via capital expenditure rather than innovation. Early markets demand building, we have never seen great innovation from large companies- why is AI so different?
Assuming a 30% inherent noise factor in human systems (related to the tacit knowledge gap) we are integrating AI into, and we can expect a base 10-30% error rate inherent in our agentic systems. Put another way, if physics can’t solve the three-body Problem, how do we solve the organizational n-body problem of siloed workflows and incentives?
Agentic operations built around the application layer offer a 2-3x near-term productivity lift for key workflows that companies invest the reengineering and iteration against. 10X with tuning and transformation.
The cost and capacity of for deploying AI at scale will drive the ecosystem. The overfunded model startups will be be the mainframes of yesteryear as agile and locally optimized AI innovation is deployed years ahead of speculative data-center build outs.

So what is the 2025 plan? Build and deploy the agentic application layer, and scale on that foundation.

Unfortunately, this requires fixing all of your front-end workflow challenges of the last 10 years. Your most important AI vendor? My pick is Vercel and the AI-enabled frontend cloud.

A digital experience platform (DXP) built on leading edge frontend cloud services and managed with airbender.io is the key rapidly building your AI application layer and achieving AI iteration velocity in production- the only metric you should be measuring in 2025.