Why do enterprises need an AI services partner instead of building in-house?

Building in-house requires staffing forward deployed engineers, evals specialists, and a change-management function before you have a working agent. An AI-native partner already has that team and ships a working pilot in weeks instead of quarters, then hands ownership back. The decision is about time-to-first-agent and risk, not capability.

What does a forward deployed engineer actually do?

A forward deployed engineer (FDE) embeds with the customer, learns the workflow in detail, and writes the integration code that turns a model into a working system. They handle data plumbing, tool design, evals, and the human handoff. OpenAI, Anthropic, and most leading AI companies have stood up dedicated FDE teams because pure SaaS doesn't deploy itself.

How is an AI-native services partner different from a Big Four consultancy?

Big Four firms staff generalist consultants and bill by the hour. AI-native partners staff senior engineers who have shipped agents in production, use flat retainers, and embed inside the workflow. The Big Four can write strategy decks; AI-native partners ship working systems and stay through production.

What are the most common reasons enterprise AI pilots fail?

Five things: data fragmented across systems of record, workflows that exist only in tribal knowledge, compliance and security reviews that block rollout, middle managers who resist agents that absorb their team's scope, and no internal owner for evals and escalation after the consultants leave. Model quality is rarely the problem.

What should I look for when evaluating an AI services partner?

Ask three questions. How long until I have a working pilot in my real environment? Who owns evals, observability, and escalation after launch? What does the handoff look like, and what do I own at the end? If a partner can't answer those clearly, they're selling decks, not systems.

Why You Need an AI Services Partner: AGI Requires Assembly, and Enterprises Cannot Self-Install

Drew Bredvick has a line worth borrowing: AGI requires some assembly. The models are already superhuman on raw intelligence by most reasonable benchmarks. They write better than the median knowledge worker, code faster than most engineers, and reason across domains that took entire PhD programs to master. None of that translates into business outcomes by itself. Bredvick's point is that intelligence has never been the limiting factor on output, and the deployment of AI through the economy will look more like the 40-year electrification of American industry than a four-year fast takeoff. The plug needs a socket. The socket needs wiring. The wiring needs an electrician.

The enterprise version of this story is the same. You can buy the smartest model on the market, and your sales team will still send the wrong email to the wrong prospect because the CRM is two systems behind reality. Your support agent will still escalate refund requests it should approve because compliance never signed off on the policy lookup. Your finance team will still close the books by hand because the invoice agent works in dev but no one owns the production runbook. The model is fine. The assembly is missing.

This is the part of the AI conversation that gets least airtime and matters most. The labs themselves know it, and the hiring patterns prove it.

The signal in the labor market

Joe Schmidt's a16z piece Trading Margin for Moat puts numbers on something most operators have been seeing for a year. OpenAI has more than 20 open roles on its forward deployed engineering team. Anthropic's careers page is hiring aggressively into the same function under different titles. Decagon hires Agent PMs. Sierra ships with embedded engineers. Harvey's go-to-market motion looks more like a McKinsey engagement than a SaaS trial.

The pattern is unmistakable. Foundation model labs and the category-defining AI application companies are trading early gross margin for implementation moat. They are doing the same thing Salesforce, ServiceNow, and Workday did during the cloud transition, when category leaders built services arms not because they wanted to be consultancies but because the customer would not deploy the product without help.

If the companies closest to the model are staffing up to do the assembly, the inference for everyone else is uncomfortable. The smartest possible vendor in your stack is telling you, through its hiring plan, that the product alone is not the deployment.

The five failure modes of unassisted enterprise AI

MIT and BCG ran a study on generative AI in the enterprise and found the same thing operators have been seeing in practice. Most pilots never reach production. The reasons cluster around five predictable failure modes, none of which are about model quality.

1. Data fragmentation. The agent needs to see the customer record, the order history, the support ticket, the contract, and the shipping status. Those live in five systems with three different identity schemes and one Salesforce instance no one trusts. The model can reason. It cannot reason about data it cannot reach.

2. Tribal workflows. The actual escalation policy lives in the head of a senior support manager who has been there nine years. The actual invoice approval logic depends on which controller is on vacation. The actual deal-desk rules are a Slack channel. Documenting the workflow is the work, and most companies have never done it.

3. Compliance and security review. The agent works in dev. Now it needs SOC 2 boundaries, PII handling, audit logging, and a clear story for what happens when it makes a mistake. The security team has questions. The legal team has more. Six months pass.

4. Middle-manager resistance. An agent that handles tier-one support is also an agent that removes the headcount budget of the tier-one support manager. Anyone who has rolled out automation in a real org knows that the technical work is the easy part. Klarna walked back parts of its AI-only support strategy not because the model failed but because the human consequences were not designed.

5. No owner after launch. The pilot ships. The consultants leave. Six weeks later the agent's accuracy drifts because the underlying CRM schema changed, and no one inside the company owns evals, prompts, observability, or the escalation path. The system degrades quietly, and the next executive review concludes that AI did not work for the company.

A good services partner exists to defuse all five of those, not just to wire up the model.

The three buyer options

If you accept that assembly is the real work, you have three ways to get it done.

Option A: Build a forward deployed team in-house. This is the right answer if AI is core to your product and you can recruit a team that has shipped agents in production. It is the wrong answer for most companies because the talent market is brutal, the ramp is twelve months, and the work for the first year is mostly plumbing that does not move the headline number. Anthropic's own research on building effective agents makes a quiet point in passing: the patterns that work in production are not the patterns that work in demos, and you only learn the difference by shipping.

Option B: Hire a Big Four or systems integrator. Familiar contracts, familiar pricing, familiar people in suits. The problem is that the operating model was built for a different kind of work. Hourly billing, generalist consultants, multi-quarter timelines, and a deck-heavy delivery format do not match the iteration loop AI work demands. IBM and Accenture have been rebuilding their services arms around AI delivery for a reason, but the legacy operating system runs underneath. Sometimes that works. Often it produces a strategy phase that ends right around when an AI-native team would have shipped pilot two.

Option C: Work with an AI-native services partner. Small senior team, flat retainer, pilot in weeks, ownership transferred at the end. The work is iterative, the cadence is short, and the operating model assumes that the model is the easy part and the integration is the job.

Dimension	In-house FDE team	Big Four / SI	AI-native partner
Time to first agent	6 to 12 months	3 to 6 months	2 to 4 weeks
Cost model	Salaried headcount	Hourly billing	Flat retainer
AI-native depth	Depends on hires	Generalist consultants	Senior engineers who have shipped
Change management	Built in long-term	Heavy slideware	Embedded, lightweight
Ownership at handoff	You own everything	Vendor retains tooling	You own code, evals, runbook

The right answer depends on scale, talent access, and risk tolerance. The wrong answer is to pretend the assembly does not exist.

What an AI-native partner actually does, week by week

The reason this model works is that the cadence matches how AI systems actually get built. Hamel Husain's evals FAQ makes the point that error analysis is where 60 to 80 percent of development time should go, not on framework selection. Chip Huyen's pitfalls post makes the parallel point that teams burn months on the wrong abstractions. Both are arguing, in different vocabularies, that the work is iteration against real traffic, not architecture in the abstract.

A reasonable four-week cadence looks like this.

Week 1 - Embed. Two senior engineers sit with the team that owns the workflow. They watch the actual work, read the actual tickets, and write down what the model needs to see. They identify which systems are reachable, which require new integrations, and which compliance constraints will bind the design.

Week 2 - Build. The first version of the agent goes into a sandbox connected to real data. Evals are written against the actual escalation policy, not a synthetic test set. The team identifies the first failure modes and writes guardrails.

Week 3 - Harden. Security review runs in parallel with iteration. Observability and audit logging go in. A human-in-the-loop approval path is designed for the failure cases that matter. Documentation is written for the people who will own this after launch.

Week 4 - Live. The agent goes into limited production with a defined scope and a clear rollback. Evals run continuously. The internal owner is named and trained. The partner stays through the first incident.

This is what assembly looks like in practice. Not slides. Not architecture diagrams. A working system in your real environment, with the people who will run it next to the people who built it.

What this looks like in practice

A 30-attorney plaintiff firm in Northern California needed intake to stop dropping calls after hours. The model side was solved before the engagement started: voice quality, latency, and reasoning were already good enough. The assembly was the work. Mapping the conflict-check process, designing the escalation tree for live attorneys, integrating with the case-management system, and writing the audit trail so the firm could defend any intake decision in discovery. Booked-consult rate moved from 41 percent to 67 percent. The model was not the difference. The wiring was.

A $40M IT services company had three full-time employees on accounts-payable matching. The work was rules-based, the documents were structured, and the model could read them. The reason it took three people was that no one had ever written down what the rules actually were. Six weeks of embedded work produced a documented policy, an agent that applied it, and an exception queue for the cases that genuinely needed judgment. AP cycle time moved from 11 days to 3.

A 50-person operations team produced a monthly board pack that took nine hours of analyst time. The model could do the analysis. The reason it took nine hours was that the data lived in four systems with inconsistent schemas, and the narrative depended on context only the CFO carried. The assembly was a pipeline that pulled the data, an agent that drafted the narrative, and a review loop that let the CFO correct it in place. Twelve minutes, not nine hours.

In every case the model existed before the engagement and the work was almost entirely integration, evals, and change management. That is the pattern.

How OpenNash Can Help

OpenNash is built for the assembly problem. The team is senior engineers with eight years of data and AI work in Silicon Valley, not generalist consultants. The model is a 14-day pilot at no charge, a flat monthly retainer after that, and a four-week embed-build-harden-live cadence. AI education and re-skilling sessions are bundled in so the organization can run the system after launch, not just receive it.

The fit is best when there is a real workflow with measurable outcomes, an executive sponsor who will defend the project through the change-management work, and a willingness to own the system at handoff. The fit is worse when the goal is a strategy deck or an architecture review without a system at the end.

If the work in front of you looks like assembly, book a call and we'll map the specific pattern to your environment. If you are still in the question phase, the /how-it-works page walks through the cadence in more detail.

The runway, not the demo

Bredvick's framing is the right one to close on. AI is a continuation of the connected-computing mega-trend, and the build-out will measure in decades. The partner you choose now does not just decide whether your first agent works. It decides how much of the next ten years of capability you actually capture.

The labs are hiring forward deployed engineers because they know this. The category-defining AI companies are trading margin for moat because they know this. The Big Four are rebuilding their services arms because they know this. The signal is loud, and it points in one direction. Smart models are not a deployment plan. Assembly is the job.