What is a permission-aware AI agent?

It is an AI agent that enforces your existing access controls at the moment it retrieves data or calls a tool. Instead of querying everything with one privileged account, it acts within the permissions of the specific user making the request, so it can only return what that person is already allowed to see.

Why isn't role-based access control enough on its own?

RBAC defines who should see what, but an agent only respects those roles if its identity is propagated through every retrieval and tool call. Many deployments map every user to a single service account, which means RBAC exists in the source systems but is bypassed entirely once the agent runs.

How do you enforce user permissions in retrieval and RAG?

You attach access-control metadata to documents at index time, then filter every vector and keyword search by the current user's identity and group memberships. Pre-filtering on permissions is far safer than letting the model retrieve broadly and trimming results afterward.

Should an AI agent use a service account or the user's identity?

Use the user's identity wherever the agent reads sensitive data. Service accounts are appropriate for narrow, non-sensitive system tasks, but using one broad service account for user-facing queries collapses your whole permission model into a single over-privileged identity.

What is the difference between user-scoped and agent-scoped permissions?

User-scoped permissions control which data a given person can reach through the agent. Agent-scoped permissions control what the agent itself is allowed to do at all, such as which tools and systems it can touch. Both layers are needed; one without the other still leaves a hole.

Permission-Aware AI Agents: Controlling Who Sees What Data

A contractor at a mid-market firm asks the internal AI assistant a simple question: "What did we pay our top five vendors last quarter?" The assistant answers instantly, with exact figures, contract terms, and the names of the people who signed off on each payment. The contractor was never cleared to see any of it. The model did nothing wrong. It answered the question it was asked, using the access it was handed. The real failure is upstream: someone gave the agent a database connection that could read everything, and nobody taught it to ask the one question that actually matters for security - who is asking?

That gap is what permission-aware AI agents close. And it is the difference between an assistant you can put in front of your whole company and a quiet, well-spoken data exfiltration tool.

The Agent Inherits Build-Time Access, Not the Asker's

When you stand up an AI agent, you give it credentials: an API key, a database connection, a service account, an OAuth token. Those credentials decide what the agent can touch. In nearly every quick proof of concept, the team grabs the broadest credential on hand because it makes the demo work on the first try. The agent gets read access to the entire warehouse, the full document store, the whole CRM. Then someone drops a chat box in front of it and ships it.

Here is the part most teams skip past. The agent now operates at the permission level of that one service account for every user, no matter who is typing. Your organization spent years building a permission model - finance sees finance, HR sees HR, contractors see almost nothing - and the agent flattens all of it into a single identity with maximum reach. You did not remove your access controls. You just built a polite assistant that ignores them.

This is not a fringe worry. In Darktrace's 2026 State of AI Cybersecurity survey, 92% of security professionals said they were concerned about the impact of AI agents. The reason is mundane and structural, not exotic. The agents are deployed faster than the controls around them, a pattern Gravitee documents in detail in its State of AI Agent Security 2026 report under the apt heading "when adoption outpaces control."

A useful mental model: before an agent reads anything or calls any tool, two questions have to be answered. Who is asking? And is this agent allowed to do this at all? A broad database connection answers neither. The rest of this post is about wiring up both answers.

User-Scoped Retrieval: The Agent Sees What the User Sees

The cleanest principle for data access is also the easiest to state: an agent should be able to surface exactly what the asking user could surface if they logged into the source system themselves. Nothing more. Glean built its entire enterprise search posture around this idea, which it calls permissions-aware AI - the agent respects the same access-control lists that govern the underlying SharePoint folder, Salesforce record, or Snowflake table.

In practice this is harder than it sounds, and the reason is retrieval. Vector databases do not understand your org chart. An embedding does not know that a document belongs to the M&A team and should be invisible to a summer intern. So you have two real options, and only one of them is safe:

Pre-filter at query time (do this). Attach access-control metadata to every chunk when you index it - owner, group, classification level. At query time, pass the current user's identity and group memberships into the search and filter before the model ever sees a candidate. The agent retrieves only from the slice the user is entitled to.
Post-filter after retrieval (avoid this). Let the model retrieve broadly, then strip out results the user should not see. This is fragile. A summary, an embedding cache, or a single leaked snippet in the model's context window can expose data you meant to hide. Once restricted content enters the prompt, you have already lost.

The same logic extends to structured data. If the agent queries a database on the user's behalf, that query should run with the user's effective permissions, not an admin role. Row-level security in your warehouse becomes the enforcement layer, and the agent simply cannot select rows the user could not select directly.

There is a security payoff here that connects to the most-cited risk in the field. Simon Willison's lethal trifecta - private data, exposure to untrusted content, and a way to exfiltrate - describes how prompt injection turns helpful agents into leak vectors. User-scoped retrieval shrinks the first leg directly. If an injected instruction tricks the agent, it can still only reach data the current user was already allowed to reach. You have not eliminated prompt injection, but you have capped the blast radius to one person's existing access instead of the entire company's.

Least Privilege Applies to Agents and Tools, Not Just People

User-scoping answers "who is asking." It does not answer "what is this agent allowed to do at all." That second question is about agent-scoped permissions, and it is where the lazy path causes the most damage.

The lazy path is a single tool that wraps a raw, read-write database connection and lets the model write its own SQL. It works in a demo and terrifies anyone who has run a production system. The disciplined path is a set of narrow, purpose-built tools, each with the minimum scope it needs:

A lookup_invoice tool that takes an invoice ID and returns specific fields, not a run_query tool that accepts arbitrary SQL.
A read-only credential for tools that only read, and a separate, tightly scoped credential for the rare tool that writes.
Per-tool allowlists so the agent physically cannot reach systems outside its job, even if the model decides it wants to.

This is the agent-equivalent of least privilege, and it maps directly to the failure mode OWASP labels Excessive Agency in its Top 10 for LLM Applications - agents granted more functionality, permissions, or autonomy than the task requires. KuppingerCole goes further and argues that data access control is becoming the new security perimeter for agentic AI, precisely because the old network-edge model assumes a human is on the other end of every request. With agents, the edge is now the tool boundary, and that boundary is only as good as the scope you gave it.

For private-equity-backed and mid-market operators, this has a portfolio dimension. A PE firm rolling a shared AI capability across portfolio companies cannot use one credential set spanning every business; the 2026 buyer guidance on AI agents for private equity is blunt that data segregation across holdings is a baseline requirement, not a feature upgrade. One leaky agent at one portfolio company should never become a path into another.

The takeaway: scope the agent's tools as if you assume the model will eventually try to do something it should not. Because under prompt injection, it will.

RBAC and SSO: Where the Identity Actually Comes From

All of this depends on the agent knowing who the user is, with proof. That is an identity problem, and you almost certainly already own the answer: single sign-on plus role-based access control.

SSO establishes the verified identity. The user authenticates against your identity provider (Okta, Microsoft Entra, Google Workspace), and the agent receives a token that says who they are and which groups they belong to. RBAC maps that identity to roles, and roles to entitlements. None of this is new - what is new is the requirement to carry that identity through the agent stack rather than dropping it at the front door.

The pattern that makes this work is on-behalf-of token exchange. Microsoft documents this directly in its guidance on the on-behalf-of (OBO) flow: the front-end receives the user's token, and each downstream service exchanges it for a scoped token that still carries the original user's identity. Applied to agents, the chain looks like this:

The user authenticates through SSO and hits the agent with their token.
The orchestration layer propagates that identity into every retrieval call and tool invocation.
Each data source enforces its own access controls against the user's effective permissions.
Nothing in the chain silently swaps the user for a privileged service account.

When teams get breached on access, step 2 is almost always where it broke. The identity was verified at login and then thrown away. The agent did its work as an omnipotent robot, and the careful RBAC model became decorative. Promethium's enterprise data governance playbook makes the same point from the governance side: durable identity propagation, not periodic policy review, is what keeps agent access honest at runtime.

Sensitive Field Masking and Audit Logs

Two controls separate a serious deployment from a hopeful one. Both are about what happens at the data layer, after identity and scope are settled.

Field-level masking. Even an authorized user often should not see raw values for everything in a record. A support agent may legitimately need a customer record to do their job while having no business reason to see a full Social Security number, a bank account, or a salary. Column-level masking and dynamic data masking in your warehouse handle this so the agent never receives the raw value in the first place. You cannot leak what was never in the context window. Treat masking as a property of the data, not a behavior you ask the model to remember - models forget, and "please do not reveal the SSN" is not a control.

Per-query audit logs. Every retrieval and every tool call the agent makes should be logged with the user identity, the query, the systems touched, and what came back. This is the difference between a security incident you can investigate and one you can only apologize for. It is also a hard requirement in regulated and investor-backed contexts, where the question after any concern is not "is the agent smart" but "can you show me exactly what it did." Palo Alto Networks frames audit and visibility as a defining feature of the 2026 AI agent security market for this reason.

The instrumentation here doubles as operational tooling. Agent observability platforms like LangSmith already capture the full trace of tool calls and retrievals for debugging; pointing that same trace data at your security and compliance needs is mostly a matter of retention, access control on the logs themselves, and tying each trace to a verified user identity.

For the deeper mechanics of how these traces hold up under regulatory scrutiny, and the specific ways agents leak data when the controls above are missing, see our companion pieces on audit trails for AI agents in regulated industries and how AI agents enable data exfiltration.

How OpenNash Can Help

Permission-aware access is not a feature you bolt on after launch; it is an architecture decision that has to be made before the first tool is wired. That is where the work lands in an OpenNash build.

Audit. We map which systems the agent needs, classify the data by sensitivity, and find where your existing RBAC and SSO model already does the heavy lifting so the agent can inherit it rather than reinvent it.
Design. We define the identity propagation chain, the per-tool scopes, the masking rules, and the audit-logging schema up front, with human approval gates on any action that writes or releases sensitive data.
Build and deploy. We implement user-scoped retrieval, narrow tools instead of god-mode connections, and per-query logging, then hand the whole system over with documentation and CI/CD so your team owns it outright.

If you are evaluating platforms, be honest about fit. A packaged enterprise search tool with strong native permissions handling can be the right call for broad document Q&A. A vendor that cannot explain how user identity flows into retrieval, or whether it uses one service account for everyone, should not be trusted with sensitive data until they can. Custom is worth it when your permission model is unusual, when you operate across segregated entities like a PE portfolio, or when ownership and auditability are dealbreakers.

If broad database access is currently how your agent works, that is the thing to fix first. Book a call to map user-scoped retrieval and least-privilege tooling to your actual workflow before the contractor asks the wrong question and gets the right answer.