Apr 30, 2026

Teach Your Technology to See: Extending Microsoft Purview Sensitivity Labels with Attribute-Based Context

At the recent Gartner Digital Workplace Summit, I attended Rachel O’Farrell’s session on unstructured data. The phrase she used to frame the challenge has stayed with me ever since: we need to teach our technology to see. It’s the cleanest framing I’ve heard for what’s changing.

A sensitivity label is a cornerstone of how organisations classify sensitive information today. It tells a system what a document has been classified as, which is a real and meaningful signal. What it doesn’t tell the system is the rest of the story: whether the document contains CUI, whether it’s subject to ITAR, whether it holds patient-adjacent data, whether it’s eligible for a training corpus. That additional context used to live in the author’s head, or in a set of manual processes around the document. In the agentic era, that is no longer workable.

I just got back from the Microsoft 365 Community Conference, and one thing stood out across the conversations I had there: adoption has come a long way. Labels are in place more than ever before, which is a real step forward, and it shifts the next question from how do we classify documents to how do we make sure the systems around those documents have the context they need to protect the data appropriately. DLP engines, sharing platforms, and Copilot indexes are now the ones making most of the real-time decisions, and the capability most organisations have today was designed for a world where humans were doing that work instead. That is not the world we are in anymore.

The Context Gap

Microsoft Purview Sensitivity Labels are a strong foundation, and we build on them deliberately. A label is a single classification applied to a document, and it carries a single identifier. That works well when a document is one thing. It becomes more limiting the moment a document is several things at once, which is how most regulated work actually looks.

A defence technical specification can carry a Confidential label while also containing CUI, export-controlled technical data, and proprietary IP. A pre-clinical trial report can carry a Confidential label while containing HIPAA-relevant patient data and commercially sensitive research. That gap creates risk in three related directions.

➤ For the people creating the content, the label alone doesn’t tell them what additional context the document needs. Authors are left to interpret, which means the same type of document can end up classified differently depending on who created it.

➤ For the security and governance stack, the label doesn’t carry the structured metadata that DLP, sharing platforms, and data discovery tools need to enforce policy with precision. Enforcement ends up relying on pattern matching after the fact, which is slower and less accurate than acting on context the author already knew.

➤ For AI systems, the label doesn’t carry the signal that determines whether a document should be indexed, surfaced, or excluded. Copilot connectors and RAG pipelines ingest what they can read. If the document itself doesn’t communicate its context, the AI system has to infer it, and inference isn’t a substitute for governance.

That’s the gap we built eSHARE Data Attributes to close. It’s an add-in for Word, PowerPoint, Excel, and Outlook that extends Microsoft Purview Sensitivity Labels with attribute-based context, so the classification a document carries is paired with the additional information the systems around it need to act on.

The Value of Contextual Metadata

Visual markings are important, and any serious governance approach includes them. They’re what a compliance auditor looks for when they review a document, and they’re what a human reader sees when the document is in front of them. But markings are designed for people, and much of what happens to a document today doesn’t involve a person when the decision actually gets made. DLP engines, sharing platforms, and AI pipelines are all acting on documents at speed, without direct supervision, and they need something beyond a visible marking to act on.

That something is structured metadata, embedded in the document itself, travelling with the file wherever it goes. A DLP policy that fires on pattern matching is, by design, reactive: the document has already been written and, in many cases, shared before the pattern engine gets involved. A DLP policy that fires on metadata the author set at the moment of creation is proactive control that is predictable, easy to explain, and quick.

The same logic extends to AI. Copilot deciding whether to index a document can read that same metadata. A RAG pipeline deciding whether a document is eligible for an internal assistant can read that metadata. A training workflow deciding whether a document should be part of a model’s corpus can read that metadata. AI governance needs proof, not promises. The proof has to be present in the document, it has to be structured, and it has to be there before the AI system gets near the file.

What This Looks Like in Practice

Here are two scenarios to bring this to life.

In the first, a defence engineer drafts a technical specification labelled Confidential. The eSHARE Data Attributes pane inside Word surfaces the attributes mapped to that label and asks the engineer to identify the CUI category and flag ITAR-controlled content. Compliant markings are composed from those answers and applied to the header and footer. Structured metadata is written into the document. When the file is shared, the policy engine reads the ITAR attribute and restricts access to U.S. persons. The engineer didn’t have to route the request or consult a checklist, and the governance outcome is the same as if a compliance reviewer had been sitting next to them.

In the second, a researcher at a life sciences organisation labels a pre-clinical trial report Confidential. The same pane asks whether the report contains patient-adjacent data subject to HIPAA. The researcher confirms. Markings apply. Metadata is embedded. When the file is shared, permissions tighten automatically. When the organisation’s AI security tooling encounters the file, it excludes it from training. That last piece is what I’d most want a board to understand: without this kind of structured context, the same document could be quietly ingested into a training corpus with no policy check in place.

Neither scenario requires the author to be a compliance expert, which is important. The label determines what attributes appear. The attributes determine what the tools that act on the file later can do. The author answers a few questions at the moment of creation, and the rest takes care of itself.

Data Attributes Strengthens the Trusted Collaboration Fabric

For more than a decade, we’ve been building on top of Microsoft 365 to meet the governance requirements regulated organisations face today. Trusted Shares bring external collaboration inside the tenant boundary. Omnichannel integrations unify email and mobile communications into the same governance fabric. Data Attributes brings enforceable context into every document at the moment it’s created. Each of these is a piece of the same picture: a Trusted Collaboration Fabric that embeds governance into the flow of work.

Classification told systems what a document was, and that was enough for a long era of compliance work. The next era asks for more. It asks for context the systems around the document can read, and it asks for that context to travel with the document. That’s what Data Attributes is for, and it’s why we’re excited to bring it to preview.

Want to see it live? Contact us and we would be happy to show you a demo. For more information you can also download the solution brief or watch this video.

Mark Cassetta

Download Mark Cassetta's Presentation

Build Bridges, Not Barriers:
Achieving Trusted Collaboration in the AI Era

Contact Us

Fill in the form and download the full comparison datasheet.

Better collaboration.
Higher productivity.
Better employee and client engagement.

Transform the way you collaborate. Contact eSHARE to get started.

Schedule a Demo