In today’s cloud-driven world, organizations store vast amounts of data across multiple platforms — Microsoft 365, Azure, SharePoint, Teams, file servers, and SaaS apps.
The challenge? Knowing what data is sensitive and where it lives.
That’s where Sensitive Information Types (SITs) come in — they are the core detection mechanism behind Microsoft Purview’s Information Protection and Data Loss Prevention (DLP) capabilities.
🔍 What Are Sensitive Information Types?
A Sensitive Information Type (SIT) is a pattern-based rule used by Microsoft Purview to automatically detect and classify sensitive content in your environment.
Each SIT is built using:
- Regular expressions (regex) for pattern detection
- Keyword dictionaries for contextual matching
- Checksum algorithms for validation (e.g., credit card numbers)
- Confidence levels to score accuracy (Low, Medium, High)
When Microsoft Purview scans data, it uses SITs to detect things like credit card numbers, ID documents, health records, or financial data, helping organizations protect and govern this information intelligently.
💡 SITs are the detection engine behind features like DLP, Auto-Labeling, the Information Protection Scanner, and Insider Risk Management.
🧠 How SITs Work
Each Sensitive Information Type has primary and secondary elements:
- Primary element – The actual pattern (e.g., 16-digit number).
- Supporting element – Keywords or context that increase accuracy.
- Confidence level – Indicates how sure the system is about a match.
For example, the built-in Credit Card Number SIT:
- Uses regex to find 16-digit sequences
- Validates the number using the Luhn checksum
- Looks for keywords like Visa, Mastercard, or Amex nearby
If these conditions are met, Purview flags the content as Sensitive with high confidence.
📦 Built-in Sensitive Information Types
Microsoft provides over 400 built-in SITs across 50+ regions, covering major regulatory and compliance frameworks.
🌍 Common Categories
| Category | Examples |
|---|---|
| 💳 Financial | Credit Card, SWIFT Code, IBAN, Bank Account Number |
| 🧾 Personal Identifiers | Passport, Driver’s License, National ID, SSN |
| 🏥 Health | ICD-10 Code, NHS Number, Health Insurance ID |
| 💼 Corporate Data | Employee ID, Payroll Number, Tax File Number |
| 🌐 Regional Regulations | EU ID, Aadhaar (India), SIN (Canada), INSEE (France) |
You can view all available SITs in the Microsoft Purview Compliance Portal → Data Classification → Sensitive Info Types.
🧰 Custom Sensitive Information Types
If your organization has unique data formats, you can create custom SITs to identify them.
For example:
- Law Firm: Case File Numbers (e.g., CFN-1234)
- Bank: Loan Reference IDs
- Manufacturer: Product Serial Numbers
⚙️ Steps to Create a Custom SIT
- Go to Microsoft Purview → Data Classification → Sensitive Info Types
- Click Create → define using regex or keyword patterns
- Set confidence levels (Low / Medium / High)
- Test detection using sample files
- Publish for use in DLP, auto-labeling, or retention policies
Advanced users can also upload SIT definitions via PowerShell XML templates.
🧠 Trainable Classifiers (AI-Based SITs)
Beyond regex and keywords, Purview offers Trainable Classifiers powered by machine learning.
These classifiers learn from real examples of your documents — identifying content by context and meaning, not just patterns.
Built-in Classifiers Include:
- Resume
- Contract
- Source Code
- Financial Document
- Health Record
You can also create custom classifiers for business-specific documents by uploading a labeled training set in the Purview portal.
🧠 Trainable classifiers help discover unstructured or context-rich data that static SITs can’t easily detect.
⚙️ Where SITs Are Used in Microsoft Purview
| Feature | Purpose of SITs |
|---|---|
| 🧭 Data Loss Prevention (DLP) | Detects sensitive data in motion and applies rules to block or warn. |
| 🏷️ Auto-Labeling | Automatically applies sensitivity labels based on detected SITs. |
| 📊 Information Protection Scanner | Scans file shares and on-premises repositories for sensitive data. |
| 🔎 Data Classification Reports | Provides visibility into where sensitive information exists. |
| 🧠 Insider Risk Management | Correlates user activities with sensitive data access and sharing. |
SITs form the foundation of data discovery, labeling, and protection in Microsoft Purview.
🧾 Roles and Permissions
To view, manage, or create Sensitive Information Types, you need specific Purview roles:
| Role / Group | Access Level |
|---|---|
| 🛡️ Compliance Administrator | Full control to create and manage SITs |
| 🔍 Security Administrator | Monitor SIT detections and alerts |
| 📄 Information Protection Contributor | Create custom SITs and manage classifiers |
| 👁️ Content Explorer Viewer | View SIT matches in files |
| 🧰 Global Administrator | Full tenant access (for initial setup only) |
📖 Microsoft Doc: Purview permissions
💼 Licensing Requirements
The ability to use and manage Sensitive Information Types depends on your Microsoft 365 license.
| Feature | Required License |
|---|---|
| Use built-in SITs in DLP or labeling | Microsoft 365 E3 (partial), Microsoft 365 E5 (full) |
| Create custom SITs | Microsoft 365 E5 / A5 / G5 |
| Use Trainable Classifiers | Microsoft 365 E5 / E5 Compliance |
| Auto-labeling with SITs | Microsoft 365 E5 Information Protection & Governance add-on |
📚 Microsoft Doc: Purview Service Description
✅ Best Practices
- Start with built-in SITs — they’re optimized for accuracy.
- Test before enforcing — use simulation or audit mode first.
- Combine SITs with sensitivity labels for layered protection.
- Use trainable classifiers for complex or free-text documents.
- Continuously review detections in Content Explorer to fine-tune policies.
📚 Check Microsoft Documentation
For more detailed technical references, visit the official Microsoft Learn articles:
- Learn about sensitive information types
- Entity definitions for built-in SITs
- Manage custom sensitive information types
- Common usage scenarios for SITs
- Microsoft Purview Information Protection overview
🧭 These pages provide up-to-date lists, XML structures, and API references for developers and administrators managing SITs in enterprise environments.
🏁 Final Thoughts
Sensitive Information Types (SITs) are the intelligence behind Microsoft Purview’s data protection ecosystem.
They enable automatic detection, labeling, and governance of sensitive information — ensuring your organization stays secure and compliant.
💡 If Sensitivity Labels are the wrappers, SITs are the detectors that tell Purview when to protect your data.
#MicrosoftPurview #SensitiveInformationTypes #DataClassification #InformationProtection #Compliance #Microsoft365 #DLP #MicrosoftSecurity