Your Employees Are Emailing Sensitive Data Right Now

Email: the original data leak channel

Organizations spend heavily on endpoint detection, network monitoring, and cloud access security. Meanwhile, the most common data exfiltration vector in every enterprise is the one they have had since day one: email.

According to research from the Ponemon Institute, email is involved in more than 65% of data loss incidents. The reasons are straightforward. Email is universally available, it crosses network boundaries by design, it carries attachments of any type, and it leaves the organization's control the moment it is sent.

The challenge is that most email data loss is not the result of a sophisticated cyberattack. It is the result of everyday business communication that happens to include information that should not be leaving the organization.

Three patterns of email data loss

Pattern 1: Accidental disclosure

This is the most common pattern and the least malicious. An employee sends a spreadsheet to the wrong recipient. A reply-all goes to a distribution list that includes external contacts. An email with sensitive client data is forwarded to a personal email address "to work on it over the weekend."

Accidental disclosure is driven by convenience and time pressure. The employee has no malicious intent. They are trying to get their work done. But the result is the same: sensitive data leaves the organization's control.

Pattern 2: Negligent handling

Negligent handling sits between accidental and intentional. The employee knows they are sending sensitive data externally but does not consider the risk. Common examples include emailing unencrypted spreadsheets containing customer personal information, sending tax documents or financial statements to personal email accounts for convenience, and including Social Security numbers, credit card numbers, or medical record identifiers in plain text email bodies.

Negligent handling is a policy problem. The organization may have data handling policies, but employees either do not know about them, do not understand them, or find them too cumbersome to follow in practice.

Pattern 3: Malicious exfiltration

Malicious exfiltration is deliberate theft. A departing employee emails customer lists, proprietary documents, or trade secrets to a personal account before their last day. A compromised insider systematically forwards sensitive communications to an external address controlled by a competitor or threat actor.

Malicious exfiltration is the least common of the three patterns but causes the most damage per incident. It is also the hardest to detect because the person doing it actively avoids triggering security controls.

How DLP detection works

Data Loss Prevention (DLP) systems inspect outbound email content for patterns that match sensitive data types. There are three primary detection methods:

Exact match detection

Exact match (also called exact data matching or EDM) compares outbound content against a database of known sensitive values. For example, if your organization maintains a database of customer Social Security numbers, the DLP system checks every outbound email for exact matches against that list. This method has a very low false positive rate but requires maintaining an up-to-date reference database.

Keyword detection

Keyword detection flags emails containing specific terms associated with sensitive data: "confidential," "internal only," "do not distribute," or industry-specific terms like "HIPAA," "PCI," or "material nonpublic information." Keyword detection is simple to implement but generates significant false positives without careful tuning.

Regex pattern detection

Regular expression (regex) patterns detect structured data formats regardless of the specific values. This is the most common DLP method for detecting personally identifiable information in email. Common patterns include:

Social Security Numbers: Three digits, a separator, two digits, a separator, four digits (e.g., 123-45-6789). The pattern also catches variants without separators or with spaces instead of hyphens.
Credit card numbers: 13 to 19 digit sequences with specific prefix ranges for each card network. Visa starts with 4, Mastercard with 51 through 55 or 2221 through 2720, American Express with 34 or 37. Luhn checksum validation reduces false positives from random number sequences.
Tax identification numbers: EINs follow a two-digit prefix, separator, seven-digit format. ITINs start with 9 and have specific digit ranges in positions four and five.

Building a basic DLP policy

A practical DLP policy starts with three tiers of sensitivity and corresponding actions:

Block and notify. Emails containing high-confidence matches for Social Security numbers, credit card numbers (with Luhn validation), or known confidential document identifiers are blocked from sending. The sender receives an immediate notification explaining why the email was blocked and how to send the information securely.
Warn and log.Emails containing medium-confidence matches (keywords like "confidential" combined with external recipients, or partial pattern matches) trigger a warning to the sender but are allowed through. The event is logged for review.
Log only. Emails sent to personal email domains (consumer webmail providers) from corporate accounts are logged for audit purposes. No action is taken unless the volume or pattern suggests data exfiltration.

Policies that are too aggressive generate alert fatigue and drive employees to find workarounds (personal devices, messaging apps, USB drives), which are harder to monitor than email.

The blind spot: forwarding rules

DLP systems inspect individual outbound emails at the time they are sent. This works well for the three patterns described above. But there is a fourth pattern that DLP misses entirely: automated forwarding rules.

When an attacker (or a malicious insider) creates a forwarding rule on an email account, every future email that matches the rule's criteria is automatically copied to an external address. The forwarding happens at the mail server level, not at the client level. Many DLP systems do not inspect server-side forwarded messages because the forwarding is treated as a transport configuration, not as a user-initiated send action.

This makes forwarding rules the most persistent and dangerous exfiltration channel available. A single forwarding rule, created in under 60 seconds, can silently exfiltrate every email matching specific keywords (invoice, payment, wire, contract) for months. The rule survives password resets. It survives MFA enrollment. It survives the employee leaving the organization if the account remains active.

DLP catches the email that an employee manually sends to the wrong address. It does not catch the forwarding rule that automatically sends a copy of every financial email to an external mailbox controlled by an attacker.

Closing the gap with InboxWatch

InboxWatch detects the exfiltration channel that DLP misses. It scans Gmail and Microsoft 365 accounts for forwarding rules that copy emails to external addresses, inbox rules that redirect messages matching sensitive keywords, delegates with unauthorized read access to mailboxes, and OAuth applications with mail read permissions that were not approved by IT.

DLP protects against data leaving through the front door. InboxWatch protects against the back door that someone quietly installed in your mail server configuration.

Email: the original data leak channel

Three patterns of email data loss

Pattern 1: Accidental disclosure

Pattern 2: Negligent handling

Pattern 3: Malicious exfiltration

How DLP detection works

Data Loss Prevention (DLP) systems inspect outbound email content for patterns that match sensitive data types. There are three primary detection methods:

Exact match detection

Keyword detection

Regex pattern detection

Social Security Numbers: Three digits, a separator, two digits, a separator, four digits (e.g., 123-45-6789). The pattern also catches variants without separators or with spaces instead of hyphens.
Credit card numbers: 13 to 19 digit sequences with specific prefix ranges for each card network. Visa starts with 4, Mastercard with 51 through 55 or 2221 through 2720, American Express with 34 or 37. Luhn checksum validation reduces false positives from random number sequences.
Tax identification numbers: EINs follow a two-digit prefix, separator, seven-digit format. ITINs start with 9 and have specific digit ranges in positions four and five.

Building a basic DLP policy

A practical DLP policy starts with three tiers of sensitivity and corresponding actions:

Block and notify. Emails containing high-confidence matches for Social Security numbers, credit card numbers (with Luhn validation), or known confidential document identifiers are blocked from sending. The sender receives an immediate notification explaining why the email was blocked and how to send the information securely.
Warn and log.Emails containing medium-confidence matches (keywords like "confidential" combined with external recipients, or partial pattern matches) trigger a warning to the sender but are allowed through. The event is logged for review.
Log only. Emails sent to personal email domains (consumer webmail providers) from corporate accounts are logged for audit purposes. No action is taken unless the volume or pattern suggests data exfiltration.

Policies that are too aggressive generate alert fatigue and drive employees to find workarounds (personal devices, messaging apps, USB drives), which are harder to monitor than email.

The blind spot: forwarding rules

Closing the gap with InboxWatch

DLP protects against data leaving through the front door. InboxWatch protects against the back door that someone quietly installed in your mail server configuration.

Your Employees Are Emailing Sensitive Data Right Now

Email: the original data leak channel

Three patterns of email data loss

Pattern 1: Accidental disclosure

Pattern 2: Negligent handling

Pattern 3: Malicious exfiltration

How DLP detection works

Exact match detection

Keyword detection

Regex pattern detection

Building a basic DLP policy

The blind spot: forwarding rules

Closing the gap with InboxWatch

Check your accounts for free

Your Employees Are Emailing Sensitive Data Right Now

Email: the original data leak channel

Three patterns of email data loss

Pattern 1: Accidental disclosure

Pattern 2: Negligent handling

Pattern 3: Malicious exfiltration

How DLP detection works

Exact match detection

Keyword detection

Regex pattern detection

Building a basic DLP policy

The blind spot: forwarding rules

Closing the gap with InboxWatch

Check your accounts for free