Understanding Data Loss Prevention

I’ll assume that you’ve watched the Indiana Jones movie Raiders of the Lost Ark – if not, be warned, there are spoilers in the next sentence.  At the end of the movie, the Ark is reported to be in a place that is “very safe.”  To which Indy replies, “From whom?” Later, the scene cuts to a man rolling a wooden crate through a huge warehouse, already filled with other large wooden crates, to an open spot on the floor where he leaves the crate to sit for eternity… or at least until the Kingdom of the Crystal Skulls movie came out.  Sure, the location had a menacing fence topped with barbed wire surrounding the perimeter, and guards posted to check that incoming and outgoing traffic was permitted or blocked based on the schedule (policy) for the day – but those didn’t seem to be enough to keep the data… errr… artifacts safe in the next movie, when a determined force of baddies set their sights on removing one.

Imagine the warehouse, the surrounding property, and the security measures in place as your  IT ecosystem.  The large wooden crates could be systems and files within the ecosystem, and the artifacts could be considered the data housed within the systems and files.  Some of the crates might be consolidated into a specific area of the warehouse that has carefully guarded access.  Similarly, critical systems in organization’s environments are likely grouped onto network segments with access policies governing who can access the content.  By setting file permissions, what actions can be performed on data by specific people (or groups of people) can be even further restricted.  These network and system level rules do not, however, provide complete coverage or control over all data to prevent its loss or leakage.

Consider for a moment the lifecycle of the content before it is placed onto the protected system – it might go through many iterations of refinement on a user desktop and exchanged via email to various internal departments.  At any time, that unprotected content might be leaked if a laptop or phone is stolen or compromised, or if an errant destination email address was typed.

According to Gartner’s MQ for Enterprise Data Loss Protection (DLP), the DLP market is defined as those technologies that, as a core function, provide remediation for data loss based on both content inspection and contextual analysis of data.  In other words, a system or software that can interpret the content of a file (or email) and apply a set of rules based on how the content is being used (prevent everyone except authorized users from opening a file, prevent a file from being sent outside the organization, etc).  The interpretation and rule sets would be applied to data:

  • At rest on premises, or in cloud applications and cloud storage
  • In motion over the network
  • In use on a managed endpoint device

In its most simple form, a DLP solution gives the ability to assign a classification to data, which can be used to match against rules that assign some action to the data while it is at rest, in use, or in motion.  For instance, a policy may require that data at rest be encrypted and can only be unencrypted by someone with the correct entitlement to do so regardless of where the data resides – on a network share, or on your desktop.  The classification of the data resides within the file’s own metadata; therefore, enforcement of the policy will follow the file!  Agents on the network use policies to determine if the data can be shared from one user to another, internally or externally, possibly even by time of day.  Further enhancements to DLP also include anomaly-based triggers that understand what is normal and what stands out as different, thus making it actionable.

In the warehouse example, these controls would have enabled the facility to truly ensure the artifact was “very safe”:

Encrypt the data at rest. (Seal the artifact in an unbreakable box inside the crate which only an authorized person can open.)

Protect the data in use. (Ensure the artifact could not be read, modified, moved, deleted, etc by an unauthorized person.)

Protect the data in motion by sanitizing/redacting the sensitive data, and enforcing strict send/receive rules. (Magically remove the artifact from the crate without anyone noticing, or prevent the artifact from being sent in the first place.)

Finally, I’d like to address Indy’s question regarding safety.  We should also ask, “Safe from whom?” when considering data.  Many threats to our network today are internal – whether intentional or not.  How many times have you typed the first two or three letters of someone’s email address then hit tab or clicked the dropdown to complete?  Mistakes happen, however a DLP solution can help prevent data loss and leakage outside of your network.

In conclusion, I hope the above scenario helped you to understand what a DLP solution is and how it can help your organization. The ability to classify data and implement rule sets based on classification adds another layer of defense and further protects a company’s intellectual property.