What is Data Leakage? Data Protection? Data Identification?

We all know what data is – our company and/or client information – in almost any form, whether digital or not.  We all, SHOULD, know that keeping our data/information ‘ours’ is the most important of all.  Exposing any of it could be harmful to us, our clients and anyone else for that fact.  So it’s very important to know where data is, where it lives and what ‘leakage’ is and how it can happen.  That’s what I’ll address in this post.

So, what general types of data are there?

  1. Network Data – Data that moves across, in and out, and through your network.  This is termed as Data in Motion (DiM)
  2. Storage Data – Data that sits on storage servers, data centers, archive locations, etc.  This is termed as Data at Rest (DaR)
  3. Endpoint Data – Data that sits on user workstations, both inside and outside of the corporate network.

Well, now that we know how data can be classified, what kind of leakage protection is available for these data types and where does it ‘sit’, or get installed/managed?

Lets start with one more acronym – Data Leakage Protection (DLP) and associate it with ‘Network’ DLP (NDLP), ‘Storage’ DLP(SDLP) and ‘Endpoint’ DLP (EDLP) – make sense?  So what are they now?

  1. NDLP – This is typically software or hardware that analyzes network traffic on its way in or out of your network, or to other networks within your organization for violation of network security policies and provisions.  An example would be a user emailing a confidential pricing sheet to another company.
  2. SDLP – This is typically a software solution that discovers if confidential or privileged data is stored in otherwise inappropriate and/or less/non secured locations on your network storage. An example would be a financial spreadsheet that is stored in a publicly shared folder on the network.
  3. EDLP – This is typically software that runs on user workstations or servers within your organization.  It works similarly to NDLP, but can be used to control the types of data than can be transferred between groups or different user types.  This type of system is the most complex and as it really controls the flow of data types and content.

So, how do these systems know what is allowed, shouldn’t be allowed, or are forbidden when it comes to governing types of data and information?

The technology deals with fundamental Data Identification technologies and algorithms.  There are a number of ways and techniques for identifying sensitive information.  Data Identification is a process in which a company uses DLP technology to actually determine what to look for in data packets, whether in motion, rest or in use.  The software or hardware solution performs advance content analysis and looks for keywords, phrase matching, document types; based on a set of rules, the data can be discarded, allowed and forwarded, re-directed for approval, or quarantined.

These systems can even control instant messaging and email before they get to archiving systems.  This brings on other benefits and considerations as these communications, if blocked, technically were never sent by the system, thereby not subject to systems retention policies, and would possibly not be recognized in a legal discovery process, if one were to occur.

Implementing these systems are done at various levels – some relatively inexpensive and others much more so.  Depending on how much data, the type of data, and the ways in which you want it protected from leakage.

Look for my webinar next month, starting a multi-part series, regarding information protection – to learn more.  I’ll touch upon how to also protect your information from general access or transport.

If you would like help and advice on how to go about implementing these technologies, please contact me.  We’re here to help!

Rick

Leave a Reply