Mainframe Security: Part 3 - Where is all your sensitive data?
One vulnerability I see a lot are copies of sensitive data outside of the production environment. This sensitive data, if disclosed, can harm the organization just as much as the production versions. Examples are Social Security Numbers, medical diagnosis or treatments, credit information, and, of course, credit card numbers which should never be stored unencrypted in the first place. One example that comes to mind is an insurance company discovering a series of database query results, stored under an individual user’s high-level index that correlated medical treatments with diagnosis, but also contained the patient’s identification. When investigated, it turns out that the employee was asked by an executive to do this analysis, but, never bothered checking with the security people on where and how to temporarily store this information and never cleaned it up afterwards.
Before the realization that copies of sensitive data are dangerous, it was the norm for application developers to just make copies of production data when they were developing or modifying application programs. Why would they risk destroying or ruining production data and putting their organization and their own futures at risk? The problem is that when you look at disclosure of sensitive information, a year-old copy of a production dataset containing sensitive data is just about as dangerous as the current production version. One mainframe software vendor told me that, in their experience, it was not unusual to find five or six copies of each production dataset in a non-production environment!
So, what are developers to do? There are products available, such as Compuware’s Test Data Privacy product, that take production datasets, even sets of production datasets where they are interrelated, and sanitize them for use as test or development data. This solution provides the application developers with the test data they need without exposing the organization’s sensitive data to a breach and all the negative ramifications of that.
Additionally, some types of data, even if legitimately stored in production versions, must be encrypted or tokenized. The most obvious example of this are credit card numbers, but, with the recent explosion of data breaches, this will expand. Once located, it will be more difficult to remediate these since the production application programs will have to be modified, but locating all instances of this sensitive data is a starting point for a remediation roadmap.
But, where does the security team start in locating the datasets and database tables containing sensitive information that they don’t know about? There are several data discovery products available in the open systems market, but only one in the mainframe market – DataSniff from Xbridge Systems.
Another issue is that application developers must be prohibited by the security system from accessing production data and forced to go through an intermediary for obtaining de-sensitized copies. This, of course, adds time and overhead when diagnosing issues. One application developer told me that, a long time ago, he would look up personal details on famous people in the company’s databases, just out of curiosity.
Occasionally, there will be a justifiable reason for making a copy or it will be circumvented, so the data discovery product must be run regularly to identify copies of sensitive data under a non-production high-level index and remediated. At this time the application developer or quality assurance technician, who probably did this with the best of intensions, must be taught the risks that he is placing on his organization.
Another issue is that there are certain industry guidelines or laws that require that specific kinds of information be encrypted or tokenized. An example of this is the PCI Security Standards that require all credit card numbers to be encrypted. So, the entire storage system must be periodically scanned to locate datasets and database table containing this information. It does not matter whether this sensitive data is stored inside or outside of the production data – it cannot exist.
The moral of this segment is that all sensitive data must be discovered and the instances that are outside of the production data must be remediated. Each instance of production data must also be categorized as to what category of data it contains so that the access permissions can be analyzed to see whether they are appropriate for that category of data – but, that’s part 4.
Read Part 4 of this blog post series.