Big Data Governance Software: Sensitive Data Jul 2, 2014
There are numerous reasons as to why Big Data Governance is one of the foremost concerns for enterprises across vertical industries:
- Compliance and regulatory issues: Industry specific compliance standards are becoming more strictly enforced, requiring organizations to know what data they have, where it is, and who can access it to avoid costly fines and potential litigation.
- Security: Gartner notes that “Market failure to offer data-centric audit and protection tools that cross…data silos is forcing CISOs to find pragmatic strategies to implement a data security governance policy.”
- Data Ownership: Governance policies help to clarify data ownership, and—when properly enacted—can determine a clear path of data lineage in case of any potential inquiries from regulatory agencies or others.
- The Ubiquity of Big Data: Big Data technologies are no longer considered a niche item, and are playing an increasingly vital role in the conducting of business and operations processes.
Still, many more organizations would embark on Big Data initiatives if they had a clear way to view auditing trails, identify and move sensitive data, and view them and governance policies within a unified dashboard that greatly reduces the traditional complexity associated with doing so.
Perhaps even more would do so if they knew that they could automate the aforementioned process, deploy such Big Data Governance software in an on-premise or Cloud environment, and implement policies into this software without code to reduce the load on IT departments.
Thanks to data-centric security solutions provider Dataguise, now they can. On June 23 the company unveiled its Dataguise for Data Governance Suite, which is touted as the industry’s first solution for sensitive data related to Big Data Governance. The suite builds on the capabilities of the company’s primary product, DgSecure, which extends governance and security measures across the enterprise. According to Dataguise Vice President of Marketing and Business Development Patty Nghiem:
“Our thing is Data Governance with a focus on sensitive data. Think of needles in a haystack. There’s so many haystacks, and what people need to do is find the needles. We’re not trying to…be all things for all people for Data Governance. We’re saying if you’ve got security and compliance issues, that’s all around sensitive data. That’s what we do.”
Automating Data Governance
The pivotal advantage to employing Dataguise’s Big Data Governance software (which is effective on traditional data as well) is that it expedites and automates the process of implementing governance rules for sensitive data that is potentially discoverable. The traditional manual process involves creating governance policies and employing IT to modify and search through information systems to find and appropriately tag sensitive data. When such information involves Big Data (with their rapid velocity and myriad forms) found in time-sensitive financial and health care industries, such a process swiftly becomes outdated.
Dataguise’s software, however, automates this process and embeds governance policies into the systems in which the data resides and is modified. It readily identifies sensitive data and implements security measures to protect them, while providing a host of other functions for them (and other data) which include:
- Facilitating Easy Policy Implementation: Users can opt to implement pre-defined policies revolving about compliance standards such as HIPAA, PCI and PII in healthcare, finance, and retail industries, respectively, or leverage their own in a format that does not require writing code or script.
- Monitoring and Auditing Data for Lineage: One of the new features of Dataguise for Data Governance Suite which is not included in previous Dataguise releases is an auditing capability that denotes where data came from, who accessed them and what changes they made to them via expressive dashboards and custom reports.
- Enhancing Accessibility Features: The entitlements characteristic of Dataguise for Data Governance (another new addition) augments traditional access functionality provided at the DgSecure platform level by denoting access according to data element type and user.
Nghiem remarked that:
“Just because something in a nine-digit number with two dashes in it doesn’t mean that’s a social security number. Part of what we’ve really built into this is the intelligence of knowing the format of a MasterCard number versus a Visa versus American Express so when you look at a number you can say with a high level of confidence…here’s where sensitive data lives and we need to do something about it.”
With a number of clients in the financial and health care industries—and with growing numbers in retail and in the public sector—Dataguise has greatly influenced the Big Data Governance and security measures for a variety of companies. It has seen several clients use its products for credit card fraud detection and effectively mask the credit card numbers of cardholders before they give them out to a third party for auditing purposes—which enables them to receive the auditing information back on credit cards with different, non-sensitive numbers.
“They don’t want to create another fraud,” Dataguise co-founder and CEO Manmeet Singh said. “They don’t want other people to walk away with 5,000 credit card numbers and create another problem.”
A large healthcare provider based in the Midwest has utilized Dataguise’s platform to mask the sensitive information of many of its patients while exposing data that is relevant to partner pharmacies. Such data includes the age ranges of patients and their zip codes, and the general conditions for which they will require medication. Thus, pharmacies have a significantly better idea of what medications they need to stock in certain areas based on what amounts to anonymous customer data. This process, which is delivered by the health care provider to various pharmacies through the Cloud, aids both the patients—who now have better access to their medication—as well as the pharmacies, who are able to maintain their supplies more efficiently and better estimate cost and inventory concerns.
In fact, this process has been so beneficial to all parties involved that some health care providers are able to effectively sell such data, generating addition sources of revenue. Singh commented that:
“They are selling some part of the data to other pharmacies, and saying I can give you that information; I know what kind of patient heart attack support is necessary, what kinds of problems patients have at major hospitals or in major areas, and most of the information of your network by zip codes. That is a very powerful use case of what Big Data can do for you.”
Common security measures that Dataguise’s software utilizes include:
- Masking: Dataguise’s intelligent masking capabilities include the ability to subtly change certain information—such as a credit card number or a zip code—to a similar, non-valid form of that information. Masking protects sensitive data and is only one way; what is masked cannot be unmasked (source data is instead merely identified by the masked information) and there is no overhead.
- Encryption: Encryption is a two-way process in which data can be both encrypted and then decrypted to appear again in its original form. When data is encrypted, it is transformed into a form which is not useful or applicable for the data’s purposes. For example, an encrypted address cannot be used to identify someone’s geographic location.
- Redaction: Redaction measures edit the content found in documents or data repositories so that sensitive data is effectively removed while other data is unaltered, enabling the latter data to be viewed while protecting the former from unauthorized users.
Dataguise’s Big Data Governance capabilities are specifically designed to account for sensitive data for both structured and unstructured data—a virtual first for Data Governance software. Both its DgSecure platform and Dataguise for Data Governance suite (which share several features) include support for traditional repositories from SQL Server, Oracle, IBM, Cloudera and more, as well as Hadoop and a variety of transactional databases and file sharing systems. They are available in on-premise or Cloud deployments, and provide a unified means of reducing the complexity associated with Big Data Governance by automating the policy writing and implementation process, heightening security for sensitive data, and clarifying ownership and data lineage while ensuring compliance.