Click to learn more about author George Kobakhidze.
For those in charge of their company’s cybersecurity efforts, they may stay up at night thinking, “Who is clicking on the most links in emails? Who opens the most phishing emails? Who uses simple passwords?”
Amidst the best efforts to educate and build in best data practices throughout the organization, IT teams have no way of knowing who hasn’t followed directions until after the fact. These are examples of how hackers exploit the entry points employees use most — unstructured data. Although it can often be seen as an entry point to cybercrime, there is a way to harness the power of unstructured data (or “dark data” as it’s often referenced) to mitigate these risks and turn these datasets into one of the most powerful resources for an organization. We spoke to George Kobakhidze, lead solutions engineer at ZL Technologies, to explain this further.
DATAVERSITY (DV): First, define for us: what is unstructured data?
George Kobakhidze (GK): Unstructured data is the digital equivalent of human activity. This can take the form of emails, human-created documents, and social media messages. Often, this unstructured data can also be considered “dark data,” the unmanaged enterprise content lurking in the shadows. I say this because, often, organizations simply do not have a way to manage all this unstructured data, leaving it sitting on servers, under-utilized and underappreciated. In this case, unstructured data can contain both valuable business insights but also potential security risks.
DV: What are the potential security risks of unstructured data?
GK: Unstructured data is simply difficult to monitor. This can often lead to system administrators not noticing when information has been replicated, leaked, tampered with, lost, or stolen. But I believe the biggest security risk of them all is posed not from external threats but from inside the walls of a business. Employees may not be intentionally perpetuating risk, but unstructured data is a byproduct of fast-paced, chaotic human work processes — it’s not malicious in nature. Accidental disclosure of content, the accidental copy and transfer of sensitive information, as well as improper access control to documents and files, are far more common security risks to unstructured data than corporate IP espionage or interference from a foreign hacker. These outcomes reveal why Data Management professionals and business leaders must find ways to understand and, ultimately, manage their unstructured data.
DV: So then is all unstructured data bad?
GK: On the contrary, unstructured data can be some of the most valuable data your company holds. The current advances in the sophistication of unstructured data analytics and text analytics mean that dark data increasingly has potential as a business asset. Once it has been managed, de-duplicated, and cleansed, it represents a corpus of human knowledge within the organization.
DV: How can IT leaders and organizations derive business insights and innovations from unstructured data?
GK: As alluded to earlier, unstructured data is essentially reflecting the human mind of a business. The real challenge lies in how organizations can curate that knowledge into useful resources. Once the information has been managed and analyzed, it can offer substantial insight into employee work patterns, communication networks, subject matter expertise, and even influencers and business processes. It also holds the potential for eliminating duplicative human effort, which can be an excellent tool to increase productivity and output.
DV: What do you foresee long-term for unstructured data? Will more organizations realize its value and be prepared to mitigate security risks?
GK: With the increase in sophistication of both analytics capabilities and governance capabilities, we will likely see a renaissance for this data that is currently being viewed as a liability. Moving forward, there will likely be a shift in perception regarding the potential of unstructured data. Long gone are the days when organizations systematically discarded content that wasn’t proven to be a formal business “record.” Rather, the big data era tends towards hoarding — for better or worse.