Click to learn more about author Gary Lyng.
Data is created at such a high volume that many IT executives simply don’t know what to do with it. In fact, many executives likely have no idea how or where their company’s data is stored, whether it’s been replicated, or what their internal Data Management practices are, allowing dark (unknown data value or negative impact) data and ROT (redundant, obsolete, trivial) data to accumulate and affect their bottom line.
Think of your company’s data center like the human body. Similar to how cholesterol build-up can lead to slower blood flow and clogged arteries, ROT data clogs a data system throughout its transferring, processing, and storage phases. This leads to internal damage and slower compute times and creates a lag in your company’s artificial intelligence, machine learning, and deep learning solutions.
To keep your data healthy and performing at optimal levels, here are five of the most significant costs that these forms of data can have on your business. Some of these are subtle, while others may have dramatic impacts on your company.
1. Data Storage
Data has to be stored somewhere, of course, and now that data storage is supposedly more affordable than ever, executives simply add more storage when they run up against their limit. But this is like buying your dog more toys instead of getting rid of the old ones you already have. If your company uses a cloud storage solution, you may be paying very little per gigabyte, but ROT data adds up quickly, especially if you face egress cloud fees every time you need to access those files. These fees quickly become cost creep, or “cost avalanche,” for your data requests and needs.
ROT data files cost companies thousands of dollars in monthly storage fees and six to eight times that in management. Many companies hoard these unnecessary files on local on-site storage, occupying valuable workspace. This information adds up as data keeps growing and companies do not practice adequate data hygiene. Additionally, if you have a local server for backups, you may end up over-allocating storage resources to useless data. Automated search and classification tools will help you identify ROT data and defensibly delete or archive it to a less expensive storage option, saving your company a substantial amount of money.
2. Data Breaches
Cyber and ransomware attacks are on the rise and the cost of a data breach is staggering. Breaches commonly occur when you don’t have a clear understanding of what data you have and where it is located. Data may be stored outside of secure file systems or on endpoints, which makes that data more susceptible to breach. This is especially common when employees make their own copies of files to take home on a laptop, move to cloud drives, smartphones or thumb drives, only to lose their equipment by accident or by theft.
Data breaches can lead to costly court settlements. Equifax infamously settled for over a billion dollars after its servers were attacked in 2017. What was the key piece of information that caused the majority of the damage? A document containing employee passwords was shared to make it easier for employees to access each other’s files.
3. Compliance Fines
Data breaches themselves are costly, but they can be made even more expensive if they involve personally identifiable information (PII) subject to data privacy compliance. Strict data privacy laws such as the General Data Protection Regulation (GDPR) in the European Union and the California Consumer Privacy Act (CCPA) in the U.S. hold companies responsible for managing consumer data. If legally protected data is breached, regulatory bodies will penalize your business with fines of up to millions of dollars.
These laws also allow individuals to request their information from your company. Companies must comply with Data Subject Access Requests (DSARs), which may include providing customers with their data, deleting it, and proving it was deleted. As the cloud naturally replicates data with backups, how sure can you be that a customer’s request was fulfilled? If you are not aware of all of the data you possess, you could face much larger compliance fines later on if that data is compromised.
4. Opportunity Costs
If a company has ROT and dark data – whose contents and importance are unknown – it means they aren’t using their data to its fullest potential. This has serious opportunity costs and poses serious risks that need to be mitigated and lost opportunity to analyze and learn from your data.
According to the IDC, organizations leveraging all of their data to the fullest extent can expect to see productivity gains – an estimated $430 billion across the market. Leaving that data in the dark means leaving chances to find new trends and discover new business paths that could lead to higher revenue and competitive advantage.
5. Wasted Hours
Decluttering terabytes of data takes a significant amount of time. Hiring more data managers and analysts may not be the most efficient approach. IDC estimates that a company with a thousand data workers wastes time equivalent to $5.7 million a year trying to find relevant data – ROT makes this task even more difficult. Data analysts should be analyzing data, not scouring through old, redundant, or trivial files. And the problem isn’t limited to data workers. If regular employees make a mistake due to ROT or outdated information, it could further reduce productivity in the workplace or worse.
An Intelligent Solution
Data analysis relies on sophisticated search tools. To uncover the value in data, enterprises need a powerful combination of tools to locate data – wherever it is. Most companies don’t realize their current data search approaches can’t access distributed information and can’t extract dark and ROT information within unstructured documents – severely limiting its data capabilities. Additionally, many tools only find metadata and do not conduct a deep search, often requiring copies of the data to be ingested into proprietary systems before they can even start to comprehend the data. This is simply cost-prohibitive for some.
An intelligent Data Management platform must allow you to find, move, or delete both ROT and dark data from your business. Tools like these can scan and label data across your entire enterprise, whether your data is stored locally, on servers, at the edge, or in the cloud, and certainly must do this efficiently (i.e., in-place without massive addition of data infrastructure or bypassing security in place).
Once you understand your data, you can analyze and determine which sets of information can be discarded and which should be kept. Conflicts between files can be quickly identified and resolved. What’s more, your employees’ data can be indexed, breaking down data silos and preventing further duplication of ROT data, and more importantly, the data is readily searchable so you can exploit its value. You must understand where your data lives in order to meet your bottom line and stay productive during the lifecycle of your business.