Click to learn more about author Peter Jackson.
We have to make ethical decisions every day when working with data. The problem is, a lot of the time we might not realize when this is the case. Data professionals and organizations are often so focused on what can be done with data and at what scale, that critical questions about data ethics can be forgotten: Should we be doing this and how should we be doing this?
These are quite basic questions, but there’s no doubt that they slip out of the thought process far too often. What data should we collect and keep? What should we collect and delete? What should we leave alone completely? How should we collect the data? These aren’t compliance issues or a question to be answered just to stay on the right side of GDPR, CCPA, or any other data protection/privacy legislation. These are questions of ethics. Just because we can, doesn’t mean we should.
Why Do We Need to Start Taking Action on Data Ethics Now?
The fact is that too often, the interpretation of data ethics is limited to Data Science teams understanding bias in training sets and teams knowing just enough to keep the regulators happy. This shouldn’t be enough to satisfy those of us working in the industry. So, why not? Why should organizations be taking data ethics seriously?
There can be no clearer representation of the consequences of unethical behavior than Cambridge Analytica. Once the fact that it had collected and used the data of millions of Facebook users without their consent was made public, the company collapsed.
This isn’t the only arena in which ethical considerations will need to be taken into account. Data ethics will have to play a part in the ongoing discussions about Lethal Autonomous Weapons (LAW), for example.
These are clearly extreme examples, but when it comes to handling personal data, the conduct of organizations will continue to be under the spotlight and one public misstep could be extremely costly. The more conscious that people become about the value of their personal data and how it should be handled, the more reputational damage organizations will suffer when they drop the ball with data ethics.
This is also being taken seriously at a governmental level. Late in 2018, the U.K. government announced the foundation of its Centre for Data Ethics and Innovation, which, as per its own website, is “tasked by the Government to connect policymakers, industry, civil society, and the public to develop the right governance regime for data-driven technologies.”
In the U.S., as part of the Federal Data Strategy, the General Services Administration announced late in 2020 that it had released a framework to help agencies make ethical decisions. The topic has also made it onto the agenda at the European Commission.
So, there’s no doubt that data ethics is on the radar, but how should we be turning this awareness into action?
How to Get Started with Data Ethics
Data ethics is a very broad topic that can take us in multiple directions. So, if you’re to start making tangible progress there needs to be a focus on some key issues:
1. The collection of data: Organizations will naturally collect as much data as possible and keep hold of as much data as possible, just in case it might prove useful in the future. We need to move away from this behavior being the norm and ingrain new approaches. We can do this by constantly asking simple questions:
- How should we collect data?
- What should we collect and keep?
- What should we collect and delete?
- What should we leave well alone?
This is not an exhaustive list by any means, but the basic premise that we need to question how data is being collected is crucial to making more ethically sound choices.
2. Reassess how we use algorithms: Organizations cannot absolve themselves of responsibility when something goes wrong and lay the blame at the door of an algorithm. There has to be human accountability with regards to whether algorithms should be used at all, whether human decisions would be preferrable in some instances, and how we can find a healthy balance in collaborative intelligence or human-machine collaboration.
When algorithms are used, Data Governance is absolutely critical. There has to be a clear understanding of the bias in training data sets and that loops back to the collection of data. Compliance teams also have to be completely on top of how data should be deployed and used.
3. Take a customer-centric approach to data ethics: Personally Identifiable Information (PII) is not “data” – it is a person. As consumers and citizens, we’re becoming more value-driven in how we engage with organizations. This naturally extends to how we expect our personal data to be handled in an ethical manner. In this context, organizations can’t afford to be reactive and merely do enough to satisfy regulators. Getting on the front foot with data ethics will play a huge role in boosting an organization’s, or a government’s, reputation with current and future customers and citizens. It can be a competitive advantage, not only in the sense that you can get a step ahead of the regulators and avoid potential fines related to poor practice, but in the sense that you can truly stand out as an organization putting data ethics at the forefront of everything it does, not just a compliance box-ticking exercise.
What’s Next?
This is just the start of what you need to consider to bake data ethics into your organization’s daily work. But we need to start somewhere. Over the coming weeks and months, we’re going to be investigating this topic in a lot more detail, digging into the core areas that demand our attention and guiding you through what really matters when it comes to data ethics. Stay tuned.