Algorithms that were once the domain of mathematics specialists, computer programmers and rocket scientists are now an integral part of our every day lives, whether we think about them or not. Algorithms make recommendations for us from which route our GPS suggests we drive to work, to which playlist our music streaming service of choice serves us, to who our social media account thinks we might like to be friends with.
However, controversies such as the scandal around Cambridge Analytica and Facebook in 2018 serve to make us think twice about the ethics related to the way data is captured, analyzed and utilized.
Data and the algorithms that can be used to mine them are not the enemy. Anyone who speaks to their chosen smart home device daily can attest that data collected and mined responsibly can enhance our lives. When it comes to the research world, as social scientists we have a responsibility to ensure that we take the lead in ensuring that algorithms are developed responsibly. To ensure that privacy is protected, and that research is always conducted under ethical conditions. The key to achieving this is the mechanism of transparency.
What is an ethically developed algorithm?
In 2016, research Andrew Tutt called for a ‘FDA for Algorithms’, noting
‘The rise of increasingly complex algorithms calls for critical thought about how to best prevent, deter and compensate for the harms that they cause…Algorithmic regulation will require federal uniformity, expert judgement, political independence and pre-market review to prevent – without stifling innovation – the introduction of unacceptably dangerous algorithms into the market”.
In lieu of such an authority existing, we can look towards the Principles for Development, created by Fairness, Accountability and Transparency in Machine Learning, to help developers and product managers design and implement algorithmic systems in publicly accountable ways:
Make available externally visible avenues of redress for adverse individual or societal effects of an algorithmic decision system, and designate an internal role for the person who is responsible for the timely remedy of such issues.
Ensure that algorithmic decisions as well as any data driving those decisions can be explained to end-users and other stakeholders in non-technical terms.
Identify, log, and articulate sources of error and uncertainty throughout the algorithm and its data sources so that expected and worst case implications can be understood and inform mitigation procedures.
Ensure that algorithmic decisions do not create discriminatory or unjust impacts when comparing across different demographics (e.g. race, sex, etc).
The limitations of algorithms
While it’s true to say that algorithms aren’t subject to the same failings as humans, such as fatigue, self interest, or avoidable errors, there are limitations on using them to make decisions, further highlighting the importance of responsible development.
Transparency and the ‘black box’ effect are often cited as a limitation of algorithms, and it’s particularly important for researchers to understand the workings of any algorithm they intend to use, to know if it’s fit for purpose.
Machine learning algorithms are trained to make recommendations based on data that isn’t always representative – therefore systematic biases can go unnoticed and proliferate over time. The element of control over any data in a research setting in paramount in overcoming this. For the same reason we’re often surprised when our computer crashes, or another piece of equipment in our life breaks down – we’re prone to treating algorithms as infallible. However, it’s important to understand they they’re designed to work well on average, only further highlighting the need for transparency and control in a research setting.
Algorithms gone wild
There’s no shortage of horror stories of algorithms deployed ostensibly for good, only to yield what would be considered fairly negative results. For example, in 2016 the engineers at Microsoft created a Twitter bot named “Tay”, which was driven by algorithms, allowing it to respond to ‘Millennials’ based on what was tweeted at it. However, within hours, Tay was tweeting racist, sexist, and Holocaust-denying tweets, proving that this technology is fallible, and that it simply learned and then reflected back the sexist and racist biases of its target audience.
In early 2017, Amazon’s Alexa smart home device made headlines when a 6-year-old-girl in Dallas was able to order a $160 dollhouse and a tin of cookies for herself, after having a conversation with the device about her love of those very items. As the story gained nation-wide med attention, it was reported that the Alexa device placed the same order for other families, based on hearing the television news reports.
These examples serve to illustrate the need for careful attention to be paid to algorithms developed for machine learning, and the role the Principals of Development can play into ensuring there’s a limitation on unintended consequences.
For researchers utilizing this technology in their work, artificial intelligence raises the idea of the ‘black box’. However, as long as developers incorporate principles of transparency and control by the researcher, software can be an ally and be seen as a research assistant. At QSR, we are continuing to develop these tools with our research base to make them more accurate, transparent, and ultimately improve the working lives of researchers.