Along with recent notorious data security breaches AI today is probably the most controversial, yet most cherished subject in tech: it is looked upon as either blessing by the likes of Mark Zuckerberg, or being cursed and questioned by Elon Musk and Steven Hawking. Interestingly, those who are raising those concerns, are not really feared about robots taking over and massacring the humanity in ‘Terminator 2: Judgement Day’ best tradition. What the AI-opponents are really worried about lies a bit deeper down: raising awareness of responsible deployment of AI in the era of machine learning algorithms and software becoming mass available. Securing unbiased AI that is not affected by human prejudice about race or gender has long been discussed by researchers. But is there any solid grounding for this fuss in the first place?
AI is trained with big data that human developers put into it. Simply, if artificial intelligence is trained with biased data, it is doomed to make biased decisions. For example, deploying AI-search algorithm in an online recruitment search service recently showed that it put male candidates higher than female one in the search results! Another instance is one of those chat-bots that are created by dozens for a number of purposes, like ‘Tay’ — originally created for innocent chit-chats, it became a racist — if not fascist — monster sprawling out offensive comments in less than 24 hours! Remember, AI is learning from whatever we, humans, throw down to it. And it is learning very fast. With this kind of software powered by machine learning algorithms becoming increasingly accessible, there’s a growing risk of biased AI pertaining into mainstream software, thus promoting discrimination. Google, for instance, has already rolled out an open-source machine-learning software library now free for anyone to use. Turns out, the so-called algorithmic bias with AI manifesting exact same biases as humans, might be becoming more prominent in everyday decisions that humans make.
Is there any way to somehow fight this algorithmic bias? Well, it’s a logical leap to suggest that since AI is fed with data, altering the data itself might be of help. Huge amount of data is needed to train one AI-algorithm. But where do developers turn to for this filtered unbiased data? Is it for free and who really owns it? Facebook data privacy scandal and subsequent refusal to comply with EU Data Protection Directive worldwide only confirms the fact that a tech-giant hold on to the right to dispose of users’ data at its own discretion. Isn’t it also because they are not only monetizing on data, but need it desperately to maintain the development of their AI? Data today is new oil really: tech-giants are not giving it away despite raising concerns and calls for giving users their personal data back among tech-community. But is it even possible to gather unbiased data to train unbiased AI in the first place?
Introducing AI to platforms and services the very essence of which is providing authentic and unprejudiced opinion is to face even more challenges. The whole point of deploying AI on opinion-driven markets is its capability to identify and exclude biased opinions. We’re having some sort of a catch-22 situation here: technology created to overcome the bias, is running the risk of becoming biased itself. Speaking of online reviews market and scoring industry, review platforms like yelp.com have long been a place to share valued opinions being a driving force for businesses to gather customer feedback. But as we all know those platforms over time started to lose credibility, since they had a bunch of common issues. Geared up with AI to fight bias and lies new generation of scoring platforms are emerging. But what if AI itself is becoming biased? Where do we get enough data to train AI adequate analysis of opinion’s authenticity? Should developers use existing Yelp-generated reviews bankl? In cases where AI is working with vast amounts existing data in the endless sea of online information, countering bias becomes much more difficult. It looks like those aspiring to come up with an AI filtering mechanism, first need to accumulate their own ‘quality’ data pack. Product owners and designers should be very aware of those risks when deploying AI to any type of systems. It is the duty of machine learning engineers to come up with safer developer tools suggesting a better way to design algorithms that don’t discriminate on gender, race, and other attributes. All of these concerns are specifically applied to the industries depending on AI as their last resort of authenticity.
There are some startups pioneering online reviews market that are taking a more conscious approach towards AI. Developers at Revain claim that they are not relying on Google and Yelp data to train their IBM Watson AI. Instead, they are seeking to gather as much data as possible out of reviews that users write on a newly launched v0.6 of Revain dashboard. For now, AI has been integrated into the platform’s architecture, yet it is not interfering with core platform functions. A larger amount of reviews is needed to ‘feed’ the AI to further analyze the end result and compare it with a human unbiased opinion. Upon doing so, the AI opinion is edited by a human and again the data is put back to AI algorithm. Ultimately, the authenticity is achieved by the operation method where reviews fragments are processed both by machines and manually and then are saved to blockchain.
Adopting high level of responsibility when creating AI and setting building an unbiased AI as an ultimate goal is absolutely essential not only to researchers, but also to those who are in fact bringing those algorithms to mass market — business leaders and media influencers. Algorithmic bias can not only lead to the spread of human biases, but the amplification of it! Sadly, most of humans are not yet aware of software bias and tend to blindly trust AI judgement.