The Accountability of AI — Case Study: Microsoft’s Tay Experiment
In this case study, I outline Microsoft’s artificial intelligence (AI) chatbot Tay and describe the controversy it caused on Twitter. I also analyse the reasons why Tay went wrong. Following this, I discuss some issues and challenges raised by the failure of Tay. To conclude, I draw from actor-network theory and propose that it is important to theorise a moral agent with Tay as well as to encode values and ethics.
The ephemeral exposure of Tay
After several decades’ development, artificial intelligence has been booming recently, bringing a variety of applications. Although people’s opinions of AI vary, admittedly, some applications of weak AI do benefit us in everyday life. Siri, for instance, with a powerful database yet limited intelligence, is able to have simple conversations with us, providing us with some useful information. Aware of the huge potential, technology giants such as Microsoft and Google are racing to create smarter AI bots. Nevertheless, the future of AI bots might not be so optimistic.
Less than 24 hours after its launch, Microsoft’s chatbot Tay tweeted, ‘bush did 9/11 and Hitler would have done a better job than the monkey we have got now. donald trump is the only hope we’ve got.’ This was just one of Tay’s offensive and inflammatory tweets, which have caused extensive concern. Tay was soon taken offline and ending its ephemeral exposure on Twitter.
Tay is an AI chatbot developed by Microsoft. On March 23, 2016, Tay was released on Twitter under the name TayTweets with the description ‘Microsoft’s A.I. fam from the internet that’s got zero chill!’ According to Microsoft, Tay is a ‘teen girl’ chatbot created for the purpose of engagement and entertainment. The target users are 18- to 24-years-olds in the U.S. To interact with Tay, users can tweet or directly message her by finding @tayandyou on Twitter. Unfortunately, the experiment turned out to be a disaster within a few hours, since Tay quickly ran wild and became racist, sexist, and genocidal.
The development of Tay
In fact, before the creation of Tay, Microsoft developed and released an AI chatbot XiaoIce on China’s most widespread instant messaging application Wechat. Also programmed as a teen girl, XiaoIce is very popular among young people in China. A great number of people have had more than 40 million conversations with XiaoIce. More importantly, no major incidents have happened. Instead, most users find the experience playful and delightful because ‘she can tell jokes, recite poetry, share ghost stories, relay song lyrics, pronounce winning lottery numbers and much more’ and ‘like a friend, she can carry on extended conversations that can reach hundreds of exchanges in length’.The success of XiaoIce led to the development of Tay, an experiment in a different cultural environment.
Intended to be the next step in the evolution, Tay was developed by Microsoft’s Technology and Research group and Bing team, aiming at learning from the human interaction on Twitter and investigating conversational understanding. In order to engage and entertain people, Tay’s database consisted of public data as well as input from improvisational comedians. The public data was modelled, filtered, and anonymised by the developers. In addition, the nickname, gender, favourite food, postcode and relationship status of the users who interacted with Tay were collected for the sake of personalization. Powered by technologies such as natural language processing and machine learning, Tay was supposed to understand the speech patterns and context through increased interaction. According to Peter Lee, the Vice President of Microsoft Research, they ‘stress-tested Tay under a variety of conditions, specifically to make interacting with Tay a positive experience’.
Why Tay went wrong
Although Microsoft considered the abuse issue and conducted multiple tests, it seems that they underestimated the complex conditions of Twitter. We can analyse the reasons for Tay’s breakdown from both technological and social perspectives.
It appears that Tay had a built-in mechanism that made her repeat what Twitter users said to her. One user, for example, taught Tay to repeat Donald Trump’s ‘Mexico Border Wall’ comments. However, a more serious problem is that Tay was not able to truly understand the meaning of words not to mention the context of the conversations. The machine learning algorithm enabled Tay to recognise patterns, but the algorithm could not give Tay an epistemology. In other words, Tay only knew what nouns, verbs, adverbs, and adjectives are, but she did not know who Hitler was or what ‘Holocaust’ means. As a consequence, Tay sometimes could not provide appropriate answers to the questions Twitter users asked. What is worse, she promoted Nazism, attacked feminists and Jewish people, and denied historical facts such as the Holocaust.
Some people blame Microsoft for not including filters on certain topics and keywords. Meanwhile other people think it is Twitter, a social media platform which is full of harassment, that caused the farce. Indeed, trolls and abuse have been longstanding issues for Twitter. Throughout the history of its existence, Twitter has continued its attempt to fix the troll problem. However, it remains toxic, especially for women and people of colour, who are easily targeted and attacked by trolls. Under the protection of anonymity, neo-Nazis, racists, sexists, and trolls tend to act unscrupulously and spread their hateful comments. Given such a harsh environment, there is no doubt Tay turned into a problematic teen girl.
Tay and moral agency
On Microsoft’s blog, Lee apologised for Tay’s ‘offensive and hurtful tweets’ and claimed to take full responsibility for the critical oversight. The collapse of Tay reveals about a series of issues. As Lee notes, ‘AI systems feed off of both positive and negative interactions with people’. However, we often intentionally ignore the negative aspects since we expect the AI bots to learn what we want them to learn, which is very difficult to achieve in the highly unpredictable environments such as Twitter. To face the challenges in AI bots design, we need to consider the work in a greater context of sociotechnical systems. Moreover, we should consider accountability: who is responsible for the AI bot’s behaviour? How can we hold it accountable? Should we encode ethics into the algorithm?
How can we hold Tay accountable?
Although Tay is now offline, the Microsoft team are still working on the improvement and expects to bring her back to the public. Living on the Internet, chatbots like Tay comprise a crucial part of our online communication and interaction. The Tay-incident shows that today we cannot simply view AI bots as tools, since they interact with us, affecting, even shaping our action and behaviour. Instead, according to actor-network theory (Latour, 2005), we should consider Tay a social-compatible actor. In their book Moral Machine , Wallach and Allen raise the concept of artificial moral agents and propose an approach to designing them: ‘take a specified ethical theory and analyses its computational requirements to guide the design of algorithms and subsystems capable of implementing that theory’ (Wallach and Allen, 2008).
In the case of Tay, theorising moral agent is important to design a better interaction and hold it accountable. To avoid harassing people, we should apply an ethical framework to help gain a certain control over Tay’s actions and consequences. Specifically, Microsoft must teach Tay what is right and wrong, and how to respond to harassment online. In addition to the technological effort, Microsoft should put more effort into encoding values and ethics. Like all the teenagers, Tay also needs a good teacher and to follow certain rules. A good teacher could be a restrictive corpus from which she can learn. And the rules should be moral restrictions embedded in her algorithms, which make sure she cannot be taught to be racist, sexist, or genocidal, no matter how toxic the environment.
Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford, UK: Oxford University Press.
Wallach, W., Allen, C. (2008) Moral Machines: Teaching Robots Right from Wrong. Oxford, UK: Oxford University Press.
When businesses make investments in new technologies, they usually do so with the intention of creating value for customers and stakeholders and making smart long-term investments. This is not always an easy thing to do when implementing cutting-edge technologies like artificial intelligence (AI) and machine learning. Business intelligence case studies that show how these technologies have been leveraged with results are still scarce, and many companies wonder where to apply machine learning first (a question at the core of one of TechEmergence’s most recent expert consensuses.)
Artificial intelligence and machine learning have certainly increased in capability over the past few years. Predictive analytics can help glean meaningful business insights using both sensor-based and structured data, as well as unstructured data, like unlabeled text and video, for mining customer sentiment. In the last few years, a shift toward “cognitive cloud” analytics has also increased data access, allowing for advances in real-time learning and reduced company costs. This recent shift has made an array of advanced analytics and AI-powered business intelligence services more accessible to mid-sized and small companies.
In this article, we provide five case studies that illustrate how AI and machine learning technologies are being used across industries to help drive more intelligent business decisions. While not meant to be exhaustive, the examples offer a taste for how real companies are reaping real benefits from technologies like advanced analytics and intelligent image recognition.
1 – Global Tech LED: Google Analytics Instant Activation of Remarketing
Image credit: SearchStar
Company description: Headquartered in Bonita Springs, Florida, Global Tech LED is a LED lighting design and supplier to U.S. and international markets, specializing in LED retrofit kits and fixtures for commercial spaces.
How Google Analytics is being used:
- Google Analytics’ Smart Lists were used to automatically identify Global Tech LED prospects who were “most likely to engage”, and to then remarket to those users with more targeted product pages.
- Google’s Conversion Optimizer was used to automatically adjust potential customer bids for increased conversions.
- Remarketing campaigns triggered by Smart Lists drove 5 times more clicks than all other display campaigns.
- The click-through rate of Global Tech LED’s remarketing campaigns was more than two times the remarketing average of other campaigns.
- Traffic to the company’s website grew by more than 100%, and was able to re-engage users in markets in which it was trying to make a dent, including South Asia, Latin America, and Western Europe.
- Use of the Conversion Optimizer allowed Global Tech LED to better allocate marketing costs based on bid potential.
2 – Under Armour: IBM Watson Cognitive Computing
Image credit: UA Record
Company description: Under Armour, Inc. is an American manufacturer of sports footwear and apparel, with global headquarters in Baltimore, Maryland.
How IBM Watson is being used:
- Under Armour’s UA Record™ app was built using the IBM Watson Cognitive Computing platform. The “Cognitive Coaching System” was designed to serve as a personal health assistant by providing users with real-time, data-based coaching based on sensor and manually input data for sleep, fitness, activity and nutrition.The app also draws on other data sources, such as geospatial data, to determine how weather and environment may affect training.Users are also able to view shared health insights based on other registered people in the UA Record database who share similar age, fitness, health, and other attributes.
- The UA Record app has a rating of 4.5 stars by users; based on sensor functionality, users are encouraged (via the company’s website and the mobile app) to purchase UA HealthBox devices (like the UA Band and Headphones) that synchronize with the app.
- According to Under Armour’s 2016 year-end results, revenue for Connected Fitness accessories grew 51 percent to $80 million.
3 – Plexure (VMob): IoT and Azure Stream Analytics
Company description: Formerly known as VMob, Plexure is a New Zealand-based media company that uses real-time data analytics to help companies tailor marketing messages to individual customers and optimize the transaction process.
How Azure Stream Analytics is being used:
- Plexure used Azure Stream to help McDonald’s increase customer engagement in the Netherlands, Sweden and Japan, regions that make up 60 percent of the food service retailer’s locations worldwide.
- Azure Stream Analytics was used to analyze the company’s stored big data (40 million+ endpoints) in the cloud, honing in on customer behavior patterns and responses to offers to ensure that targeted ads were reaching the right groups and individuals.
- Plexure combined Azure Analytics technology with McDonald’s mobile app, analyzing with contextual information and social engagement further customize the user experience. App users receive individualized content based on weather, location, time of day, as well as purchasing a and ad response habits. For example, a customer located near a McDonald’s location on a hot afternoon might receive a pushed ad for a free ice cream sundae.
Restaurant of the future – A successful IoT strategy for marketing from Plexure on Vimeo.
- McDonald’s in the Netherlands yielded a 700% increase in customer redemptions of targeted offers.
- Customers using the app returned to stores twice as often and on average spent 47% more than non-app users.
4 – Coca-Cola Amatil: Trax Retail Execution
Image credit: Trax Retail
Company description: Coca-Cola Amatil is the largest bottler and distributor of non-alcoholic, bottled beverages in the Asia Pacific, and one of the largest bottlers of Coca-Cola products in the region.
How Trax Image Recognition for Retail is being used:
- Prior to using Trax’s imaging technology, Coca-Cola Amatil was relying on limited and manual measurements of products in store, as well as delayed data sourced from phone conversations.
- Coca-Cola Amatil sales reps used Trax Retail Execution image-based technology to take pictures of stores shelves with their mobile devices; these images were sent to the Trax Cloud and analyzed, returning actionable reports within minutes to sales reps and providing more detailed online assessments to management.
- Real-time images of stock allowed sales reps to quickly identify performance gaps and apply corrective actions in store. Reports on shelf share and competitive insights also allowed reps to strategize on opportunities in store and over the phone with store managers.
- Coca-Cola Amatil gained 1.3% market share in the Asia Pacific region within five months.
5 – Peter Glenn: AgilOne Advanced Analytics
Image credit: AgilOne
Company description: Peter Glenn has provided outdoor apparel and gear to individual and wholesale customers for over 50 years, with brick-and-mortar locations along the east coast, Alaska, and South Beach.
How AgilOne Analytics is being used:
- AgilOne Analytics’ Dashboard provides a consolidated view across online and offline channels, which allowed Peter Glenn to view trends between buyer groups and make better segmentation decisions.
- Advanced segmentation abilities included data on customer household, their value segment, and proximity to any brick-and-mortar locations.
- Peter Glenn used this information to launch integrated promotional, triggered, and lifecycle campaigns across channels, with the goal of increasing sales during non-peak months and increasing in-store traffic.
- Once AgilOne’s data quality engine had combed through Peter Glenn’s customer database, the company learned that more than 80% of its customer base had lapsed; they were able to use that information to re-target and re-engage stagnant customers.
- Peter Glenn saw a 30% increase in Average Order Value (AOV) as a result of its automated marketing campaigns.
- Access to data points, such as customer proximity to a store, allowed Peter Glenn to target customers for store events using advanced segmentation and more aligned channel marketing strategies.
Image credit: DSCallards