AI Gone Blind: What Can PowerPoint See?

When you read the recent news about Artificial Intelligence (AI), you might feel like every job, including yours, is about to be automated away, and it’s just a matter of time. You are bombarded with realistic newspaper articles that are claimed to be “completely and automatically written by a computer in a coherent manner”, and deepfake videos that can fool you into thinking an alternative reality exists. But what are the actual accomplishments and failures of AI projects in our daily lives?

In this case study, our team examined three major aspects of AI in a daily business setting:

  • How AI is integrated into a very popular business software successfully
  • How the same AI fails spectacularly, leading to AI blindness
  • The impact of AI (Blindness) on Explainability, Trust, Ethics and Business

How AI is integrated into Microsoft PowerPoint successfully

Hype and singularity scenarios aside, the business of AI generally reveals itself in a much more limited sense. And precisely because of its being limited, it helps our daily tasks and office work. One such task is related to accessibility and involves describing the contents of a picture: a description as an “alternative text”, so that blind people with their screen readers can hear the description and understand the picture. This automated picture description feature is already a part of Microsoft PowerPoint; in other words, you can simply insert a picture into your PowerPoint presentation, and Microsoft Azure Cognitive Services will process the pixels of that image, returning a brief text representation, using computer vision, deep learning, and natural language generation algorithms that are trained on a lot of images. You can see it in action below, successfully describing images:

This is a striking example of an AI feature integrated into one of the most popular business software, ready to be used by hundreds of millions of people. But it turns out that this system, designed to help blind people, can itself be blinded/confused by some images that are very easy to understand for human beings. 

When Deep Learning based AI Goes Blind

We conducted a very simple, straightforward experiment using images found on the Internet, as well as some images from a few scientific articles that focused on how deep learning based AI systems can be fragile and brittle. The experiments used PowerPoint version 2008 (Build 13127.20508), running on an Microsoft Windows 10 operating system, during 29th and 30th September, 2020.

Let’s first see how the system goes blind. In all of the experiments below, you’ll see next to the image the first and second tries of the AI system integrated into PowerPoint (which can be the same for some examples), followed by a human interpreting the same scene and AI’s performance:

In the example above, the AI system is completely “blind” to the fact that there’s a woman right in front of us. On the other hand, it has no problem identifying the man on the couch.

Let’s take a look at our second example of “AI blindness”:

It is hard to believe that the system has gone completely “blind”, not identifying the woman on the left half of the photograph. Unlike our first example, this woman is in a very typical, ordinary pose, standing upright, holding a book, looking straight at the camera, her face and upper body fully visible. To a human being, the person in the photo is a very ordinary example of a person wearing glasses, standing up and holding a book. You might even be tempted to say that it doesn’t get more boring than that. Yet, it is a very challenging example for the AI system.

Below is our third example, a woman standing in front of a few police officers:

It is very easy for humans to see that there’s a woman standing in front of “a group of people in uniform”, but the AI system doesn’t mention the woman at all.

But it’s not like the AI system has no idea about such scenes: in a different but semantically similar scene, the description isn’t very surprising, though this time AI can’t “see” people in uniform or police officers. This is another type of “AI blindness”.

Next, we take some images from the article “Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects”, to see if Microsoft Cognitive Services working in tandem with PowerPoint can handle some challenges:

In both of these cases, the system is confused, even though it was able to correctly identify some important aspects of the scenes such as snow and forest in the first one, and the truck in the second one. If a blind person read those AI-generated descriptions, she would have no idea at all that she is looking at some dramatic and serious traffic accidents in very different road and weather conditions

In another example, the AI system is baffled again, probably because of the angle. And it thinks when such extraordinary actions are happening, it’s mostly men, and not women. “On what kind of data set was this AI system trained?” you might ask, justifiably.

On the other hand, if only everybody behaved properly and responsibly in the traffic, it would be so easy for the AI systems, too:

After observing how this AI vision system fails and succeeds, we turn our attention to another scientific article and data set that contains real-world scenes: “ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models”:

Some of the readers might object to the first scene, saying that you don’t “normally” bring an office chair into the bathroom, and lay it down on the floor. However, you could say this only because you could easily identify the bright red object in the center of the photograph. Yet another example of AI blindness that makes that ordinary bright red chair the “elephant in the room”.

Continuing with the ObjectNet data set, below is another striking example of AI blindness, and a clearly visible chair, occupying a big part of the photograph being the “elephant in the room” again:

It’s almost like the AI system has been trained to have strange relationship with blue objects, as can be exemplified by the following:

AI Blindness: Impact on Explainability, Interpretability, Trust, Ethics and Business

In the previous sections, we have seen cases where a deep learning based AI system integrated into Microsoft PowerPoint could describe pictures accurately and the cases where it went completely “blind” and failed spectacularly, leaving us with AI blindness.

What can we learn from this experience? Below is our key takeaways:

  • Advanced, deep learning based AI systems are already part of our business software, and the direction is clear: we will see even more deep integration into business software, leading into incremental progress, instead of big bang projects from scratch. Key decision makers will always try to pinpoint the business value derived from such incremental progress, and try to manage the risk. Scoping the AI project correctly will have a big impact on its success and further evolution.
  • The same AI systems can have big difficulties with some examples that would be no brainer for a 5-year-old child. The real problem is that it is also difficult for us humans to predict when and how the AI system will fail, unless we have a good grasp of the data sets, their characteristics and how representative they are for the tasks at hand.
  • Which brings us to the third, and final point: managing data sets for training AI systems will become much more important, and it will affect how stakeholders from the business and technology teams troubleshoot and debug such advanced, and to an extent black-box, systems.

These points, in turn, will require extra care and critical thinking, design, and implementation of such AI feature integrations, and proper monitoring of these end-to-end AI systems, taking into account the potential impact on User eXperience (UX). We also expect business stakeholders to demand explainability and interpretability from these AI systems to establish trust, impacting their responsibility towards the ethical implications for the reputation of all the business parties and actors involved.

If you have any comments or questions, feel free to contact TM Data ICT Solutions, and we’ll be happy to interact with you.

Appendix

A few more examples of AI having difficulties:

About the author: Emre Sevinç is the co-founder & CTO of TM Data ICT Solutions. You can read more about him in his personal blog.

search previous next tag category expand menu location phone mail time cart zoom edit close