The dangers of machine learning

Alexander Murray-Watters

12 April 2019

Models are often uninterpretable

As the models produced by, e.g., a neural network are often uninterpretable, it is difficult to work out where a model went wrong.

Google Flu trends is a prime example of this. For many years, Google was able to accurately describe where the flu had spread to (and how many people had it) faster than the CDC (Centers for Disease Control and Prevention). Google wrote tracked the spread of flu by looking for spikes in the number of searches for the flu (or flu related symptoms). There was some discussion of using Google Flu trends as part of the US government’s response to flu outbreaks. The downside is that when other events caused people to search about the flu on Google, such as the Swine flu outbreak in 2009, the model broke down, as it couldn’t distinguish searches about the news story from searches about the flu. Changes in how the news coverge has also led to problems with Google’s method. Since the individual components of a neural network are difficult to interpret, it can be hard to tell if the problem is with the variables included (or excluded) from the model, or if the problem is with the model parameters, functional relationship between variables, or some other cause. https://www.nature.com/news/when-google-got-flu-wrong-1.12413

TODO: Insert image.

Biased data

Garbage in => Garbage out.

TODO: Insert bit on racist Google image search

https://www.theguardian.com/technology/2018/jan/12/google-racism-ban-gorilla-black-people https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai

TODO: Insert image.

Example: How to make a racist AI

(Based on: https://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/)

We grab the GloVe pre-trained word vector first: https://nlp.stanford.edu/projects/glove/

A word vector is a representation of a word within what is called a “word vector space”. These vector spaces attempt to model the semantic and syntatical context of words, and are usually constructed by feeding a large body of text into a neural network. The neural network is usually constructed so that given an input such as context (e.g., the other words in a sentence surounding the word to be predict), predict what word should “fit”. The other direction is also possible, that is, predict “context” given a particular word.

This particular GloVe word vector dataset has a 1.9 million word vocabulary is 1.75 Gigabytes large (compressed), 5 Gigabytes uncompressed.

glove.df <- read.csv("~/Downloads/glove.42B.300d.txt",
                 header=TRUE, sep=" ",quote="",
                 col.names=c("word",paste0("e",1:300)))
## Did the file read in correctly?
head(glove.df$word, n=100)

dim(glove.df)

head(glove.df$e1)

# We use the scan function to read in the list of words as it has an
# easy way to distinguish comments from data (that way we don't have
# to bother with grep or regular expressions).
negative.words <- scan("../data/negative-words.txt", comment.char=";", what="", blank.lines.skip=T)

positive.words <- scan("../data/positive-words.txt", comment.char=";", what="", blank.lines.skip=T)

length(positive.words)
length(negative.words)

head(positive.words)
head(negative.words)
## Locating postive and negative words in our dataset.
pos.vectors <- which(glove.df$word %in% positive.words)
neg.vectors <- which(glove.df$word %in% negative.words)

## Limiting dataset to only those words in our lists of pos/neg words.
glove.df.reduc <- glove.df[c(pos.vectors, neg.vectors),]

## Need to reassign posiions now (since we have a smaller dataset).
pos.vectors <- which(glove.df.reduc$word %in% positive.words)
neg.vectors <- which(glove.df.reduc$word %in% negative.words)


## Binary indicator for positive or negative words.
sentiment.vec <- ifelse(1:length(glove.df.reduc$word)%in%pos.vectors,
                        1, ifelse(1:length(glove.df.reduc$word)%in%neg.vectors, -1, 0))

## backing up word vector
word.backup <- glove.df.reduc$word

## Converting to matrix for glmnet.
glove.df.reduc <- as.matrix(glove.df.reduc[,-1] )

rownames(glove.df.reduc) <- word.backup

## garbage collecting to recover RAM.
gc(); gc(); gc();


## Testing/traing datasets. 80% for training, 20% for testing.
train.df <- sample(1:length(word.backup), size=length(word.backup)*.8)

test.df <-  which(!(1:length(word.backup))%in%train.df)

## If this isn't 0 then we've messed up our spliting procedure.
sum(test.df%in%train.df)


## Fitting a model -- lasso regression.
fit.cv <- cv.glmnet(glove.df.reduc[train.df,], sentiment.vec[train.df],family="binomial" )

pred.fit <- predict(fit.cv$glmnet.fit, glove.df.reduc[test.df,], s=fit.cv$lambda.min)

pred.fit[sample(1:nrow(pred.fit), size=10),]
library(ggplot2)
library(dplyr)


## Making sure things make sense.
ggplot(tibble(pred.fit,
              pos.neg = as.character(sentiment.vec[c(pos.vectors, neg.vectors)[test.df]]))) +
  aes(x = pos.neg, y = pred.fit) +
  geom_boxplot() +
  labs(title = "Boxplot of model Coefficents by word type",
       x="Positive or Negative ", y = "Coefficent")
## Calculating the mean sentiment of a sentence.
calc.sentiment <- function(input.text="This is great!", reg.model = fit.cv,
                           known.word.list = word.backup, glove.df){

  ## Simple pattern that matches whitespace or puncuation. Note:
  ## screws-up apostrophes and hyphens!
  simple.regex <- "[[:blank:]|[:punct:]]"

  ## Splits text using our regex into a vector of words and empty spaces.
  word.vec <- tolower(unlist(strsplit(input.text, simple.regex)))

  ## Deleting the empty spaces. 
  word.vec <- word.vec[nchar(word.vec)>0]


  n.unknown.words <- sum(!(word.vec %in% known.word.list))

  ## If no words match, return 0.  
  if(n.unknown.words == length(unique(word.vec))){return(0)}

  ## If only 1 word matches, we have to transpose the matrix so that
  ## it matches what glmnet needs.
  ## else if((length(unique(word.vec)) - n.unknown.words) == 1 ){
  ##   new.data <- t(as.matrix(glove.df[which(glove.df$word %in% word.vec),]))
  ## }
  else{
    new.data <- as.matrix(glove.df[which(known.word.list %in% word.vec),])
  }


  pred.val <- predict(reg.model$glmnet.fit,
                      new.data,
                      s=reg.model$lambda.min)

  return(mean(c(pred.val)))
}

## Backup list of words
word.backup <- glove.df$word

#convert to matrix and drop list of words (otherwise we end up with
#matrix of strings, not numbers)
glove.df.final <- as.matrix(glove.df[, -1])

gc()

Now for the Racism

calc.sentiment("Let's go out for Italian food.", glove.df=glove.df.final)

calc.sentiment("Let's go out for Chinese food.", glove.df=glove.df.final)

calc.sentiment("Let's go out for Mexican food.", glove.df=glove.df.final)

So “Mexican” is rated lower than the other two.

What about names?

calc.sentiment("My name is Emily", glove.df=glove.df.final)

calc.sentiment("My name is Heather", glove.df=glove.df.final)

calc.sentiment("My name is Yvette", glove.df=glove.df.final)

calc.sentiment("My name is Shaniqua", glove.df=glove.df.final)

Exercise: calc.sentiment

Try giving calc.sentiment some other sentences to check for other kinds of prejudices or bias contained in the glove neural network.

Exercise: calc.sentiment

Try to find some words or sentences that break the regular expression that extracts words from sentences.

A stereotypical black name (Shaniqua) is rated far more negatively than a stereotypical white name (such as Emily or Heather).

Even when a method appears to be unbiased on the surface, if there is bias in how data are collected, the model will also be biased.

Other examples

There is no magic to machine learning

“Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.” Frank Herbert, Dune, 1965.

Further reading

Nontechnical

Specific and/or technical

General introductions

Deep learning

Adversarial neural networks