Machine Learning - Data Integrity

April 16, 2019

Our last discussion highlighted a broad issue of adversarial interactions with machine learning algorithms. The central point of the adversarial interactions we examined in the prior discussion was to induce errors through false negatives. There is another type of threat to machine learning, something much more insidious than simple content obfuscation or fancy word transformations, however. Believe it or not, our training data can be poisoned. How does poisoning work and what is the central point? Let’s find out!

Mirror, Mirror

poison bottle

The term poisoning is misleading. Poisoning does not have the aim of disabling the algorithm (i.e., death). Rather, the aim of poisoning is to effectively misdirect the algorithm. Think more like smoke and mirrors, less like cyanide. Thus, something I want to clarify about poisoning is what exactly is affected by the poison. I think popular perception is that the algorithm is the construct affected. This simply isn’t true. Well, this simply isn’t possible. The algorithm is fixed in source and cannot be modified in runtime form. At least that is not how poisoning works. You need memory injections or other exploits if you’re after the runtime algorithm itself.

So, poisoning corrupts the emergent model the algorithm produces. The poison does this by targeting specific elements within the training data and skewing the outcome. Thinking back to decision trees, we can imagine that the features and labels- or branches and nodes- of the tree are the affected model components. Instead of predicting that students will like a programming course if they have previously taken a programming class, we will (inaccurately) predict the opposite.

A crucial point is that the machine learning algorithm will not necessarily generate errors when poisoned. In fact, without dedicated defense mechanisms in place, we may not ever realize we’ve been poisoned while our models or predictions are wrong.

Defending

Fortunately, there are validated methods to protect a machine learning implementation from data poisoning threats. Because poisoning works on our training data, the defenses we have likewise focus on data. For example, we can sanitize training data before allowing it to reach our algorithm. The concept here is similar to sanitizing user input in a web application so as to avoid SQL injection. Further, we can attempt to generalize training data to a degree whereby specific elements affected by poison are unable to influence the prediction models.

Jason Pittman

These are static defenses, however. That is, we employ sanitization after we capture training data and the data are fixed. The same is true for generalization. As well, each time new training data is collected we have to rerun our sanitization or generalization processes. While effective, the nature of these defenses strikes me as awkward.

I mean, fundamentally the threat of poisoning is a data integrity problem. We expect the training data to be A, and it was at least when we captured it as far as we know. However, unbeknownst to us, the training data is poisoned to become B which is altogether different. In classic cybersecurity, we’d produce a checksum or hash and be able to reference it to see if the object has changed. This isn’t possible with machine learning because our output is derived from the source dataset but is not the dataset. Sure, we could still hash the training set as a data blob but I don’t think that helps. Not all is lost, though.

In the Future

To be fair, there are research groups working on less awkward defenses. Some ideas revolve around using additional machine learning algorithms to defend against threats such as poisoning. Other research suggests that speed may be a viable control. Certainly, there are future ideas yet to be discovered as well. Join me next time when I investigate the latest, most cutting edge ideas in cybersecurity for machine learning!