Machine Learning - Adversarial Algorithms
We briefly discussed biases and extreme edges as ways in which machine learning can produce skewed information. In both cases, an effective mantra is garbage in, garbage out. In other words, bad data will almost always lead to bad information. Thus, we left off last time with the question of prevention. More specifically, we wondered how we might design effective algorithms if we shouldn’t trust the training data or study the training dataset. I’m going to add now that we shouldn’t necessarily trust the test data either. As such, we’re left to assume the worst and construct our algorithm accordingly. Check it out.
I understand how what I’m about to tell you may seem pessimistic. I assure you my position is simply practical. Put simply, here there be adversaries. Always, under every circumstance, assume the worst in your data, your operating environment, and even your implementation. That said, we can design our algorithms to be robust and resilient. Yes, there is hope that we can learn in hostile conditions.
For our purposes, I want to focus on adversarial interaction through data. I say this to differentiate from the use of additional algorithms to create the adversarial element which is something that can be both maliciously done or constructively. Adversarial algorithms are an entirely different topic. Further, we can assume the goal of an adversary is to (a) cause our machine learning to produce too many errors to be useful; or (b) to influence our algorithm into making bad decisions or predictions.
A strong example for both adversarial data scenarios is spam email. Spammers have evolved over time to have sophisticated adversarial toolkits. For instance, we know that spam email will transform content such that a filtering mechanism cannot read it. Base64 encoding of an email body is a classic example. Spam also has evolved complex character and word transformations including hidden or obscuring pixels in HTML rendered content. Additionally, such email often leaves out content (e.g., missing words or phrases) or deliberately corrupts features (e.g., misspellings). Yet, despite the advances our spammy adversaries have made, spam email has experienced a dramatic reduction over the years.
The reduction is largely attributable to machine learning algorithms assuming the worst. Well, more precisely, less spam is the result of machine learning classification acquiring the ability to discern between legit and malicious email. Fundamentally, such a feat of determination is achieved in one of two ways.
First, we can design a subroutine that, put simply, simulates how our classification features and labels behave under adversarial conditions. Then, the algorithm ensures that the training model does not lead in that direction. A good question at this point would be to ask how the algorithm would learn which model leads to a decision of legitimate or malicious. Well, the mystery is easily dispelled: performance cost. We’ve seen cost functions before, remember?
Second, algorithms can make feature selections robustly and use more resilient, harder to poison elements. Oddly enough, robustness and resilience in this context can mean using inexect matching criteria as opposed to what I think intuitively makes sense which is to make matching criteria more structured, more exact. I don’t mean that we keep track of all spellings for particular words as that is a mere signature list. Rather, I’m suggesting the algorithm have a fuzzy sense of word structure, grammar rules, and so forth. Think of this as a grammar or style detection tool like Grammarly but for spam email.
Naturally, whether the algorithm is in Grammarly or a modern spam email filter, solutions to adversarial conditions are heavily data dependent. Thus, our concept of strong machine learning algorithms takes us back to where we started, to data. Join me next time when I examine how to assure the integrity of our training and test data so that we don’t end up poisoned.