Wednesday, March 6, 2019

Machine Learning and the Expectation of Mistakes

Traditionally in software development we've considered anytime an application creates an incorrect result to a bug. A bug is considered an error in a system; not that the application simply made a mistake, but that there is a fundamental flaw in the system itself. Why is this important? Because mistakes are something that can be learned from and can be self corrected while bugs are fundamental flaws in the system where the only way they can be corrected is by outside intervention to change the operation of the system itself.

Last year Elaine Herzberg was struck and killed by an Uber self driving car. Was this a bug in Uber's software or was it a mistake that the software made? This is an important distinction and one for us to recognize is a real question. Traditionally most applications do exactly what the code tells them to do, no less, no more. But machine learning changes that. We have to ask ourselves if the software in that car took the wrong action based on the data that had been used to create the model to date; if the results of this action were fed back in would the updated model in same situation make decisions that lead to a positive outcome (no crash). If the answer to that is no, the base software needs to change, then we have a bug. But if the answer is yes, we have a self driving car that made a mistake based on a new situation that past experience didn't prepare it for and it may not happen given the same situation a second time.



To do complex automation tasks like autonomous vehicles, where the application will encounter unexpected situations, requires machine learning and the concept of mistakes to be successful. To treat all unexpected outcomes as bugs is not sustainable, there is no way that developers will be able to code in every situation into an application a self driving car may encounter or keep updating it as new situations are discovered. But if we are willing to treat them as mistakes that can be learned from with the model updated based on new situations then we have something that is within the realm of reality.

Software that doesn't just have bugs but also can make mistakes leads to some interesting mental shifts that we need to make as an industry and in society at large. Any application that uses machine learning internally should take a look at any unexpected outcomes and ask themselves the fundamental question, "Is there a mechanism to feed back results into the system whereas the model can be updated and lead to a better (or at least different) result next time?" If the answer is yes then no other action may be required other than monitoring or recreating the situation to see if better results are created. This is a big shift in our thinking.

It is also a big shift for society. Families of people killed by self driving cars may not want to hear that the software made a mistake. They likely want 100% accuracy in correct decision making. A standard that we don't apply to people and self driving cars won't be able to meet it either. There are some other and obvious legal and liability questions that will come to play as well.

I can also see that organizations that don't do well with their people making mistakes, likely are going to have a hard time accepting software that can make mistakes as well. They will categorize them as bugs, just like they do when their people make mistakes. This will be another area where organizations who are more accepting of mistakes will likely have a competitive advantage as their culture will be more accepting of the fundamental learning process that machine learning requires.  Requiring the model to be perfect out of the gate is just not a realistic situation in a complex world with nearly infinite possibilities and situations.



So what should organizations keep in mind when starting with machine learning:

- Understand there is a difference between a bug and a mistake.
- Software that has no way to feed back in the accuracy of the model results can only have bugs. Mistakes require the capacity to learn, to improve.
- Mistakes require different remediation action than bugs. Checking to see if a model learned from the incorrect outcome may become a QA question. If it is found that the model can't learn from the situation, then that would become a bug; a flaw in the ability of the system to learn.
- Machine learning models deal in probabilities. The model may return that it is 95% sure a picture has my face in it. Is 95% good enough? Probably, the threshold of acceptability will need to be defined somewhere. But that should not be the end of the discussion. The best applications will have a way to feed back into the model if the results are correct or not so next time the same picture is guessed more accurately.
- Mistakes need to be tolerated, not just by the development team but the management team, the users of the applications and anyone in society at large that interacts with them. Mistakes may feel like the personification of software systems, but the mental shift is recognizing that a system that can learn from bad outcomes has the capacity to make mistakes and requires no self awareness.
- Many models, particularly ones created with deep learning, are black boxes. They cannot be fixed the same way application code can.
- Without allowing for the concept of mistakes, our current software development capabilities will not be able to create systems that can effectively work in complex environments with uncounted and unknown situations they may experience.

The rise in use of machine learning has the capacity to automate things that we have never been able to automate before. One of the likely victims of that upheaval is the understanding that all problems are flaws or bugs. We will have to be more accepting of mistakes, just as we are of our fellow humans.