Introduction
When we write code, bugs are inevitable. Eventually our program should be free from bugs, but there are two kind of bugs: Bugs that we are responsible, because there are errors in our logic and bugs that we can do nothing about because they are caused from some external source.
The first kind of bugs should be eliminated. Our testing should help us find those bugs and we are responsible that our code is bug free.
The second kind of bugs is not our fault. It may be caused from a framework we use or some other factor, like input from a faulty HDD or errors in connections to the internet etc. Although these bugs are not our fault, providing a smooth experience to our users is our responsibility. After all, they are using our program and any crash will reflect bad on us. Any attempt for an explanation to our users will seem like an excuse.
Below, I will describe four ways to deal with those kind of bugs, starting from the way that affects the least the user experience and finishing with the worst way, an exception being thrown and the termination of our program.
Whenever we write code, we should try to understand in which category a bug belongs to and try to deal with it accordingly, always starting from the first way and if that solution is not possible go to the next one, so that the end user will have the best possible experience.
1st Solution - Ignore the bug
That might seem wrong, after all we should always do something about a bug right?
There are cases though, that a bug can be safely ignored, without affecting how our program works or the user experience. Consider the following:
We have made a program that accepts input from a sensor that calculates the speed of a car. Every 1/100th of a second we get a value that is the speed of the car and every second we want to display that value.
The value that we want to display is the average of all the one hundred values we got as input from the sensor. Obviously we should have code in place that checks for the values we get from that input, but if one of the inputs is an invalid value, like a negative number when we know a negative number is impossible or 1235 km/h which is more than the speed of sound then we can safely ignore that value and calculate the average from the other 99 inputs.
Our code should have checks for impossible values, but also have checks for how often these wrong values are sent as input from the sensor. If one of these values out of a hundred is wrong, it doesn’t really matter. On the other hand if we get 30-40 wrong values every second then something is wrong with the sensor and we should take appropriate action, but one is negligible and can be safely ignored.
Bugs from input that can safely be ignored, provide the smoothest experience to our users, as the program continues to work as expected.
2nd Solution - Provide a default value
If we are in a situation where the 1st solution isn’t possible because the bug cannot be safely ignored, as this would change the behavior of our program, then providing a default value, can be the next least annoying way to deal with a bug for the end user.
There are situations that a default value makes sense. Consider the following:
We have a program that reads from the user’s drive a json file that has a value for the number of elements this file contains and after that number the elements themselves. When our program tries to read that file, this number is negative. Obviously that is wrong and any reading we try to do after that may also provide wrong values for the elements, even if we have a way to find out how many elements are there.
This wrong value may be an indication that the user’s drive is failing, but what is more important is that the end user has the best possible experience. The best way to deal with these kind of situations, is to provide a default value, in our case the value of zero would make sense and make the user know that the file could not be loaded because it contained corrupted data.
If that data is not essential for our program to run, like trying to load a level in a puzzle game, then our program should inform the user about the problem but keep running with the data it has. Maybe try loading the next level.
Not all the data we will have, will be as simple as an integer value, in those cases implementing the null object / default object pattern can be invaluable to our code architecture. You can read about the null object pattern in my blog post here : The Null object pattern in C#
Some data though, is essential to our application, in those cases we can try the third way of dealing with bugs caused from external sources:
3rd Solution - Return the program to a previous state
If our program cannot continue without some data, then instead of halting the execution we can consider returning our program to a previous state and then try again. Here’s an example:
Our program has some age verification, so the user has to provide a date of birth. We display a box that accepts some input from the user, but that input has an invalid value, for example 07-08-2323
. ( If you are reading this post after 2323 ignore that date, you get the meaning and also “Yoohoo! my blog posts survived for 300 years!”)
In those cases we can ignore the input, provide a message to the end user for the mistake, include as much information as we can explaining the reasons that error happened and then return out program to the state that displays the box that expects a valid date.
This solution is not only for user input, but for any kind of input. If we want to fetch data from the internet, we can have code that will try to do that for a number of times. If that still fails and the previous two solutions are not applicable to our situation then we go to the fourth solution which is the worst for the experience of our users.
4th Solution - Stop the program
This is the worst possible solution. Even if our program stops but has a log of what went wrong, the end user doesn’t really care. All the logs in the world may be useful for the developers, but our users will be frustrated.
There are situations where we might not have a choice, but we should carefully consider our options if stopping the program is the only way, or any of the previous three solutions can be implemented. For example, failing to load assets for a game level in a linear game, that are essential to gameplay while the user is in that level, is a problem that cannot be corrected with the previous solutions.
A program that stops or crashes provides the worst possible experience for the end user, especially if it stops when the user has some unsaved work. If possible we should at least try and save any data and the state of our program before we stop it, so our end users won’t feel like they have lost valuable time.
Conclusion
When dealing with bugs that are not bugs of our own code but are caused from some external source, our first obligation is the user experience. Crash logs may be invaluable to the developers but providing the best possible experience to our users always comes first. In every case we should always remember that our programs are for the benefit of our users and not for us.
I hope you found this post useful. Thank you for reading and as always, if you have any questions or comments you can use the comments section or contact me directly via the contact form or by email. Also if you don’t want to miss any of the new blog posts, you can always subscribe to my newsletter or the RSS feed.