Thursday, June 18, 2009

The App that cried wolf

We’ve all read the story about the little boy who cried wolf. Time and again he cried wolf and one day when there was a real wolf no one listened.

Applications that send emails for error after error are a lot like the boy in the story. At first we investigate each error and try to come up with corrective measures to eliminate the issue, but eventually we just create a junk mail filter and simply ignore them.

So what is the root cause of the wolf app? The primary cause lies in how we define and handle exceptions. Exceptions are just that, exceptional cases that we hadn’t specifically planned for. We know during design and implementation that the potential exists for an exception. For example we may read a file and that file may be corrupt.   But the way that it is usually handled is by throwing it up the chain and eventually the whole app screeches to a halt and the spam machine kicks into overdrive.

Since we already know the potential for an exception exists in a particular piece of code, we should handle it where it happens and degrade gracefully. Additionally not every error or exception should be the subject of an email. Errors should be classified and dealt with accordingly. For many errors this means logging to a file or a database for later review. Many times the logs will show a pattern of problematic methods that continually raise the same errors. These can then be prioritized for rework during a maintenance cycle. Critical errors on the other hand should be identified by an email or an entry in the application event log. These critical errors should be dealt with immediately because the have a direct correlation to the up time of the application.

By taking some time to plan and design the way in which errors and exceptions are dealt with could mean the difference between delivering an app that cries wolf or one that matures of the course of time.

No comments: