Hunt down bad error messages

My printer is unable to clean 51. It won’t work, and all it says is “Unable to Clean 51.”

Here’s my suggestion for finding such useless error messages in a code review: Write a script to extract all string literals from your source code, then read over the output. The beauty of this approach is that the reviewer sees the text without the context of the surrounding code, just as the user does. Ideally someone who is not a programmer would review the output strings. I would hope someone who ran across “Unable to Clean 51” would flag that as something that doesn’t make sense.

It’s not hard to write a script to pull out string literals. If you’d like, you could use the script that accompanies my Code Project article PowerShell Script for Reviewing Text Shown to Users. That script tries to filter out strings that are not user output, such as file paths. It’s a pretty crude script, attempting to handle source code written in several languages. It will have a few false positives, and a few false negatives, but it works quite well for a short script. (Code Project’s “browse source” tab doesn’t work for PowerShell source. You’ll have to download the code to read it.)

Whenever I recommend this script, I run into a few objections that I’ll address below.

Q: Wouldn’t it be better if programmers put all user output text in string tables rather than putting user output text in the main body of their code?

A: Yes, but that takes more effort, and it’s not how most programmers work. And even if you have a policy to put all output strings in a resource table, you’d need something like this script to enforce the policy.

Q: Wouldn’t it be better to have a real parser extract the strings rather than doing regular expression guesswork?

A: Sure.

Q: Wouldn’t it be better to use a spell checker that’s built into your IDE?

A: No. The purpose of the review is not just to check spelling. You also want to catch grammar errors and unhelpful messages.

Q: Wouldn’t it be better to do a complete code review?

A: Yes and no. Complete code reviews are great for overall code quality. But they take a lot of effort and don’t happen often. A review of just the extracted strings takes far, far less time. Also, viewing the output strings out of context is better for catching unhelpful or ungrammatical messages.

Q: Isn’t this simplistic? Doesn’t every company do something like this?

A: If they do, why do I continually find spelling errors, grammatical errors, and unhelpful messages in the software I use?

22 thoughts on “Hunt down bad error messages

  1. On many Unix systems, you can run ‘strings yourProgram’ to get a list of all strings in an executable file.

  2. This is why some of us oldsters have fond memories of “Messages and Codes” manuals. Every message a program issued had a unique number, which you could look up in the book (or later, in online help, frequently generated from the same source as the book). Each message’s write-up contained the message template text, descriptions of any variable elements, a description of the reason for its issuance, and suggestions for action if you saw one.

    The discipline required to produce such manuals also tended to cause the reviews John and Daniel describe, and later made it much easier to translate the messages into languages other than English, without changing code to do so.

  3. The most important thing in a review the messages is to ask someone who barely know what’s the purpose of a mouse to review them. You should hunt for the less computer litterate people to review these messages. Too often, these messages are reviewed by those already familiar with the software and already having a sense of what it could be about. The novice user just don’t know anything about the software itself and the messages should take this into account. In fact, the whole software should be tested by novice users and you should change often your testers since these, with time, will no longer be novice users.

  4. Daniel: You could make a business out of this. Hire a bunch of non-technical folks and have them review the messages. The reviews may become more computer savvy over time, but they won’t become familiar with any one product if you have many clients and many reviewers.

  5. If you rewrite messages, put them in two parts. A short error note, and an explanation.

    Unable to clean printer head
    After being switched on, the printer needs to clean its printing head. This has failed. Reasons could be … Please try…

  6. I doubt there would be a string literal like “Unable to Clean 51.” in the code. Depending on the programming language and libraries it could be something innocent that would pass the code review, like “Unable to Clean %s.” or even “Unable to Clean %0.” So I believe that testing can catch more problems than code review.

  7. Interestingly, I had a similar idea and wrote a script that does something like this: http://nickknowlson.com/projects/word_frequency/

    There’s the normal word frequency one AS WELL as a ‘look inside strings only’ word frequency one. In my case I was more interested in finding duplication, so my script counts the number of repeated words or phrases inside strings.

    I’ve found some interesting things with this script (in one case in a small codebase, over 100 uses of SELECT * FROM).

  8. Error: missing ‘n’ in the word ‘show’ of ink text in 2nd paragraph.

    Thanks for the article!

  9. I know a woman whose job is not far from what Daniel and John propose above. As her main duty, she elicits requirements and subject-matter expertise from people who want her company to develop computer-based training for them. However, her company also finds it extremely useful to have her try any and all software they develop, of whatever type.

    Invariably, five minutes into the test she inadvertently does something that (a) causes the software to fail gracelessly, and (b) causes the developer to say “But why would anyone _do_ that???”.

  10. For instances like this, I’ve written my own “FindAndReplace” app that takes care of this for me. If I ever see an error message that the user got that we have NO idea what it means, I just fire up my app, search for that exact string and it takes me right to the code.

    App was also written so I can do mass changes to code via Regular Expressions, but it serves more purposes than that (also really helpful for finding specific pieces of data that are processed via nightly batch jobs).

  11. Even worse, in C++ the error message may be constructed by writing to a stream, so the string literal is split into many literals, possibly in multiple lines.

  12. I absolutely agree that all communication with the user should be written so that the user can interpret the message and be able to perform corrective actions. Your methods will certainly help improve those.
    But we programmers can only produce excellent communication for those errors that we can reasonably forsee and trap. There are hundreds of other conceivable events that can happen to interrupt our software.
    Using your example, the error condition comes from the device and the programmer worked for the device manufacturer. It is reasonable to think that the programmer knew all of the possible error codes and should have been able to present a clear message.
    Now step away so that you are programming software for some user who happens to buy that printer. The printer fails and you receive “Error 282 – printer error”. You have no clue what caused the condition and (generally) no way to determine it. All you can do is trap the error and, if you can’t stop it from interrupting the user’s workflow, report it to the human in charge.
    So I think that, as much as we would really want to, we cannot eradicate these uncommunicative messages.

  13. Murray: One thing I didn’t mention is that it appears that error numbers for my printer are not unique. “51” can mean a lot of different things. Even an arbitrary long number would be better, if I could search on that number.

    Mark: If I saw that, I’d flag it for further investigation. If it said “%d” rather than “%s”, i.e. it’s passing in a number rather than text, I’d be pretty certain it’s a bad message. With “%s” I’d want to know whether the text being passed in is reasonable. (In any case, it’s a bad sign that “clean” is capitalized.)

  14. It sounds like “51” is your printer’s equivalent to my compiler’s error 5 “Unknown Error”.
    I call it the “Mr Jones” error after Bob Dylan’s character that “knows there’s something happening but doesn’t know what it is”. :)

Comments are closed.