I recently read an article on some interesting research regarding how our brains react when our expectations about what we're looking at aren't met. Researchers found that test subjects had typical reactions when viewing robots that act like robots and when viewing people that act like people. So far, so good. However, when viewing robots that act like people (so called, "androids"), our brains have to work much harder to make sense of the situation.

It occurs to me that the same principle is at play when we interact with automated telephone systems which are based on voice recognition, rather than numeric menu navigation. I overheard a coworker recently attempt to interact with a telephone system based on voice recognition and, at one point, he was laughing because the system clearly wasn't understanding his request. That laughter, was his brain's response to the incongrous situation of being forced to verbally communicate with something that clearly didn't understand him.

Designing telephone systems, or IVRs, isn't easy. I know from personal experience. However, using the latest and greatest technology isn't always the best choice. I believe that "old-fashioned" systems based on numeric input are still better in most circumstances because:

  • They match user's mental model of the experience: user's aren't forced to talk to a machine as if it was a human.
  • They're unambiguous: you know what input is expected from you, rather than having to guess what the system can or can't understand.

None of this is to say that voice recognition doesn't have a useful role to play in modern telephone systems, just that it should be used sparingly, rather than forming the basis for the whole system. At least not until they can really understand what their  users are saying!