Always have a backup plan

I've been thinking about how things can go wrong lately.

At Cooper, we have a design principle that suggests that designers should "hide the ejector seat levers," meaning: make sure users can't inadvertently cause their software to fail. By the same token, we also encourage designers to "make errors impossible" by designing software that anticipates the actions of its users.

Nevertheless, things will go wrong. By anticipating failures, and designing backup plans like those described below, you can minimize the impact of unexpected problems on the user.

Provide clearly marked detours

Providing clear, timely information about how to proceed when things go wrong will win you respect from users instead of derision. If one of your systems is down, tell your users up front so they don't try to operate a broken system. This information is like a big DETOUR sign instead of one crying NO OUTLET. But be sure that you put these signs up before users encounter the error: don't make people drive down the road only to find out that it's closed.

Know critical points of failure

Being a good software designer means you must consider the hardware and subsystems that support your application. When designing software, brainstorm with the engineering team to figure out where failure is likely to occur. Typically, these are things like:

  • Database failures: What happens if the system can't retrieve key data like customer names and addresses, or if it can't authenticate passwords? What if the system can't write data to the database for a significant period of time?
  • Batched information transfer problems: If the system relies on batching information and sending it several times a day instead of in real-time, it can cause discrepancies and disorient users. Being explicit about the accuracy of information in the interface will at least keep users from getting concerned if they don't see their data updated immediately.
  • Latency: When developing client/server applications, don't discount the time it takes for the user's machine to render and operate the software. What happens if graphics don't load? Are there certain elements that should be loaded separately?
  • System crashes: What happens if a centralized server goes down? Backup and recovery systems are getting more robust all the time, but be sure you know what happens when the system switches to a different server.

Prioritize your list of likely points of failure to distinguish those that are critical and that should be addressed first. Make sure users can complete key transactions before trying to bulletproof the entire system.

Don't block the exits

When you develop alternate ways for users to achieve their goals, make sure they are usable and accessible. Providing keyboard commands in case the user's mouse stops working is a good idea, but only if people can find them without a mouse.

Take it offline

Developing some real world analogues for virtual transactions means users aren't left hanging if computer systems fail. If you can't take payments online, for example, offer a phone number or mailing address as a backup.

Just undo it

One of the great things about many software applications is the Undo command. Since people often make simple mistakes, giving them the ability to reverse their actions helps avoid serious consequences.

In some cases, it may be difficult to implement certain Undo operations, such as writing to a database in real-time. One less preferrable alternative is to ask users to confirm key entries, giving them a chance to verify their actions.

Typos ahppen

In a data entry interface, every typo degrades the quality of the information available. Some key ways to reduce the number of data errors include:

  • Reconcile redundant entries: When a user inputs data, check for similar entries and verify that they aren't duplicates. Check for common misspellings (like Untied States, or Great Britian) that might be committed within the domain of data and correct them automatically to avoid generating misleading entries.
  • Provide bounded entry fields: Users are more likely to mistype than choose the wrong values from a list. Provide a way for people to enter custom values in the event that the ones provided aren't sufficient.
  • Offer searching and browsing: Some errors will undoubtedly find their way into a large body of data. If your database thinks "The Beetles" performed Abbey Road, users might never find that particular classic. Don't rely solely on exact keyword matches to get users to their information. Present them with the closest matches and a way to browse through the results.

Revealing a typo is a lot less damaging than losing data, or customers, altogether.

Put the assurance back in QA

In the rush to get a product to market, don't skimp on Quality Assurance. Makers of products like automobiles or hair dryers must submit their work to a rigorous set of tests before they are deemed safe. Most malfunctioning software may not physically harm anyone, but it can still cause damage.

Test your product early and often, allowing for collaboration between QA and design. A good tester will find ways to break a product that you never dreamed of. Exposing your product to a wider range of test cases will unearth failures that can be mitigated through design.

Reduce false failures

Sometimes just the perception of failure is enough to cause actual problems. Look for places where users might introduce errors and put in some protection. Examples of this might be:

  • If the application behaves slowly, users may repeat their actions to make sure their input isn't lost. This can cause simple errors (like sending an email twice), or in some cases can lead to more serious problems like inundating the system with duplicate transactions and causing it to malfunction.
  • Unclear confirmations leave users wondering if they completed their action. If your commerce system works flawlessly but users flood your customer service agents with calls to make sure their order was placed, your system is still failing.

Many software designers see failure as a "hardware problem"—a fact of computer life. Something as simple as a momentary power outage can bring down a system for hours, leaving users stranded. And what can a lowly software designer do about power outages? While eliminating failure may be beyond the control of a software designer, it's far from a moot issue. In fact, designing around failure is one of the best ways designers can make software successful. Take time to consider the ways in which failure might occur and mitigate them as part of the design. Don't leave your users with Ctrl-Alt-Delete as the only escape route.