If you’ve heard me ever deliver a presentation on bots, one thing you’ve heard over and over is that a bot is just an app. There are numerous reasons I repeat this mantra, but not the least of which is the fact it’s true - a bot is simply an app with a different interface.
It’s learning how to design for this new interface that causes the greatest struggle for developers, and is the leading cause for failed bot applications. Users are familiar with certain paradigms with, say, a web application. They understand drop-down menus and links, know they can close the entire browser by clicking the X, and can even use the much reviled hamburger. When designing a bot, we need to ensure we meet our user’s expectations on how the application (again, it’s just an app) will behave.
Natural language processing (NLP) is near science fiction. I mean, think about it - you type in a sentence, and this bit of artificial intelligence can break it down, determining what you were trying to say, and extracting the key components of the utterance. That’s mind-blowing.
When developers see this for the first time, they’re often seduced into thinking they should use NLP as the bot’s logic, allowing the NLP to determine everything about what the user said, and have it (indirectly) call a simple function that consists of a couple of lines of code, and viola - a bot is born. It’s only after bashing into wall after wall that developers discover NLP just isn’t built for this.
NLP shines at determining a high level topic, and teasing out a few keywords or phrases, from an utterance. But language is hard, and NLP isn’t able to grok the full nuance of human conversation. Put simply, the technology isn’t there yet, and solving the Turing test isn’t the goal you should be trying to accomplish when building your bot.
Once your user realizes they’re talking to a bot, and not to a human, they’ll down-level, typing in the keywords and commands, trying to cut right to what they’re hoping to be able to do through the bot. By trying to make your bot more life-like, and replying to full human conversation, you’re only making your job, and the user’s experience, more challenging.
Use NLP as the entry point for a task your bot can confirm, and to retrieve the first handful of values (or parameters). Then prompt the user through various questions to drill down to the specific operation they’re looking to perform, and collect the necessary information.
Chat. It’s right there in the name. If we’re building a chat bot, we should handle everything via chat, and expect the user to type in these long sentences where they provide all the information we need, right? I mean, that makes sense, doesn’t it?
Well, not really.
For starters, many people don’t like to type, or type all that quickly; I’ve spent many hours supporting developers as they work on labs, and I’m always amazed at the contortions one will twist themselves into just to avoid typing a few words, using and abusing copy/paste.
Many (most?) people are hunt & peck typers, using at most three fingers. And even for those who are fast typers, this still leaves out one of the most commonly used keyboards - phone - where two thumbs is the max someone will be able to use. Nobody’s going to put forth a ton of characters in that situation.
For a host of reasons, some of which I’ll be returning to later in this post, we should allow the user to perform many operations without having type a lot of characters, or just provide buttons whenever possible for the user to click or tap on.
One of my favorite jokes about marriage is that it’s simply two people asking each other what they want for dinner. This is often how that conversation goes:
Karin (my wife): What do you want for dinner?
Me: Somewhere with good cocktails.
K: How about Siam Palace?
M: Well, they don’t have TVs.
K: How about Joe’s Grill?
M: Hmmm… Well, I’d really like Italian.
Pretty standard conversation. What you’ll notice about it is I didn’t tell Karin right away what it was I wanted. Or, more to the point, I really didn’t know what I wanted when the conversation started. I needed a little back and forth to help narrow down the choices to figure things out.
Users are very much the same way. No user is going to provide all the information necessary to perform an operation in the first message. Sometimes they don’t know the available options, and sometimes they just don’t know what they want up front. Your bot needs to be able to work with partial information, and to collect the necessary pieces over the course of multiple questions.
Along the lines of a back and forth, it’s best to show a user what’s available. This might be a set of commands they could choose from, or it might be a list of restaurant features from the example above. Showing a list of options accomplishes the first goal of allowing the user to see what they can do or choose, but it also allows the user to avoid typing. Less typing, the better.
Imagine the following bot:
The user is now left to figure out what to do all on their own. As we discussed earlier, users are used to paradigms available in other types of applications that intuitively guide them towards what they’re hoping to accomplish. In the scenario above, we’ve provided none of that for our users. We effectively returned all the way back to a C-prompt, where the user is expected to know exactly what commands to type in. This is terrible design.
You can provide guidance to the user in a lot of ways.
For starters, the welcome message should give samples as to what the user can type in. This is especially powerful just before performing an operation on the user’s behalf, giving them a shortcut of sorts right back to that command.
Buttons are once again a good thing. :-) A list of buttons makes it very clear what a user is able to do in a particular scenario or state.
Respond to the word
help. If there’s one thing every user in the world is going to expect to be able to type it’s
help. There might be other words, such as cancel or quit, they might expect as well. But
help is pretty universal, and your bot should make it universal.
This is a new interface for us all - for both our users and for us as developers. It’s going to take a little bit of learning from both sides to build good experiences. The key to remember as a developer is to work with the user, to always show the user what they’re able to do, and never expect them to type in more than a handful of words.