Deep Dive into Microsoft Project Conversation Learner

Most of the articles I write on here are about my own personal work and exploration, but given this post is about company IP I have to say:

Disclaimer: I work at Microsoft on the Project Conversation Learner team

With that out of the way, I’m super excited to finally talk about this project publicly. Given Google I/O and Microsoft //Build happened during the same week you might have missed the announcement, but Project Conversation Learner was released at //Build 2018 as a private preview.

You can view the official docs here: https://labs.cognitive.microsoft.com/en-us/project-conversation-learner

Notice it is a Cognitive Service Lab as opposed to an official Cognitive Service. All of the labs have the prefix Project and say “experimental”. To quote the official marketing text:

Labs provides developers with an early look at emerging Cognitive Services technologies. Early adopters who do not need market-ready technology can discover, try and provide feedback on new Cognitive Services technologies before they are generally available. Labs are not Azure services.

In other words, as a lab it’s really important for us to get customers using the project to determine if it’s enabling customers in the way we expect or learn there are scenarios we’re missing that we can work on. I think the better customers understand the product the more valuable feedback they can provide and hence the reason I’m writing this article.

Purpose:

  1. Provide better understanding in order to receive better feedback

Bot development is relatively new and also rapidly growing in popularity. This means there are many developers entering this space at all different points in the experience spectrum. Some are not yet familiar with the problems they will face and wouldn’t know they needed something like Conversation Learner. There are also experienced bot devs who understand the problems but haven’t found a solution or it may not be clear how Conversation Learner solves them. In this article I hope to cover both the problems that exist due to limitations of current tech/tooling out there as well as how Conversation Learner solves them. Specifically we’ll go over What Conversation Learner is? Why you should use it? and How it works?

What is it?

Diagram of different components related to building bots

You register your bot with the Azure Bot Service to make it available for customers. As users communicate through various applications we’re calling input channels such as Teams, Slack, Skype etc. these messages pass through the Azure Bot Service and are normalized into a standard JSON format and forwarded to your bot’s endpoint /api/messages. In order for your bot to return useful information to the users it may have been making queries to back-end services such as using LUIS to recognize intent and then perhaps secondary query for information. The bot constructs a response activity and returns it to the user.

In the local development flow we simulate the input from users by using the BotEmulator v3 which is an electron application with a WebChat control. Notice this block overlaps the Input Channel and Bot Service because it is usually communicating with a bot on localhost and bypassing the Bot Service.

If we ignore Azure Bot Service at the moment, there are 3 main components you care about as a developer:

  1. UI to input/test your bot behavior
  2. SDK to facilitate building your Bot server
  3. Back-end services to give your bot more rich behavior and value to customers.

Let’s look a little deeper at the implementation of your bot. Your bot may have used botbuilder SDK and relied on other middleware such as the LuisRecognizer to abstract the work of calling the Language Understanding Cognitive Service

Development with Project Conversation Learner is very similar in that it also has a UI, accompanying middleware ConversationLearner, and a cloud service to facilitate responding. The difference as that all 3 of these components were developed for each other to provide the best bot development experience. (Technically with the latest version of botbuilder it’s no longer middleware, but these are familiar terms for people building web services so I chose to re-use them since the

I will go into much more detail about what these extra pieces do in the “How” section but just keep this mental image in your head and know that if you wanted to use Project Conversation Learner it is intended feel very familiar to your existing workflow. You would be installing a node package just like you would if you were using botbuilder. You would test your bot through a WebChat UI. You would be entering an API key to call into the back-end service just as if you were using other Cognitive Service such as LUIS.

Recap of the various components:

  1. Input Channels
    In other words, UI to communicate with your bot (sending messages, speech, etc).
    In this case we see examples of some well known applications such as Teams, Slack, Skype, or agents like Cortana etc.
  2. Azure Bot Service
    This performs two main functions.
    1. Register your bot to make it available to the channels
    2. Normalize the different types of activities / messages from the different channels into a standardized format it can send to your bot
    (This is specific to Azure, but you will likely have some middle layer here to do the translation as you want to make your bot available to a bunch of different applications without writing this logic yourself.)
  3. Bot Service
    Your bot is simply a web service that responds to /api/messages endpoint. In this case, /api/messages is known because it’s the standard endpoint Azure Bot service will forward messages to. Regardless of the endpoint, just know that you can leverage all the skills and background you have in building, testing, deploying web services and apply it to building bots. Even if you register your bot with Azure Bot Service to get the benefits of the normalization and discovery you can still host the actual service anywhere you like. (AWS, Microsoft Azure, Google Cloud, etc.) You just provide the url it needs to call.
  4. External Services
    It’s up to the implementation of your bot to decide what it calls before send the HTTP response. Technically these services are optional, but without them it’s difficult to build a bot that is useful. Example of these externals services might be a language understanding service to help determine the user intent and another service to query your company inventory or perhaps search documents.

Why should I use it?

Assuming a tech stack of BotEmulator v3, botbuilder SDK, and LUIS here are the main problems:

Problems

  • Code complexity as to bot scales to cover more inputs / exceptions
    Think about all the different nuances of interpreting language. As you continue to test the bot you will find cases where the bot should respond in a manner that doesn’t easily fit into the existing rules and intents you have already setup. Typical approaches use hierarchies of rules which are difficult to manage.
  • Requires developers/code to make changes to bot design
    Services like LUIS are limited to determining the intent of the message and the bot still has to translate that into an activity/message
  • Barrier to improve bot and deploy changes
    The idea here is to be able to update bot behavior without having to redeploy code which is always a potential risk. With services like LUIS you can update how it classifies intents and retrain that model to effect behavior of bot without changing code, but anytime you wanted to add new intents this would require changes.

Solutions

  • Code what’s easy to code / Learn what’s easy to learn
    This helps reduce bot code to only the business logic. For example restricting set of bot actions to users based on their role retrieved after authentication
  • Can update how bot may respond to user without redeploying code
    The Project Conversation Learner service can hold bot responses instead of just intents so you can make types of changes to bot behavior without redeploying code.
  • Enables non-developers to “develop” bot
    Because a large part of bot behavior no longer requires code changes it opens up entirely new classes of contributors. This is very exciting! For example, customer support agents who deal with customers the bot could not handle can now look at where the bot struggled and make the fix so it doesn’t happen again. All while protecting the integrity of existing bot behavior.
    Or for product owners this removes the uncertainty and information loss of translating the expected user experience to code specification. With CL what you do with the bot during training is how the bot will behave.

In summary, this is a new way to build bots and train models on task based conversations and solves some of the major problems with existing bot development you might already be running into.

How does it work?

Static Bot

Static Bot

Bot with Language Understanding Service (LUIS)

Bot with LUIS

Bot with Project Conversation Learner

Bot with CL

Notice the question “How does CL predict actions?”. This is because we have previously trained the model to understand this conversation. You may ask what does training look like?

Bot with CL (Training)

Notice this diagram introduced two new optional sections:

  1. Entity Detection
    After the trainer inputs the expected user text the system pauses after retrieving entities from LUIS and provides the opportunity for trainer to correct the entity prediction before we continue. If a change is needed it will automatically update the LUIS app for you in the background as you continue working with your bot so that change is ready next time. This is one of those new developer experiences I mentioned above that reduces amount of jumping between apps.
  2. Action Selection
    After the entity detection it’s time to determine who the bot should respond or for CL terminology what action the bot should take. Similarly to the entity detection loop, we initially predict what the action should be, but pause the system and give trainer opportunity to correct which action should be taken. If a change is needed this will update the model to increase likely hood of making this prediction in future.

Video:

Video walk through of article. Includes overview of Conversation Learner UI and building a bot with it.

Possible drawbacks of CL (As of August 2018)

  • Effects the development workflow / engineering system
    Because part of your bot’s behavior is based on the learned model stored in our service you can’t simply create a new branch in code to work on a new feature; however, we do allow taking snapshots of the models and tagging them similar to branches which can help alleviate the issues. We also allow exporting your model which allows you to save it in source control.
  • Making breaking changes to bot behavior
    As you input more dialogs into the system and later want to make a change which conflicts with existing behavior, this can’t be resolved by the model and you must manually correct these which can be cumbersome.

Remember that this is a “Lab” product and a new approach to building bots. We definitely have lot to learn, but hope you can help us.

Summary

References:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store