Deep Dive into Microsoft Project Conversation Learner

Most of the articles I write on here are about my own personal work and exploration, but given this post is about company IP I have to say:

Disclaimer: I work at Microsoft on the Project Conversation Learner team

With that out of the way, I’m super excited to finally talk about this project publicly. Given Google I/O and Microsoft //Build happened during the same week you might have missed the announcement, but Project Conversation Learner was released at //Build 2018 as a private preview.

You can view the official docs here: https://labs.cognitive.microsoft.com/en-us/project-conversation-learner

Notice it is a Cognitive Service Lab as opposed to an official Cognitive Service. All of the labs have the prefix Project and say “experimental”. To quote the official marketing text:

Labs provides developers with an early look at emerging Cognitive Services technologies. Early adopters who do not need market-ready technology can discover, try and provide feedback on new Cognitive Services technologies before they are generally available. Labs are not Azure services.

In other words, as a lab it’s really important for us to get customers using the project to determine if it’s enabling customers in the way we expect or learn there are scenarios we’re missing that we can work on. I think the better customers understand the product the more valuable feedback they can provide and hence the reason I’m writing this article.

Purpose:

Bot development is relatively new and also rapidly growing in popularity. This means there are many developers entering this space at all different points in the experience spectrum. Some are not yet familiar with the problems they will face and wouldn’t know they needed something like Conversation Learner. There are also experienced bot devs who understand the problems but haven’t found a solution or it may not be clear how Conversation Learner solves them. In this article I hope to cover both the problems that exist due to limitations of current tech/tooling out there as well as how Conversation Learner solves them. Specifically we’ll go over What Conversation Learner is? Why you should use it? and How it works?

What is it?

In order to understand why you should use Conversation Learner I think you first have to understand a bit about existing technology used to builds bots. This will provide the context and allow understanding what role the different components play in the system. By comparing how ConversationLearner is different will help you understand how it solves certain issues.

Image for post
Image for post
Diagram of different components related to building bots

You register your bot with the Azure Bot Service to make it available for customers. As users communicate through various applications we’re calling input channels such as Teams, Slack, Skype etc. these messages pass through the Azure Bot Service and are normalized into a standard JSON format and forwarded to your bot’s endpoint /api/messages. In order for your bot to return useful information to the users it may have been making queries to back-end services such as using LUIS to recognize intent and then perhaps secondary query for information. The bot constructs a response activity and returns it to the user.

In the local development flow we simulate the input from users by using the BotEmulator v3 which is an electron application with a WebChat control. Notice this block overlaps the Input Channel and Bot Service because it is usually communicating with a bot on localhost and bypassing the Bot Service.

If we ignore Azure Bot Service at the moment, there are 3 main components you care about as a developer:

Let’s look a little deeper at the implementation of your bot. Your bot may have used botbuilder SDK and relied on other middleware such as the LuisRecognizer to abstract the work of calling the Language Understanding Cognitive Service

Development with Project Conversation Learner is very similar in that it also has a UI, accompanying middleware ConversationLearner, and a cloud service to facilitate responding. The difference as that all 3 of these components were developed for each other to provide the best bot development experience. (Technically with the latest version of botbuilder it’s no longer middleware, but these are familiar terms for people building web services so I chose to re-use them since the

I will go into much more detail about what these extra pieces do in the “How” section but just keep this mental image in your head and know that if you wanted to use Project Conversation Learner it is intended feel very familiar to your existing workflow. You would be installing a node package just like you would if you were using botbuilder. You would test your bot through a WebChat UI. You would be entering an API key to call into the back-end service just as if you were using other Cognitive Service such as LUIS.

Recap of the various components:

Why should I use it?

Previously I mentioned that the UI, SDK, and SERVICE offered by Project Conversation Learner are all built for each other. They are also more powerful as they are specifically designed for training models on conversations instead of single inputs of text like existing technologies such as LUIS. This enables new developer experiences.

Assuming a tech stack of BotEmulator v3, botbuilder SDK, and LUIS here are the main problems:

Problems

  • Developer Feedback loop
    To implement behavior change in your bot you must update multiple components. For example updating LUIS to output new intent and then update the code to make use of this new intent. This results in a lot of jumping between testing input, adjusting back-end, then adjusting code.
  • Code complexity as to bot scales to cover more inputs / exceptions
    Think about all the different nuances of interpreting language. As you continue to test the bot you will find cases where the bot should respond in a manner that doesn’t easily fit into the existing rules and intents you have already setup. Typical approaches use hierarchies of rules which are difficult to manage.
  • Requires developers/code to make changes to bot design
    Services like LUIS are limited to determining the intent of the message and the bot still has to translate that into an activity/message
  • Barrier to improve bot and deploy changes
    The idea here is to be able to update bot behavior without having to redeploy code which is always a potential risk. With services like LUIS you can update how it classifies intents and retrain that model to effect behavior of bot without changing code, but anytime you wanted to add new intents this would require changes.

Solutions

  • Reactive/Linear thought process
    Some approaches to bot development are very proactive. You plan ahead on user inputs, slots/intents and entities, then configure the different dependencies, write the bot code, and then finally test your bot to see if it behaved as expected
    With Project Conversation Learner you can make these decisions inline by allowing you to build your bot as you interact with it.
  • Code what’s easy to code / Learn what’s easy to learn
    This helps reduce bot code to only the business logic. For example restricting set of bot actions to users based on their role retrieved after authentication
  • Can update how bot may respond to user without redeploying code
    The Project Conversation Learner service can hold bot responses instead of just intents so you can make types of changes to bot behavior without redeploying code.
  • Enables non-developers to “develop” bot
    Because a large part of bot behavior no longer requires code changes it opens up entirely new classes of contributors. This is very exciting! For example, customer support agents who deal with customers the bot could not handle can now look at where the bot struggled and make the fix so it doesn’t happen again. All while protecting the integrity of existing bot behavior.
    Or for product owners this removes the uncertainty and information loss of translating the expected user experience to code specification. With CL what you do with the bot during training is how the bot will behave.

In summary, this is a new way to build bots and train models on task based conversations and solves some of the major problems with existing bot development you might already be running into.

How does it work?

Similarly to the “What is it?” section. I think it’s helpful to understand the relationship between the different components. Here we’ll actually look at the process and code for three different versions of the same bot using. First will be a static bot, second will be a bot using a language service such as LUIS, and third will be a bot using Project Conversation Learner. This gives concrete examples for the journey a new developer would take as they progress bot capabilities as well as understand the transition process.

Static Bot

This is close to the standard hello world bot. It doesn’t call any external services. It simply uses regex to process input and responds with hard-coded messages.

Image for post
Image for post
Static Bot

Bot with Language Understanding Service (LUIS)

This is the next step up which calls LUIS to understand user intent and then responds with appropriate message.

Image for post
Image for post
Bot with LUIS

Bot with Project Conversation Learner

Notice the sequence if fairly similar although instead of returning intents it returns bot actions.

Image for post
Image for post
Bot with CL

Notice the question “How does CL predict actions?”. This is because we have previously trained the model to understand this conversation. You may ask what does training look like?

Image for post
Image for post
Bot with CL (Training)

Notice this diagram introduced two new optional sections:

Video:

Video walk through of article. Includes overview of Conversation Learner UI and building a bot with it.

Possible drawbacks of CL (As of August 2018)

As a lab we’re still actively improving the product so I’m hesitant to list these as they will likely be obsolete soon, but here are some things to consider and that we appreciate feedback on:

  • Effects the development workflow / engineering system
    Because part of your bot’s behavior is based on the learned model stored in our service you can’t simply create a new branch in code to work on a new feature; however, we do allow taking snapshots of the models and tagging them similar to branches which can help alleviate the issues. We also allow exporting your model which allows you to save it in source control.
  • Making breaking changes to bot behavior
    As you input more dialogs into the system and later want to make a change which conflicts with existing behavior, this can’t be resolved by the model and you must manually correct these which can be cumbersome.

Remember that this is a “Lab” product and a new approach to building bots. We definitely have lot to learn, but hope you can help us.

Summary

I hope this has helped you understand Project Conversation Learner and how it allows a hybrid of code and ML to improve development of dialogue managers. We also looked at how it solves some of the pain points of current development workflow and opens up possibilities. If your interested in exploring this tech please go request an invite.

References:

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store