How to build a Scalable Background Processor using Azure Container Apps

Matt Mazzola
4 min readFeb 18, 2023

--

There is a common scenario in application development where you want to perform an expensive computation on a set of data. However, this processing may be too costly to perform synchronously with the normal sequence of user action, process data, and send response.

The cost of this process such as the computation complexity and duration will make this better fit as an asynchronous process that runs in the background. In this way the user or application is not expecting this process to complete and receive a response. It only initiates or triggers the process to start, and it may finish at some later time. With Background Processes there are various other parameters you may control such as how instances many can run concurrently.

There are many different methods and technologies to solve this problem. In this article, we will look at using Azure Container Apps with KEDA (Kubernetes Event Driven Autoscaling)

My learning started from interest in DAPR, which caused me to venture into Container Apps, and then I came across other resources about Background Processing and thought learning to create Background Processors using Container Apps was an important solution to understand and be able to reproduce as a software engineer.

K8s, DAPR, and Container Apps are closely related. I don’t have a background with Kubernetes but from my observations, Container Apps is an abstraction built on top of Kubernetes and has DAPR integrated. This makes it a very scalable and powerful service offering for many use cases.

Architecture Diagram

Let’s take a look at the architecture diagram of what we would be building.

We will have 1 Client Website and 3 Background Processors which are “triggered” by receiving a Queue Message, processing SQL data and storing the results.

Take a moment to review the legend and understand the connections in this diagram.

Architecture of Background Processor build on Azure Container Apps

The basic sequence is as follows:

  1. User interacts with Website to generate and store data in SQL
  2. User performs some operation on Website that requires background computation. This puts a message in the Queue for that particular processor.
  3. KEDA monitor observes message, scales up container of Background Processor
  4. Background Processor read queue message, fetches data from SQL, processes data (in this case, only performing a simple sum of values for demonstration) and storing the result in a separate table.
  5. User may observe the results of processing on the Website

Container Details

Here, I list some other technologies used the different containers that may interest people when looking through the code.

Client: Built using RemixJS + Prisma
Processor 1: Node + Storage Queue + Prisma
Processor 2: Poetry / Python + Storage Queue + Prisma
Processor 3: Node + Service Bus Queue + Prisma

It was likely implied through use of Prisma ORM, but the data is stored using Azure SQL and all of the resources are deployed using Azure Bicep

This article is more about demonstrating how these technologies are combined to create a valuable solution rather than offering some algorithmic understanding or novel application.

View of the Remix Application

The application was intended to be the simplest interface to demonstrate full capabilities of the processors.

  • A button to add an Item record to the table.
  • A button to add a Queue message with random value.
  • Two tables to display Items and Results
  • Result contains both the computed sum value AND the random value from the message to demonstrate the data is received.
View of Batch Processor Website

There are many resources out there are application development so I won’t spend much time on the website or Remix.

Setting up KEDA scaling

As mentioned above, all the processors are deployed as Docker containers using Azure Bicep. When authoring the Bicep template for the processor using Storage Queue, you will refer to the Bicep Schema and the Azure Storage Queue KEDA Scaler schema to author the file.

{
minReplicas: 0
maxReplicas: 5
rules: [
{
name: 'storage-queue-message'
custom: {
// https://keda.sh/docs/2.9/scalers/azure-storage-queue/
type: 'azure-queue'
metadata: {
queueName: queueName
queueLength: string(queueLength)
activationQueueLength: string(activationQueueLength)
connectionFromEnv: 'STORAGE_CONNECTION_STRING'
accountName: storageAccountName
cloud: 'AzurePublicCloud'
}
auth: [
{
secretRef: storageConnectionStringSecretName
triggerParameter: 'connnection'
}
]
}
}
]
}

And for the Service Bus KEDA Schema

{
minReplicas: 0
maxReplicas: 5
rules: [
{
name: 'service-bus-queue-message'
custom: {
// https://keda.sh/docs/2.9/scalers/azure-service-bus/
type: 'azure-servicebus'
metadata: {
queueName: queueName
messageCount: string(messageCount)
activationMessageCount: string(activationQueueLength)
connectionFromEnv: 'SERVICE_BUS_NAMESPACE_CONNECTION_STRING'
namespace: serviceBusNamespaceName
cloud: 'AzurePublicCloud'
}
auth: [
{
secretRef: serviceBusConnectionStringSecretName
triggerParameter: 'connnection'
}
]
}
}
]
}

Link to Repository

The full code has many other details you might be interested in learning such as:

  • How secrets are fetched and set on the containers
  • How Bicep deployments are scripted
  • How Azure SDKs for both Storage and Service Bus queues are used to query the count of messages, receive, and dequeue
  • How Prisma schema can be defined in the website and consumed across various processors

https://github.com/mattmazzola/batch-processor

Video Demonstration

Video demonstrating processors responding to queue messages

Perhaps not the most in detailed article I’ve written, but perhaps it inspires you to look into Azure Container Apps and how you can build a robust scalable solution with rather simple configuration.

Let me know what you think in the comments!

--

--

No responses yet