Designing High-Throughput, Resilient Background Processing in ASP.NET Core with Azure Service Bus and the EF Core Outbox Pattern

December 2, 2025 · Asad Ali

console.log(“Hello world”);

Moving from a big, stateful, ACID monolith to microservices is where a lot of teams first get properly burned by background processing. In the monolith, you had an ambient transaction, a local queue table, and everything “just worked” most of the time. Once you bring in Azure Service Bus, multiple services, and async processing, you suddenly discover:

  • Messages are delivered at-least-once, not exactly-once.
  • Your handlers aren’t actually idempotent.
  • Ordering is occasionally violated during scale-out.
  • Poison messages keep retrying and hammering your dependencies.
  • You can’t use distributed transactions across SQL and Service Bus in a cloud-native way.

In this post I’ll walk through how I design high-throughput, resilient background pipelines in ASP.NET Core using:

  • Azure Service Bus (queues / topics, sessions, DLQs, duplicate detection)
  • EF Core + SQL as a transactional outbox
  • BackgroundService-based workers for draining the outbox and processing messages

The focus is on real production concerns: exactly-once effects, idempotency, duplicates, ordering, poison messages, and operational visibility.

The Problem Space: Background Processing in Modern ASP.NET Core Microservices

When teams split a monolith, they usually keep the original assumptions:

  • “We save to the database and then we publish an event.”
  • “Handlers are only invoked once.”
  • “If something fails, we just roll back.”

Those assumptions break down fast in a microservices + message broker world.

Where things go wrong in practice

Here are patterns I’ve seen repeatedly:

  1. Non-atomic state changes + message sends
    Service A writes to SQL and then publishes a message to Azure Service Bus. Either can fail independently:
    • DB commit succeeds, send to Service Bus fails → downstream never learns about the change.
    • DB commit fails, send to Service Bus succeeds → downstream acts on something that doesn’t exist.
  2. At-least-once delivery surprises
    Azure Service Bus makes no promise to deliver a message exactly-once. Network blips, client restarts, lock timeouts, and retries all lead to duplicate deliveries. If your handlers assume single execution, you get double-charges, duplicated emails, or corrupted aggregates.
  3. Parallelism vs ordering
    You scale your consumers out to multiple instances, max out prefetchCount and concurrency, and suddenly order-dependent flows (e.g., OrderCreated → OrderUpdated → OrderCancelled) process out of order.
  4. Ghost retries and poison messages
    A malformed or logic-breaking message keeps being retried until it hits the max delivery count and drops into the DLQ. If you don’t have DLQ handling and alerts, you never notice that a key business flow is effectively broken for a subset of messages.

All of this is normal behavior for a modern message broker. The failure is in design assumptions, not in Service Bus.

The good news: with an EF Core outbox and a properly designed worker model, you can get to very strong exactly-once effects and high throughput without falling back to distributed transactions.

Core Concepts: Exactly-Once Semantics, Idempotency, and the Outbox Pattern

Exactly-once effects vs exactly-once delivery

Azure Service Bus is an at-least-once delivery system. Messages may be delivered more than once, but not fewer than once (ignoring DLQ and TTL semantics). Trying to retrofit exactly-once delivery into this model is painful, expensive, and usually not worth it.

What we really want is:

  • Exactly-once effects – the business outcome of processing is as if the message was handled once, even if it was delivered multiple times.
  • Atomicity between local state changes and message publication – when a service changes its state and emits events/commands, those must be consistent.

Idempotency

Idempotency is the property that applying the same operation more than once yields the same final state as applying it once. In message handlers, this usually means:

  • Detecting and ignoring duplicates using a message processing log / business key.
  • Designing updates to be “set to X” rather than “increment by 1”, where possible.
  • Using natural idempotency (e.g., INSERT ... ON CONFLICT DO NOTHING-style semantics, or upserts)

Idempotency is your safety net for all the retries and duplicate deliveries that the broker must perform to be reliable.

The EF Core transactional outbox

The outbox pattern, as documented in the Microsoft Architecture Center and used in eShopOnContainers, gives us atomicity between domain state and outgoing messages by:

  1. Writing domain state changes and outgoing messages to the same database in a single transaction.
  2. Storing outgoing messages in an OutboxMessages table (the “outbox”).
  3. Having a separate worker process (or background task) read from the outbox and publish to Azure Service Bus.
  4. Updating the outbox entry’s status once the publish succeeds (or marking it failed permanently after retries).

This eliminates the need for a distributed transaction between SQL and Service Bus. If the worker fails mid-publish, it can safely retry later because the messages are persisted and we can design the send side to be idempotent using message IDs + Service Bus duplicate detection.

Architecture Overview: ASP.NET Core + Azure Service Bus + EF Core Outbox

Let me outline a concrete architecture that has worked well in multiple production systems:

+-----------------------------+            +---------------------------+
|  ASP.NET Core API / Worker |            |      Other Services      |
|                             |            |  (Consumers of messages) |
|  +-----------------------+  |            +-------------+-------------+
|  | EF Core + SQL Server |  |                          ^
|  |  - Domain tables     |  |                          |
|  |  - OutboxMessages    |  |                          |
|  +----------+------------+  |   Azure Service Bus      |
|             ^               |   (Queues / Topics)      |
|             |               +-----------+--------------+
|   HTTP / internal calls                 ^
|             |                Background  |
|   Domain logic & SaveChanges()  Worker  |
|             |                 (Outbox   |
|      Txn: Domain + Outbox       drain)  |
+-------------+----------------------------+

Key components:

  • ASP.NET Core API / Worker: where write operations occur (commands / mutations). These write domain state and append outbox records in the same transaction.
  • EF Core + SQL: your bounded context’s data store and the outbox table share the same database.
  • Outbox background worker: BackgroundService that polls the outbox, sends messages to Service Bus, and marks them as dispatched.
  • Azure Service Bus: provides durable, at-least-once message delivery, duplication detection, sessions for ordering, DLQs for poison messages, scheduled messages for delays, etc.
  • Downstream services: subscribe to topics/queues, handle messages idempotently, update their own local state and possibly append their own outbox entries for further propagation.

Notice that we’re pushing consistency one hop at a time: each service ensures its own internal consistency and then emits integration events. Global consistency is eventual, but it’s reliable.

Designing the EF Core Outbox: Schema, States, and Transaction Boundaries

The outbox design is where a lot of subtle bugs hide. Over the years I’ve converged on a set of principles that keep it sane and scalable.

Outbox table schema

A minimal, production-grade schema looks like this (SQL Server style):

 

CREATE TABLE OutboxMessages
(
    Id              UNIQUEIDENTIFIER    NOT NULL PRIMARY KEY,
    AggregateId     NVARCHAR(100)      NULL, -- For debugging/correlation
    MessageType     NVARCHAR(200)      NOT NULL, -- e.g. Namespace.IntegrationEvent
    Payload         NVARCHAR(MAX)      NOT NULL, -- Serialized JSON
    Metadata        NVARCHAR(MAX)      NULL, -- CorrelationId, CausationId, etc.

    CreatedAtUtc    DATETIME2(3)       NOT NULL,
    ProcessedAtUtc  DATETIME2(3)       NULL,

    Status          TINYINT            NOT NULL, -- 0=Pending,1=Processing,2=Succeeded,3=Failed
    RetryCount      INT                NOT NULL DEFAULT(0),
    LastError       NVARCHAR(2000)     NULL,

    LockId          UNIQUEIDENTIFIER   NULL,
    LockExpiresUtc  DATETIME2(3)       NULL
);

CREATE INDEX IX_OutboxMessages_Status_CreatedAt
    ON OutboxMessages(Status, CreatedAtUtc) INCLUDE (RetryCount);

Why this shape?

  • Status – lets the worker quickly filter by pending messages.
  • LockId / LockExpiresUtc – protects against multiple workers processing the same row concurrently (more on this later).
  • RetryCount / LastError – essential for diagnosing and eventually deciding that something is permanently broken.
  • Payload – I keep it as JSON to avoid coupling outbox schema to specific event types. Deserialization happens in the worker.

Persisting domain changes + outbox atomically

Application code should never call the Service Bus client directly inside a business transaction. Instead, you:

  1. Execute your domain logic.
  2. Generate one or more integration events (plain C# records).
  3. Convert those into OutboxMessage entities and add them to the DbContext.
  4. Call SaveChangesAsync() once inside an explicit transaction.

Example with EF Core (simplified):

public async Task PlaceOrderAsync(PlaceOrderCommand command, CancellationToken ct)
{
    await using var tx = await _db.Database.BeginTransactionAsync(ct);

    var order = new Order(/* ... */);
    _db.Orders.Add(order);

    var evt = new OrderPlacedIntegrationEvent(order.Id, order.CustomerId, order.Total);

    var outboxMessage = OutboxMessage.CreateFrom(evt, aggregateId: order.Id.ToString());
    _db.OutboxMessages.Add(outboxMessage);

    await _db.SaveChangesAsync(ct);
    await tx.CommitAsync(ct);
}

public sealed class OutboxMessage
{
    public Guid Id { get; private set; }
    public string MessageType { get; private set; } = default!;
    public string Payload { get; private set; } = default!;
    public string? Metadata { get; private set; }
    public DateTime CreatedAtUtc { get; private set; }
    public byte Status { get; private set; }

    private OutboxMessage() { } // EF

    public static OutboxMessage CreateFrom(object @event, string? aggregateId = null)
    {
        return new OutboxMessage
        {
            Id = Guid.NewGuid(),
            MessageType = @event.GetType().FullName!,
            Payload = JsonSerializer.Serialize(@event, @event.GetType()),
            Metadata = JsonSerializer.Serialize(new
            {
                AggregateId = aggregateId,
                CorrelationId = Activity.Current?.TraceId.ToString(),
                CausationId = Activity.Current?.SpanId.ToString()
            }),
            CreatedAtUtc = DateTime.UtcNow,
            Status = 0 // Pending
        };
    }
}

The key is: if the transaction commits, both the domain row and the outbox entry exist; if it rolls back, neither do. No more “ghost” events or lost messages.

Locking strategy for multiple workers

With high throughput, you’ll have multiple outbox workers across instances. You must make sure they don’t grab the same row. I avoid overly clever distributed locks and just use “claim & mark” semantics:

-- Pseudo SQL for claiming a batch
UPDATE TOP (@batchSize) OutboxMessages
SET Status = 1, -- Processing
    LockId = @lockId,
    LockExpiresUtc = DATEADD(SECOND, @lockTimeoutSeconds, SYSUTCDATETIME())
WHERE Status = 0 -- Pending
  AND (LockExpiresUtc IS NULL OR LockExpiresUtc <= SYSUTCDATETIME())
ORDER BY CreatedAtUtc;

The worker then selects rows with its LockId, processes them, and updates status to Succeeded or Failed. If a worker crashes, locks eventually expire and another worker can pick them up.

Implementing Background Workers in ASP.NET Core for Outbox Draining

ASP.NET Core BackgroundService (or a separate Worker Service) is the natural home for outbox draining. I typically separate responsibilities into two workers:

  • Outbox dispatcher – reads from the outbox, sends to Service Bus.
  • Service Bus consumer – reads from Service Bus, runs business handlers.

You can co-host them in one process or separate them into distinct services – depends on deployment and scaling needs.

Outbox dispatcher skeleton

public sealed class OutboxDispatcher : BackgroundService
{
    private readonly IServiceProvider _serviceProvider;
    private readonly ILogger<OutboxDispatcher> _logger;
    private readonly TimeSpan _pollInterval;

    public OutboxDispatcher(
        IServiceProvider serviceProvider,
        ILogger<OutboxDispatcher> logger,
        IOptions<OutboxOptions> options)
    {
        _serviceProvider = serviceProvider;
        _logger = logger;
        _pollInterval = options.Value.PollInterval;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            try
            {
                var processedAny = await DispatchBatchAsync(stoppingToken);

                if (!processedAny)
                {
                    await Task.Delay(_pollInterval, stoppingToken);
                }
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Outbox dispatcher loop failed");
                await Task.Delay(_pollInterval, stoppingToken);
            }
        }
    }

    private async Task<bool> DispatchBatchAsync(CancellationToken ct)
    {
        using var scope = _serviceProvider.CreateScope();
        var db = scope.ServiceProvider.GetRequiredService<AppDbContext>();
        var sender = scope.ServiceProvider.GetRequiredService<ServiceBusSender>();

        var lockId = Guid.NewGuid();
        var now = DateTime.UtcNow;
        var lockExpiry = now.AddSeconds(30);

        await using var tx = await db.Database.BeginTransactionAsync(ct);

        // Claim a batch of messages
        var batch = await db.OutboxMessages
            .Where(x => x.Status == 0 &&
                        (x.LockExpiresUtc == null || x.LockExpiresUtc <= now))
            .OrderBy(x => x.CreatedAtUtc)
            .Take(100)
            .ToListAsync(ct);

        if (!batch.Any())
        {
            return false;
        }

        foreach (var msg in batch)
        {
            msg.Status = 1; // Processing
            msg.LockId = lockId;
            msg.LockExpiresUtc = lockExpiry;
        }

        await db.SaveChangesAsync(ct);
        await tx.CommitAsync(ct);

        // Now send messages outside the DB transaction
        foreach (var msg in batch)
        {
            try
            {
                var serviceBusMessage = BuildServiceBusMessage(msg);
                await sender.SendMessageAsync(serviceBusMessage, ct);

                msg.Status = 2; // Succeeded
                msg.ProcessedAtUtc = DateTime.UtcNow;
                msg.LastError = null;
            }
            catch (Exception ex)
            {
                msg.Status = 3; // Failed (or keep as Processing and use RetryCount)
                msg.RetryCount++;
                msg.LastError = ex.Message.Truncate(1900);

                _logger.LogError(ex, "Failed to send outbox message {OutboxId}", msg.Id);
            }
        }

        await db.SaveChangesAsync(ct);
        return true;
    }

    private static ServiceBusMessage BuildServiceBusMessage(OutboxMessage msg)
    {
        var body = new BinaryData(msg.Payload);
        var sbMessage = new ServiceBusMessage(body)
        {
            MessageId = msg.Id.ToString(), // For duplicate detection
            ContentType = "application/json",
        };

        // Optional: propagate correlation metadata
        if (!string.IsNullOrWhiteSpace(msg.Metadata))
        {
            var meta = JsonSerializer.Deserialize<Dictionary<string, string?>>(msg.Metadata)!;
            foreach (var kv in meta)
            {
                if (kv.Value != null)
                {
                    sbMessage.ApplicationProperties[kv.Key] = kv.Value;
                }
            }
        }

        return sbMessage;
    }
}

Observations:

  • We split “claim & mark” and “publish” into two distinct steps to keep the DB transaction small.
  • We use the outbox message Id as the Service Bus MessageId to enable broker-side duplicate detection.
  • We handle per-message exceptions individually to avoid a single bad message blocking the entire batch.

You can refine the retry model by having separate states for transient vs permanent failure, but this skeleton is enough to start.

Configuring Azure Service Bus for High Throughput and Ordering Guarantees

Azure Service Bus has a lot of knobs. Misconfiguring them is a classic way to self-sabotage throughput.

Throughput considerations

  • Premium tier: For serious production workloads, pay for Premium. You get predictable latency, isolation, and the ability to scale throughput with messaging units.
  • Partitioned queues/topics: Enabled for higher throughput and availability by spreading messages across multiple brokers. For most high-volume workloads, turn this on unless strict ordering across the entire entity is required.
  • Prefetch & concurrency:
    • Increase PrefetchCount on receivers to reduce network round-trips.
    • Use MaxConcurrentCalls appropriately in the processor to match your CPU/IO balance.
  • Client factory: Reuse ServiceBusClient instances; they’re designed to be long-lived. Don’t create a new client per message.

Ordering with sessions vs partitions

When you need ordering, you have to trade off some throughput flexibility.

  • Per-key ordering: use sessions.
    • Assign a SessionId (e.g., OrderId) to all messages in a logical sequence.
    • Service Bus guarantees in-order processing within a session.
    • You can still scale by processing many sessions in parallel, but each session is single-threaded.
  • Global ordering: basically incompatible with high throughput.
    • If you require strict total ordering, you effectively become single-threaded.
    • Most real systems relax this requirement to per-aggregate or per-business-key ordering.
  • Partitioned entities and ordering:
    • Partitioning breaks global ordering, but with sessions you still get order within a session.
    • So the usual pattern is: partitioned queue/topic + sessions per aggregate.

Duplicate detection

Azure Service Bus can automatically drop duplicate messages by MessageId within a configured time window.

  • Enable duplicate detection on queues/topics used by the outbox dispatcher.
  • Use the outbox Id as the MessageId.
  • Choose a window long enough to cover your worst-case retry horizons (e.g., hours, not seconds).

This is not a replacement for idempotency in handlers, but it reduces duplicate pressure significantly when send-side retries happen.

Handling Failures: Retries, Poison Messages, and Dead-Letter Queues

Background pipelines live and die by how they handle failure. You will have transient failures (timeouts, throttling) and persistent or data failures (invalid payload, broken invariants).

Retries in the outbox dispatcher

For the dispatcher, failures are usually transient (Service Bus is unavailable, network issues). I commonly:

  • Mark the message as Failed and increment RetryCount on failure.
  • Have a scheduled job or another pass that picks up failed messages with RetryCount < MaxRetries and reverts them to Pending after some delay (retry with backoff).
  • When RetryCount exceeds MaxRetries, mark as Dead (a distinct status) and alert. These are “application-level DLQ” entries.

This pattern is similar to Service Bus’s own MaxDeliveryCount and DLQ, but applied to outbox dispatching.

Message handler retries and DLQ

On the Service Bus consumer side, I lean on the broker’s built-in features:

  • Use the processor client (ServiceBusProcessor / ServiceBusSessionProcessor).
  • Let it automatically abandon messages on exceptions (so they’re redelivered up to MaxDeliveryCount).
  • Configure MaxDeliveryCount appropriately (e.g., 5–10). Too low and you mark transient issues as poison; too high and you burn resources.
  • After MaxDeliveryCount is exceeded, messages go into the DLQ.

In one system, we initially set MaxDeliveryCount to 1 by mistake. A 30-second DB blip caused a chunk of messages to be DLQ’d instantly. We didn’t notice until business noticed missing events. That incident forced us to treat DLQ monitoring as a first-class operational responsibility, not a “later” thing.

Poison messages

Poison messages usually fall into categories:

  • Schema drift – consumer expects a field that doesn’t exist or has changed type.
  • Business invariant violations – message content is logically invalid for current rules.
  • Code bugs – handler null-refs, throws, or loops.

Your strategy:

  • Always log the MessageId, SessionId, and payload for failed messages.
  • Process the DLQ periodically (manual or automated) to inspect and, if safe, re-submit messages to the main queue (or a dedicated “repair” queue).
  • Consider adding a “sub-DLQ” concept at the application level to differentiate known-bad from unknown-bad data.

With Service Bus, you can move DLQ messages back to the main queue by peeking/receiving from the DLQ and sending a copy to the main queue, but do not do this blindly. You’ll just create an infinite poison loop.

Ensuring Idempotency and Dealing with Duplicates Across Services

Even with outbox + duplicate detection, duplicates happen. Consumers must be written as if every message could be a repeat.

Per-consumer deduplication

I use a simple ProcessedMessages (or Inbox) table per consuming service:

CREATE TABLE ProcessedMessages
(
    MessageId       NVARCHAR(200)   NOT NULL PRIMARY KEY,
    ProcessedAtUtc  DATETIME2(3)    NOT NULL,
    Source          NVARCHAR(100)   NULL
);

Pattern for handlers:

  1. Begin a local transaction.
  2. Check if MessageId already exists in ProcessedMessages.
    • If yes → short-circuit, commit, and complete the message without re-executing the business logic.
    • If no → execute the handler logic, persist changes, insert into ProcessedMessages, commit.

This gives each consumer exactly-once effect semantics even with at-least-once delivery.

In EF Core:

public async Task HandleAsync(ServiceBusReceivedMessage sbMessage, CancellationToken ct)
{
    var messageId = sbMessage.MessageId;

    await using var tx = await _db.Database.BeginTransactionAsync(ct);

    var alreadyProcessed = await _db.ProcessedMessages
        .AnyAsync(x => x.MessageId == messageId, ct);

    if (alreadyProcessed)
    {
        await tx.CommitAsync(ct);
        return; // Idempotent no-op
    }

    var body = sbMessage.Body.ToString();
    var evt = JsonSerializer.Deserialize<OrderPlacedIntegrationEvent>(body)!;

    // Business logic
    await _projection.ApplyOrderPlacedAsync(evt, ct);

    _db.ProcessedMessages.Add(new ProcessedMessage
    {
        MessageId = messageId,
        ProcessedAtUtc = DateTime.UtcNow,
        Source = "OrderService.Queue"
    });

    await _db.SaveChangesAsync(ct);
    await tx.CommitAsync(ct);
}

Idempotent handler design

Even with a ProcessedMessages table, still design handlers to be naturally idempotent where possible:

  • Use upserts (e.g., “set status to Shipped” rather than “add 1 to shippedCount”).
  • Include version numbers or sequence numbers in events and ignore out-of-order older versions when necessary.
  • When calling external APIs, use idempotency keys on the API side where supported.

This saved us in one system where a bug accidentally disabled the ProcessedMessages check for a subset of handlers. The handlers themselves were idempotent for most operations, so the impact was far smaller than it could have been.

Observability and Operations: Metrics, Logging, and Alerting for Background Pipelines

Background processing is invisible to users until it fails. By the time support tickets appear, you’re usually investigating retroactively. You want to flip that dynamic: detect and fix issues before business notices.

Metrics to track

At a minimum, I push the following metrics to Application Insights / Azure Monitor:

  • Outbox dispatcher
    • Outbox pending count (DB query + custom metric).
    • Outbox dispatch rate (messages/sec).
    • Outbox dispatch failures (count, exception types).
    • Outbox message age percentiles (P50/P95/P99) – how long messages sit before being published.
  • Service Bus entities (native metrics)
    • Active message count.
    • DLQ message count.
    • Incoming/outgoing request rates.
    • Server errors vs user errors.
  • Consumers
    • Handler success/failure counts.
    • Processing latency (receive → handler complete).
    • Lock renewal counts (if you extend locks for long-running handlers).

Correlation IDs and traces

Observability without correlation is noise. I always:

  • Include CorrelationId and CausationId in message metadata.
  • Flow them as Activity / trace context into downstream handlers.
  • Log them in all error logs and key info logs.

With Application Insights’ distributed tracing, you can then see:

  • HTTP call into Service A → domain logic → outbox save.
  • Outbox dispatcher → Service Bus send.
  • Service B receiver → handler → DB changes.

This is invaluable when debugging out-of-sync states across services.

Alerting

I treat the following as red flags worthy of alerts (email/Teams/PagerDuty depending on criticality):

  • DLQ message count > 0 for any critical queue for more than a few minutes.
  • Outbox pending count > N (baseline-based threshold) for more than X minutes.
  • Outbox message age P95 > SLO (e.g., > 5 minutes).
  • Consumer failure rate > Y% over a rolling window.

In one project, a bad deployment broke JSON deserialization for one event type. DLQ count soared, our alert fired, and we rolled back in minutes. Without that, it would’ve been hours before business reported missing downstream data.

Common Pitfalls, Tuning Tips, and Production Hardening Checklist

Frequent mistakes

  • Sending directly to Service Bus inside business transactions
    • Leads to inconsistent state if the DB and broker succeed/fail independently.
    • Use the outbox instead; keep the transaction local.
  • Ignoring idempotency
    • Assuming “Service Bus will only deliver once” – it won’t.
    • Always build per-consumer deduplication via ProcessedMessages (or equivalent) and idempotent handlers.
  • Under-provisioned Service Bus tier
    • Using Basic/Standard for high-volume workloads and then fighting throttling and limits.
    • For serious work, start on Premium and resize messaging units as you learn.
  • Unbounded parallelism
    • Cranking MaxConcurrentCalls up without checking DB connection pool limits, downstream API rate limits, or CPU.
    • Always size concurrency against your slowest/most fragile dependency.
  • No backpressure / queue-based load leveling
    • Making the front-end depend on synchronous background work, defeating the whole purpose of queues.
    • Embrace eventual consistency; use the queue to absorb spikes.
  • Ignoring DLQs
    • Letting the DLQ silently grow.
    • DLQ count should be on a big red dashboard in your NOC.

Tuning tips

  • Start with conservative MaxConcurrentCalls and scale up while watching DB and CPU.
  • Experiment with PrefetchCount – too low and you have overhead, too high and you risk uneven work distribution and longer lock times.
  • Use batch operations where possible (e.g., SendMessagesAsync with message batches) from the outbox dispatcher.
  • Keep payload sizes reasonable; consider compression or data minimization for very large messages.
  • For very hot paths, consider splitting into multiple queues / topics by domain (e.g., “orders”, “payments”, “notifications”) to isolate load.

Production hardening checklist

Before I call a background pipeline “production ready”, I roughly walk through this checklist:

  • Outbox
    • [ ] Domain writes and outbox entries are persisted in a single EF Core transaction.
    • [ ] Outbox table has proper indexes on Status, CreatedAtUtc.
    • [ ] Outbox dispatcher is horizontally scalable and uses safe claiming/locking.
    • [ ] Outbox message IDs are used as Service Bus MessageId with duplicate detection enabled.
  • Consumers
    • [ ] Each consumer has an ProcessedMessages/Inbox table.
    • [ ] Handlers are idempotent and defensive to out-of-order / repeated messages.
    • [ ] Business logic + processed log writes occur in a single transaction.
  • Service Bus configuration
    • [ ] Appropriate tier (Premium for high-throughput critical workloads).
    • [ ] Partitioning enabled where throughput/availability matter more than global ordering.
    • [ ] Sessions used where per-aggregate ordering is required.
    • [ ] Duplicate detection enabled with a sane window.
    • [ ] MaxDeliveryCount and TTL tuned per queue.
  • Resiliency
    • [ ] Retries with exponential backoff + jitter for external calls inside handlers.
    • [ ] Circuit breakers around flaky downstream dependencies.
    • [ ] Reasonable MaxConcurrentCalls to avoid hammering DBs/other APIs.
  • Observability
    • [ ] Metrics for outbox backlog, dispatch rate, and age.
    • [ ] Metrics for DLQ sizes per entity.
    • [ ] Traces that link HTTP requests to outbox entries, Service Bus messages, and consumer processing.
    • [ ] Alerts for DLQ growth, outbox backlog, and elevated failure rates.

The Bottom Line

You don’t need distributed transactions or exotic infrastructure to get reliable, high-throughput background processing in ASP.NET Core. With an EF Core outbox, Azure Service Bus, and disciplined idempotent handler design, you can achieve:

  • Atomicity between your database state and outgoing messages.
  • Exactly-once effects on the consuming side, even with at-least-once delivery.
  • Controlled ordering for flows that need it, via sessions.
  • Robust handling for poison messages using DLQs and outbox failure states.
  • Operational visibility into the health and performance of your background pipelines.

The key is to accept the realities of cloud messaging—retries, duplicates, partial failures—and design with them, not against them. The outbox pattern isn’t magic; it’s just careful use of the one place you still have ACID: your local database.

If your team is in the middle of tearing apart a monolith and struggling with “lost” events, double-processing, or weird inconsistencies between services, start by implementing a clean outbox + idempotent consumer model. In my experience, that’s where reliability jumps from “works most of the time” to “boringly predictable under load”.

[code-block-pro language=”csharp”] public void Hello() { Console.WriteLine(“Hi”); } [/code-block-pro]

CREATE TABLE ProcessedMessages 
( 
  MessageId NVARCHAR(200) NOT NULL PRIMARY KEY, 
  ProcessedAtUtc DATETIME2(3) NOT NULL, 
  Source NVARCHAR(100) NULL 
);