Designing High-Throughput, Idempotent Message Processing Pipelines with Azure Service Bus and .NET

December 9, 2025 · Asad Ali

Most teams discover idempotency and high-throughput messaging the hard way: after they’ve corrupted data in production or melted a downstream system under load.
In this post, I’ll walk through how I design high-throughput, idempotent message processing pipelines using Azure Service Bus and .NET, based on years of running these patterns in production.

Why idempotent, high-throughput messaging changes how you design .NET systems

Let’s level-set the constraints first, because they drive almost every design decision:

  • Azure Service Bus is at-least-once delivery. Duplicates will happen. Network blips, lock expiry, client retries – all can cause redelivery.
  • High throughput means concurrency. You’ll run multiple consumers (competing consumers pattern), each with multiple concurrent handlers.
  • State mutation is where you get hurt. Idempotency is usually not about your message handler code; it’s about how many times you mutate your database or external systems.
  • Back-pressure is non-negotiable. Your queue will absorb spikes; your processing pipeline must not blindly hammer downstream dependencies.

Once you accept these, you stop asking “How do I avoid duplicates?” and start asking “How do I survive duplicates while keeping throughput high?”

Architecture insight: In every real system I’ve seen, the bottleneck is rarely Service Bus itself. It’s your database, your HTTP integrations, and your lack of idempotency around those calls.

Choosing the right Azure Service Bus primitives for high throughput

Before writing any .NET code, you need to shape the messaging topology around your workload. Azure Service Bus gives you just enough flexibility to shoot yourself in the foot if you don’t think it through.

Queues vs topics: point-to-point vs fan-out

  • Queue – single logical consumer group. Perfect for work queues, background processing, batch jobs.
  • Topic + subscriptions – pub/sub and fan-out. Multiple independent subscribers can process the same event stream differently.

For high-throughput pipelines:

  • Use queues for command-like “do this work” scenarios where exactly one processing lane is required.
  • Use topics when multiple services need to react to the same business event independently.

Standard vs Premium, partitioned vs non-partitioned

You’re trading cost, throughput, and latency:

Tier / Option Pros Cons When I choose it
Standard, non-partitioned Simpler, cheaper Lower throughput, no isolation Dev/test, low-volume workloads
Standard, partitioned Higher throughput, better availability No sessions across partitions High throughput but cost-sensitive, no strict FIFO per key
Premium Dedicated resources, predictable latency, AZ support More expensive Mission-critical, very high throughput, noisy neighbors unacceptable
Tip: For anything business-critical that will obviously grow, start with Premium. The cost of migrating later (especially with sessions and duplicate detection) is far higher than the price delta.

Sessions: ordered, related message streams

Sessions give you FIFO per sessionId. Service Bus will ensure:

  • Messages with the same SessionId are processed in order.
  • Only one ServiceBusSessionReceiver has a given session at a time.

They’re a good fit when:

  • You’re processing a per-user or per-aggregate stream (e.g., all operations for OrderId).
  • Ordering inside that stream actually matters.

The catch: sessions hurt raw parallelism if you only have a few hot keys. You need enough distinct SessionId values to spread load across multiple consumers.

Partitioning and partition keys

Partitioned queues/topics in Standard tier distribute messages across partitions; each partition is an independent broker and store. You can optionally provide a PartitionKey to keep related messages together.

  • Use a stable, high-cardinality partition key (e.g., CustomerId) to spread load.
  • Don’t use the same fixed key for everything unless you like single-partition bottlenecks.

Duplicate detection on the broker

Service Bus can do some deduplication for you via duplicate detection:

  • Configured per entity with a duplicate detection window (20 seconds to 7 days).
  • Uses MessageId history to discard duplicates sent within that window.

It’s useful, but not enough by itself:

  • Deduplication is only on send, not on receive semantics.
  • It doesn’t protect you from consumer-side duplicates (e.g., message lock expiry and redelivery).
Warning: Don’t rely solely on Service Bus duplicate detection for business guarantees. Treat it as a helpful optimization, not your only line of defense.

Designing idempotent handlers in .NET: stop thinking in “try/catch” and start thinking in “state transitions”

Most idempotency failures I’ve debugged weren’t about exotic distributed systems theory. They were simple things like “we created the invoice twice” or “we charged the customer twice”.

The rule I drill into teams is:

If the same message arrives 10 times, your final system state must be the same as if it arrived once.

— hard-earned production lesson

Anatomy of a robust message handler

In .NET with Azure.Messaging.ServiceBus, your handler usually ends up shaped like this:

public class OrderCreatedMessage
{
    public string MessageId { get; set; } = default!; // business message id, not broker id
    public string OrderId { get; set; } = default!;
    public decimal Amount { get; set; }
    public DateTimeOffset OccurredAt { get; set; }
}

public class OrderCreatedHandler
{
    private readonly IIdempotencyStore _idempotencyStore;
    private readonly IOrdersService _ordersService;
    private readonly ILogger<OrderCreatedHandler> _logger;

    public OrderCreatedHandler(
        IIdempotencyStore idempotencyStore,
        IOrdersService ordersService,
        ILogger<OrderCreatedHandler> logger)
    {
        _idempotencyStore = idempotencyStore;
        _ordersService = ordersService;
        _logger = logger;
    }

    public async Task HandleAsync(
        OrderCreatedMessage message,
        ServiceBusReceivedMessage raw,
        CancellationToken cancellationToken)
    {
        // 1. Check idempotency first – fast path if already processed
        if (await _idempotencyStore.IsProcessedAsync(message.MessageId, cancellationToken))
        {
            _logger.LogInformation("Skipping already processed message {MessageId}", message.MessageId);
            return;
        }

        // 2. Attempt to mark as "in-progress" (idempotency guard)
        var acquired = await _idempotencyStore.TryAcquireAsync(message.MessageId, cancellationToken);
        if (!acquired)
        {
            _logger.LogInformation("Another worker is processing message {MessageId}", message.MessageId);
            return; // let the other worker finish; duplicate will be safe
        }

        try
        {
            // 3. Execute side effects (DB, HTTP, etc.)
            await _ordersService.CreateOrderAsync(message, cancellationToken);

            // 4. Mark as completed
            await _idempotencyStore.MarkCompletedAsync(message.MessageId, cancellationToken);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to handle message {MessageId}", message.MessageId);

            // optional: mark as failed with reason; let retry policy decide
            await _idempotencyStore.MarkFailedAsync(message.MessageId, ex, cancellationToken);
            throw; // surface to Service Bus processor so it can retry / DLQ
        }
    }
}

This pattern relies on a durable IIdempotencyStore that can coordinate between workers.

What makes a handler truly idempotent?

You must audit every side effect and make sure it’s either:

  • Idempotent by contract (e.g., PUT /resource/123 with same payload multiple times).
  • Protected by idempotency keys (e.g., payment provider APIs that accept an idempotency key).
  • Wrapped in your own deduplication logic using a durable store.

Things that are not naturally idempotent:

  • INSERT-only SQL operations without unique constraints.
  • External calls that create new resources without idempotency keys.
  • Email/SMS sending (users notice duplicates very quickly).
Anti-pattern: “We’re safe because we wrapped the handler in a try/catch and log errors.” You haven’t addressed idempotency at all – you’ve just improved observability of failure.

Deduplication strategies: getting to “exactly once effect”

You will not get exactly-once delivery from Service Bus. You can, however, get exactly-once effect if you control how side effects are applied.

Choosing your idempotency key

The most important design choice is what uniquely identifies the business operation. Options:

  • Business operation ID – e.g., PaymentId, OrderId. Good when 1:1 with the effect.
  • Client-generated message ID – a GUID generated by the publisher and carried through as MessageId in the payload.
  • Composite key – e.g., (OrderId, EventType, Version) when one aggregate receives multiple event types.

Use that as your primary idempotency key; don’t rely on the broker-assigned ServiceBusReceivedMessage.MessageId unless you’re very deliberate about it.

Implementing a durable idempotency store

You can implement IIdempotencyStore in a few ways; each has trade-offs.

1. SQL-based idempotency table (my default for most systems)

Example table:

CREATE TABLE MessageProcessingStatus
(
    MessageId       NVARCHAR(100) NOT NULL,
    Status          TINYINT       NOT NULL, -- 0=InProgress,1=Completed,2=Failed
    ProcessedAtUtc  DATETIME2     NULL,
    FailedReason    NVARCHAR(500) NULL,
    CONSTRAINT PK_MessageProcessingStatus PRIMARY KEY (MessageId)
);

And a simple implementation using EF Core-style pseudo code:

public class SqlIdempotencyStore : IIdempotencyStore
{
    private readonly MyDbContext _db;

    public SqlIdempotencyStore(MyDbContext db) => _db = db;

    public async Task<bool> IsProcessedAsync(string messageId, CancellationToken ct)
    {
        return await _db.MessageStatuses
            .AnyAsync(x => x.MessageId == messageId && x.Status == Status.Completed, ct);
    }

    public async Task<bool> TryAcquireAsync(string messageId, CancellationToken ct)
    {
        try
        {
            _db.MessageStatuses.Add(new MessageStatus
            {
                MessageId = messageId,
                Status = Status.InProgress,
                ProcessedAtUtc = null
            });

            await _db.SaveChangesAsync(ct);
            return true;
        }
        catch (DbUpdateException ex) when (IsUniqueConstraintViolation(ex))
        {
            // Someone else inserted it first; let them process
            return false;
        }
    }

    public async Task MarkCompletedAsync(string messageId, CancellationToken ct)
    {
        var entity = await _db.MessageStatuses.FindAsync(new object[] { messageId }, ct);
        if (entity is null) return; // or throw

        entity.Status = Status.Completed;
        entity.ProcessedAtUtc = DateTime.UtcNow;
        await _db.SaveChangesAsync(ct);
    }

    public async Task MarkFailedAsync(string messageId, Exception ex, CancellationToken ct)
    {
        var entity = await _db.MessageStatuses.FindAsync(new object[] { messageId }, ct);
        if (entity is null) return;

        entity.Status = Status.Failed;
        entity.FailedReason = ex.Message.Truncate(500);
        entity.ProcessedAtUtc = DateTime.UtcNow;
        await _db.SaveChangesAsync(ct);
    }
}

This pattern leans on the primary key constraint to ensure only one worker “wins” the right to process a given message.

2. Redis / distributed cache

For ultra-low latency and simple “processed key” checks, Redis works well:

  • Use SETNX (set if not exists) to acquire.
  • Set a TTL based on your retention requirements.

But beware:

  • Caches can be evicted; your guarantees are weaker than with a database.
  • You need to tune TTLs carefully to balance memory vs duplicate risk.

3. Piggybacking on business tables

Sometimes the cleanest solution is not a separate idempotency table. For example:

  • Orders has a unique constraint on ExternalOrderId.
  • If you try to insert the same order twice, you get a constraint violation.

This can give you idempotent effects without a separate idempotency store – just treat unique constraint violations as “already done”.

Best practice: Model your domain tables with meaningful unique constraints. They are your strongest weapon in achieving exactly-once effect semantics.

Failures, retries, and poison messages: how to not burn the house down

Every high-throughput pipeline eventually meets its poison messages – payloads that always fail because of validation issues, data assumptions, or downstream invariants.

Understanding Service Bus delivery and DLQ behavior

With peek-lock mode (which you should use), the flow is:

  1. Message is locked for a duration (default 30s, up to 5 minutes).
  2. You process it, then explicitly CompleteAsync, AbandonAsync, DeferAsync, or DeadLetterAsync.
  3. If your handler throws or the lock expires, the message becomes available again and delivery count increments.
  4. Once MaxDeliveryCount is exceeded (default 10), Service Bus moves it to the dead-letter queue (DLQ).

Retry policies: transient vs permanent failures

I split errors into roughly two buckets:

  • Transient – timeouts, 5xx from downstream, throttling. Worth retrying with exponential backoff + jitter.
  • Permanent – validation errors, domain rule violations, bad data. Retrying blindly just delays the inevitable and increases noise.

I usually implement a layered strategy:

  1. Inside the handler: use Polly (or similar) to retry transient downstream calls a few times with short backoff.
  2. If still failing, throw; let Service Bus redeliver up to MaxDeliveryCount.
  3. After MaxDeliveryCount, message moves to DLQ – treat DLQ as an inbox for manual / semi-automated inspection.
var retryPolicy = Policy
    .Handle<HttpRequestException>()
    .Or<TimeoutException>()
    .WaitAndRetryAsync(3, attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt))); // 2,4,8s

public async Task HandleAsync(OrderCreatedMessage msg, CancellationToken ct)
{
    await retryPolicy.ExecuteAsync(async () =>
    {
        await _ordersService.CreateOrderAsync(msg, ct);
    });
}

Explicit dead-lettering with context

Don’t always rely on MaxDeliveryCount. Sometimes you know immediately the message is hopeless (e.g., invalid schema). In that case:

await args.DeadLetterMessageAsync(
    args.Message,
    deadLetterReason: "ValidationFailed",
    deadLetterErrorDescription: "CustomerId missing");

That metadata makes DLQ triage way more efficient later.

Delayed retries via scheduled messages

For downstream systems that don’t like being hammered, you can use scheduled messages to implement delayed retries:

// schedule retry in 5 minutes
var scheduledMessage = new ServiceBusMessage(body)
{
    MessageId = originalMessageId,
    ApplicationProperties =
    {
        ["RetryAttempt"] = attempt + 1
    }
};

await sender.ScheduleMessageAsync(scheduledMessage, DateTimeOffset.UtcNow.AddMinutes(5));
Warning: Don’t reinvent a full-blown workflow engine on top of scheduled messages. Use them for simple backoff; if you need complex orchestrations, look at Durable Functions or a proper workflow engine.

Circuit breakers around fragile dependencies

If a downstream system is consistently failing, hammering it from 50 consumers in parallel won’t help. Wrap your outbound calls in a circuit breaker:

var circuitBreaker = Policy
    .Handle<Exception>()
    .CircuitBreakerAsync(
        exceptionsAllowedBeforeBreaking: 10,
        durationOfBreak: TimeSpan.FromMinutes(1));

public Task HandleAsync(OrderCreatedMessage msg, CancellationToken ct)
    => circuitBreaker.ExecuteAsync(ct => _ordersService.CreateOrderAsync(msg, ct), ct);

Once open, you can choose to:

  • Abandon messages so they’re retried later.
  • Dead-letter quickly with a clear “Downstream unavailable” reason and move work to a separate recovery lane.

Preserving ordering and causality without killing throughput

Ordering is expensive. I only guarantee it where the business truly demands it.

When ordering actually matters

Some realistic examples:

  • Balance updates for the same account.
  • State transitions on the same aggregate (e.g., Order Placed → Paid → Shipped).
  • Event streams that feed projections (where out-of-order events break invariants).

Where ordering doesn’t matter (but teams mistakenly think it does):

  • Independent operations across different customers or tenants.
  • Notification events that can be processed in any order.
  • Bulk ingestion tasks where each record is logically independent.

Using sessions for per-key ordering

For per-aggregate ordering, I often combine:

  • SessionId = AggregateId for strict ordering.
  • Enough aggregate IDs to keep many concurrent sessions open across workers.

In .NET, you’d use ServiceBusSessionProcessor:

var options = new ServiceBusSessionProcessorOptions
{
    MaxConcurrentSessions = 16, // how many sessions processed in parallel
    MaxConcurrentCallsPerSession = 1, // FIFO per session
    AutoCompleteMessages = false,
    PrefetchCount = 0
};

var processor = client.CreateSessionProcessor(queueName, options);

processor.ProcessMessageAsync += async args =>
{
    var sessionId = args.Message.SessionId;
    // handle message for this session in order
};

Partitioning and logical ordering

When you don’t need strict FIFO but want to avoid reordering within some scope, you can:

  • Use partition keys (Standard partitioned entities) so messages for a key stay on the same partition.
  • Accept eventual ordering: some minor reordering is tolerated, but all operations are still idempotent.
Architecture insight: A lot of teams over-constrain themselves with global ordering. If you design your aggregates properly, you often only need ordering per aggregate ID, which is much cheaper.

Throughput tuning in .NET: prefetch, concurrency, batch receive

Once correctness is nailed, you can safely turn the knobs for throughput.

ServiceBusProcessor basics

With Azure.Messaging.ServiceBus, the usual pattern is:

var client = new ServiceBusClient(connectionString);

var options = new ServiceBusProcessorOptions
{
    MaxConcurrentCalls = 32,
    AutoCompleteMessages = false,
    PrefetchCount = 128
};

var processor = client.CreateProcessor(queueName, options);

processor.ProcessMessageAsync += ProcessMessageHandler;
processor.ProcessErrorAsync += ErrorHandler;

await processor.StartProcessingAsync();
  • MaxConcurrentCalls – how many messages your process handles concurrently.
  • PrefetchCount – how many messages to buffer client-side to reduce round trips.
  • AutoCompleteMessages = false – you control CompleteMessageAsync once business logic is done.

Prefetch: use it, but don’t overshoot

Prefetch can significantly increase throughput by keeping a local buffer of messages.

  • Start with PrefetchCount ~ 2–4 × MaxConcurrentCalls.
  • Adjust based on latency vs memory trade-offs.
  • Be cautious with very long processing times; prefetching too much can lead to many locked-but-unprocessed messages.

Concurrency tuning: CPU vs I/O bound

Ask yourself: is your handler mostly CPU-bound or I/O-bound?

  • CPU-bound (heavy computation) – don’t set MaxConcurrentCalls much higher than your core count.
  • I/O-bound (HTTP, DB, storage) – you can often safely go higher and let async I/O do its thing.
Debugging insight: One of my pipelines looked “slow” because we’d limited MaxConcurrentCalls to 4 based on early tests. The handler was 95% I/O. Bumping to 64 immediately doubled throughput without stressing CPU.

Batch receive and send

For some workloads, especially ETL-style jobs, it’s beneficial to handle messages in batches:

  • Receive multiple messages with ReceiveMessagesAsync in a loop (when not using ServiceBusProcessor).
  • Use send batching (SendMessagesAsync with a collection) on the publisher side.

But remember: batching complicates per-message idempotency and error handling, so I use it where the cost savings are worth the complexity.

Resiliency and observability: timeouts, idempotency metrics, tracing

If you can’t see your pipeline misbehaving early, you’ll only notice it once you’ve lost money or data.

Timeout discipline

Every external call in your handler must have:

  • A reasonable timeout (HTTP client, DB commands, etc.).
  • A retry policy (where appropriate).
  • A cancellation path tied to the message handling CancellationToken.

Long-running handlers increase the chance of message lock expiry, which leads to duplicates even if your code “looks fine”. Consider lock renewal for genuine long tasks, but avoid making every handler long-running by accident.

What to measure for idempotent pipelines

Aside from the usual queue length and DLQ depth, I track:

  • Idempotency hit rate – how many messages are skipped as already processed.
  • Duplicate attempt count – how many times the same key is seen attempted within a window.
  • Handler latency – including breakdown by status (success, transient failure, permanent failure).
  • Downstream error rates – per dependency.

Emit metrics from your IIdempotencyStore so you know when duplicate behavior is spiking (often a symptom of timeouts, lock expiry, or buggy publishers).

End-to-end tracing with correlation IDs

For distributed tracing, I usually:

  • Propagate traceparent / tracestate (W3C Trace Context) in message headers.
  • Include a correlation ID (CorrelationId property or in ApplicationProperties).
  • Log the same correlation ID across HTTP, Service Bus, and DB logs.

In .NET, when sending:

var activity = Activity.Current; // from System.Diagnostics

var message = new ServiceBusMessage(body)
{
    MessageId = businessMessageId,
    CorrelationId = activity?.TraceId.ToString() ?? Guid.NewGuid().ToString()
};

if (activity != null)
{
    message.ApplicationProperties["traceparent"] = activity.Id;
}

And on the consumer side, start a new activity using traceparent and CorrelationId so traces stitch together in Application Insights or your tracing system.

Sample production-ready pipeline architecture

Let’s pull this together into a concrete example you can map to a real system: say an “Order Processing” backend.

High-level flow

+-------------+      +---------------------+      +-----------------------------+
|  API / UI  | ---> |  OrderCommandTopic  | ---> |  Order Processor Service(s) |
+-------------+      +---------------------+      +-----------------------------+
                                   |                             |
                                   |                       +-----+-----+
                                   |                       |  SQL DB   |
                                   |                       +-----------+
                                   |                             |
                                   v                             v
                            +-------------------+        +---------------+
                            | OrderEventsTopic  | ----> | Other services|
                            +-------------------+        +---------------+

Focus on the Order Processor Service which consumes commands and mutates state.

Service Bus configuration

  • Tier: Premium.
  • Entity: order-commands queue (non-session, non-partitioned) or topic + subscription depending on use case.
  • Duplicate detection: enabled, 1–7 day window.
  • MaxDeliveryCount: tuned based on downstream stability (e.g., 10–20).

.NET worker layout

src/
  OrderProcessor.Worker/
    Program.cs
    Messaging/
      ServiceBusProcessorHostedService.cs
      OrderCreatedHandler.cs
      IIdempotencyStore.cs
      SqlIdempotencyStore.cs
    Infrastructure/
      MyDbContext.cs
      HttpClients/
      Logging/

In Program.cs:

var builder = Host.CreateApplicationBuilder(args);

builder.Services.AddDbContext<MyDbContext>(...);

builder.Services.AddSingleton(new ServiceBusClient(
    builder.Configuration["ServiceBus:ConnectionString"]));

builder.Services.AddSingleton<IIdempotencyStore, SqlIdempotencyStore>();

builder.Services.AddHostedService<ServiceBusProcessorHostedService>();

var app = builder.Build();

await app.RunAsync();

The hosted service wires up the processor:

public class ServiceBusProcessorHostedService : IHostedService
{
    private readonly ServiceBusClient _client;
    private readonly IServiceProvider _services;
    private ServiceBusProcessor? _processor;
    private readonly ILogger<ServiceBusProcessorHostedService> _logger;

    public ServiceBusProcessorHostedService(
        ServiceBusClient client,
        IServiceProvider services,
        ILogger<ServiceBusProcessorHostedService> logger)
    {
        _client = client;
        _services = services;
        _logger = logger;
    }

    public async Task StartAsync(CancellationToken cancellationToken)
    {
        var options = new ServiceBusProcessorOptions
        {
            MaxConcurrentCalls = 32,
            AutoCompleteMessages = false,
            PrefetchCount = 128
        };

        _processor = _client.CreateProcessor("order-commands", options);

        _processor.ProcessMessageAsync += ProcessMessageAsync;
        _processor.ProcessErrorAsync += ErrorHandlerAsync;

        await _processor.StartProcessingAsync(cancellationToken);
    }

    private async Task ProcessMessageAsync(ProcessMessageEventArgs args)
    {
        using var scope = _services.CreateScope();

        var handler = scope.ServiceProvider.GetRequiredService<OrderCreatedHandler>();

        var body = args.Message.Body.ToObjectFromJson<OrderCreatedMessage>();

        await handler.HandleAsync(body, args.Message, args.CancellationToken);

        await args.CompleteMessageAsync(args.Message);
    }

    private Task ErrorHandlerAsync(ProcessErrorEventArgs args)
    {
        _logger.LogError(args.Exception,
            "Service Bus error. Entity: {EntityPath}, ErrorSource: {ErrorSource}",
            args.EntityPath, args.ErrorSource);

        return Task.CompletedTask;
    }

    public async Task StopAsync(CancellationToken cancellationToken)
    {
        if (_processor != null)
        {
            await _processor.StopProcessingAsync(cancellationToken);
            await _processor.DisposeAsync();
        }
    }
}

Key properties of this design:

  • Scoped DI per message via CreateScope() – clean lifetime management.
  • Explicit completion after handler success.
  • Centralized error handling in ProcessErrorAsync.
  • Separate idempotency store used inside OrderCreatedHandler.
Outcome: You can scale out this worker horizontally with many replicas. As load grows, increase MaxConcurrentCalls and replica count. Idempotency and DLQ behavior keep your data safe.

Operational habits, testing strategies, and the traps I’ve seen teams fall into

Operational best practices

  • Alert on DLQ growth – not just absolute size, but rate of increase.
  • Watch consumer lag – queue length vs historical baseline; use this to guide scale-out.
  • Separate “replay” tools – build a small internal tool to browse DLQs and re-submit messages after fixes.
  • Version your contracts – be explicit when breaking changes occur; remember that old messages may still be waiting in queues.

Testing idempotency and failure behavior

My rule for any critical handler: if we can’t confidently prove it’s idempotent under weird conditions, it’s not ready.

Tests I insist on:

  • Duplicate delivery tests – send the exact same message 2–5 times; assert final DB state is correct and no double side effects occurred.
  • Lock expiry simulation – make the handler sleep past lock duration; ensure idempotency logic still protects state on redelivery.
  • Partial failure tests – make downstream call fail after DB changes; confirm idempotency handles re-run safely.
  • DLQ routing tests – ensure poison messages end up in DLQ with correct reasons and that your replay tooling works.

Common pitfalls I’ve debugged in production

  • Using broker MessageId as the only dedup key and accidentally regenerating IDs at each send – defeated duplicate detection completely.
  • Long-running synchronous handlers (blocking I/O, no timeouts) – causing locks to expire, leading to “random” duplicates.
  • Idempotency table in a different database than main state – leading to cross-DB distributed transaction issues and inconsistent writes.
  • Relying solely on retries for what were actually data quality issues – DLQ got flooded, downstream team got paged for no reason.
  • Global FIFO assumptions – making everything sequential and then wondering why throughput was terrible.
Anti-pattern: “We’ll just use a bigger instance size if we need more throughput.” Scaling up without fixing idempotency and concurrency design just accelerates data corruption and downstream meltdowns.

The bottom line

High-throughput, idempotent message processing with Azure Service Bus and .NET isn’t about a single magic feature. It’s the combination of:

  • Choosing the right Service Bus constructs (queues vs topics, sessions, partitioning, Premium vs Standard).
  • Designing idempotent handlers with a clear idempotency key and a durable store.
  • Using DLQs, retries, and circuit breakers to handle transient vs permanent failures.
  • Tuning concurrency and prefetch for your CPU/I/O profile.
  • Investing in observability – correlation IDs, metrics for duplicates, DLQ monitoring.

Get those right, and you can safely crank up load, scale out workers, and sleep a lot better when your traffic spikes. Ignore them, and your “fast” pipeline will be the fastest way to corrupt data that you’ve ever deployed.

If you’re building or refactoring a pipeline today, start by making one handler provably idempotent end-to-end. Once that’s solid, scale it out and apply the same patterns everywhere else.