There’s a very sharp line between “we have a queue and some workers” and a real, production-grade, high-throughput, idempotent pipeline. Most .NET teams discover that line the hard way—after a retry storm duplicates payments, a misconfigured outbox melts SQL Server, or a tiny bug in idempotency handling fans out into cross-tenant data corruption. In this article, I’ll walk through how I actually design and run Azure Service Bus + EF Core–backed pipelines in real systems: the patterns, the scars, and the trade-offs you only see once you’ve watched one of these pipelines fall over under load.
Where Message Pipelines Actually Fail: Duplicates, Ordering, and Consistency
Let’s ground this in the real failure modes first. Until you’ve mapped these out explicitly, you’re not designing—you’re gambling.
Duplicates are not a corner case; they are the normal case
Azure Service Bus is explicitly at-least-once delivery (Microsoft calls this out clearly). That means:
- Network blip after processing but before
CompleteMessageAsync→ same message redelivered. - Worker process crash after DB commit but before completion → same message redelivered.
- Application-level retry logic that retries after committing.
If you don’t treat duplicates as first-class citizens, you WILL double-charge, double-ship, or double-provision something. The Azure Architecture Center’s Idempotent Receiver pattern exists for a reason.
Ordering is almost always weaker than you think
Teams intellectually know that queues don’t guarantee global ordering, but then they write code that assumes it anyway:
- They expect
OrderCreatedto be processed beforeOrderCancelled. - They expect
v2of a command not to arrive beforev1. - They assume multiple competing consumers will “just work” with optimistic concurrency.
Service Bus sessions can give FIFO within a session, but that’s a very specific tool with its own scaling pain, not a magic global ordering feature.
Consistency: the dual-write cliff
The dual-write problem is the classic trap: update SQL, then publish to Service Bus (or vice versa) without a single atomic boundary. Azure’s documentation on the Outbox pattern spells this out clearly: if you’re updating your data store and sending messages as two independent operations, you will eventually get inconsistencies.
EF Core adds its own quirks here:
- Execution strategies (retries) can re-run parts of your pipeline unexpectedly.
- Concurrency exceptions can surface at
SaveChangesAsynclong after you thought business rules were done. - Transactions cannot span SQL Server and Service Bus in a sane, cloud-native way (EF Core docs call this out).
So you have to choose: either let the broker be the source of truth and treat DB updates as side effects, or use the DB as the transactional source of truth and push messages out via an outbox. The latter is what we’ll focus on.
Delivery Semantics, Idempotency, and What Service Bus Actually Guarantees
Before we touch .NET code, we need to internalize what Azure Service Bus does—and does not—do for us.
Delivery semantics in real systems
- At-most-once: “We’ll try, but you might not see it.” Rarely acceptable for business-critical flows.
- At-least-once: “You will eventually see it, maybe more than once.” This is Service Bus.
- Exactly-once: In distributed systems, this is really “at-least-once + idempotent processing + deduplication.” As the Azure Architecture Center notes, so-called exactly-once is semantic, not infrastructural.
Greg Young, Udi Dahan, and Martin Fowler have all been beating this drum for a decade: stop chasing magical transport guarantees and design idempotent business operations instead.
What Azure Service Bus gives you out of the box
- Peek-lock mode: receive + lock, process, then
Complete/Abandon/DeadLetter. If the lock expires or you don’t complete, the message is available again (duplicate territory). - Duplicate detection: per entity (queue/topic) message-id based dedup window (20s–7d) that discards messages with the same
MessageId(docs here). - Transactions (within Service Bus): receive + complete + send can be atomic within the namespace, but you can’t include SQL Server in that transaction boundary.
- Sessions: FIFO + single consumer per session (grouped by
SessionId). - Prefetch + batching: the SDK can prefetch messages and process them concurrently.
- DLQ + deferral: robust tools for poison messages and delayed processing.
Note the key constraints:
- Duplicate detection is windowed and per entity; it’s not a global, permanent dedup safety net.
- Transactions don’t include your relational database.
- At-least-once semantics are fundamental; retries will cause duplicates.
Architectural Shapes for Idempotent Consumers in .NET
Let’s talk shapes, not code. Patterns first; implementation details later.
Classic queue + competing consumers
This is the bread-and-butter high-throughput pattern (Competing Consumers):
- One queue.
- Many worker processes/instances (Kubernetes pods, Azure App Service, Functions, etc.).
- Each worker pulls messages and processes independently.
Pros:
- Scales horizontally by adding workers.
- Keeps producers decoupled from consumers (Queue-Based Load Leveling pattern).
Cons:
- No ordering across workers.
- Contention on shared resources (SQL, caches) under high concurrency.
- Higher probability of observing concurrency races on the same aggregate.
Topics + subscriptions for fan-out
Same principles, but with multiple subscribers getting their own copy of messages. Each subscription is its own queue-like entity. Idempotency must be handled per-subscriber; each consumer sees its own duplicates and failure modes.
Sessions when ordering is non-negotiable
Sessions give FIFO semantics per SessionId, processed by a single consumer at a time. This is powerful, but comes with sharp edges:
- If you hot-spot a small number of sessions (e.g., large-tenants or popular user), you bottleneck on a single consumer.
- If sessions are long-lived and locks expire, you get nasty reprocessing behavior.
- Scaling out is limited by the number of active sessions, not by the raw count of messages.
I only reach for sessions when I can clearly articulate the grouping key and its cardinality, and when I’ve thought through hot sessions and backpressure.
Where to enforce idempotency in the architecture
You have three common options:
- Transport-level dedup (Service Bus duplicate detection) – limited window, per-entity, good as a safety layer but not sufficient alone.
- Application-level idempotency using a store – “idempotency keys” stored in SQL/Redis; each handler checks and records processing state.
- Business-level idempotency – design operations (e.g., upserts, set-based updates) so that replaying the same command becomes a no-op or equivalent result.
In serious systems, I end up using (2) + (3). The transport can help, but I never trust it as the sole guardrail.
EF Core + SQL Server: Getting Transactional Consistency Without Hurting Throughput
This is where many teams faceplant. EF Core and SQL Server are powerful, but their defaults are not tuned for message pipelines by any stretch.
The dual-write problem in .NET clothes
The broken pattern looks like this:
public async Task HandleAsync(MyMessage msg, ServiceBusReceivedMessage sbMsg)
{
// 1. Update DB
var order = await _db.Orders.FindAsync(msg.OrderId);
order.Status = OrderStatus.Paid;
await _db.SaveChangesAsync();
// 2. Publish integration event
var integrationEvent = new OrderPaidIntegrationEvent(order.Id, ...);
var sbMessage = new ServiceBusMessage(JsonSerializer.Serialize(integrationEvent));
await _sender.SendMessageAsync(sbMessage);
// 3. Complete Service Bus message
await _receiver.CompleteMessageAsync(sbMsg);
}
Failure modes:
- DB commit succeeds,
SendMessageAsyncfails → system state changed but no downstream notification. - Both succeed, but process crashes before
CompleteMessageAsync→ duplicate processing on retry (and possibly another publish). - Retries around
SaveChangesAsynccan rerun business logic if you’re not careful with EF Core execution strategies.
This is exactly the dual-write problem the Azure Architecture Center warns against. You need a single transaction boundary where business state and outgoing messages are committed together. That’s the transactional outbox.
Execution strategies and hidden retries
EF Core’s connection resiliency is great but dangerous if misunderstood. By default, SQL Server providers can be configured with an execution strategy that automatically retries transient failures. Inside a transaction, this can re-execute your entire operation block.
If that block includes side effects (e.g., HTTP calls, Service Bus sends), you’ve just made them non-idempotent by default. Patterns from Google SRE around idempotent RPCs apply here; don’t mix external side effects inside EF’s retryable execution strategy scope.
Transaction isolation level choices
Most EF Core apps implicitly run under SQL Server’s default READ COMMITTED. In high-throughput messaging systems, I often explicitly choose:
- READ COMMITTED SNAPSHOT at the DB level to reduce reader-writer blocking.
- SERIALIZABLE only around narrow critical sections (e.g., enforcing idempotency insert uniqueness) via explicit transactions.
You don’t want every handler transaction to be SERIALIZABLE. That will annihilate concurrency under load.
Implementing the Transactional Outbox with EF Core for Service Bus
Let’s put a robust pattern on the table. This is the approach that has saved me in multiple high-traffic systems.
Outbox schema and domain model
A realistic outbox table (trimmed for brevity):
public class OutboxMessage
{
public Guid Id { get; set; } // Outbox record ID
public string MessageId { get; set; } // Business-level event ID (idempotency)
public string Type { get; set; } // .NET type or logical event name
public string Payload { get; set; } // JSON payload
public DateTime CreatedUtc { get; set; }
public DateTime? ProcessedUtc { get; set; }
public string? Error { get; set; } // Last error, if failed
public int Attempts { get; set; }
}
public class AppDbContext : DbContext
{
public DbSet<OutboxMessage> OutboxMessages => Set<OutboxMessage>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<OutboxMessage>(b =>
{
b.HasKey(x => x.Id);
b.Property(x => x.MessageId)
.IsRequired()
.HasMaxLength(200);
b.HasIndex(x => x.MessageId).IsUnique(); // Idempotency on producer side
b.HasIndex(x => x.ProcessedUtc);
b.Property(x => x.Type).HasMaxLength(500);
});
}
}
Writing to the outbox inside the same transaction
All state changes + outbox records are committed atomically:
public async Task HandleBusinessCommandAsync(PayOrderCommand cmd, CancellationToken ct)
{
// This might itself be invoked by a message handler or by an HTTP API.
using var tx = await _db.Database.BeginTransactionAsync(ct);
var order = await _db.Orders.FindAsync(new object[] { cmd.OrderId }, ct);
if (order == null) throw new DomainException($"Order {cmd.OrderId} not found");
order.MarkPaid(cmd.PaymentId, cmd.PaidAmount);
var integrationEvent = new OrderPaidIntegrationEvent(
cmd.OrderId,
cmd.PaymentId,
cmd.PaidAmount,
occurredAtUtc: _clock.UtcNow);
var outbox = new OutboxMessage
{
Id = Guid.NewGuid(),
MessageId = integrationEvent.Id.ToString(),
Type = integrationEvent.GetType().FullName!,
Payload = JsonSerializer.Serialize(integrationEvent, _jsonOptions),
CreatedUtc = _clock.UtcNow,
Attempts = 0
};
_db.OutboxMessages.Add(outbox);
await _db.SaveChangesAsync(ct);
await tx.CommitAsync(ct);
}
If SaveChangesAsync or CommitAsync fails, neither the order update nor the outbox record is persisted. We have a clear atomic boundary.
Dispatching the outbox in a background worker
A separate process (or background service) reads unprocessed outbox rows and publishes them to Service Bus. This is where you want careful batching and locking.
public class OutboxDispatcher : BackgroundService
{
private readonly IServiceProvider _services;
private readonly ServiceBusSender _sender;
private readonly ILogger<OutboxDispatcher> _logger;
public OutboxDispatcher(IServiceProvider services, ServiceBusSender sender,
ILogger<OutboxDispatcher> logger)
{
_services = services;
_sender = sender;
_logger = logger;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
try
{
using var scope = _services.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<AppDbContext>();
// Fetch a batch for this iteration
var batch = await db.OutboxMessages
.Where(x => x.ProcessedUtc == null && x.Attempts < 10)
.OrderBy(x => x.CreatedUtc)
.Take(100)
.ToListAsync(stoppingToken);
if (!batch.Any())
{
await Task.Delay(TimeSpan.FromSeconds(1), stoppingToken);
continue;
}
// Optional: create a Service Bus batch
using ServiceBusMessageBatch sbBatch = await _sender.CreateMessageBatchAsync(stoppingToken);
foreach (var msg in batch)
{
var sbMessage = new ServiceBusMessage(msg.Payload)
{
MessageId = msg.MessageId,
Subject = msg.Type,
};
if (!sbBatch.TryAddMessage(sbMessage))
{
// Flush current batch
await _sender.SendMessagesAsync(sbBatch, stoppingToken);
// Start a new batch
sbBatch.Dispose();
using var newBatch = await _sender.CreateMessageBatchAsync(stoppingToken);
if (!newBatch.TryAddMessage(sbMessage))
{
throw new InvalidOperationException("Message too large for batch");
}
}
msg.Attempts++;
msg.ProcessedUtc = _clock.UtcNow; // or only on success if you handle partial
}
await _sender.SendMessagesAsync(sbBatch, stoppingToken);
await db.SaveChangesAsync(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error dispatching outbox messages");
await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken);
}
}
}
}
Trade-offs:
- If you mark
ProcessedUtcbefore a successfulSendMessagesAsync, a crash between DB commit and broker send will drop messages. Safer is: update state after successful send, but then ensure you handle duplicates downstream usingMessageIdand consumer idempotency. - You must keep outbox growth under control with retention/cleanup jobs.
Designing Idempotent Handlers: Keys, Stores, and Concurrency
Now to the core: consumer-side idempotency. This is where most of the “exactly-once” semantics actually live.
Choosing the right idempotency key
Bad choice: “use the Service Bus MessageId” without thinking.
Better approach: design a business-level idempotency identifier:
- For commands: use a
CommandIdthat the caller generates and passes through the system. - For domain/integration events: use a stable
EventId. - For operations tied to a domain object: combine
<AggregateId>:<LogicalOperation>:<OperationKey>.
Then embed that as both:
MessageIdon the Service Bus message (so the broker’s duplicate detection can help).- A key in your idempotency store (SQL/Redis table keyed on that id).
Idempotency store in SQL with EF Core
Concrete example: recording processed events to make the handler idempotent.
public class ProcessedMessage
{
public Guid Id { get; set; }
public string MessageId { get; set; } = default!;
public string HandlerName { get; set; } = default!;
public DateTime ProcessedUtc { get; set; }
public string? ResultHash { get; set; }
}
public class AppDbContext : DbContext
{
public DbSet<ProcessedMessage> ProcessedMessages => Set<ProcessedMessage>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<ProcessedMessage>(b =>
{
b.HasKey(x => x.Id);
b.Property(x => x.MessageId).IsRequired().HasMaxLength(200);
b.Property(x => x.HandlerName).IsRequired().HasMaxLength(200);
// Idempotency uniqueness per handler
b.HasIndex(x => new { x.MessageId, x.HandlerName }).IsUnique();
});
}
}
Now the message handler can do something like:
public class OrderPaidHandler
{
private readonly AppDbContext _db;
private readonly ILogger<OrderPaidHandler> _logger;
public async Task HandleAsync(OrderPaidIntegrationEvent evt, CancellationToken ct)
{
var handlerName = nameof(OrderPaidHandler);
using var tx = await _db.Database.BeginTransactionAsync(ct);
var alreadyProcessed = await _db.ProcessedMessages
.AnyAsync(x => x.MessageId == evt.Id.ToString() && x.HandlerName == handlerName, ct);
if (alreadyProcessed)
{
_logger.LogDebug("Skipping duplicate event {EventId} in {Handler}", evt.Id, handlerName);
await tx.RollbackAsync(ct); // No changes
return;
}
var order = await _db.Orders.FindAsync(new object[] { evt.OrderId }, ct);
if (order == null)
{
// Decide: DLQ? Compensate? For this example, just log and return.
_logger.LogWarning("Order {OrderId} not found for event {EventId}", evt.OrderId, evt.Id);
}
else
{
order.ApplyPayment(evt.PaymentId, evt.Amount);
}
_db.ProcessedMessages.Add(new ProcessedMessage
{
Id = Guid.NewGuid(),
MessageId = evt.Id.ToString(),
HandlerName = handlerName,
ProcessedUtc = DateTime.UtcNow
});
await _db.SaveChangesAsync(ct);
await tx.CommitAsync(ct);
}
}
Notice:
- Idempotency decision and business updates are in a single DB transaction.
- If two workers race on the same event, one will hit a unique index violation on
ProcessedMessages. Your handler must treat that as “already processed” and swallow it.
Dealing with concurrency races and deadlocks
Under heavy competing consumers, you’ll see:
- Deadlocks on
ProcessedMessagesor business tables. - Unique constraint violations when two workers process the same message concurrently.
Patterns that have worked well for me:
- Keep the transactional scope small: only the minimal reads/writes needed for that handler.
- Let the DB enforce idempotency (unique index) and catch
DbUpdateExceptionwith unique key hints to treat as “already processed.” - Use retry with backoff for deadlocks (
1205), but beware of amplifying duplicate effects—your idempotency guard must be robust.
High-Throughput Mechanics: Prefetch, Parallelism, and Backpressure
This is where Service Bus + .NET either sings or melts. I’ve seen both.
ServiceBusProcessor tuning
Using Azure.Messaging.ServiceBus’s ServiceBusProcessor is usually the best path. Example setup:
services.AddSingleton(_ => new ServiceBusClient(config.ServiceBus.ConnectionString));
services.AddSingleton(sp =>
{
var client = sp.GetRequiredService<ServiceBusClient>();
var options = new ServiceBusProcessorOptions
{
AutoCompleteMessages = false,
MaxConcurrentCalls = 32, // tune based on CPU, DB, and workload
PrefetchCount = 256, // tune based on message size and latency
MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(5)
};
return client.CreateProcessor(config.ServiceBus.QueueName, options);
});
Then wire up:
public class ServiceBusWorker : IHostedService
{
private readonly ServiceBusProcessor _processor;
private readonly ILogger<ServiceBusWorker> _logger;
private readonly IServiceProvider _services;
public ServiceBusWorker(ServiceBusProcessor processor,
IServiceProvider services,
ILogger<ServiceBusWorker> logger)
{
_processor = processor;
_services = services;
_logger = logger;
}
public Task StartAsync(CancellationToken cancellationToken)
{
_processor.ProcessMessageAsync += OnMessageAsync;
_processor.ProcessErrorAsync += OnErrorAsync;
return _processor.StartProcessingAsync(cancellationToken);
}
public async Task StopAsync(CancellationToken cancellationToken)
{
await _processor.StopProcessingAsync(cancellationToken);
_processor.ProcessMessageAsync -= OnMessageAsync;
_processor.ProcessErrorAsync -= OnErrorAsync;
}
private async Task OnMessageAsync(ProcessMessageEventArgs args)
{
using var scope = _services.CreateScope();
var handler = scope.ServiceProvider.GetRequiredService<IMessageDispatcher>();
try
{
await handler.DispatchAsync(args.Message, args.CancellationToken);
await args.CompleteMessageAsync(args.Message, args.CancellationToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process message {MessageId}", args.Message.MessageId);
// Let Service Bus retry based on MaxDeliveryCount; eventually DLQ
await args.AbandonMessageAsync(args.Message, cancellationToken: args.CancellationToken);
}
}
private Task OnErrorAsync(ProcessErrorEventArgs args)
{
_logger.LogError(args.Exception, "Service Bus error. Entity: {EntityPath}", args.EntityPath);
return Task.CompletedTask;
}
}
How I actually tune MaxConcurrentCalls and PrefetchCount
Rules of thumb (and they’re only that):
- Start with
MaxConcurrentCalls~= 2–4x CPU cores if your handlers are mostly I/O bound. - Start with
PrefetchCountaround 4–8xMaxConcurrentCallsfor small messages; lower if messages are large. - Watch SQL CPU, DTUs/vCores, and lock wait times. If DB is saturated, stop increasing concurrency; instead, scale out horizontally.
Backpressure is crucial: if your DB is the bottleneck, more messages in-flight will only make it worse (thundering herd, more lock contention, more timeouts). Sam Newman and the Google SRE book both emphasize respecting bottlenecks; your pipeline needs to surface those.
Rare but real failure modes
- Hot partitions in SQL: A single aggregate ID being hammered by many messages (e.g., a popular tenant or product) creating localized contention and deadlocks.
- Lock renewal failures: Long-running handlers that don’t fit within lock duration, causing messages to reappear mid-processing. Design handlers to be short and offload heavy work to other async flows.
- Worker restarts + prefetch: If you prefetch aggressively and your pod restarts, many messages will become available again; ensure idempotency is bulletproof.
Ordering, Duplicates, and Poison Messages in Production
This is where theory meets paging.
Handling logical ordering in an at-least-once world
Common anti-pattern: assuming OrderCancelled won’t arrive before OrderCreated. In real systems, out-of-order delivery happens due to retries, DLQ replays, and parallelism.
Patterns that work:
- Design events to be self-contained and not rely on previous events being processed.
- Use versioned state in the DB, and treat outdated events as no-ops (event’s
Version< current aggregate version). - Where strict ordering is mandatory, use Service Bus sessions and key by aggregate or logical stream, but be explicit about throughput limits.
Poison messages and DLQ handling
Anything that fails repeatedly hits the DLQ once MaxDeliveryCount is exceeded. You MUST treat DLQ as a first-class citizen:
- Alert on DLQ depth.
- Provide tooling to inspect DLQ messages (correlation IDs, payload, exception traces).
- Have a replay mechanism that respects idempotency and rate limits.
In one system, we actually had a specific bug: a schema change caused deserialization failures only for a certain event type. Producers were pushing thousands of these; consumers were failing, retrying, then DLQ-ing after max attempts. We had several hundred thousand messages in DLQ before we noticed. If we didn’t have strong idempotency on the handler, replaying that backlog would have destroyed the system.
Observability and Runbooks: How You Actually Operate These Pipelines
Most teams build the pipeline and forget the operational side. That’s how you end up with the infamous “the queue is growing and we don’t know why” incident at 2AM.
What I always instrument
- Per-queue/subscription metrics:
- Active message count, DLQ count.
- Incoming vs processed messages/sec.
- Average and P95 end-to-end latency (enqueue → processed).
- Handler-level metrics:
- Success/failure counts per handler type.
- Processing time distribution per handler.
- Idempotency “hit rate” (how many events are duplicates).
- Outbox metrics:
- Outbox table size and age distribution.
- Dispatch lag (now −
CreatedUtcfor last processed message). - Outbox dispatch failures and retry counts.
Tracing and correlation IDs
Follow the Google SRE book’s emphasis on having a single trace ID through the system:
- Assign a
CorrelationId/traceIdat the edge (HTTP/API). - Propagate it into Service Bus message application properties.
- Enrich logs with
traceIdin every handler.
Runbook basics
Your on-call runbooks for this pipeline should answer:
- What do I do when the queue depth is growing faster than we process?
- How do I safely increase concurrency?
- How do I inspect and replay DLQ messages?
- What’s the process for handling a bug that caused bad messages to be published (e.g., versioning issue)?
In one of my last platforms, we had a pre-built “Replay DLQ with filter” tool that allowed operators to replay messages matching certain predicates (e.g., tenantId, eventType) to a staging queue, validate, then fan back into prod. Without this, teams would resort to ad-hoc scripts in production—never a good sign.
Concrete Reference Architecture: End-to-End Flow
Let’s tie this together with an ASCII architecture diagram and a realistic .NET solution layout.
End-to-end architecture diagram
HTTP / gRPC Request
│ (traceId)
▼
┌──────────────────────┐
│ API Gateway / │
│ Edge Service │
└─────────┬───────────┘
│ (traceId, tenantId)
▼
┌──────────────────────┐
│ Command Service │
│ (.NET + EF Core) │
└─────────┬───────────┘
│ 1) Begin Tx
│ 2) Update aggregates
│ 3) Insert Outbox rows
▼
SQL Server / Azure SQL
┌─────────────────────────┐
│ Orders, Payments, │
│ OutboxMessages, │
│ ProcessedMessages │
└─────────┬──────────────┘
│ (CDC/Outbox Poller)
▼
┌──────────────────────┐
│ Outbox Dispatcher │
│ (.NET Worker) │
└─────────┬───────────┘
│ Service Bus Messages
│ (MessageId=EventId,
│ traceId, tenantId)
▼
┌─────────────────────────────────────┐
│ Azure Service Bus Namespace │
│ ┌───────────────┐ ┌───────────┐ │
│ │ orders-topic │ │ commands │ │
│ └──────┬────────┘ └─────┬─────┘ │
│ │ subscriptions │ │
└─────────┼──────────────────┼───────┘
│ │
┌──────────▼─────────┐ ┌─────▼──────────┐
│ Billing Service │ │ Fulfillment │
│ (.NET Worker + │ │ Service │
│ EF Core + Idemp.) │ │ │
└────────┬───────────┘ └─────┬─────────┘
│ 4) Check Processed │
│ Messages / │
│ Idempotency │
▼ ▼
SQL Billing DB SQL Fulfillment DB
▲ ▲
│ │
└───────────┬─────────┘
│
Observability / Tracing
(AppInsights / OTEL)
Legend:
- traceId/spanId flow across services and bus
- MessageId used for Service Bus duplicate detection
- Outbox ensures DB + events are consistent
- Idempotency store ensures at-least-once becomes effectively exactly-once for handlers
Solution folder structure (realistic)
📂 src
├── 📂 Api
│ ├── 📂 Controllers
│ ├── 📂 Contracts
│ ├── 📂 Filters
│ ├── 📂 Middleware
│ │ ├── CorrelationMiddleware.cs
│ │ └── TenantResolutionMiddleware.cs
│ ├── 📂 Auth
│ ├── 📄 Program.cs
│ └── 📄 CompositionRoot.cs
├── 📂 Application
│ ├── 📂 Commands
│ │ ├── PayOrder
│ │ │ ├── PayOrderCommand.cs
│ │ │ └── PayOrderCommandHandler.cs
│ ├── 📂 Events
│ │ ├── OrderPaidIntegrationEvent.cs
│ ├── 📂 Abstractions
│ │ ├── IOutboxService.cs
│ │ └── IMessageIdFactory.cs
│ └── 📂 Behaviors
│ └── LoggingBehavior.cs
├── 📂 Domain
│ ├── 📂 Orders
│ │ ├── Order.cs
│ │ └── OrderPaidDomainEvent.cs
│ ├── 📂 ValueObjects
│ └── 📂 Exceptions
├── 📂 Infrastructure
│ ├── 📂 Persistence
│ │ ├── AppDbContext.cs
│ │ ├── Configurations
│ │ │ ├── OrderConfiguration.cs
│ │ │ └── OutboxMessageConfiguration.cs
│ │ ├── Migrations
│ │ └── ExecutionStrategies
│ ├── 📂 Messaging
│ │ ├── ServiceBus
│ │ │ ├── ServiceBusWorker.cs
│ │ │ ├── ServiceBusMessageDispatcher.cs
│ │ │ └── ServiceBusConfiguration.cs
│ │ ├── Idempotency
│ │ │ └── ProcessedMessageStore.cs
│ │ └── Outbox
│ │ ├── OutboxDispatcher.cs
│ │ └── OutboxCleanupJob.cs
│ ├── 📂 Observability
│ │ ├── Tracing.cs
│ │ └── LoggingEnricher.cs
│ └── 📂 Configuration
│ └── ServiceBusOptions.cs
└── 📂 Workers
├── 📂 BillingWorker
└── 📂 FulfillmentWorker
This layout separates concerns:
- Domain and Application are messaging-agnostic; they emit events and commands.
- Infrastructure owns Service Bus specifics, outbox dispatch, idempotency store implementation.
- Workers compose everything together for specific pipelines.
What I’d Insist On for Any New High-Throughput Pipeline
If you’re designing a new .NET + Azure Service Bus + EF Core pipeline today, my non-negotiables would be:
- Explicitly accept at-least-once delivery and design for idempotency everywhere—no magical exactly-once fantasies.
- Use a transactional outbox for any DB + event publishing boundary; do not attempt distributed transactions.
- Have a concrete idempotency store per handler (unique indexes, not in-memory hacks).
- Tune ServiceBusProcessor with real load tests, watching SQL and lock behavior, not just queue throughput.
- Invest in DLQ tooling, replay safety, and observability from day one.
The difference between an academic queue-based system and a production-grade, high-throughput pipeline is not the choice of broker—it’s the discipline around idempotency, consistency, and operations. Azure Service Bus and EF Core are more than capable; they just won’t save you from bad design. That part is on us.