Friday, May 15, 2026

Structured Logging with SLF4J and Java

Structured Logging with SLF4J and Java

Prerequisites

Before you begin, make sure you have: - Java 21 or later installed (java --version to check) - Maven 3.8 or later installed (mvn --version to check) - A plain Maven Java project (no framework required)

What you’ll build: a standalone Java application that emits structured JSON logs with contextual MDC fields, propagated correctly across platform threads, virtual threads, and reactive pipelines.

Companion repo: github.com/abadongutierrez/structured-logging-slf4j-java-companion-repo

Time required: ~25 minutes


1. Why Structured Logging

Let me paint you a familiar picture.

It’s 2am. Something is broken in production. You open your logs and see this:

2024-03-10 00:32:01 INFO  OrderService - Order 8821 placed by user 42, amount $99.00
2024-03-10 00:32:01 ERROR PaymentService - Payment failed
2024-03-10 00:32:02 INFO  OrderService - Order 9103 placed by user 17, amount $12.00

Which order failed? Was it 8821 or 9103? You start grepping. The grep matches 40 lines. Some are from different services. You’re not sure they’re even from the same request. You are running on caffeine and the only thought in your head is the pillow waiting at home.

A tired developer debugging production logs at 2am

I’ve been there, my friend. It’s not fun at all.

The tools are already there — Datadog, Splunk, Grafana Loki — and they are genuinely powerful. They can index billions of log lines, alert you in milliseconds, correlate events across dozens of services. But they can only work with what you give them. And what most teams give them is a stream of free-text sentences.

That’s the root problem. Free-text logs are written for humans to read, not machines to query. Feed a powerful log aggregator unstructured text and you’ve cut its legs off. It can tell you which lines contain the word “failed.” It cannot tell you which orders failed, or how many, or for which users, or whether the failure rate is climbing. That intelligence is there — locked inside a sentence where no machine can reach it. The moment you need to filter, correlate, or alert on log data, unstructured text fights you at every step.

Structured logging flips this. Every log line becomes a JSON object:

{
  "timestamp": "2024-03-10T14:32:01.442Z",
  "level": "INFO",
  "logger": "com.example.OrderService",
  "message": "Order placed",
  "orderId": "8821",
  "userId": "42",
  "amount": "99.00"
}

Every field is a key. Datadog, Splunk, Grafana Loki — they all ingest this without custom parsers. You can filter by orderId, alert when amount > 500, or find every log line from userId: 42 in the last hour. Instantly.

That’s the deal. Let’s build it.


2. The SLF4J + Logback Stack

If you’ve done any Java development, you’ve heard of Log4j. If not for logging, then for the infamous Log4Shell — the critical vulnerability that had every engineering team scrambling in late 2021. Log4j is Java logging’s most famous name, and there’s a reason for that: it’s been around since 2001 and it’s genuinely everywhere. What’s confusing is that Log4j 2 — the version behind Log4Shell — is a completely separate rewrite by a different team. SLF4J and Logback are different again: created by Ceki Gülcü, Log4j’s original author, after he left the project. He wanted a cleaner design, and the central idea was to split API from implementation.

Think of SLF4J as the steering wheel of your car. You interact with it — you call logger.info(...), logger.error(...) — but it doesn’t actually move anything on its own. Logback is the engine under the hood. It receives the log events from SLF4J and decides what to do with them: format them, write them to a file, ship them to a collector.

Why does this separation matter? Because you can swap the engine without replacing the steering wheel. Switch from Logback to Log4j2 by changing a dependency and a config file. Your application code stays exactly as it is.

Add these three dependencies to your pom.xml:

<dependencies>
    <!-- SLF4J API -->
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>2.0.18</version>
    </dependency>

    <!-- Logback backend -->
    <dependency>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-classic</artifactId>
        <version>1.5.32</version>
    </dependency>

    <!-- JSON encoder for Logback -->
    <dependency>
        <groupId>net.logstash.logback</groupId>
        <artifactId>logstash-logback-encoder</artifactId>
        <version>9.0</version>
    </dependency>
</dependencies>

Note: Logback 1.5.17 or later is required for correct MDC behavior with Java 21 virtual threads. Do not use anything earlier — I’ll explain why in section 6.


3. Configuring JSON Output

Logback reads its configuration from src/main/resources/logback.xml. It doesn’t need to be told where to look. At startup, Logback scans the classpath in a fixed order: first for logback-test.xml, then for logback.xml. The test variant is useful when you want a different configuration during tests — say, plain text output so test logs are easier to read. In a Maven project, src/main/resources compiles to the classpath root, so that’s exactly where Logback will find either file.

Let’s create the file:

<configuration>
    <!--
      Appender: decides WHERE logs go.
      ConsoleAppender sends every log event to stdout.
      Other appenders write to files, sockets, or remote collectors.
    -->
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <!--
          Encoder: decides HOW each log event is serialized before it is written.
          LogstashEncoder turns every event into a JSON object.
          Without this, Logback writes plain text in its default pattern format.
          This is the important piece.
        -->
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>

    <!-- Root logger: apply this configuration to everything at INFO level and above. -->
    <root level="INFO">
        <appender-ref ref="STDOUT"/>
    </root>
</configuration>

That’s really it. LogstashEncoder takes every log event and turns it into a JSON object — timestamp, level, logger name, thread name, message, and any contextual fields you attach. All automatic.

Verify

Add a logger to your main class and run the application:

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger log = LoggerFactory.getLogger(Main.class);

    public static void main(String[] args) {
        log.info("Application started");

        log.info("Application ended");
    }
}

Expected output:

{"@timestamp":"...","@version":"1","message":"Application started","logger_name":"com.jabaddon.structuredlogging.Main","thread_name":"main","level":"INFO"}
{"@timestamp":"...","@version":"1","message":"Application ended","logger_name":"com.jabaddon.structuredlogging.Main","thread_name":"main","level":"INFO"}

If you see plain text instead of JSON, check two things: is logback.xml under src/main/resources (not src/main/java)? And is logstash-logback-encoder actually on the classpath (mvn dependency:tree will tell you)?

How to declare the Logger

You’ll see three patterns in the wild:

// 1. Static field — the standard choice. One logger per class, shared across
//    all instances, never garbage collected. Use this unless you have a reason not to.
private static final Logger log = LoggerFactory.getLogger(Main.class);

// 2. Instance field with getClass() — useful in abstract base classes where you want
//    the logger name to reflect the concrete subclass, not the abstract parent.
private final Logger log = LoggerFactory.getLogger(getClass());

// 3. Lombok @Slf4j — generates the static field above at compile time.
//    Common in Spring Boot projects. Requires the Lombok dependency.
@Slf4j
public class Main { ... }

The logger_name field in the JSON output maps directly to the class you pass to getLogger(). That’s how your log aggregator knows which class emitted each line — so pass the right class, every time.


4. Adding Context with MDC

JSON logs are already better than text logs. But right now every log line only carries the message and some metadata — timestamp, level, thread name. That’s still not enough to debug what you saw at the start of this post.

The missing piece is context. And here’s the thing: it’s not a matter of adding more data to every log line. It’s a matter of thinking — before you write a single line of code — about what questions you’ll be asking at 2am when something breaks. Which user triggered this? Which order was affected? Which request is this log line part of? Which tenant owns this operation?

The answers to those questions are your context fields. Not every field you could add — just the ones that will make the difference between a five-minute investigation and a two-hour one. Too little context and you’re back to grepping. Too much and every log line becomes noise, and the fields that matter get buried in the ones that don’t. A good rule of thumb: if you’d reach for it in a WHERE clause when debugging, it belongs in your logging context.

Once you know which fields matter, the next question is how to get them into every log statement. You could pass them as extra parameters down through every method that needs to log. But that gets ugly fast: your business methods accumulate context arguments they don’t own, every signature grows, and the logging concern leaks into every layer of your code.

MDC — Mapped Diagnostic Context — solves this elegantly. You attach key-value pairs to the current thread’s logging context once, and every log statement made on that thread automatically includes them. No parameter passing. No noise in your method signatures.

Step 1: MDC.putCloseable — safer than manual put/remove

The raw MDC API gives you MDC.put() and MDC.remove(). The typical pattern looks like this:

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;

public class OrderService {
    private static final Logger log = LoggerFactory.getLogger(OrderService.class);

    public void placeOrder(String orderId, String userId, double amount) {
        MDC.put("orderId", orderId);
        MDC.put("userId", userId);
        try {
            log.info("Order placed: amount={}", amount);
            confirmOrder();
            log.info("Order confirmed");
        } finally {
            MDC.remove("orderId");
            MDC.remove("userId");
        }
    }

    private void confirmOrder() {
        log.info("Confirming order...");
    }
}

That finally block is not optional — and I mean that seriously. MDC is backed by ThreadLocal storage. If you skip the cleanup, those entries stay on the thread. In a thread pool, the next request handled by that same thread inherits your stale context. I’ve seen this produce genuinely baffling bugs: logs from request A tagged with the user ID from request B. Not fun to debug at 2am.

The other problem: every key you add to put() needs a matching remove(). Miss one — during a refactor, in a rush, at 11pm — and you have a silent bug with no stack trace.

SLF4J has a better way. MDC.putCloseable() returns an MDCCloseable (a nested class that implements Closeable), which means you can let try-with-resources handle the cleanup:

public void placeOrder(String orderId, String userId, double amount) {
    try (var o = MDC.putCloseable("orderId", orderId);
         var u = MDC.putCloseable("userId", userId)) {
        log.info("Order placed: amount={}", amount);
        confirmOrder();
        log.info("Order confirmed");
    }
}

private void confirmOrder() {
    log.info("Confirming order...");
}

No finally. No manual remove(). Java closes each resource in reverse order when the block exits — whether normally or by exception.

This is already a meaningful improvement. But it doesn’t compose — two separate closeables that happen to be declared together aren’t the same as one context that owns both keys. Add a third field and the try header grows with it.

Step 2: a small Mdc utility — one block for all keys

A thin utility class fixes this permanently. Add it once to your project:

import org.slf4j.MDC;
import java.util.LinkedHashMap;
import java.util.Map;

/**
 * A thin wrapper around SLF4J MDC that manages multiple keys as a single
 * try-with-resources block.
 *
 * <p>This is one way to solve the problem  not the only way. Depending on
 * your project, you might prefer a different name, a different null-handling
 * strategy, or integrating this behaviour directly into a request-scoping
 * filter or interceptor. Treat this as a starting point, not a prescription.
 */
public final class Mdc {
    private Mdc() {}

    /**
     * A non-throwing AutoCloseable returned by {@link #with} and {@link Builder#open}.
     * Declaring close() without a checked exception lets try-with-resources work
     * without a surrounding catch block.
     */
    @FunctionalInterface
    public interface Context extends AutoCloseable {
        @Override void close();  // no checked exception — try-with-resources needs no catch
    }

    /**
     * Puts all entries into MDC and returns a Context that removes them on close.
     * Uses Map.of() semantics: throws NullPointerException if any value is null.
     * For nullable values use {@link #context()}.
     */
    public static Context with(Map<String, String> context) {
        context.forEach(MDC::put);
        return () -> context.keySet().forEach(MDC::remove);
    }

    /** Returns a Builder for assembling the context incrementally. Null values are skipped. */
    public static Builder context() { return new Builder(); }

    public static final class Builder {
        private final Map<String, String> entries = new LinkedHashMap<>();

        /** Adds a key-value pair. Silently skips the entry if value is null. */
        public Builder put(String key, String value) {
            if (value != null) entries.put(key, value);
            return this;
        }

        /** Writes all entries to MDC and returns a Context that removes them on close. */
        public Context open() {
            entries.forEach(MDC::put);
            return () -> entries.keySet().forEach(MDC::remove);
        }
    }
}

Now every callsite becomes one variable and one block:

public void placeOrder(String orderId, String userId, double amount) {
    try (var ctx = Mdc.with(Map.of("orderId", orderId, "userId", userId))) {
        log.info("Order placed: amount={}", amount);
        confirmOrder();
        log.info("Order confirmed");
    }
}

private void confirmOrder() {
    log.info("Confirming order...");
}

Add a fourth key and the try header doesn’t change.

When values might be null: Map.of() throws NullPointerException if any value is null. If a field is conditionally available — say userId isn’t known until after an auth check — use the builder instead:

try (var ctx = Mdc.context()
        .put("orderId", orderId)
        .put("userId", userId)    // skipped silently if null
        .open()) {
    log.info("Order placed: amount={}", amount);
}

The builder’s put() skips null values without any caller-side filtering.

Verify

{"message":"Order placed: amount=99.0","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Confirming order...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Order confirmed","orderId":"8821","userId":"42","level":"INFO"}

All three log statements carry orderId and userId as top-level JSON keys — including the one inside confirmOrder(), which never touches MDC directly. That’s the point: set context once, and every log call on that thread picks it up automatically.


5. Correlation IDs for Distributed Tracing

MDC gives you context within a single component. Correlation IDs give you context across an entire request — through every service, every async call, every log line that request touches.

The idea is simple: generate a unique ID at the entry point of each request. Attach it to MDC. Every log statement from that point forward carries it — as long as it runs on the same thread. When something breaks, you filter by correlation ID and see the complete trail of what happened.

Hand work off to a thread pool or a reactive pipeline and the context disappears. Sections 6 and 7 show how to handle that.

import java.util.Map;
import java.util.UUID;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class RequestHandler {
    private static final Logger log = LoggerFactory.getLogger(RequestHandler.class);

    public void handle(String incomingTraceId) {
        // prefer the caller's ID; generate one if absent
        String traceId = (incomingTraceId != null) ? incomingTraceId : UUID.randomUUID().toString();
        try (var ctx = Mdc.with(Map.of("traceId", traceId))) {
            log.info("Request received");
            new OrderService().placeOrder("8821", "42", 99.00);
            log.info("Request completed");
        }
    }
}

If the request came from another service that already has a trace ID (say, via an HTTP header), you reuse it. If this is the origin, you generate one. Either way, the same ID flows through everything downstream.

Verify

All log statements within the request share the same traceId:

{"message":"Request received","traceId":"a3f1c2d4-...","level":"INFO"}
{"message":"Order placed: amount=99.0","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Confirming order...","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Order confirmed","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Request completed","traceId":"a3f1c2d4-...","level":"INFO"}

That grep that used to return 40 ambiguous lines? Now it returns exactly the lines you care about.


6. MDC in Multi-Threaded Environments

Here’s where things get interesting — and where I’ve seen the most production surprises. You added a traceId in section 5 and it worked perfectly. Then you introduced an async task, and suddenly the traceId is gone from those log lines. No error, no warning. Just missing context.

MDC is ThreadLocal. Each thread has its own independent MDC map. This works perfectly when a single thread handles a request from start to finish. But the moment you submit work to another thread, that thread starts with an empty MDC. Your context is gone.

Platform thread pools

import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.slf4j.MDC;

ExecutorService pool = Executors.newFixedThreadPool(4);

Map<String, String> mdcContext = MDC.getCopyOfContextMap();

pool.submit(() -> {
    if (mdcContext != null) {
        MDC.setContextMap(mdcContext);
    }
    try {
        log.info("Async task running");
        // ... work
    } finally {
        MDC.clear();
    }
});

The pattern is: copy the MDC map on the calling thread before you submit, then restore it as the very first thing inside the task. The finally { MDC.clear() } ensures the pool thread is clean for whoever uses it next.

Virtual threads (Java 21)

Virtual threads behave exactly the same as platform threads with respect to MDC. Same empty-on-creation behavior, same copy-and-restore fix:

import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.slf4j.MDC;

ExecutorService virtualPool = Executors.newVirtualThreadPerTaskExecutor();

Map<String, String> mdcContext = MDC.getCopyOfContextMap();

virtualPool.submit(() -> {
    if (mdcContext != null) {
        MDC.setContextMap(mdcContext);
    }
    try {
        log.info("Virtual thread task running");
    } finally {
        MDC.clear();
    }
});

Why Logback 1.5.17+ matters here: earlier versions had a race condition between MDC initialization and the logger context under high virtual thread concurrency — fixed in 1.5.17. The fix also requires SLF4J 2.0.17 or later to be fully effective; the dependency block in section 2 already satisfies this. Pin to 1.5.32 and you get all the fixes. Don’t learn this the hard way in production.

What frameworks handle for you

If you are using Spring Boot 3.x with Micrometer Tracing, most of this is handled automatically. Micrometer wraps your @Async executors and scheduled tasks with context-propagating decorators — the copy-and-restore pattern above happens behind the scenes, and trace/span IDs are populated in MDC without any manual wiring.

If you are on Spring Boot without Micrometer Tracing, Spring’s ThreadPoolTaskExecutor supports a TaskDecorator interface. One decorator implementation that copies and restores MDC replaces the manual boilerplate for every task submitted to that pool.

The patterns in this section are what those frameworks implement under the hood. Understanding them means you can debug propagation failures when the framework’s abstraction leaks — and it always leaks eventually.


7. MDC in Reactive Environments

If multi-threaded MDC surprised you, reactive will genuinely frustrate you at first. And that’s okay — I’ve been there too.

With thread pools, at least you knew when you were crossing a thread boundary. You called pool.submit(), you saw the handoff, you knew you had to copy the context. The fix was mechanical once you understood the problem.

Reactive is different. You write a pipeline — .map(), .flatMap(), .filter() — and it looks like a single flow. But under the hood, each operator may run on a different thread, decided by the scheduler, at a time you don’t control. You never called submit(). You never explicitly crossed a thread boundary. And yet your MDC context is gone halfway through the pipeline, and some log lines have the traceId while others don’t. You reread your code three times and can’t see why.

The root cause is that ThreadLocal — the mechanism MDC is built on — is fundamentally incompatible with reactive programming’s execution model. ThreadLocal assumes one thread owns the context for the duration of a task. Reactive assumes no such thing. The solution requires a different approach entirely: stop carrying context in ThreadLocal and start carrying it in the reactive context, which actually propagates through the pipeline.

Before writing any reactive code, add these two dependencies to your pom.xml:

<!-- Project Reactor -->
<dependency>
    <groupId>io.projectreactor</groupId>
    <artifactId>reactor-core</artifactId>
    <version>3.6.11</version>
</dependency>

<!-- RxJava 3 -->
<dependency>
    <groupId>io.reactivex.rxjava3</groupId>
    <artifactId>rxjava</artifactId>
    <version>3.1.10</version>
</dependency>

The solution is to stop relying on thread-local storage and instead carry the MDC context in the reactive context, which does propagate through the pipeline. Then you restore MDC from it at each point where a log statement runs.

Project Reactor

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
import reactor.core.publisher.Mono;
import reactor.util.context.Context;
import java.util.Map;

public Mono<String> processOrder(String orderId, String traceId) {
    return Mono.just(orderId)
        .doOnEach(signal -> {
            if (signal.isOnNext() || signal.isOnError()) {
                Map<String, String> mdcContext = signal.getContextView()
                    .<Map<String, String>>getOrDefault("MDC_CONTEXT", Map.of());
                MDC.setContextMap(mdcContext);
                if (signal.isOnNext()) {
                    log.info("Processing order");
                }
                MDC.clear();
            }
        })
        .map(id -> "processed-" + id)
        .contextWrite(Context.of("MDC_CONTEXT", Map.of("traceId", traceId, "orderId", orderId)));
}

contextWrite() loads the MDC map into the Reactor context. doOnEach() reads it back and restores it into MDC before each signal, then clears it immediately after.

Why contextWrite() goes at the end: Reactor context flows upstream — an operator can only see context written downstream of it. Put contextWrite() at the end of your chain and it’s visible to every operator above it. Put it in the middle and operators above that point are left in the dark.

RxJava

RxJava has no built-in reactive context, but it has scheduler hooks — and that’s all you need:

import io.reactivex.rxjava3.plugins.RxJavaPlugins;
import org.slf4j.MDC;
import java.util.Map;

// Call this once at application startup.
// setScheduleHandler replaces any previously registered handler —
// calling it a second time silently discards the first.
RxJavaPlugins.setScheduleHandler(runnable -> {
    Map<String, String> mdcContext = MDC.getCopyOfContextMap();
    return () -> {
        if (mdcContext != null) {
            MDC.setContextMap(mdcContext);
        }
        try {
            runnable.run();
        } finally {
            MDC.clear();
        }
    };
});

This wraps every unit of work that RxJava schedulers execute — capturing MDC at scheduling time and restoring it at execution time. Set it once at startup and forget about it.

Verify (Reactor)

{"message":"Processing order","traceId":"a3f1c2d4-...","orderId":"8821","level":"INFO","thread_name":"reactor-http-nio-3"}

The context fields are there even though the log ran on a Reactor NIO thread. That’s exactly what we want.

What frameworks handle for you

Project Reactor — automatic with Spring Boot 3 + Micrometer Tracing

Reactor 3.5.3 introduced Hooks.enableAutomaticContextPropagation(). When called at startup, it bridges ThreadLocal values — including SLF4J MDC — into Reactor’s context automatically. No contextWrite(), no doOnEach(). Spring Boot 3.x enables this automatically when micrometer-tracing is on the classpath, via the io.micrometer:context-propagation library.

Without Spring Boot, one call at startup is all it takes:

import reactor.core.publisher.Hooks;

// Call once before any pipeline runs.
Hooks.enableAutomaticContextPropagation();

After that, MDC values set before subscribing propagate through the entire pipeline — across schedulers, across operators — without any manual wiring.

RxJava — no automatic mechanism

Spring does not auto-configure RxJava MDC propagation. The RxJavaPlugins.setScheduleHandler() pattern shown above is what you need regardless of framework. The good news: you set it once at startup and it covers every observable in the application.

As with section 6: understanding the manual patterns is still worth it. When the automatic propagation breaks — and it does, usually at the boundary between a managed executor and one you created yourself — knowing what’s happening underneath is how you diagnose it.


8. Troubleshooting

I’ve hit every one of these. Hopefully this saves you some time.

MDC fields are missing in async tasks You submitted work to a thread pool without copying the MDC map first. Capture MDC.getCopyOfContextMap() on the calling thread before submitting, and call MDC.setContextMap() as the first thing inside the task.

MDC fields appear in some log lines but not others in a reactive pipeline You’re logging outside a doOnEach() block, or contextWrite() is placed upstream of the operators that need it. Move contextWrite() to the end of the chain and log only inside doOnEach().

MDC fields from one request appear in the next request’s logs A thread was returned to the pool without clearing MDC. Add MDC.clear() at the very start of each request entry point as a defensive measure — on top of the finally cleanup on the way out. Belt and suspenders.

All logs are plain text, not JSON Check that logback.xml is under src/main/resources (not src/main/java). Run mvn dependency:tree to confirm logstash-logback-encoder is on the runtime classpath. And double-check the encoder class name — it’s LogstashEncoder, not LogstashFormatter.

Empty or missing MDC in virtual threads Upgrade to Logback 1.5.17 or later. Earlier versions have MDC adapter issues under virtual thread concurrency.


9. What You Built

You now have a standalone Java application that: - Emits every log line as a JSON object any log aggregator can ingest - Attaches request-scoped context (order ID, user ID, trace ID) as top-level JSON fields via MDC - Propagates that context correctly across platform thread pools, Java 21 virtual threads, Project Reactor pipelines, and RxJava schedulers

The 2am debugging session I described at the start? With this in place, you grep by traceId and see the full story of that request in seconds.

The complete code is at github.com/abadongutierrez/structured-logging-slf4j-java-companion-repo.

From here you can: - Add a log aggregator: ship these JSON logs to Datadog, the ELK stack, or Grafana Loki without any parser configuration - Automate trace ID injection: replace the manual UUID.randomUUID() with OpenTelemetry or Micrometer Tracing, which generate W3C-compliant trace/span IDs and populate MDC automatically - Add Spring Boot: Spring Boot 3.x auto-configures Logback and Micrometer Tracing — the patterns here carry over directly, with less boilerplate


10. The Problem That Comes Next

You solved the single-service problem. Every log line is structured, every request is traceable, context propagates correctly regardless of how your application executes. That’s the foundation.

The next problem is consistency across services — and it’s sneakier.

I’ve worked on systems where every service emitted JSON logs and debugging was still a nightmare. One service called it userId, another user_id, a third user. One used ERROR for every exception including the expected ones. Another didn’t log external API calls at all. The aggregator had structured data, but the data had no shared meaning. You couldn’t write a monitor that worked across services. You were back to grepping, just in a fancier tool.

Structured logging without consistency is still chaos — slower chaos, with better tooling.

Part of the answer is a logging taxonomy: a shared vocabulary of event names that your entire system agrees on. Instead of free-form messages like "Order placed" or "Payment failed", you define a set of named events — order.placed, payment.failed, user.authenticated — and emit them as a dedicated field alongside the message:

{"event": "order.placed", "message": "Order placed successfully", "orderId": "8821", "amount": "99.00"}
{"event": "payment.failed", "message": "Payment declined by provider", "orderId": "8821", "reason": "insufficient_funds"}

Now your monitoring layer can work with event names instead of message strings. Alert when event = "payment.failed" exceeds a threshold. Build a dashboard that counts order.placed events per minute. Write queries that are stable across refactors — because renaming a log message doesn’t break your alerts when the event name is a contract, not a sentence.

No comments: