Friday, May 15, 2026

Structured Logging with SLF4J and Java

Structured Logging with SLF4J and Java

Prerequisites

Before you begin, make sure you have: - Java 21 or later installed (java --version to check) - Maven 3.8 or later installed (mvn --version to check) - A plain Maven Java project (no framework required)

What you’ll build: a standalone Java application that emits structured JSON logs with contextual MDC fields, propagated correctly across platform threads, virtual threads, and reactive pipelines.

Companion repo: github.com/abadongutierrez/structured-logging-slf4j-java-companion-repo

Time required: ~25 minutes


1. Why Structured Logging

Let me paint you a familiar picture.

It’s 2am. Something is broken in production. You open your logs and see this:

2024-03-10 00:32:01 INFO  OrderService - Order 8821 placed by user 42, amount $99.00
2024-03-10 00:32:01 ERROR PaymentService - Payment failed
2024-03-10 00:32:02 INFO  OrderService - Order 9103 placed by user 17, amount $12.00

Which order failed? Was it 8821 or 9103? You start grepping. The grep matches 40 lines. Some are from different services. You’re not sure they’re even from the same request. You are running on caffeine and the only thought in your head is the pillow waiting at home.

A tired developer debugging production logs at 2am

I’ve been there, my friend. It’s not fun at all.

The tools are already there — Datadog, Splunk, Grafana Loki — and they are genuinely powerful. They can index billions of log lines, alert you in milliseconds, correlate events across dozens of services. But they can only work with what you give them. And what most teams give them is a stream of free-text sentences.

That’s the root problem. Free-text logs are written for humans to read, not machines to query. Feed a powerful log aggregator unstructured text and you’ve cut its legs off. It can tell you which lines contain the word “failed.” It cannot tell you which orders failed, or how many, or for which users, or whether the failure rate is climbing. That intelligence is there — locked inside a sentence where no machine can reach it. The moment you need to filter, correlate, or alert on log data, unstructured text fights you at every step.

Structured logging flips this. Every log line becomes a JSON object:

{
  "timestamp": "2024-03-10T14:32:01.442Z",
  "level": "INFO",
  "logger": "com.example.OrderService",
  "message": "Order placed",
  "orderId": "8821",
  "userId": "42",
  "amount": "99.00"
}

Every field is a key. Datadog, Splunk, Grafana Loki — they all ingest this without custom parsers. You can filter by orderId, alert when amount > 500, or find every log line from userId: 42 in the last hour. Instantly.

That’s the deal. Let’s build it.


2. The SLF4J + Logback Stack

If you’ve done any Java development, you’ve heard of Log4j. If not for logging, then for the infamous Log4Shell — the critical vulnerability that had every engineering team scrambling in late 2021. Log4j is Java logging’s most famous name, and there’s a reason for that: it’s been around since 2001 and it’s genuinely everywhere. What’s confusing is that Log4j 2 — the version behind Log4Shell — is a completely separate rewrite by a different team. SLF4J and Logback are different again: created by Ceki Gülcü, Log4j’s original author, after he left the project. He wanted a cleaner design, and the central idea was to split API from implementation.

Think of SLF4J as the steering wheel of your car. You interact with it — you call logger.info(...), logger.error(...) — but it doesn’t actually move anything on its own. Logback is the engine under the hood. It receives the log events from SLF4J and decides what to do with them: format them, write them to a file, ship them to a collector.

Why does this separation matter? Because you can swap the engine without replacing the steering wheel. Switch from Logback to Log4j2 by changing a dependency and a config file. Your application code stays exactly as it is.

Add these three dependencies to your pom.xml:

<dependencies>
    <!-- SLF4J API -->
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>2.0.18</version>
    </dependency>

    <!-- Logback backend -->
    <dependency>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-classic</artifactId>
        <version>1.5.32</version>
    </dependency>

    <!-- JSON encoder for Logback -->
    <dependency>
        <groupId>net.logstash.logback</groupId>
        <artifactId>logstash-logback-encoder</artifactId>
        <version>9.0</version>
    </dependency>
</dependencies>

Note: Logback 1.5.17 or later is required for correct MDC behavior with Java 21 virtual threads. Do not use anything earlier — I’ll explain why in section 6.


3. Configuring JSON Output

Logback reads its configuration from src/main/resources/logback.xml. It doesn’t need to be told where to look. At startup, Logback scans the classpath in a fixed order: first for logback-test.xml, then for logback.xml. The test variant is useful when you want a different configuration during tests — say, plain text output so test logs are easier to read. In a Maven project, src/main/resources compiles to the classpath root, so that’s exactly where Logback will find either file.

Let’s create the file:

<configuration>
    <!--
      Appender: decides WHERE logs go.
      ConsoleAppender sends every log event to stdout.
      Other appenders write to files, sockets, or remote collectors.
    -->
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <!--
          Encoder: decides HOW each log event is serialized before it is written.
          LogstashEncoder turns every event into a JSON object.
          Without this, Logback writes plain text in its default pattern format.
          This is the important piece.
        -->
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>

    <!-- Root logger: apply this configuration to everything at INFO level and above. -->
    <root level="INFO">
        <appender-ref ref="STDOUT"/>
    </root>
</configuration>

That’s really it. LogstashEncoder takes every log event and turns it into a JSON object — timestamp, level, logger name, thread name, message, and any contextual fields you attach. All automatic.

Verify

Add a logger to your main class and run the application:

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    private static final Logger log = LoggerFactory.getLogger(Main.class);

    public static void main(String[] args) {
        log.info("Application started");

        log.info("Application ended");
    }
}

Expected output:

{"@timestamp":"...","@version":"1","message":"Application started","logger_name":"com.jabaddon.structuredlogging.Main","thread_name":"main","level":"INFO"}
{"@timestamp":"...","@version":"1","message":"Application ended","logger_name":"com.jabaddon.structuredlogging.Main","thread_name":"main","level":"INFO"}

If you see plain text instead of JSON, check two things: is logback.xml under src/main/resources (not src/main/java)? And is logstash-logback-encoder actually on the classpath (mvn dependency:tree will tell you)?

How to declare the Logger

You’ll see three patterns in the wild:

// 1. Static field — the standard choice. One logger per class, shared across
//    all instances, never garbage collected. Use this unless you have a reason not to.
private static final Logger log = LoggerFactory.getLogger(Main.class);

// 2. Instance field with getClass() — useful in abstract base classes where you want
//    the logger name to reflect the concrete subclass, not the abstract parent.
private final Logger log = LoggerFactory.getLogger(getClass());

// 3. Lombok @Slf4j — generates the static field above at compile time.
//    Common in Spring Boot projects. Requires the Lombok dependency.
@Slf4j
public class Main { ... }

The logger_name field in the JSON output maps directly to the class you pass to getLogger(). That’s how your log aggregator knows which class emitted each line — so pass the right class, every time.


4. Adding Context with MDC

JSON logs are already better than text logs. But right now every log line only carries the message and some metadata — timestamp, level, thread name. That’s still not enough to debug what you saw at the start of this post.

The missing piece is context. And here’s the thing: it’s not a matter of adding more data to every log line. It’s a matter of thinking — before you write a single line of code — about what questions you’ll be asking at 2am when something breaks. Which user triggered this? Which order was affected? Which request is this log line part of? Which tenant owns this operation?

The answers to those questions are your context fields. Not every field you could add — just the ones that will make the difference between a five-minute investigation and a two-hour one. Too little context and you’re back to grepping. Too much and every log line becomes noise, and the fields that matter get buried in the ones that don’t. A good rule of thumb: if you’d reach for it in a WHERE clause when debugging, it belongs in your logging context.

Once you know which fields matter, the next question is how to get them into every log statement. You could pass them as extra parameters down through every method that needs to log. But that gets ugly fast: your business methods accumulate context arguments they don’t own, every signature grows, and the logging concern leaks into every layer of your code.

MDC — Mapped Diagnostic Context — solves this elegantly. You attach key-value pairs to the current thread’s logging context once, and every log statement made on that thread automatically includes them. No parameter passing. No noise in your method signatures.

Step 1: MDC.putCloseable — safer than manual put/remove

The raw MDC API gives you MDC.put() and MDC.remove(). The typical pattern looks like this:

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;

public class OrderService {
    private static final Logger log = LoggerFactory.getLogger(OrderService.class);

    public void placeOrder(String orderId, String userId, double amount) {
        MDC.put("orderId", orderId);
        MDC.put("userId", userId);
        try {
            log.info("Order placed: amount={}", amount);
            confirmOrder();
            log.info("Order confirmed");
        } finally {
            MDC.remove("orderId");
            MDC.remove("userId");
        }
    }

    private void confirmOrder() {
        log.info("Confirming order...");
    }
}

That finally block is not optional — and I mean that seriously. MDC is backed by ThreadLocal storage. If you skip the cleanup, those entries stay on the thread. In a thread pool, the next request handled by that same thread inherits your stale context. I’ve seen this produce genuinely baffling bugs: logs from request A tagged with the user ID from request B. Not fun to debug at 2am.

The other problem: every key you add to put() needs a matching remove(). Miss one — during a refactor, in a rush, at 11pm — and you have a silent bug with no stack trace.

SLF4J has a better way. MDC.putCloseable() returns an MDCCloseable (a nested class that implements Closeable), which means you can let try-with-resources handle the cleanup:

public void placeOrder(String orderId, String userId, double amount) {
    try (var o = MDC.putCloseable("orderId", orderId);
         var u = MDC.putCloseable("userId", userId)) {
        log.info("Order placed: amount={}", amount);
        confirmOrder();
        log.info("Order confirmed");
    }
}

private void confirmOrder() {
    log.info("Confirming order...");
}

No finally. No manual remove(). Java closes each resource in reverse order when the block exits — whether normally or by exception.

This is already a meaningful improvement. But it doesn’t compose — two separate closeables that happen to be declared together aren’t the same as one context that owns both keys. Add a third field and the try header grows with it.

Step 2: a small Mdc utility — one block for all keys

A thin utility class fixes this permanently. Add it once to your project:

import org.slf4j.MDC;
import java.util.LinkedHashMap;
import java.util.Map;

/**
 * A thin wrapper around SLF4J MDC that manages multiple keys as a single
 * try-with-resources block.
 *
 * <p>This is one way to solve the problem  not the only way. Depending on
 * your project, you might prefer a different name, a different null-handling
 * strategy, or integrating this behaviour directly into a request-scoping
 * filter or interceptor. Treat this as a starting point, not a prescription.
 */
public final class Mdc {
    private Mdc() {}

    /**
     * A non-throwing AutoCloseable returned by {@link #with} and {@link Builder#open}.
     * Declaring close() without a checked exception lets try-with-resources work
     * without a surrounding catch block.
     */
    @FunctionalInterface
    public interface Context extends AutoCloseable {
        @Override void close();  // no checked exception — try-with-resources needs no catch
    }

    /**
     * Puts all entries into MDC and returns a Context that removes them on close.
     * Uses Map.of() semantics: throws NullPointerException if any value is null.
     * For nullable values use {@link #context()}.
     */
    public static Context with(Map<String, String> context) {
        context.forEach(MDC::put);
        return () -> context.keySet().forEach(MDC::remove);
    }

    /** Returns a Builder for assembling the context incrementally. Null values are skipped. */
    public static Builder context() { return new Builder(); }

    public static final class Builder {
        private final Map<String, String> entries = new LinkedHashMap<>();

        /** Adds a key-value pair. Silently skips the entry if value is null. */
        public Builder put(String key, String value) {
            if (value != null) entries.put(key, value);
            return this;
        }

        /** Writes all entries to MDC and returns a Context that removes them on close. */
        public Context open() {
            entries.forEach(MDC::put);
            return () -> entries.keySet().forEach(MDC::remove);
        }
    }
}

Now every callsite becomes one variable and one block:

public void placeOrder(String orderId, String userId, double amount) {
    try (var ctx = Mdc.with(Map.of("orderId", orderId, "userId", userId))) {
        log.info("Order placed: amount={}", amount);
        confirmOrder();
        log.info("Order confirmed");
    }
}

private void confirmOrder() {
    log.info("Confirming order...");
}

Add a fourth key and the try header doesn’t change.

When values might be null: Map.of() throws NullPointerException if any value is null. If a field is conditionally available — say userId isn’t known until after an auth check — use the builder instead:

try (var ctx = Mdc.context()
        .put("orderId", orderId)
        .put("userId", userId)    // skipped silently if null
        .open()) {
    log.info("Order placed: amount={}", amount);
}

The builder’s put() skips null values without any caller-side filtering.

Verify

{"message":"Order placed: amount=99.0","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Confirming order...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Order confirmed","orderId":"8821","userId":"42","level":"INFO"}

All three log statements carry orderId and userId as top-level JSON keys — including the one inside confirmOrder(), which never touches MDC directly. That’s the point: set context once, and every log call on that thread picks it up automatically.


5. Correlation IDs for Distributed Tracing

MDC gives you context within a single component. Correlation IDs give you context across an entire request — through every service, every async call, every log line that request touches.

The idea is simple: generate a unique ID at the entry point of each request. Attach it to MDC. Every log statement from that point forward carries it — as long as it runs on the same thread. When something breaks, you filter by correlation ID and see the complete trail of what happened.

Hand work off to a thread pool or a reactive pipeline and the context disappears. Sections 6 and 7 show how to handle that.

import java.util.Map;
import java.util.UUID;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class RequestHandler {
    private static final Logger log = LoggerFactory.getLogger(RequestHandler.class);

    public void handle(String incomingTraceId) {
        // prefer the caller's ID; generate one if absent
        String traceId = (incomingTraceId != null) ? incomingTraceId : UUID.randomUUID().toString();
        try (var ctx = Mdc.with(Map.of("traceId", traceId))) {
            log.info("Request received");
            new OrderService().placeOrder("8821", "42", 99.00);
            log.info("Request completed");
        }
    }
}

If the request came from another service that already has a trace ID (say, via an HTTP header), you reuse it. If this is the origin, you generate one. Either way, the same ID flows through everything downstream.

Verify

All log statements within the request share the same traceId:

{"message":"Request received","traceId":"a3f1c2d4-...","level":"INFO"}
{"message":"Order placed: amount=99.0","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Confirming order...","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Order confirmed","traceId":"a3f1c2d4-...","orderId":"8821","userId":"42","level":"INFO"}
{"message":"Request completed","traceId":"a3f1c2d4-...","level":"INFO"}

That grep that used to return 40 ambiguous lines? Now it returns exactly the lines you care about.


6. MDC in Multi-Threaded Environments

Here’s where things get interesting — and where I’ve seen the most production surprises. You added a traceId in section 5 and it worked perfectly. Then you introduced an async task, and suddenly the traceId is gone from those log lines. No error, no warning. Just missing context.

MDC is ThreadLocal. Each thread has its own independent MDC map. This works perfectly when a single thread handles a request from start to finish. But the moment you submit work to another thread, that thread starts with an empty MDC. Your context is gone.

Platform thread pools

import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.slf4j.MDC;

ExecutorService pool = Executors.newFixedThreadPool(4);

Map<String, String> mdcContext = MDC.getCopyOfContextMap();

pool.submit(() -> {
    if (mdcContext != null) {
        MDC.setContextMap(mdcContext);
    }
    try {
        log.info("Async task running");
        // ... work
    } finally {
        MDC.clear();
    }
});

The pattern is: copy the MDC map on the calling thread before you submit, then restore it as the very first thing inside the task. The finally { MDC.clear() } ensures the pool thread is clean for whoever uses it next.

Virtual threads (Java 21)

Virtual threads behave exactly the same as platform threads with respect to MDC. Same empty-on-creation behavior, same copy-and-restore fix:

import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.slf4j.MDC;

ExecutorService virtualPool = Executors.newVirtualThreadPerTaskExecutor();

Map<String, String> mdcContext = MDC.getCopyOfContextMap();

virtualPool.submit(() -> {
    if (mdcContext != null) {
        MDC.setContextMap(mdcContext);
    }
    try {
        log.info("Virtual thread task running");
    } finally {
        MDC.clear();
    }
});

Why Logback 1.5.17+ matters here: earlier versions had a race condition between MDC initialization and the logger context under high virtual thread concurrency — fixed in 1.5.17. The fix also requires SLF4J 2.0.17 or later to be fully effective; the dependency block in section 2 already satisfies this. Pin to 1.5.32 and you get all the fixes. Don’t learn this the hard way in production.

What frameworks handle for you

If you are using Spring Boot 3.x with Micrometer Tracing, most of this is handled automatically. Micrometer wraps your @Async executors and scheduled tasks with context-propagating decorators — the copy-and-restore pattern above happens behind the scenes, and trace/span IDs are populated in MDC without any manual wiring.

If you are on Spring Boot without Micrometer Tracing, Spring’s ThreadPoolTaskExecutor supports a TaskDecorator interface. One decorator implementation that copies and restores MDC replaces the manual boilerplate for every task submitted to that pool.

The patterns in this section are what those frameworks implement under the hood. Understanding them means you can debug propagation failures when the framework’s abstraction leaks — and it always leaks eventually.


7. MDC in Reactive Environments

If multi-threaded MDC surprised you, reactive will genuinely frustrate you at first. And that’s okay — I’ve been there too.

With thread pools, at least you knew when you were crossing a thread boundary. You called pool.submit(), you saw the handoff, you knew you had to copy the context. The fix was mechanical once you understood the problem.

Reactive is different. You write a pipeline — .map(), .flatMap(), .filter() — and it looks like a single flow. But under the hood, each operator may run on a different thread, decided by the scheduler, at a time you don’t control. You never called submit(). You never explicitly crossed a thread boundary. And yet your MDC context is gone halfway through the pipeline, and some log lines have the traceId while others don’t. You reread your code three times and can’t see why.

The root cause is that ThreadLocal — the mechanism MDC is built on — is fundamentally incompatible with reactive programming’s execution model. ThreadLocal assumes one thread owns the context for the duration of a task. Reactive assumes no such thing. The solution requires a different approach entirely: stop carrying context in ThreadLocal and start carrying it in the reactive context, which actually propagates through the pipeline.

Before writing any reactive code, add these two dependencies to your pom.xml:

<!-- Project Reactor -->
<dependency>
    <groupId>io.projectreactor</groupId>
    <artifactId>reactor-core</artifactId>
    <version>3.6.11</version>
</dependency>

<!-- RxJava 3 -->
<dependency>
    <groupId>io.reactivex.rxjava3</groupId>
    <artifactId>rxjava</artifactId>
    <version>3.1.10</version>
</dependency>

The solution is to stop relying on thread-local storage and instead carry the MDC context in the reactive context, which does propagate through the pipeline. Then you restore MDC from it at each point where a log statement runs.

Project Reactor

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
import reactor.core.publisher.Mono;
import reactor.util.context.Context;
import java.util.Map;

public Mono<String> processOrder(String orderId, String traceId) {
    return Mono.just(orderId)
        .doOnEach(signal -> {
            if (signal.isOnNext() || signal.isOnError()) {
                Map<String, String> mdcContext = signal.getContextView()
                    .<Map<String, String>>getOrDefault("MDC_CONTEXT", Map.of());
                MDC.setContextMap(mdcContext);
                if (signal.isOnNext()) {
                    log.info("Processing order");
                }
                MDC.clear();
            }
        })
        .map(id -> "processed-" + id)
        .contextWrite(Context.of("MDC_CONTEXT", Map.of("traceId", traceId, "orderId", orderId)));
}

contextWrite() loads the MDC map into the Reactor context. doOnEach() reads it back and restores it into MDC before each signal, then clears it immediately after.

Why contextWrite() goes at the end: Reactor context flows upstream — an operator can only see context written downstream of it. Put contextWrite() at the end of your chain and it’s visible to every operator above it. Put it in the middle and operators above that point are left in the dark.

RxJava

RxJava has no built-in reactive context, but it has scheduler hooks — and that’s all you need:

import io.reactivex.rxjava3.plugins.RxJavaPlugins;
import org.slf4j.MDC;
import java.util.Map;

// Call this once at application startup.
// setScheduleHandler replaces any previously registered handler —
// calling it a second time silently discards the first.
RxJavaPlugins.setScheduleHandler(runnable -> {
    Map<String, String> mdcContext = MDC.getCopyOfContextMap();
    return () -> {
        if (mdcContext != null) {
            MDC.setContextMap(mdcContext);
        }
        try {
            runnable.run();
        } finally {
            MDC.clear();
        }
    };
});

This wraps every unit of work that RxJava schedulers execute — capturing MDC at scheduling time and restoring it at execution time. Set it once at startup and forget about it.

Verify (Reactor)

{"message":"Processing order","traceId":"a3f1c2d4-...","orderId":"8821","level":"INFO","thread_name":"reactor-http-nio-3"}

The context fields are there even though the log ran on a Reactor NIO thread. That’s exactly what we want.

What frameworks handle for you

Project Reactor — automatic with Spring Boot 3 + Micrometer Tracing

Reactor 3.5.3 introduced Hooks.enableAutomaticContextPropagation(). When called at startup, it bridges ThreadLocal values — including SLF4J MDC — into Reactor’s context automatically. No contextWrite(), no doOnEach(). Spring Boot 3.x enables this automatically when micrometer-tracing is on the classpath, via the io.micrometer:context-propagation library.

Without Spring Boot, one call at startup is all it takes:

import reactor.core.publisher.Hooks;

// Call once before any pipeline runs.
Hooks.enableAutomaticContextPropagation();

After that, MDC values set before subscribing propagate through the entire pipeline — across schedulers, across operators — without any manual wiring.

RxJava — no automatic mechanism

Spring does not auto-configure RxJava MDC propagation. The RxJavaPlugins.setScheduleHandler() pattern shown above is what you need regardless of framework. The good news: you set it once at startup and it covers every observable in the application.

As with section 6: understanding the manual patterns is still worth it. When the automatic propagation breaks — and it does, usually at the boundary between a managed executor and one you created yourself — knowing what’s happening underneath is how you diagnose it.


8. Troubleshooting

I’ve hit every one of these. Hopefully this saves you some time.

MDC fields are missing in async tasks You submitted work to a thread pool without copying the MDC map first. Capture MDC.getCopyOfContextMap() on the calling thread before submitting, and call MDC.setContextMap() as the first thing inside the task.

MDC fields appear in some log lines but not others in a reactive pipeline You’re logging outside a doOnEach() block, or contextWrite() is placed upstream of the operators that need it. Move contextWrite() to the end of the chain and log only inside doOnEach().

MDC fields from one request appear in the next request’s logs A thread was returned to the pool without clearing MDC. Add MDC.clear() at the very start of each request entry point as a defensive measure — on top of the finally cleanup on the way out. Belt and suspenders.

All logs are plain text, not JSON Check that logback.xml is under src/main/resources (not src/main/java). Run mvn dependency:tree to confirm logstash-logback-encoder is on the runtime classpath. And double-check the encoder class name — it’s LogstashEncoder, not LogstashFormatter.

Empty or missing MDC in virtual threads Upgrade to Logback 1.5.17 or later. Earlier versions have MDC adapter issues under virtual thread concurrency.


9. What You Built

You now have a standalone Java application that: - Emits every log line as a JSON object any log aggregator can ingest - Attaches request-scoped context (order ID, user ID, trace ID) as top-level JSON fields via MDC - Propagates that context correctly across platform thread pools, Java 21 virtual threads, Project Reactor pipelines, and RxJava schedulers

The 2am debugging session I described at the start? With this in place, you grep by traceId and see the full story of that request in seconds.

The complete code is at github.com/abadongutierrez/structured-logging-slf4j-java-companion-repo.

From here you can: - Add a log aggregator: ship these JSON logs to Datadog, the ELK stack, or Grafana Loki without any parser configuration - Automate trace ID injection: replace the manual UUID.randomUUID() with OpenTelemetry or Micrometer Tracing, which generate W3C-compliant trace/span IDs and populate MDC automatically - Add Spring Boot: Spring Boot 3.x auto-configures Logback and Micrometer Tracing — the patterns here carry over directly, with less boilerplate


10. The Problem That Comes Next

You solved the single-service problem. Every log line is structured, every request is traceable, context propagates correctly regardless of how your application executes. That’s the foundation.

The next problem is consistency across services — and it’s sneakier.

I’ve worked on systems where every service emitted JSON logs and debugging was still a nightmare. One service called it userId, another user_id, a third user. One used ERROR for every exception including the expected ones. Another didn’t log external API calls at all. The aggregator had structured data, but the data had no shared meaning. You couldn’t write a monitor that worked across services. You were back to grepping, just in a fancier tool.

Structured logging without consistency is still chaos — slower chaos, with better tooling.

Part of the answer is a logging taxonomy: a shared vocabulary of event names that your entire system agrees on. Instead of free-form messages like "Order placed" or "Payment failed", you define a set of named events — order.placed, payment.failed, user.authenticated — and emit them as a dedicated field alongside the message:

{"event": "order.placed", "message": "Order placed successfully", "orderId": "8821", "amount": "99.00"}
{"event": "payment.failed", "message": "Payment declined by provider", "orderId": "8821", "reason": "insufficient_funds"}

Now your monitoring layer can work with event names instead of message strings. Alert when event = "payment.failed" exceeds a threshold. Build a dashboard that counts order.placed events per minute. Write queries that are stable across refactors — because renaming a log message doesn’t break your alerts when the event name is a contract, not a sentence.

Tuesday, November 25, 2025

Abstraction in Java: Focusing on What Matters

# Abstraction in Java: Focusing on What Matters

Abstraction in Java: Focusing on What Matters

Abstraction is one of the pillars of the Object Oriented Programming paradigm and it is probably one of the most misunderstood concepts. I’ve seen colleagues and myself, let’s be honest, confuse it with encapsulation a lot of times. Many people think it’s just about creating interfaces and abstract classes. But abstraction is something more fundamental, and honestly, understanding it well helps you go from good code to great code.

What Actually is Abstraction?

Abstraction is the art of hiding complexity and exposing only what’s necessary to achieve the functionality. When you drive your car, you don’t need to be a mechanic or know about how the internal combustion engine works. You just grab your keys, start the engine and from there you push the gas pedal to advance, use the brake pedal to stop, and turn the steering wheel to the direction you want to go. That’s abstraction my friend.

When using abstraction you show only the essential features of an object and hiding the unnecessary details.

The goal is to reduce complexity by breaking down a system into smaller, manageable pieces where each piece has a clear, simple, intention revealing interface. When done correctly, and this is not simple, of course, abstraction lets you think about problems at a higher level without getting bogged down in implementation details.

Abstraction vs Encapsulation - Ok yeah, so what’s the difference then?

Great question!

Encapsulation is about controlling access to data to offer protection and avoid inconsistencies. It’s about bundling data with the methods that operate on that data, and restricting direct access to some of the internal components.

Abstraction is about hiding complexity and showing only relevant details. It’s a simplified view of something complex. Think of it as creating a simple universal remote control for a complicated entertainment system.

Both concepts are hiding (or protecting) something: Encapsulation is hiding information and Abstraction is hiding implementation details.

Let us see some code to illustrate the encapsulation concept (the following example is shorter version of the code from another post in my blog that talks about encapsulation: https://abaddon-gtz.blogspot.com/2025/08/encapsulation-in-java-writing-secure.html)

public class BadBankAccount {
    // Private fields - hidden from outside access
    private String owner;
    private double balance;
    private List<String> transactions;
    private static final double MINIMUM_BALANCE = 0.0;

    // Constructor with validation (This can be better implemented as a factory method)
    public BadBankAccount(String owner, double initialBalance) {
        if (owner == null || owner.trim().isEmpty()) {
            throw new IllegalArgumentException("Owner name cannot be null or empty");
        }
        if (initialBalance < MINIMUM_BALANCE) {
            throw new IllegalArgumentException("Initial balance cannot be negative");
        }

        this.owner = owner;
        this.balance = initialBalance;
        this.transactions = new ArrayList<>();
        addTransaction("Account opened with balance: $" + initialBalance);
    }

    // Business logic for deposits
    public boolean deposit(double amount) {
        if (amount <= 0) {
            System.out.println("Deposit amount must be positive");
            return false;
        }

        balance += amount;
        addTransaction("Deposited: $" + amount + " | New balance: $" + balance);
        return true;
    }

    // more code
}

The focus here is protection. We’re preventing invalid states and controlling access to the internals to avoid inconsistencies.

Now look at abstraction:

// A checkout service that process payments
public class CheckoutService {
    private PaymentProcessor processor;

    public void checkout(double amount) {
        // simple interface, abstracting complexity
        boolean success = processor.processPayment(amount);
        if (success) {
            completeOrder();
        }
    }
}

// ABSTRACTION!
public interface PaymentProcessor {
    boolean processPayment(double amount);
}

// Implementation 1 - Credit Card (complex internal logic)
public class CreditCardProcessor implements PaymentProcessor {
    @Override
    public boolean processPayment(double amount) {
        // Hidden complexity:
        // - Validate card number
        // - Check CVV
        // - Contact payment service
        // - Handle security
        // - Deal with declined transactions or other errors
        // - Manage retry logic
        // All this complexity is hidden behind a simple interface
        return true;
    }
}

// Implementation 2 - PayPal (different complex logic)
public class PayPalProcessor implements PaymentProcessor {
    @Override
    public boolean processPayment(double amount) {
        // Hidden complexity:
        // - authentication
        // - PayPal API calls
        // - Handle different currencies
        // - etc, etc, etc
        // Again, complexity hidden
        return true;
    }
}

The focus here is simplification. We’re hiding different complex implementations behind a common, simple interface.

They Work Together

More often than not these concepts are used together and probably because of that, I think, it is why we tend to confuse them a lot.

Again, using the example of driving a car, there is encapsulation and abstraction in action. You don’t have to open the hood of the car to start the engine manually and also to turn the car to the direction you want to go you use the simple interface that is the steering wheel.

Simple quick mental model to remember the concepts

Ask yourself:

  • “Am I protecting data and controlling access?” → That’s encapsulation
  • “Am I hiding complexity and providing a simpler interface?” → That’s abstraction

Both make your code better, but for different reasons. Encapsulation keeps your data safe and consistent. Abstraction keeps your code manageable and flexible.

Do I really need them?

Without proper abstraction and encapsulation, your code can become a tangled mess where:

  • You need to understand too much to do simple things
  • Changes in one place break things in unexpected places (we’ve all know this pain!)
  • New team members take forever to understand what’s going on in the code
  • Testing becomes a nightmare because everything is coupled to everything

So, yeah! you need them!

Imagine you are building a simple tool to migrate data from one database to another and because some business and safety requirements, there are some steps that you need to enforce like:

  • you must create a backup before copying the data
  • after running the migration you need to validate data is actually copied in the database.

Now imagine you write a class like the following:

/**
 * <Some very large and detailed documentation of how to use this tool>
 */
public class DatabaseMigrationTool {
    // lot of attributes

    public void setSourceDatabase(String host, String db, String user, String password) { ... }
    public void setTargetDatabase(String host, String db, String user, String password) { ... }

    public void createBackup() {
        // Implementation details omitted for brevity
        this.backupCreated = true;

        // nullpointer exceptions if there is no source and target data defined :D
    }

    public void migrateData() {
        // you cannot transfer the data if the backup wasn't created
        if (!backupCreated) throw new IllegalStateException("Hey, did you forget to backup?");

        // Implementation details omitted for brevity
        this.dataTransferred = true;
    }

    public void validateMigratedData() {
        // you cannot validate copied data if the data wasn't transfered first!
        if (!dataTransferred) throw new IllegalStateException("Come on! really?!");
    }
}

This is probably not that bad. If a developer that does not like to read documentation tries to use it, he probably will hit his head against the wall a couple of times before learning there is some order of how to call the methods.

DatabaseMigration migration = new DatabaseMigration();
migration.migrateData(); // ❌ Oops! No backup!
migration.setSourceDatabase(...);
migration.validateMigratedData(); // ❌ Oops!
migration.setTargetDatabase(...);

With Abstraction (clean and intention revealing)

Let us use the beautiful “Step Builder Pattern” that combines Fluent interfaces and the Builder pattern.

public interface FirstStep {
    SecondStep fromDatabase(String host, String db, String user, String password);
}

public interface SecondStep {
    ThirdStep toDatabase(String host, String db, String user, String password);
}

public interface ThirdStep {
    FourthStep withBackup();
}

public interface FourthStep {
    FifthStep migrateData();
}

public interface FifthStep {
    FinalStep validateMigration();
}

public interface FinalStep {
    void finish();
}

public class DatabaseMigrationTool {
    // you have the same attributes and methods defined before but they are private
    // no one can call them if not using the builder

    // The builder!
    public static FirstStep builder() {
        return new Builder();
    }

    private static class Builder
        implements FirstStep, SecondStep, ThirdStep, FourthStep, FifthStep, FinalStep {
        private DatabaseMigrationTool tool = new DatabaseMigrationTool();

        @Override
        public SecondStep fromDatabase(String host, String db, String user, String password) {
            // Add validation here
            this.tool.setSourceDatabase(host, db, user, password);
            return this;
        }

        @Override
        public ThirdStep toDatabase(String host, String db, String user, String password) {
            // Add validation here
            this.tool.setTargetDatabase(host, db, user, password);
            return this;
        }

        @Override
        public FourthStep withBackup() {
            this.tool.createBackup();
            return this;
        }

        @Override
        public FifthStep migrateData() {
            this.tool.migrateData();
            return this;
        }

        @Override
        public FinalStep validateMigration() {
            this.tool.validateMigratedData();
            return this;
        }

        @Override
        public void finish() {
            System.out.println("Done!");
        }
    }
}

Now if some developer wants to use it, the different interfaces that define the flow of the steps enforce the correct usage:

DatabaseMigrationTool.builder()
    .fromDatabase(...)
    .toDatabase(...)
    .withBackup()
    .migrateData()
    .validateMigration()
    .finish();

You could probably say: “I can simply define a ‘execute’ method calling all methods in the correct order”. Yes, of course, but what if later you need to add other steps that can be skipped according to the needs of each individual user of the tool?

public interface SkippableStep {
    NextStep executeStep();
    NextStep skipStep();
}

That’s the power of abstraction.

Achieving Abstraction with Java

1. Interfaces - The Foundation

Interfaces define contracts without implementation details.

public interface PaymentProcessor {
    PaymentResult processPayment(double amount, PaymentMethod method);
    boolean refund(String transactionId);
    TransactionStatus checkStatus(String transactionId);
}

// Different implementations
public class StripePaymentProcessor implements PaymentProcessor { /* ... */ }
public class PayPalPaymentProcessor implements PaymentProcessor { /* ... */ }

2. Abstract Classes - use them when you need some common stuff

Use abstract classes when you have shared behavior but still need abstraction:

public abstract class ReportGenerator {
    // Template method design pattern - defines the overall algorithm structure
    public final Report generate(String data) {
        Report report = createReport();
        report.setHeader(generateHeader());
        report.setBody(processData(data));
        report.setFooter(generateFooter());
        return report;
    }

    // shared behavior
    protected String generateHeader() {
        return "Report Generated: " + LocalDateTime.now();
    }

    // Abstract methods - let subclasses decide how to implement the functionality
    protected abstract Report createReport();
    protected abstract String processData(String data);
    protected abstract String generateFooter();
}

3. Depend on Abstractions, Not Concrete Implementations

Your high-level code should depend on interfaces, not concrete classes. SOLID Dependency Inversion principle!

// Bad
public class OrderService {
    private MySQLOrderRepository repository; // Tied to MySQL!

    public OrderService() {
        this.repository = new MySQLOrderRepository();
    }
}

// Good - depends on abstraction
public class OrderService {
    private final OrderRepository repository; // Any implementation works!

    public OrderService(OrderRepository repository) {
        this.repository = repository;
    }
}

Some good practices I’ve learned

  • Start with the interface, not the implementation. When designing a new component, ask yourself: “What operations do I need?” not “How will I implement this?” Design the contract first.
  • Keep abstractions focused. An interface should represent one concept. If your interface is doing too much, split it up. SOLID Interface Segregation Principle is your friend here.
  • Don’t create abstractions right away.* Wait until you have at least two implementations or a clear reason. Over-abstraction is as bad as under-abstraction. I’ve made this mistake more times than I’d like to admit.
  • Name abstractions by what they do, not how they do it. Use NotificationChannel instead of EmailSender. The abstraction shouldn’t leak implementation details in its name.
  • Test through abstractions. Write tests against interfaces, not concrete classes. This makes your tests more resilient to implementation changes and you can use mocks this way.

Conclusion

Abstraction is about managing complexity by hiding unnecessary details and exposing clean, intuitive interfaces. When you get it right, your code becomes easier to understand, easier to test, and way easier to change. Getting right abstractions does not happen at the first try, it requires practice, design patterns knowledge, and experimentation.

Good abstractions reduce the need for extensive documentation (caution, I’m not saying you don’t need documentation) because the code itself clearly expresses intent. The abstractions become the language you use to think about and discuss your system.

Do you have other examples of abstraction to share?