在金融级手机银行中,日志是系统的黑匣子:工程团队用它复盘链路,审计与监管用它证明事实。

单体年代,落盘归档就够用;微服务与容器云时代,实例倍增、日志暴涨;高并发下,过度日志会吃满 I/O、拖垮 CPU,主链路也会跟着失速。 因此,可用的日志系统必须同时做到 高性能、高可用、低成本,并满足 合规。

我们把方法归纳为四类:采集、配置、规范、压测。

一、采集——应用只写 stdout,平台全权接管。

  • LogAgent: 在节点侧采集标准输出,送到Collector,再分发到 Kafka/ES/对象存储;容器重启不丢、查询口径一致。
  • 大对象不上链:交易日志只写摘要(size、hash、关键字段);全量请求/响应、慢SQL走异步旁路,主链路不被堵。
  • 链路可降级:多副本冗余;背压时先丢DEBUG/INFO,保WARN/ERROR。
  • 成本可控: 统计 GB/天、监控backlog与延迟;在日志中注入traceId/spanId,确保跨服务可追踪。
  • 协议选择: 以gRPC/HTTP + 批量压缩为主,极端吞吐再评估Netty/TCP的复杂度。

二、配置——把“写日志”的成本从主线程挪走。

  • 使用AsyncAppender,主线程入队、后台批量写;

注:队列大小估算:queueSize ≈ QPS × 单条日志字节数 × 可容忍积压秒数(受 JVM 可用内存约束)

  • 设置条件丢弃,只丢低级别。
  • 格式输出用JSON,时间戳写epochMillis,展示端再格式化。
  • 统一字段:traceId、spanId、app、pod、instance、thread、uid、ip、channel、uri、method、status、rt_ms;通过 MDC 注入模板输出。
  • 归档按时间 + 大小 + 级别组合;ERROR长保,INFO/DEBUG短保;
  • Agent 侧设置每容器限速,防止I/O被日志占满。

代码示例

2.1 logback.xml(异步、JSON、分级归档、独立 payload/ERROR 通道)

放置:src/main/resources/logback.xml

说明:

  • APP_FILE:INFO/DEBUG 主通道(短保,JSON,Size+Time 归档)。
  • ERROR_FILE:ERROR 独立通道(长保)。
  • PAYLOAD_FILE:大负载/慢 SQL 旁路通道(短保+隔离)。
  • 三者均用 AsyncAppender,业务线程永不阻塞。
<configuration>
  <!-- 可用环境变量覆盖 -->
  <property name="APP" value="mobile-bank"/>
  <property name="LOG_HOME" value="/data/logs/${APP}"/>

  <!-- ========== 主通道(INFO/DEBUG)JSON + epochMillis + 双归档 ========== -->
  <appender name="APP_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>${LOG_HOME}/app.log</file>
    <encoder class="net.logstash.logback.encoder.LogstashEncoder">
      <includeMdc>true</includeMdc>
      <timeZone>UTC</timeZone>
      <timestampPattern>UNIX_MILLIS</timestampPattern> <!-- epoch 毫秒 -->
      <fieldNames>
        <timestamp>ts</timestamp>
        <level>lvl</level>
        <thread>th</thread>
        <logger>logger</logger>
        <message>msg</message>
      </fieldNames>
    </encoder>
    <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
      <fileNamePattern>${LOG_HOME}/app.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
      <maxFileSize>128MB</maxFileSize>   <!-- 单卷上限 -->
      <maxHistory>14</maxHistory>        <!-- 短保 14 天 -->
      <totalSizeCap>50GB</totalSizeCap>  <!-- 总容量上限 -->
    </rollingPolicy>
    <!-- 拒绝 ERROR(ERROR 走专用通道) -->
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
      <level>ERROR</level>
      <onMatch>DENY</onMatch>
      <onMismatch>NEUTRAL</onMismatch>
    </filter>
  </appender>

  <!-- ========== ERROR 独立通道(长保,用于审计/追责) ========== -->
  <appender name="ERROR_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>${LOG_HOME}/error.log</file>
    <encoder class="net.logstash.logback.encoder.LogstashEncoder">
      <includeMdc>true</includeMdc>
      <timeZone>UTC</timeZone>
      <timestampPattern>UNIX_MILLIS</timestampPattern>
    </encoder>
    <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
      <fileNamePattern>${LOG_HOME}/error.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
      <maxFileSize>64MB</maxFileSize>
      <maxHistory>90</maxHistory>         <!-- 长保 90 天(按需调整) -->
      <totalSizeCap>100GB</totalSizeCap>
    </rollingPolicy>
    <!-- 仅接受 ERROR -->
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
      <level>ERROR</level>
      <onMatch>ACCEPT</onMatch>
      <onMismatch>DENY</onMismatch>
    </filter>
  </appender>

  <!-- ========== 大负载/慢 SQL 旁路通道(短保+隔离) ========== -->
  <appender name="PAYLOAD_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>${LOG_HOME}/payload.log</file>
    <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
      <fileNamePattern>${LOG_HOME}/payload.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
      <maxFileSize>64MB</maxFileSize>
      <maxHistory>3</maxHistory>          <!-- 很短的留存 -->
    </rollingPolicy>
  </appender>

  <!-- ========== 异步包装:主线程入队,后台批量写 ========== -->
  <!-- 队列估算:queueSize ≈ QPS × 单条大小 × 容忍积压秒数 -->
  <appender name="ASYNC_APP" class="ch.qos.logback.classic.AsyncAppender">
    <appender-ref ref="APP_FILE"/>
    <queueSize>20000</queueSize>               <!-- 例:10k/s × 1KB × 2s -->
    <discardingThreshold>30</discardingThreshold> <!-- 仅丢 DEBUG/INFO -->
    <neverBlock>true</neverBlock>               <!-- 主线程不阻塞 -->
    <includeCallerData>false</includeCallerData>
  </appender>

  <appender name="ASYNC_ERROR" class="ch.qos.logback.classic.AsyncAppender">
    <appender-ref ref="ERROR_FILE"/>
    <queueSize>5000</queueSize>                 <!-- 错误量通常较小 -->
    <discardingThreshold>0</discardingThreshold><!-- ERROR 永不丢 -->
    <neverBlock>true</neverBlock>
  </appender>

  <appender name="ASYNC_PAYLOAD" class="ch.qos.logback.classic.AsyncAppender">
    <appender-ref ref="PAYLOAD_FILE"/>
    <queueSize>5000</queueSize>
    <discardingThreshold>10</discardingThreshold> <!-- 高水位优先丢低级别 payload -->
    <neverBlock>true</neverBlock>
  </appender>

  <!-- 根日志:默认 INFO,写主通道 + ERROR 通道 -->
  <root level="INFO">
    <appender-ref ref="ASYNC_APP"/>
    <appender-ref ref="ASYNC_ERROR"/>
  </root>

  <!-- 独立 logger:写入完整 payload/慢 SQL 等 -->
  <logger name="payload.logger" level="INFO" additivity="false">
    <appender-ref ref="ASYNC_PAYLOAD"/>
  </logger>
</configuration>

Maven 依赖(使用 JSON 编码器):

<dependency>
  <groupId>net.logstash.logback</groupId>
  <artifactId>logstash-logback-encoder</artifactId>
  <version>7.4</version>
</dependency>

2.2 MDC 过滤器(统一注入字段)

放置:src/main/java/com/example/logging/MdcFilter.java

作用:在每个请求进入时,把统一字段放进 MDC,日志模板自动带出;请求结束后必须清理。

package com.example.logging;

import org.slf4j.MDC;

import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import java.io.IOException;
import java.net.InetAddress;

public class MdcFilter implements Filter {

    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
            throws IOException, ServletException {

        HttpServletRequest request = (HttpServletRequest) req;
        try {
            // 1) traceId / spanId:来自网关/APM(无则本地生成保底)
            String traceId = request.getHeader("X-Trace-Id");
            if (traceId == null || traceId.isEmpty()) {
                traceId = java.util.UUID.randomUUID().toString();
            }
            MDC.put("traceId", traceId);

            String spanId = request.getHeader("X-Span-Id");
            if (spanId != null) MDC.put("spanId", spanId);

            // 2) 统一基础维度
            MDC.put("app", System.getenv().getOrDefault("APP_NAME", "mobile-bank"));
            MDC.put("pod", System.getenv().getOrDefault("HOSTNAME", "N/A"));
            MDC.put("instance", InetAddress.getLocalHost().getHostName());
            MDC.put("thread", Thread.currentThread().getName());

            // 3) 请求维度
            MDC.put("uri", request.getRequestURI());
            MDC.put("method", request.getMethod());
            MDC.put("ip", request.getRemoteAddr());
            MDC.put("channel", request.getHeader("X-Channel") == null ? "unknown" : request.getHeader("X-Channel"));

            // 4) 业务维度(示例:用户ID)
            String uid = request.getHeader("X-User-Id");
            if (uid != null) MDC.put("uid", uid);

            chain.doFilter(req, res);
        } finally {
            // 防止线程复用导致“脏 MDC”
            MDC.clear();
        }
    }
}

注册 Filter(Spring Boot):

package com.example.logging;

import org.springframework.boot.web.servlet.FilterRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class FilterConfig {
    @Bean
    public FilterRegistrationBean<MdcFilter> mdcFilterRegistration() {
        FilterRegistrationBean<MdcFilter> reg = new FilterRegistrationBean<>();
        reg.setFilter(new MdcFilter());
        reg.addUrlPatterns("/*");
        reg.setOrder(1); // 尽量靠前,保证后续日志都有 MDC
        return reg;
    }
}

2.3 Agent 侧限速(Fluent Bit → Kafka 示例)

“每容器限速”通常结合匹配规则 + 背压/节流插件 + 后端限额 实现。以下为简化示例。

# /etc/fluent-bit/fluent-bit.conf (或 K8s ConfigMap)
[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info

[INPUT]
    Name              tail
    Path              /var/log/containers/*.log
    Parser            docker
    Tag               kube.*
    Mem_Buf_Limit     256MB     # 防止单Agent吃光节点内存
    Skip_Long_Lines   On

[FILTER]
    Name              kubernetes
    Match             kube.*
    Merge_Log         On
    Keep_Log          Off

# (可选)节流/采样过滤,根据插件支持情况启用
#[FILTER]
#    Name          throttle
#    Match         kube.*
#    Rate          8000     # 条/秒
#    Window        1

[OUTPUT]
    Name          kafka
    Match         kube.*
    Brokers       kafka-0:9092,kafka-1:9092
    Topics        app-logs
    rdkafka.batch.num.messages  10000
    rdkafka.compression.codec   gzip

三、规范——开发者只写业务,框架兜底合规与体积。

  • AOP自动采集入参、出参、耗时、异常;
  • Sanitizer 统一做摘要、脱敏、截断与硬上限(单条 4–8KB)。
  • 异常栈与大负载走独立 Logger 与独立限流;转账/支付等高敏高频接口采用摘要模式,查询/统计等低敏中频按 1–5% 抽样落体走旁路。
  • 级别策略:正常 INFO 一行摘要;超时/错误 → WARN/ERROR + 摘要;全量需旁路且限流。
  • 所有策略通过配置中心灰度开关,紧急调整无需发版。
  • SQL/HTTP 体量限制:>16KB 仅摘要,>256KB 只记元数据(URI、状态、rt、size、hash)。
  • 审计留痕独立于诊断日志:不可变、可回放、结构化存储,单独计费与留存,满足防篡改与追溯。
  • 一律参数化日志,禁止字符串拼接。

代码示例

3.1 策略参数(YAML + 热更新)

# application.yml
log:
  feature:
    enabled: true                 # 总开关(紧急可关增强逻辑,仅保留 ERROR)
    samplePercent: 0.02           # 查询/统计接口:完整payload抽样 1~5%
    hardCapBytes: 8192            # 单条硬上限(4~8KB)
    maxStringChars: 512           # 单字段最大字符数
    maxCollectionElements: 10     # 集合/数组样本数量
    httpBodySummaryThreshold: 16384    # >16KB:摘要
    httpBodyMetaOnlyThreshold: 262144  # >256KB:仅元数据
    timeoutMsWarn: 1000           # 超时阈值(ms),超时打印 WARN
    highRiskApis:                 # 高敏高频:严格摘要(转账/支付)
      - "/transfer/**"
      - "/payment/**"
    lowRiskApis:                  # 低敏中频:允许抽样全量(查询/统计)
      - "/query/**"
      - "/report/**"

绑定与热更新(示例):

package com.example.logging.cfg;

import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.cloud.context.config.annotation.RefreshScope;
import org.springframework.stereotype.Component;
import java.util.List;

@Data
@RefreshScope
@Component
@ConfigurationProperties(prefix = "log.feature")
public class LogFeatureProps {
  private boolean enabled = true;
  private double samplePercent = 0.02;
  private int hardCapBytes = 8192;
  private int maxStringChars = 512;
  private int maxCollectionElements = 10;
  private int httpBodySummaryThreshold = 16 * 1024;
  private int httpBodyMetaOnlyThreshold = 256 * 1024;
  private int timeoutMsWarn = 1000;
  private List<String> highRiskApis;
  private List<String> lowRiskApis;
}

3.2 AOP:自动采集入参/出参/耗时/异常(配合 Sanitizer)

package com.example.logging;

import com.example.logging.cfg.LogFeatureProps;
import com.example.logging.sanitize.LogSanitizer;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;

import java.util.Map;

@Slf4j
@Aspect
@Component
@RequiredArgsConstructor
public class ApiLogAspect {

    private static final Logger PAYLOAD_LOG = LoggerFactory.getLogger("payload.logger");
    private final LogFeatureProps props;

    @Around("within(@org.springframework.web.bind.annotation.RestController *)")
    public Object around(ProceedingJoinPoint pjp) throws Throwable {
        long start = System.currentTimeMillis();

        // 1) 入参摘要(集合取样、字符串截断、敏感脱敏)
        Map<String, Object> inBrief = LogSanitizer.briefArgs(pjp.getArgs(),
                props.getMaxCollectionElements(), props.getMaxStringChars());

        try {
            Object ret = pjp.proceed();
            long rt = System.currentTimeMillis() - start;

            // 2) 出参摘要
            Map<String, Object> outBrief = LogSanitizer.briefResult(ret,
                    props.getMaxCollectionElements(), props.getMaxStringChars());

            // 3) 超时判定:超过阈值打印 WARN
            if (rt > props.getTimeoutMsWarn()) {
                log.warn("api_slow in={} out={} rt_ms={}", inBrief, outBrief, rt);
            } else {
                log.info("api_call in={} out={} rt_ms={}", inBrief, outBrief, rt);
            }

            // 4) 低敏中频接口:按采样比例写“完整payload”到旁路通道(隔离+短保)
            if (isLowRiskApi() && Math.random() < props.getSamplePercent()) {
                PAYLOAD_LOG.info("full_payload body={}", LogSanitizer.safeToString(ret, props));
            }

            return ret;
        } catch (Throwable t) {
            long rt = System.currentTimeMillis() - start;
            // 5) 异常摘要(栈走默认ERROR通道,或按需写入 payload 通道)
            log.error("api_fail in={} rt_ms={} ex={}", inBrief, rt, t.toString(), t);
            throw t;
        }
    }

    private boolean isLowRiskApi() {
        // 简化示例:真实项目请用 HandlerMethod + PathPatternMatcher 判断
        return true;
    }
}

3.3 Sanitizer:摘要/脱敏/截断/硬上限(单条 4–8KB)

package com.example.logging.sanitize;

import com.example.logging.cfg.LogFeatureProps;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.util.*;
import java.util.stream.Collectors;

/**
 * 将任何对象“安全地”转为可记录的简短结构:
 * - String:超长截断并标注原长度
 * - Collection/Array:仅输出 size + 前 N 个样本
 * - Map:仅输出前 N 个键(按迭代顺序)
 * - byte[]:不落正文,仅 len + hash 前16位
 * - 统一敏感字段脱敏
 * - 最终字符串若超过硬上限(4~8KB),二次截断
 */
public class LogSanitizer {

    public static Map<String, Object> briefArgs(Object[] args, int maxElements, int maxChars) {
        Map<String, Object> result = new LinkedHashMap<>();
        if (args == null) return result;
        for (int i = 0; i < Math.min(args.length, maxElements); i++) {
            result.put("arg" + i, sanitize(args[i], maxElements, maxChars));
        }
        if (args.length > maxElements) result.put("_rest", args.length - maxElements);
        return result;
    }

    public static Map<String, Object> briefResult(Object ret, int maxElements, int maxChars) {
        return Map.of("result", sanitize(ret, maxElements, maxChars));
    }

    /** 面向旁路通道的“安全 toString”,会应用硬上限 */
    public static String safeToString(Object obj, LogFeatureProps props) {
        String s = String.valueOf(sanitize(obj, props.getMaxCollectionElements(), props.getMaxStringChars()));
        if (s.getBytes(StandardCharsets.UTF_8).length > props.getHardCapBytes()) {
            return s.substring(0, Math.min(s.length(), props.getMaxStringChars())) + "...[HARD-CAP," + props.getHardCapBytes() + "B]";
        }
        return s;
    }

    @SuppressWarnings("unchecked")
    public static Object sanitize(Object obj, int maxElements, int maxChars) {
        if (obj == null) return null;

        if (obj instanceof Number || obj instanceof Boolean) return obj;

        if (obj instanceof CharSequence) {
            String s = maskSensitive(obj.toString());
            return s.length() > maxChars
                    ? s.substring(0, maxChars) + "...[TRUNCATED,len=" + s.length() + "]"
                    : s;
        }

        if (obj instanceof byte[]) {
            byte[] arr = (byte[]) obj;
            return Map.of(
                    "type", "bytes",
                    "len", arr.length,
                    "sha256_16", sha256Hex(arr).substring(0, 16)
            );
        }

        if (obj.getClass().isArray()) {
            int len = java.lang.reflect.Array.getLength(obj);
            List<Object> sample = new ArrayList<>();
            for (int i = 0; i < Math.min(len, maxElements); i++) {
                sample.add(sanitize(java.lang.reflect.Array.get(obj, i), maxElements, maxChars));
            }
            return Map.of("type", "array", "size", len, "sample", sample);
        }

        if (obj instanceof Collection<?> col) {
            List<Object> sample = new ArrayList<>();
            int cnt = 0;
            for (Object o : col) {
                if (cnt++ >= maxElements) break;
                sample.add(sanitize(o, maxElements, maxChars));
            }
            return Map.of("type", "collection", "size", col.size(), "sample", sample);
        }

        if (obj instanceof Map<?, ?> m) {
            Map<Object, Object> sample = new LinkedHashMap<>();
            int cnt = 0;
            for (Map.Entry<?, ?> e : m.entrySet()) {
                if (cnt++ >= maxElements) break;
                sample.put(e.getKey(), sanitize(e.getValue(), maxElements, maxChars));
            }
            return new LinkedHashMap<>() {{
                put("type", "map");
                put("size", m.size());
                put("sample", sample);
                put("rest", Math.max(0, m.size() - maxElements));
            }};
        }

        String s = maskSensitive(String.valueOf(obj));
        return s.length() > maxChars ? s.substring(0, maxChars) + "...[TRUNCATED,len=" + s.length() + "]" : s;
    }

    /** 简易敏感信息脱敏示例(手机号/卡号),实际需按制度完善 */
    public static String maskSensitive(String s) {
        if (s == null) return null;
        s = s.replaceAll("(\\b1\\d{2})\\d{4}(\\d{4}\\b)", "$1****$2");   // 手机号 11位:前3后4
        s = s.replaceAll("(\\b\\d{6})\\d+(\\d{4}\\b)", "$1******$2");   // 卡号:前6后4
        return s;
    }

    private static String sha256Hex(byte[] data) {
        try {
            MessageDigest md = MessageDigest.getInstance("SHA-256");
            byte[] d = md.digest(data);
            return bytesToHex(d);
        } catch (Exception e) {
            return "sha256_error";
        }
    }
    private static String bytesToHex(byte[] bytes) {
        return Arrays.stream(bytes).mapToObj(b -> String.format("%02x", b)).collect(Collectors.joining());
    }
}

3.4 HTTP/SQL 体量限制(>16KB 摘要,>256KB 元数据)

简化版过滤器:统计请求/响应体大小,放入MDC;AOP决定是否仅摘要或只记元数据。

package com.example.logging.http;

import com.example.logging.cfg.LogFeatureProps;
import org.slf4j.MDC;
import org.springframework.util.StreamUtils;

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;

public class HttpSizeFilter implements Filter {
    private final LogFeatureProps props;
    public HttpSizeFilter(LogFeatureProps props) { this.props = props; }

    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
            throws IOException, ServletException {

        CachingRequestWrapper request = new CachingRequestWrapper((HttpServletRequest) req);
        CachingResponseWrapper response = new CachingResponseWrapper((HttpServletResponse) res);

        try {
            chain.doFilter(request, response);
        } finally {
            byte[] in = request.getBody();
            byte[] out = response.getBody();

            int inSize = in == null ? 0 : in.length;
            int outSize = out == null ? 0 : out.length;

            MDC.put("req_size", String.valueOf(inSize));
            MDC.put("resp_size", String.valueOf(outSize));
            MDC.put("req_hash16", hash16(in));
            MDC.put("resp_hash16", hash16(out));

            response.copyBodyToResponse(); // 写回客户端
        }
    }

    private static class CachingRequestWrapper extends HttpServletRequestWrapper {
        private final byte[] body;
        CachingRequestWrapper(HttpServletRequest request) throws IOException {
            super(request);
            this.body = StreamUtils.copyToByteArray(request.getInputStream());
        }
        public byte[] getBody() { return body; }
        @Override public ServletInputStream getInputStream() {
            ByteArrayInputStream bais = new ByteArrayInputStream(body);
            return new ServletInputStream() {
                @Override public boolean isFinished() { return bais.available() == 0; }
                @Override public boolean isReady() { return true; }
                @Override public void setReadListener(ReadListener readListener) {}
                @Override public int read() { return bais.read(); }
            };
        }
    }

    private static class CachingResponseWrapper extends HttpServletResponseWrapper {
        private final ByteArrayOutputStream bos = new ByteArrayOutputStream();
        private ServletOutputStream out;
        private PrintWriter writer;

        CachingResponseWrapper(HttpServletResponse response) throws IOException {
            super(response);
        }

        @Override public ServletOutputStream getOutputStream() {
            if (out == null) {
                out = new ServletOutputStream() {
                    @Override public boolean isReady() { return true; }
                    @Override public void setWriteListener(WriteListener writeListener) {}
                    @Override public void write(int b) { bos.write(b); }
                };
            }
            return out;
        }

        @Override public PrintWriter getWriter() {
            if (writer == null) writer = new PrintWriter(new OutputStreamWriter(bos, StandardCharsets.UTF_8));
            return writer;
        }

        public byte[] getBody() { return bos.toByteArray(); }

        public void copyBodyToResponse() throws IOException {
            if (writer != null) writer.flush();
            byte[] bytes = bos.toByteArray();
            ServletOutputStream os = super.getOutputStream();
            os.write(bytes);
            os.flush();
        }
    }

    private static String hash16(byte[] data) {
        if (data == null || data.length == 0) return "-";
        try {
            MessageDigest md = MessageDigest.getInstance("SHA-256");
            byte[] digest = md.digest(data);
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < 8; i++) sb.append(String.format("%02x", digest[i]));
            return sb.toString();
        } catch (Exception e) { return "hash_err"; }
    }
}

判定逻辑(在 AOP 中使用):

  • resp_size > 256KB → 只记 URI/status/rt/size/hash(元数据)。
  • 大于16KB → 仅摘要(LogSanitizer 处理)。
  • 否则正常摘要;完整 payload 必须走旁路并抽样限流。

3.5 审计留痕(结构化、不可变、可回放)

package com.example.audit;

import lombok.AllArgsConstructor;
import lombok.Data;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;

/** 审计事件模型(仅示例,按监管字段补充) */
@Data @AllArgsConstructor
class AuditEvent {
    private long ts;          // 事件时间(epochMillis)
    private String userId;    // 用户标识(建议hash存库)
    private String action;    // 动作,如 TRANSFER_SUBMIT
    private String caseId;    // 交易/案件编号
    private String result;    // 结果,如 OK/REJECT
    private String detail;    // 摘要化详情(敏感字段已脱敏)
}

/** 审计写入:Kafka→对象存储/合规库,独立留存与计费 */
@Service
public class AuditWriter {
    private final KafkaTemplate<String, Object> kafka;
    public AuditWriter(KafkaTemplate<String, Object> kafka) { this.kafka = kafka; }

    public void write(AuditEvent evt) {
        // 不可变:只追加,不修改;后续落对象存储并做WORM/版本化
        kafka.send("audit-events", evt.getCaseId(), evt);
    }
}

3.6 示例 Controller(摘要 + 旁路抽样 + 异常示例)

package com.example.web;

import lombok.extern.slf4j.Slf4j;
import org.slf4j.LoggerFactory;
import org.slf4j.Logger;
import org.springframework.web.bind.annotation.*;

import java.util.*;

@Slf4j
@RestController
@RequestMapping("/demo")
public class DemoController {

    private static final Logger PAYLOAD_LOG = LoggerFactory.getLogger("payload.logger");

    /** 正常接口:INFO 一行摘要;偶尔抽样把完整payload写入旁路通道 */
    @GetMapping("/query")
    public Map<String, Object> query(@RequestParam(defaultValue = "u001") String uid) {
        Map<String, Object> resp = new HashMap<>();
        resp.put("uid", uid);
        resp.put("balance", 123.45);
        resp.put("ts", System.currentTimeMillis());

        // 模拟较大的结果集(正常摘要;仅少量抽样走旁路)
        List<Integer> records = new ArrayList<>();
        for (int i = 0; i < 500; i++) records.add(i);
        resp.put("records", records);

        // 1% 抽样写旁路(payload.logger),避免刷爆磁盘与采集链路
        if (Math.random() < 0.01) {
            PAYLOAD_LOG.info("full_payload body={}", resp);
        }
        return resp; // AOP+Sanitizer 会记录摘要
    }

    /** 故障示例:ERROR 通道长保 */
    @GetMapping("/pay")
    public String pay(@RequestParam String orderId) {
        // 高敏高频(转账/支付)不允许落全量:AOP 按“严格摘要”处理
        if (orderId.startsWith("bad")) {
            throw new RuntimeException("payment failed"); // ERROR 通道
        }
        return "OK";
    }
}

四、压测——用数据把红线画清楚。

  • 压测重点:对比Sanitizer开/关、单条上限(4KB vs 8KB)、每请求日志数(0.5/1/2/5 行);同时观察TPS、TP99、CPU、磁盘 IOPS、Agent backlog 与全链路延迟(Agent→Collector→ES/Kafka)。
  • 基线示例:默认单条 4KB、每请求 ≤2 行、全链路 P99 ≤3s;把结论写入告警阈值与团队手册。
  • 避免陷阱:别只看 TPS;每组≥15 分钟、剔除预热;测试与生产规格需对齐或说明缩放。

结语

把日志当成“系统内的成本中心”去管理:能摘要不落体、能旁路不阻塞、能降级不拖垮。

当采集、配置、规范与压测四条腿同时站稳,日志就不再是高并发系统的负担,而是稳定运营与合规审计的底气。