在金融级手机银行中,日志是系统的黑匣子:工程团队用它复盘链路,审计与监管用它证明事实。
单体年代,落盘归档就够用;微服务与容器云时代,实例倍增、日志暴涨;高并发下,过度日志会吃满 I/O、拖垮 CPU,主链路也会跟着失速。 因此,可用的日志系统必须同时做到 高性能、高可用、低成本,并满足 合规。
我们把方法归纳为四类:采集、配置、规范、压测。
一、采集——应用只写 stdout,平台全权接管。
- LogAgent: 在节点侧采集标准输出,送到Collector,再分发到 Kafka/ES/对象存储;容器重启不丢、查询口径一致。
- 大对象不上链:交易日志只写摘要(size、hash、关键字段);全量请求/响应、慢SQL走异步旁路,主链路不被堵。
- 链路可降级:多副本冗余;背压时先丢DEBUG/INFO,保WARN/ERROR。
- 成本可控: 统计 GB/天、监控backlog与延迟;在日志中注入traceId/spanId,确保跨服务可追踪。
- 协议选择: 以gRPC/HTTP + 批量压缩为主,极端吞吐再评估Netty/TCP的复杂度。
二、配置——把“写日志”的成本从主线程挪走。
- 使用AsyncAppender,主线程入队、后台批量写;
注:队列大小估算:queueSize ≈ QPS × 单条日志字节数 × 可容忍积压秒数(受 JVM 可用内存约束)。
- 设置条件丢弃,只丢低级别。
- 格式输出用JSON,时间戳写epochMillis,展示端再格式化。
- 统一字段:traceId、spanId、app、pod、instance、thread、uid、ip、channel、uri、method、status、rt_ms;通过 MDC 注入模板输出。
- 归档按时间 + 大小 + 级别组合;ERROR长保,INFO/DEBUG短保;
- Agent 侧设置每容器限速,防止I/O被日志占满。
代码示例
2.1 logback.xml(异步、JSON、分级归档、独立 payload/ERROR 通道)
放置:src/main/resources/logback.xml
说明:
- APP_FILE:INFO/DEBUG 主通道(短保,JSON,Size+Time 归档)。
- ERROR_FILE:ERROR 独立通道(长保)。
- PAYLOAD_FILE:大负载/慢 SQL 旁路通道(短保+隔离)。
- 三者均用 AsyncAppender,业务线程永不阻塞。
<configuration>
<!-- 可用环境变量覆盖 -->
<property name="APP" value="mobile-bank"/>
<property name="LOG_HOME" value="/data/logs/${APP}"/>
<!-- ========== 主通道(INFO/DEBUG)JSON + epochMillis + 双归档 ========== -->
<appender name="APP_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_HOME}/app.log</file>
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdc>true</includeMdc>
<timeZone>UTC</timeZone>
<timestampPattern>UNIX_MILLIS</timestampPattern> <!-- epoch 毫秒 -->
<fieldNames>
<timestamp>ts</timestamp>
<level>lvl</level>
<thread>th</thread>
<logger>logger</logger>
<message>msg</message>
</fieldNames>
</encoder>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${LOG_HOME}/app.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
<maxFileSize>128MB</maxFileSize> <!-- 单卷上限 -->
<maxHistory>14</maxHistory> <!-- 短保 14 天 -->
<totalSizeCap>50GB</totalSizeCap> <!-- 总容量上限 -->
</rollingPolicy>
<!-- 拒绝 ERROR(ERROR 走专用通道) -->
<filter class="ch.qos.logback.classic.filter.LevelFilter">
<level>ERROR</level>
<onMatch>DENY</onMatch>
<onMismatch>NEUTRAL</onMismatch>
</filter>
</appender>
<!-- ========== ERROR 独立通道(长保,用于审计/追责) ========== -->
<appender name="ERROR_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_HOME}/error.log</file>
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdc>true</includeMdc>
<timeZone>UTC</timeZone>
<timestampPattern>UNIX_MILLIS</timestampPattern>
</encoder>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${LOG_HOME}/error.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
<maxFileSize>64MB</maxFileSize>
<maxHistory>90</maxHistory> <!-- 长保 90 天(按需调整) -->
<totalSizeCap>100GB</totalSizeCap>
</rollingPolicy>
<!-- 仅接受 ERROR -->
<filter class="ch.qos.logback.classic.filter.LevelFilter">
<level>ERROR</level>
<onMatch>ACCEPT</onMatch>
<onMismatch>DENY</onMismatch>
</filter>
</appender>
<!-- ========== 大负载/慢 SQL 旁路通道(短保+隔离) ========== -->
<appender name="PAYLOAD_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_HOME}/payload.log</file>
<encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${LOG_HOME}/payload.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
<maxFileSize>64MB</maxFileSize>
<maxHistory>3</maxHistory> <!-- 很短的留存 -->
</rollingPolicy>
</appender>
<!-- ========== 异步包装:主线程入队,后台批量写 ========== -->
<!-- 队列估算:queueSize ≈ QPS × 单条大小 × 容忍积压秒数 -->
<appender name="ASYNC_APP" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="APP_FILE"/>
<queueSize>20000</queueSize> <!-- 例:10k/s × 1KB × 2s -->
<discardingThreshold>30</discardingThreshold> <!-- 仅丢 DEBUG/INFO -->
<neverBlock>true</neverBlock> <!-- 主线程不阻塞 -->
<includeCallerData>false</includeCallerData>
</appender>
<appender name="ASYNC_ERROR" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="ERROR_FILE"/>
<queueSize>5000</queueSize> <!-- 错误量通常较小 -->
<discardingThreshold>0</discardingThreshold><!-- ERROR 永不丢 -->
<neverBlock>true</neverBlock>
</appender>
<appender name="ASYNC_PAYLOAD" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="PAYLOAD_FILE"/>
<queueSize>5000</queueSize>
<discardingThreshold>10</discardingThreshold> <!-- 高水位优先丢低级别 payload -->
<neverBlock>true</neverBlock>
</appender>
<!-- 根日志:默认 INFO,写主通道 + ERROR 通道 -->
<root level="INFO">
<appender-ref ref="ASYNC_APP"/>
<appender-ref ref="ASYNC_ERROR"/>
</root>
<!-- 独立 logger:写入完整 payload/慢 SQL 等 -->
<logger name="payload.logger" level="INFO" additivity="false">
<appender-ref ref="ASYNC_PAYLOAD"/>
</logger>
</configuration>
Maven 依赖(使用 JSON 编码器):
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>7.4</version>
</dependency>
2.2 MDC 过滤器(统一注入字段)
放置:src/main/java/com/example/logging/MdcFilter.java
作用:在每个请求进入时,把统一字段放进 MDC,日志模板自动带出;请求结束后必须清理。
package com.example.logging;
import org.slf4j.MDC;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import java.io.IOException;
import java.net.InetAddress;
public class MdcFilter implements Filter {
@Override
public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
throws IOException, ServletException {
HttpServletRequest request = (HttpServletRequest) req;
try {
// 1) traceId / spanId:来自网关/APM(无则本地生成保底)
String traceId = request.getHeader("X-Trace-Id");
if (traceId == null || traceId.isEmpty()) {
traceId = java.util.UUID.randomUUID().toString();
}
MDC.put("traceId", traceId);
String spanId = request.getHeader("X-Span-Id");
if (spanId != null) MDC.put("spanId", spanId);
// 2) 统一基础维度
MDC.put("app", System.getenv().getOrDefault("APP_NAME", "mobile-bank"));
MDC.put("pod", System.getenv().getOrDefault("HOSTNAME", "N/A"));
MDC.put("instance", InetAddress.getLocalHost().getHostName());
MDC.put("thread", Thread.currentThread().getName());
// 3) 请求维度
MDC.put("uri", request.getRequestURI());
MDC.put("method", request.getMethod());
MDC.put("ip", request.getRemoteAddr());
MDC.put("channel", request.getHeader("X-Channel") == null ? "unknown" : request.getHeader("X-Channel"));
// 4) 业务维度(示例:用户ID)
String uid = request.getHeader("X-User-Id");
if (uid != null) MDC.put("uid", uid);
chain.doFilter(req, res);
} finally {
// 防止线程复用导致“脏 MDC”
MDC.clear();
}
}
}
注册 Filter(Spring Boot):
package com.example.logging;
import org.springframework.boot.web.servlet.FilterRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class FilterConfig {
@Bean
public FilterRegistrationBean<MdcFilter> mdcFilterRegistration() {
FilterRegistrationBean<MdcFilter> reg = new FilterRegistrationBean<>();
reg.setFilter(new MdcFilter());
reg.addUrlPatterns("/*");
reg.setOrder(1); // 尽量靠前,保证后续日志都有 MDC
return reg;
}
}
2.3 Agent 侧限速(Fluent Bit → Kafka 示例)
“每容器限速”通常结合匹配规则 + 背压/节流插件 + 后端限额 实现。以下为简化示例。
# /etc/fluent-bit/fluent-bit.conf (或 K8s ConfigMap)
[SERVICE]
Flush 1
Daemon Off
Log_Level info
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker
Tag kube.*
Mem_Buf_Limit 256MB # 防止单Agent吃光节点内存
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Keep_Log Off
# (可选)节流/采样过滤,根据插件支持情况启用
#[FILTER]
# Name throttle
# Match kube.*
# Rate 8000 # 条/秒
# Window 1
[OUTPUT]
Name kafka
Match kube.*
Brokers kafka-0:9092,kafka-1:9092
Topics app-logs
rdkafka.batch.num.messages 10000
rdkafka.compression.codec gzip
三、规范——开发者只写业务,框架兜底合规与体积。
- AOP自动采集入参、出参、耗时、异常;
- Sanitizer 统一做摘要、脱敏、截断与硬上限(单条 4–8KB)。
- 异常栈与大负载走独立 Logger 与独立限流;转账/支付等高敏高频接口采用摘要模式,查询/统计等低敏中频按 1–5% 抽样落体走旁路。
- 级别策略:正常 INFO 一行摘要;超时/错误 → WARN/ERROR + 摘要;全量需旁路且限流。
- 所有策略通过配置中心灰度开关,紧急调整无需发版。
- SQL/HTTP 体量限制:>16KB 仅摘要,>256KB 只记元数据(URI、状态、rt、size、hash)。
- 审计留痕独立于诊断日志:不可变、可回放、结构化存储,单独计费与留存,满足防篡改与追溯。
- 一律参数化日志,禁止字符串拼接。
代码示例
3.1 策略参数(YAML + 热更新)
# application.yml
log:
feature:
enabled: true # 总开关(紧急可关增强逻辑,仅保留 ERROR)
samplePercent: 0.02 # 查询/统计接口:完整payload抽样 1~5%
hardCapBytes: 8192 # 单条硬上限(4~8KB)
maxStringChars: 512 # 单字段最大字符数
maxCollectionElements: 10 # 集合/数组样本数量
httpBodySummaryThreshold: 16384 # >16KB:摘要
httpBodyMetaOnlyThreshold: 262144 # >256KB:仅元数据
timeoutMsWarn: 1000 # 超时阈值(ms),超时打印 WARN
highRiskApis: # 高敏高频:严格摘要(转账/支付)
- "/transfer/**"
- "/payment/**"
lowRiskApis: # 低敏中频:允许抽样全量(查询/统计)
- "/query/**"
- "/report/**"
绑定与热更新(示例):
package com.example.logging.cfg;
import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.cloud.context.config.annotation.RefreshScope;
import org.springframework.stereotype.Component;
import java.util.List;
@Data
@RefreshScope
@Component
@ConfigurationProperties(prefix = "log.feature")
public class LogFeatureProps {
private boolean enabled = true;
private double samplePercent = 0.02;
private int hardCapBytes = 8192;
private int maxStringChars = 512;
private int maxCollectionElements = 10;
private int httpBodySummaryThreshold = 16 * 1024;
private int httpBodyMetaOnlyThreshold = 256 * 1024;
private int timeoutMsWarn = 1000;
private List<String> highRiskApis;
private List<String> lowRiskApis;
}
3.2 AOP:自动采集入参/出参/耗时/异常(配合 Sanitizer)
package com.example.logging;
import com.example.logging.cfg.LogFeatureProps;
import com.example.logging.sanitize.LogSanitizer;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.util.Map;
@Slf4j
@Aspect
@Component
@RequiredArgsConstructor
public class ApiLogAspect {
private static final Logger PAYLOAD_LOG = LoggerFactory.getLogger("payload.logger");
private final LogFeatureProps props;
@Around("within(@org.springframework.web.bind.annotation.RestController *)")
public Object around(ProceedingJoinPoint pjp) throws Throwable {
long start = System.currentTimeMillis();
// 1) 入参摘要(集合取样、字符串截断、敏感脱敏)
Map<String, Object> inBrief = LogSanitizer.briefArgs(pjp.getArgs(),
props.getMaxCollectionElements(), props.getMaxStringChars());
try {
Object ret = pjp.proceed();
long rt = System.currentTimeMillis() - start;
// 2) 出参摘要
Map<String, Object> outBrief = LogSanitizer.briefResult(ret,
props.getMaxCollectionElements(), props.getMaxStringChars());
// 3) 超时判定:超过阈值打印 WARN
if (rt > props.getTimeoutMsWarn()) {
log.warn("api_slow in={} out={} rt_ms={}", inBrief, outBrief, rt);
} else {
log.info("api_call in={} out={} rt_ms={}", inBrief, outBrief, rt);
}
// 4) 低敏中频接口:按采样比例写“完整payload”到旁路通道(隔离+短保)
if (isLowRiskApi() && Math.random() < props.getSamplePercent()) {
PAYLOAD_LOG.info("full_payload body={}", LogSanitizer.safeToString(ret, props));
}
return ret;
} catch (Throwable t) {
long rt = System.currentTimeMillis() - start;
// 5) 异常摘要(栈走默认ERROR通道,或按需写入 payload 通道)
log.error("api_fail in={} rt_ms={} ex={}", inBrief, rt, t.toString(), t);
throw t;
}
}
private boolean isLowRiskApi() {
// 简化示例:真实项目请用 HandlerMethod + PathPatternMatcher 判断
return true;
}
}
3.3 Sanitizer:摘要/脱敏/截断/硬上限(单条 4–8KB)
package com.example.logging.sanitize;
import com.example.logging.cfg.LogFeatureProps;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.util.*;
import java.util.stream.Collectors;
/**
* 将任何对象“安全地”转为可记录的简短结构:
* - String:超长截断并标注原长度
* - Collection/Array:仅输出 size + 前 N 个样本
* - Map:仅输出前 N 个键(按迭代顺序)
* - byte[]:不落正文,仅 len + hash 前16位
* - 统一敏感字段脱敏
* - 最终字符串若超过硬上限(4~8KB),二次截断
*/
public class LogSanitizer {
public static Map<String, Object> briefArgs(Object[] args, int maxElements, int maxChars) {
Map<String, Object> result = new LinkedHashMap<>();
if (args == null) return result;
for (int i = 0; i < Math.min(args.length, maxElements); i++) {
result.put("arg" + i, sanitize(args[i], maxElements, maxChars));
}
if (args.length > maxElements) result.put("_rest", args.length - maxElements);
return result;
}
public static Map<String, Object> briefResult(Object ret, int maxElements, int maxChars) {
return Map.of("result", sanitize(ret, maxElements, maxChars));
}
/** 面向旁路通道的“安全 toString”,会应用硬上限 */
public static String safeToString(Object obj, LogFeatureProps props) {
String s = String.valueOf(sanitize(obj, props.getMaxCollectionElements(), props.getMaxStringChars()));
if (s.getBytes(StandardCharsets.UTF_8).length > props.getHardCapBytes()) {
return s.substring(0, Math.min(s.length(), props.getMaxStringChars())) + "...[HARD-CAP," + props.getHardCapBytes() + "B]";
}
return s;
}
@SuppressWarnings("unchecked")
public static Object sanitize(Object obj, int maxElements, int maxChars) {
if (obj == null) return null;
if (obj instanceof Number || obj instanceof Boolean) return obj;
if (obj instanceof CharSequence) {
String s = maskSensitive(obj.toString());
return s.length() > maxChars
? s.substring(0, maxChars) + "...[TRUNCATED,len=" + s.length() + "]"
: s;
}
if (obj instanceof byte[]) {
byte[] arr = (byte[]) obj;
return Map.of(
"type", "bytes",
"len", arr.length,
"sha256_16", sha256Hex(arr).substring(0, 16)
);
}
if (obj.getClass().isArray()) {
int len = java.lang.reflect.Array.getLength(obj);
List<Object> sample = new ArrayList<>();
for (int i = 0; i < Math.min(len, maxElements); i++) {
sample.add(sanitize(java.lang.reflect.Array.get(obj, i), maxElements, maxChars));
}
return Map.of("type", "array", "size", len, "sample", sample);
}
if (obj instanceof Collection<?> col) {
List<Object> sample = new ArrayList<>();
int cnt = 0;
for (Object o : col) {
if (cnt++ >= maxElements) break;
sample.add(sanitize(o, maxElements, maxChars));
}
return Map.of("type", "collection", "size", col.size(), "sample", sample);
}
if (obj instanceof Map<?, ?> m) {
Map<Object, Object> sample = new LinkedHashMap<>();
int cnt = 0;
for (Map.Entry<?, ?> e : m.entrySet()) {
if (cnt++ >= maxElements) break;
sample.put(e.getKey(), sanitize(e.getValue(), maxElements, maxChars));
}
return new LinkedHashMap<>() {{
put("type", "map");
put("size", m.size());
put("sample", sample);
put("rest", Math.max(0, m.size() - maxElements));
}};
}
String s = maskSensitive(String.valueOf(obj));
return s.length() > maxChars ? s.substring(0, maxChars) + "...[TRUNCATED,len=" + s.length() + "]" : s;
}
/** 简易敏感信息脱敏示例(手机号/卡号),实际需按制度完善 */
public static String maskSensitive(String s) {
if (s == null) return null;
s = s.replaceAll("(\\b1\\d{2})\\d{4}(\\d{4}\\b)", "$1****$2"); // 手机号 11位:前3后4
s = s.replaceAll("(\\b\\d{6})\\d+(\\d{4}\\b)", "$1******$2"); // 卡号:前6后4
return s;
}
private static String sha256Hex(byte[] data) {
try {
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] d = md.digest(data);
return bytesToHex(d);
} catch (Exception e) {
return "sha256_error";
}
}
private static String bytesToHex(byte[] bytes) {
return Arrays.stream(bytes).mapToObj(b -> String.format("%02x", b)).collect(Collectors.joining());
}
}
3.4 HTTP/SQL 体量限制(>16KB 摘要,>256KB 元数据)
简化版过滤器:统计请求/响应体大小,放入MDC;AOP决定是否仅摘要或只记元数据。
package com.example.logging.http;
import com.example.logging.cfg.LogFeatureProps;
import org.slf4j.MDC;
import org.springframework.util.StreamUtils;
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
public class HttpSizeFilter implements Filter {
private final LogFeatureProps props;
public HttpSizeFilter(LogFeatureProps props) { this.props = props; }
@Override
public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
throws IOException, ServletException {
CachingRequestWrapper request = new CachingRequestWrapper((HttpServletRequest) req);
CachingResponseWrapper response = new CachingResponseWrapper((HttpServletResponse) res);
try {
chain.doFilter(request, response);
} finally {
byte[] in = request.getBody();
byte[] out = response.getBody();
int inSize = in == null ? 0 : in.length;
int outSize = out == null ? 0 : out.length;
MDC.put("req_size", String.valueOf(inSize));
MDC.put("resp_size", String.valueOf(outSize));
MDC.put("req_hash16", hash16(in));
MDC.put("resp_hash16", hash16(out));
response.copyBodyToResponse(); // 写回客户端
}
}
private static class CachingRequestWrapper extends HttpServletRequestWrapper {
private final byte[] body;
CachingRequestWrapper(HttpServletRequest request) throws IOException {
super(request);
this.body = StreamUtils.copyToByteArray(request.getInputStream());
}
public byte[] getBody() { return body; }
@Override public ServletInputStream getInputStream() {
ByteArrayInputStream bais = new ByteArrayInputStream(body);
return new ServletInputStream() {
@Override public boolean isFinished() { return bais.available() == 0; }
@Override public boolean isReady() { return true; }
@Override public void setReadListener(ReadListener readListener) {}
@Override public int read() { return bais.read(); }
};
}
}
private static class CachingResponseWrapper extends HttpServletResponseWrapper {
private final ByteArrayOutputStream bos = new ByteArrayOutputStream();
private ServletOutputStream out;
private PrintWriter writer;
CachingResponseWrapper(HttpServletResponse response) throws IOException {
super(response);
}
@Override public ServletOutputStream getOutputStream() {
if (out == null) {
out = new ServletOutputStream() {
@Override public boolean isReady() { return true; }
@Override public void setWriteListener(WriteListener writeListener) {}
@Override public void write(int b) { bos.write(b); }
};
}
return out;
}
@Override public PrintWriter getWriter() {
if (writer == null) writer = new PrintWriter(new OutputStreamWriter(bos, StandardCharsets.UTF_8));
return writer;
}
public byte[] getBody() { return bos.toByteArray(); }
public void copyBodyToResponse() throws IOException {
if (writer != null) writer.flush();
byte[] bytes = bos.toByteArray();
ServletOutputStream os = super.getOutputStream();
os.write(bytes);
os.flush();
}
}
private static String hash16(byte[] data) {
if (data == null || data.length == 0) return "-";
try {
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] digest = md.digest(data);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 8; i++) sb.append(String.format("%02x", digest[i]));
return sb.toString();
} catch (Exception e) { return "hash_err"; }
}
}
判定逻辑(在 AOP 中使用):
- resp_size > 256KB → 只记 URI/status/rt/size/hash(元数据)。
- 大于16KB → 仅摘要(LogSanitizer 处理)。
- 否则正常摘要;完整 payload 必须走旁路并抽样限流。
3.5 审计留痕(结构化、不可变、可回放)
package com.example.audit;
import lombok.AllArgsConstructor;
import lombok.Data;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;
/** 审计事件模型(仅示例,按监管字段补充) */
@Data @AllArgsConstructor
class AuditEvent {
private long ts; // 事件时间(epochMillis)
private String userId; // 用户标识(建议hash存库)
private String action; // 动作,如 TRANSFER_SUBMIT
private String caseId; // 交易/案件编号
private String result; // 结果,如 OK/REJECT
private String detail; // 摘要化详情(敏感字段已脱敏)
}
/** 审计写入:Kafka→对象存储/合规库,独立留存与计费 */
@Service
public class AuditWriter {
private final KafkaTemplate<String, Object> kafka;
public AuditWriter(KafkaTemplate<String, Object> kafka) { this.kafka = kafka; }
public void write(AuditEvent evt) {
// 不可变:只追加,不修改;后续落对象存储并做WORM/版本化
kafka.send("audit-events", evt.getCaseId(), evt);
}
}
3.6 示例 Controller(摘要 + 旁路抽样 + 异常示例)
package com.example.web;
import lombok.extern.slf4j.Slf4j;
import org.slf4j.LoggerFactory;
import org.slf4j.Logger;
import org.springframework.web.bind.annotation.*;
import java.util.*;
@Slf4j
@RestController
@RequestMapping("/demo")
public class DemoController {
private static final Logger PAYLOAD_LOG = LoggerFactory.getLogger("payload.logger");
/** 正常接口:INFO 一行摘要;偶尔抽样把完整payload写入旁路通道 */
@GetMapping("/query")
public Map<String, Object> query(@RequestParam(defaultValue = "u001") String uid) {
Map<String, Object> resp = new HashMap<>();
resp.put("uid", uid);
resp.put("balance", 123.45);
resp.put("ts", System.currentTimeMillis());
// 模拟较大的结果集(正常摘要;仅少量抽样走旁路)
List<Integer> records = new ArrayList<>();
for (int i = 0; i < 500; i++) records.add(i);
resp.put("records", records);
// 1% 抽样写旁路(payload.logger),避免刷爆磁盘与采集链路
if (Math.random() < 0.01) {
PAYLOAD_LOG.info("full_payload body={}", resp);
}
return resp; // AOP+Sanitizer 会记录摘要
}
/** 故障示例:ERROR 通道长保 */
@GetMapping("/pay")
public String pay(@RequestParam String orderId) {
// 高敏高频(转账/支付)不允许落全量:AOP 按“严格摘要”处理
if (orderId.startsWith("bad")) {
throw new RuntimeException("payment failed"); // ERROR 通道
}
return "OK";
}
}
四、压测——用数据把红线画清楚。
- 压测重点:对比Sanitizer开/关、单条上限(4KB vs 8KB)、每请求日志数(0.5/1/2/5 行);同时观察TPS、TP99、CPU、磁盘 IOPS、Agent backlog 与全链路延迟(Agent→Collector→ES/Kafka)。
- 基线示例:默认单条 4KB、每请求 ≤2 行、全链路 P99 ≤3s;把结论写入告警阈值与团队手册。
- 避免陷阱:别只看 TPS;每组≥15 分钟、剔除预热;测试与生产规格需对齐或说明缩放。
结语
把日志当成“系统内的成本中心”去管理:能摘要不落体、能旁路不阻塞、能降级不拖垮。
当采集、配置、规范与压测四条腿同时站稳,日志就不再是高并发系统的负担,而是稳定运营与合规审计的底气。