如何将远程调用,例如Feign/RestTemplate的调用时间,异常信息等指标收集起来。便于报警和展示呢?这里采用Prometheus+Grafana的方式来实现。本文重点讲述下指标如何被Prometheus进行收集的。
1. 指标被Prometheus收集
1.1 简述Prometheus
Prometheus是一个开源的监控系统,它由以下几个核心组件构成:
- 数据爬虫:根据配置的时间定期的通过HTTP抓取metrics数据;
- time-series数据库:存储所有的metrics数据;
- 简单的用户交互接口:可视化、查询和监控所有的metrics;
SpringBoot使用Micrometer,一个应用的metrics组件,将actuator metrics整合到外部监控系统中。
为了整合Prometheus,需要增加如下依赖:
<!-- Micrometer Prometheus registry -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
一旦增加上述依赖,SpringBoot会自动配置一个PrometheusMeterRegistry和CollectorRegistry来收集和输出格式化的metrics数据,使得Prometheus服务器可以抓取。
所有应用的metrics数据根据一个叫/prometheus
的endpoint来设置是否可用。Prometheus服务器可以周期性的抓取这个endpoint来获取metrics 麦捶可死
数据。
1.2 代码实现
引入依赖:
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
定义PrometheusMeterRegistry
的实例类——该类里面会设置属性。
public class AbstractMetricsInstance {
protected PrometheusMeterRegistry registry;
public AbstractMetricsInstance(PrometheusMeterRegistry registry) {
this.registry = registry;
}
protected void maximumAllowableTags(String meterName, String tagName, int maxCount) {
if (registry == null || StringUtils.isEmpty(meterName) || StringUtils.isEmpty(tagName) || maxCount < 0) {
return;
}
MeterFilter denyFilter = new OnlyOnceLoggingDenyMeterFilter(() ->
String.format("Metrics reached the maximum number of '%s' tags for '%s'.", tagName, meterName));
registry.config().meterFilter(MeterFilter.maximumAllowableTags(meterName, tagName, maxCount, denyFilter));
}
protected String getMatchPatternByUri(String uri) {
return uri;
}
}
子类实现:
public class AbstractHttpMetricsInstance extends AbstractMetricsInstance {
private static final String TAG_RA = "remote";
private static final String TAG_SN = "svc_name";
private static final String TAG_UP = "uri";
private static final String TAG_SC = "status_code";
private static final String TAG_EX = "exception";
private static final String NONE_VAL = "-";
private final String RSP_SS;
private final String REQ_DS;
private final String RSP_EX;
private final int maxAllowUriTags;
public AbstractHttpMetricsInstance(String RSP_SS, String REQ_DS, String RSP_EX,
PrometheusMeterRegistry registry,
int maxAllowUriTags) {
super(registry);
this.RSP_SS = RSP_SS;
this.REQ_DS = REQ_DS;
this.RSP_EX = RSP_EX;
this.maxAllowUriTags = maxAllowUriTags;
initMaxAllowUriTags();
initPercentilesHistogram();
}
private void initPercentilesHistogram() {
if (registry == null) {
return;
}
registry.config().meterFilter(new MeterFilter() {
@Override
public DistributionStatisticConfig configure(final Meter.Id id, final DistributionStatisticConfig config) {
if (Meter.Type.TIMER.equals(id.getType()) && id.getName().startsWith(REQ_DS)) {
return DistributionStatisticConfig.builder().percentilesHistogram(true)
.sla(Duration.ofMillis(25).toNanos(), Duration.ofMillis(50).toNanos(),
Duration.ofMillis(75).toNanos(), Duration.ofMillis(100).toNanos(),
Duration.ofMillis(200).toNanos(), Duration.ofMillis(500).toNanos(),
Duration.ofMillis(750).toNanos(), Duration.ofSeconds(1).toNanos(),
Duration.ofSeconds(2).toNanos())
.minimumExpectedValue(Duration.ofSeconds(5).toNanos())
.maximumExpectedValue(Duration.ofSeconds(5).toNanos())
.build().merge(config);
}
return config;
}
});
}
private void initMaxAllowUriTags() {
maximumAllowableTags(RSP_SS, TAG_UP, maxAllowUriTags);
maximumAllowableTags(REQ_DS, TAG_UP, maxAllowUriTags);
maximumAllowableTags(RSP_EX, TAG_UP, maxAllowUriTags);
}
/**
* The process for metrics of metric_openfeign_request_duration_seconds_bucket
* The process for metrics of metric_openfeign_request_duration_seconds_count
* The process for metrics of metric_openfeign_request_duration_seconds_max
* The process for metrics of metric_openfeign_request_duration_seconds_sum
*
* @param remoteAddr 远程服务ip地址
* @param serviceName api的eureka name
* @param path uri
* @param durations 请求耗时
*/
public void requestDurationSeconds(final String remoteAddr, final String serviceName, final String path,
final Long durations) {
if (this.registry == null
|| Strings.isNullOrEmpty(remoteAddr)
|| Strings.isNullOrEmpty(path)) {
return;
}
Tags tags = Tags.of(TAG_RA, remoteAddr)
.and(TAG_SN, StringUtils.isNotBlank(serviceName) ? serviceName : NONE_VAL)
.and(TAG_UP, getMatchPatternByUri(path));
Timer.builder(REQ_DS)
.tags(tags)
.register(registry)
.record(durations, TimeUnit.MILLISECONDS);
}
/**
* The process for metrics of metric_openfeign_response_total
*
* @param remoteAddr 远程服务ip地址
* @param serviceName api的eureka name
* @param path uri
* @param statusCode 返回状态码
*/
public void responseStatusCodeCount(final String remoteAddr, final String serviceName, final String path,
final int statusCode) {
if (this.registry == null
|| Strings.isNullOrEmpty(remoteAddr)
|| Strings.isNullOrEmpty(path)) {
return;
}
Tags tags = Tags.of(TAG_RA, remoteAddr)
.and(TAG_SN, StringUtils.isNotBlank(serviceName) ? serviceName : NONE_VAL)
.and(TAG_UP, getMatchPatternByUri(path))
.and(TAG_SC, String.valueOf(statusCode));
registry.counter(RSP_SS, tags).increment();
}
/**
* The process for metrics of metric_openfeign_client_exception_total
*
* @param remoteAddr 远程服务ip地址
* @param serviceName api的eureka name
* @param path uri
* @param exception 异常
*/
public void requestExceptionCount(final String remoteAddr, final String serviceName, final String path,
final Throwable exception) {
if (this.registry == null
|| Strings.isNullOrEmpty(path)
|| exception == null) {
return;
}
Tags tags = Tags.of(TAG_RA, remoteAddr)
.and(TAG_SN, StringUtils.isNotBlank(serviceName) ? serviceName : NONE_VAL)
.and(TAG_UP, getMatchPatternByUri(path))
.and(exceptionTag(exception));
registry.counter(RSP_EX, tags).increment();
}
private Tags exceptionTag(final Throwable exception) {
String simpleName = exception.getClass().getSimpleName();
return Tags.of(TAG_EX, simpleName.isEmpty() ? exception.getClass().getName() : simpleName);
}
}
openFeign的Metrics类实现:
@Component
public class OpenFeignMetricsInstance extends AbstractHttpMetricsInstance {
private static final String RSP_SS = "metric.openfeign.response";
private static final String REQ_DS = "metric.openfeign.request.duration";
private static final String RSP_EX = "metric.openfeign.client.exception";
public OpenFeignMetricsInstance(PrometheusMeterRegistry registry) {
super(RSP_SS, REQ_DS, RSP_EX, registry, 1000);
}
}
2. 如何监控RestTemplate的信息
当调用成功时、调用出现异常时均调用OpenFeignMetricsInstance
类进行打点收集。
@Service
public class RestTemplateMetricsInterceptor implements ClientHttpRequestInterceptor {
@Autowired
private OpenFeignMetricsInstance openFeignMetricsInstance;
@Override
public ClientHttpResponse intercept(HttpRequest request, byte[] body, ClientHttpRequestExecution execution) throws IOException {
ClientHttpResponse response;
URI uri = request.getURI();
String path = uri.getPath();
String serviceName = uri.getHost();
int status = 599;
try {
final Stopwatch stopwatch = Stopwatch.createStarted();
response = execution.execute(request, body);
stopwatch.stop();
//覆盖错误码
status = response.getRawStatusCode();
openFeignMetricsInstance.requestDurationSeconds(serviceName, serviceName, path, stopwatch.elapsed(TimeUnit.MILLISECONDS));
} catch (Exception e) {
openFeignMetricsInstance.requestExceptionCount(serviceName, serviceName, path, e);
throw e;
} finally {
openFeignMetricsInstance.responseStatusCodeCount(serviceName, serviceName, path, status);
}
return response;
}
}
ClientHttpRequestInterceptor如何设置到RestTemplate可以参考:Spring—RestTemplate设置Interceptor拦截器代码实现。
3. 如何监控Feign的信息
文章参考——Feign源码分析—替换(装饰)底层client完成Feign接口的监控