《Flink自拟实战笔记》之Springboot日志收集 1

1 - 简介

这个系列是我学习Flink之后,想到加强一下我的FLink能力,所以开始一系列的自己拟定业务场景,进行开发。

这里更类似笔记,而不是教学,所以不会特别细致,敬请谅解。

这里是实战的,具体一些环境,代码基础知识不会讲解,例如docker,flink语法之类的,看情况做具体讲解,所以需要一些技术门槛。

2 - 准备

  • flink - 1.12.0
  • elasticsearch - 7.12
  • kafka - 2.12-2.5.0
  • kibana - 7.12
  • filebeat - 7.12

这里就不做下载地址的分享了,大家自行下载吧。

3 - 代码

Flink代码

maven pom依赖,别问为啥这么多依赖,问我就说不知道,你就复制吧。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
xml复制代码<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.iolo</groupId>
<artifactId>flink_study</artifactId>
<version>1.0.0</version>
<!-- 指定仓库位置,依次为aliyun、apache和cloudera仓库 -->
<repositories>
<repository>
<id>aliyun</id>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
<repository>
<id>apache</id>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
</repository>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>

<properties>
<encoding>UTF-8</encoding>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<java.version>1.8</java.version>
<scala.version>2.12</scala.version>
<flink.version>1.12.0</flink.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-scala-bridge_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java-bridge_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- blink执行计划,1.11+默认的-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-common</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- flink连接器-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-sql-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-jdbc_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-csv</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-json</artifactId>
<version>${flink.version}</version>
</dependency>

<dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>flink-connector-redis_2.11</artifactId>
<version>1.0</version>
<exclusions>
<exclusion>
<artifactId>flink-streaming-java_2.11</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-runtime_2.11</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-core</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-java</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-hive_2.12</artifactId>
<version>${flink.version}</version>
</dependency>

<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-shaded-hadoop-2-uber</artifactId>
<version>2.7.5-10.0</version>
</dependency>

<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.38</version>
</dependency>

<!-- 日志 -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.7</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<scope>runtime</scope>
</dependency>

<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.44</version>
</dependency>

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.2</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.1</version>
</dependency>

<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch7_2.11</artifactId>
<version>${flink.version}</version>
</dependency>

</dependencies>

<build>
<sourceDirectory>src/main/java</sourceDirectory>
<plugins>
<!-- 编译插件 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<!--<encoding>${project.build.sourceEncoding}</encoding>-->
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
<!-- 打包插件(会包含所有依赖) -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<!--
zip -d learn_spark.jar META-INF/*.RSA META-INF/*.DSA META-INF/*.SF -->
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<!-- 设置jar包的入口类(可选) -->
<mainClass></mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

</project>

下面是flink的,具体讲解都在代码里

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
java复制代码package com.iolo.flink.cases;

import com.alibaba.fastjson.JSONObject;
import com.iolo.common.util.DingDingUtil;
import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.RuntimeContext;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.shaded.hadoop2.com.google.gson.Gson;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkFunction;
import org.apache.flink.streaming.connectors.elasticsearch.RequestIndexer;
import org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.util.Collector;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.Requests;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;

/**
* @author iolo
* @date 2021/3/17
* 监控日志实时报警
* <p>
* 准备环境
* 1 - flink 1.12
* 2 - kafka
* 3 - filebeat
* 4 - Springboot服务(可以产生各类级别日志的接口)
* 5 - es+kibana
* <p>
* filebeat 监控Springboot服务日志 提交给kafka(主题sever_log_to_flink_consumer)
* flink消费kafka主题日志 ,整理收集,如果遇到error日志发送邮件,或者发钉钉(这里可以调用Springboot服务,或者直接flink发送)
* 然后将所有日志存入es 进行 kibana分析
**/
public class case_1_kafka_es_log {
public static void main(String[] args) throws Exception {
// TODO env
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);

Properties props = new Properties();
//集群地址
props.setProperty("bootstrap.servers", "127.0.0.1:9092");
//消费者组id
props.setProperty("group.id", "test-consumer-group");
//latest有offset记录从记录位置开始消费,没有记录从最新的/最后的消息开始消费
//earliest有offset记录从记录位置开始消费,没有记录从最早的/最开始的消息开始消费
props.setProperty("auto.offset.reset", "latest");
//会开启一个后台线程每隔5s检测一下Kafka的分区情况,实现动态分区检测
props.setProperty("flink.partition-discovery.interval-millis", "5000");
//自动提交(提交到默认主题,后续学习了Checkpoint后随着Checkpoint存储在Checkpoint和默认主题中)
props.setProperty("enable.auto.commit", "true");
//自动提交的时间间隔
props.setProperty("auto.commit.interval.ms", "2000");

// TODO source
FlinkKafkaConsumer<String> source = new FlinkKafkaConsumer<>("sever_log_to_flink_consumer", new SimpleStringSchema(), props);

DataStreamSource<String> ds = env.addSource(source);

// TODO transformation
SingleOutputStreamOperator<Tuple3<String, String, String>> result = ds.flatMap(new FlatMapFunction<String, Tuple3<String, String, String>>() {
@Override
public void flatMap(String s, Collector<Tuple3<String, String, String>> collector) throws Exception {
JSONObject json = (JSONObject) JSONObject.parse(s);
String timestamp = json.getString("@timestamp");
String message = json.getString("message");
String[] split = message.split(" ");
String level = split[3];
if ("[ERROR]".equalsIgnoreCase(level)) {
System.out.println("error!");
DingDingUtil.dingdingPost("error");
}

collector.collect(Tuple3.of(timestamp, level, message));
}
});

// TODO sink
result.print();

/**
* https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/elasticsearch.html
*/
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
ElasticsearchSink.Builder<Tuple3<String, String, String>> esSinkBuilder = new ElasticsearchSink.Builder<>(
httpHosts,
new ElasticsearchSinkFunction<Tuple3<String, String, String>>() {
@Override
public void process(Tuple3<String, String, String> stringStringStringTuple3, RuntimeContext runtimeContext, RequestIndexer requestIndexer) {
Map<String, String> json = new HashMap<>();
json.put("@timestamp", stringStringStringTuple3.f0);
json.put("level", stringStringStringTuple3.f1);
json.put("message", stringStringStringTuple3.f2);
IndexRequest item = Requests.indexRequest()
.index("my-log")
.source(json);
requestIndexer.add(item);
}
});
esSinkBuilder.setBulkFlushMaxActions(1);
result.addSink(esSinkBuilder.build());
// TODO execute
env.execute("case_1_kafka_es_log");
}
}

其中为了告警通知,做了个钉钉自定义机器人通知,需要的可以去百度查看一下,很方便。

developers.dingtalk.com/document/ap…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
java复制代码package com.iolo.common.util;

import lombok.extern.slf4j.Slf4j;
import okhttp3.MediaType;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;

import java.io.IOException;

/**
* @author iolo
* @date 2021/3/30
* https://developers.dingtalk.com/document/app/custom-robot-access/title-jfe-yo9-jl2
**/
@Slf4j
public class DingDingUtil {
private static final String url = "https://oapi.dingtalk.com/robot/send?access_token=你自己的token替换";

/**
* 秘钥token
*
* @param
* @return java.lang.String
* @author fengxinxin
* @date 2021/3/30 下午5:03
**/
public static void dingdingPost(String text) throws Exception {
MediaType JSON = MediaType.parse("application/json");
OkHttpClient client = new OkHttpClient();
String json = "{\"msgtype\": \"text\",\"text\": {\"content\": \"FlinkLog:" + text + "\"}}";
RequestBody body = RequestBody.create(JSON, json);
Request request = new Request.Builder()
.url(url)
.post(body)
.build();
try (Response response = client.newCall(request).execute()) {
String responseBody = response.body().string();
log.info(responseBody);

} catch (IOException e) {
log.error(e.getMessage());
}
}
}

然后可以直接在的控制面板直接启动这个main方法

image-20210401160753799.png

Springboot

gitee地址直接下载,不做详细讲解

接口地址 http://127.0.0.1:8080/test/log?level=error&count=10

Kafka

操作命令,这些命令都是在Kafka里的bin目录下,Zookeeper是kafka自带的那个

1
2
3
4
5
6
7
8
9
10
11
12
shell复制代码# Zookeeper 启动命令
./zookeeper-server-start.sh ../config/zookeeper.properties
# Kafka 启动命令
./kafka-server-start.sh ../config/server.properties
# 创建 topic sever_log_to_flink_consumer
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic sever_log_to_flink_consumer
# 查看是否创建成功
./kafka-topics.sh --list --zookeeper localhost:2181
# 这是生产者
./kafka-console-producer.sh --broker-list localhost:9092 --topic sever_log_to_flink_consumer
# 这是消费者
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic sever_log_to_flink_consumer --from-beginning

Elasticsearch

这里开始使用docker,具体环境可以自行搭建,并且以后docker的场景会越来越多,直接上命令。

1
2
3
4
5
6
7
shell复制代码docker run \
--name fxx-es \
-p 9200:9200 \
-p 9300:9300 \
-v /Users/iOLO/dev/docker/es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-e "discovery.type=single-node" \
docker.elastic.co/elasticsearch/elasticsearch:7.12.0

验证

es5.png

Kibana

1
2
3
4
5
shell复制代码docker run \
--name fxx-kibana \
--link fxx-es:elasticsearch \
-p 5601:5601 \
docker.elastic.co/kibana/kibana:7.12.0

我这里去容器内部设置中文,你可以不做

设置的方法,在配置文件kibana.yml增加i18n.locale: "zh-CN"

验证 地址 是 127.0.0.1:5601

具体操作的时候进行图文讲解

Filebeat

下载地址 www.elastic.co/cn/download…

选择自己电脑环境进行下载,我是MAC

解压之后修改配置文件里,直接上配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
xml复制代码# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

# Change to true to enable this input configuration. 这里是需要修改
enabled: true

# Paths that should be crawled and fetched. Glob based paths. 这里改成你本地下载那个Springboot的log文件地址
paths:
- /Users/iOLO/dev/Java/flinklog/logs/flink-log.*.log

# ------------------------------ Kafka Output -------------------------------
output.kafka:
# initial brokers for reading cluster metadata kafka的连接地址,这是直接从官网粘贴过来的,
# https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html
hosts: ["127.0.01:9092"]

# message topic selection + partitioning 然后就是消费topic,其他都是官网的默认值,我就没做改动
topic: 'sever_log_to_flink_consumer'
partition.round_robin:
reachable_only: false

required_acks: 1
compression: gzip
max_message_bytes: 1000000

4 - 实战

环境和程序都准备好了之后,别忘了启动Springboot服务

然后通过请求接口服务 127.0.0.1:8080/test/log?level=error&count=10 来产生日志

通过查看钉钉 看是否有报警信息

image-20210401161645083.png

钉钉成功!!!

然后就是Kibana的操作

直接上结果页面

es1.png

然后就是操作步骤

第一先去es选择index

es2.png

第二步根据红框点击进去es查询index页面

es3.png

最后在输入框里查询你刚才的index ,咱们的代码就是my-index,根据提示进行下一步,我这里已经创建过了,所以就不在演示。最后就可以会有之前页面的效果。

es4.png

5 - 结束语

整体就是这样,很多人肯定会提出质疑,说直接filebeat+ELK 也能完成这类效果,好的,你别杠,我这是学习flink之后,然后自己出业务场景,进行flink的实战总结,如果你有更好的方案,就干你的。

然后如果大家有啥想要的,遇到的场景,都可以提出来,我会斟酌后进行采纳进行实战实施。

最后感谢阅读。

本文转载自: 掘金

开发者博客 – 和开发相关的 这里全都有

0%