ELK是一种能够从任意数据源抽取数据,并实时对数据进行搜索、分析和可视化展现的数据分析框架。此篇文章对ELK环境部署进行简单记录,Linux版与Mac版相差不多,此处以macOS系统操作。
基础环境:java1.8.0,node8.14.0,maven3.5.0
目标文件(文末有网盘链接,也可直接下载):
************************
ES6.5.0 https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz
Logstash6.5.0 https://artifacts.elastic.co/downloads/logstash/logstash-6.5.0.tar.gz
Kibana6.5 https://artifacts.elastic.co/downloads/kibana/kibana-6.5.0-darwin-x86_64.tar.gz
IK https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
***********************
截至2019年1月18日,ES(https://www.elastic.co/downloads/elasticsearch)最新版本为6.5.4,但是分词插件IK现在最高版本只支持到6.5.0,所以ELK整体版本定在6.5.0
一.ES+IK
ES简介:Elasticsearch是个开源分布式搜索引擎,提供搜集、分析、存储数据三大功能。它的特点有:分布式,零配置,自动发现,索引自动分片,索引副本机制,restful风格接口,多数据源,自动搜索负载等。
下载地址已经在目标文件中写好,以ES举例,获取方式如下,Logstash和Kibana不做赘述。
1.1 获取es下载地址
到elastic官网https://www.elastic.co/ 右上「downloads」按钮,进入下载页面,找到Elasticsearch->Download->pastVersion 到版本选择页面(https://www.elastic.co/downloads/past-releases),选择6.5.0版,获取到下载地址。
1.2 下载并安装es
新建文件夹「ELK」,下载es压缩文件到本地,命令如下:
mkdir /opt/soft/ELK
cd /opt/soft/ELK/
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz
下载完后在当前文件解压,然后进入文件运行es进行测试
tar -zxvf elasticsearch-6.5.0.tar.gz
rm elasticsearch-6.5.0.tar.gz
cd elasticsearch-6.5.0
./bin/elasticsearch
运行log出现「started」代表运行成功,es端口默认为9200,到浏览器输入http://localhost:9200/ 查看
1.3 安装IK分词插件
control+C停掉es,在plugins下创建ik文件夹。
cd plugins
mkdir ik
回到ELK目录下进行下载ik
cd ../..
wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
解压zip文件,如果没有unzip命令使用brew下载一个即可,linux下使用yum/apt-get。
unzip master.zip
cd elasticsearch-analysis-ik-master
解压完ik后进入到文件夹内进行maven打包
mvn clean package
需要时间较长,build成功后,到target/releases/ 拿出「elasticsearch-analysis-ik-6.5.0.zip」到es的plugins下ik文件夹内并解压
mv elasticsearch-analysis-ik-6.5.0.zip /opt/soft/ELK/elasticsearch-6.5.0/plugins/ik/
cd /opt/soft/ELK/elasticsearch-6.5.0/plugins/ik/
unzip ik.zip
解压完后,运行es,查看ik插件状态
log中出现“loaded plugin [analysis-ik]” 并且started 代表ik插件安装成功
1.4 测试ES+IK
新开一个终端,创建索引index,然后对“你果然后面有戏”进行分词测试
curl -XPUT http://localhost:9200/index
curl -H "Content-Type: application/json" -XGET 'http://localhost:9200/index/_analyze?pretty=true' -d '
{
"analyzer" : "ik_max_word",
"text":"你果然后面有戏份"
}'
看终端输出结果
{
"tokens" : [
{
"token" : "你",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "果然",
"start_offset" : 1,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 1
},
{
"token" : "然后",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "后面",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "面有",
"start_offset" : 4,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 4
},
{
"token" : "有戏",
"start_offset" : 5,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 5
},
{
"token" : "戏份",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 6
}
]
}
再测一个“落花有意流水无情”
curl -H "Content-Type: application/json" -XGET 'http://localhost:9200/index/_analyze?pretty=true' -d '
{
"analyzer" : "ik_max_word",
"text":"落花有意流水无情"
}'
输出:
{
"tokens" : [
{
"token" : "落花有意流水无情",
"start_offset" : 0,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "落花有意",
"start_offset" : 0,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 1
},
{
"token" : "落花",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "有意",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "流水无情",
"start_offset" : 4,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 4
},
{
"token" : "流水",
"start_offset" : 4,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 5
},
{
"token" : "无情",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 6
}
]
}
中文分词效果出现,至此,ES+IK安装并测试完成
二.Logstash+input_jdbc
Logstash简介:Logstash 主要是用来日志的搜集、分析、过滤日志的工具,支持大量的数据获取方式。一般工作方式为c/s架构,client端安装在需要收集日志的主机上,server端负责将收到的各节点日志进行过滤、修改等操作在一并发往elasticsearch上去。
2.1 安装logstash
到ELK文件夹下进行logstash下载并解压
cd /opt/soft/ELK
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.5.0.tar.gz
tar -zxvf logstash-6.5.0.tar.gz
2.2 logstash测试
到解压好的文件夹下运行logstash
cd logstash-6.5.0/
./bin/logstash -e 'input { stdin { } } output { stdout {} }'
运行logstash后,日志面板出现「Successfully started」代表成功,输入hello world 进行测试
[2019-01-18T17:07:16,543][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-01-18T17:07:16,789][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
hello world
{
"message" => "hello world",
"@timestamp" => 2019-01-18T09:07:23.194Z,
"host" => "zhangyanandeMacBook-Pro.local",
"@version" => "1"
}
2.3 安装logstash-input-jdbc 插件
使用logstash 自带命令进行安装
./bin/logstash-plugin install logstash-input-jdbc
出现「Installation successful」时代表安装完成
2.4 logstash 实现mysql数据库同步
创建sqlconfig.conf文件
vim sqlconfig.conf
input {
jdbc {
# 指定mysql驱动jar
jdbc_driver_library => "/opt/soft/ELK/mysql-connector-java-5.1.47-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
#连接到指定数据库
jdbc_connection_string => "jdbc:mysql://localhost:3306/infodata?useUnicode=true&characterEncoding=utf8&useSSL=true"
#数据库用户名
jdbc_user => "root"
#数据库密码
jdbc_password => "Ac123456"
#定时器
schedule => "* * * * *"
#输入需要输出到es的sql,也可以使用-f命令指定文件
statement => "SELECT id,name,content,crtime from zztest"
}
}
filter {
json {
source => "message"
remove_field => ["message"]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
# 输出到指定索引
index => "sqlindex"
# 填写查询的列名
document_id => "%{id}"
}
stdout {
codec => json_lines
}
}
保存文件后,再次运行logstash,这次指定加载文件为sqlconfig
./bin/logstash -f sqlconfig.conf
查看log,db是否同步到es中
Sending Logstash logs to /opt/soft/ELK/logstash-6.5.0/logs which is now configured via log4j2.properties[2019-01-18T17:26:41,208][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified[2019-01-18T17:26:41,227][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.5.0"}[2019-01-18T17:26:44,471][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}[2019-01-18T17:26:45,001][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}[2019-01-18T17:26:45,010][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}[2019-01-18T17:26:45,289][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}[2019-01-18T17:26:45,388][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}[2019-01-18T17:26:45,392][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}[2019-01-18T17:26:45,430][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}[2019-01-18T17:26:45,456][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}[2019-01-18T17:26:45,474][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}[2019-01-18T17:26:45,598][INFO ][logstash.outputs.elasticsearch] Installing elasticsearch template to _template/logstash[2019-01-18T17:26:45,964][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x2b242f8a run>"}[2019-01-18T17:26:46,011][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}[2019-01-18T17:26:46,326][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}[2019-01-18T17:27:01,682][INFO ][logstash.inputs.jdbc ] (0.027233s) SELECT id,name,content,crtime from zz{"@timestamp":"2019-01-18T09:27:01.820Z","name":"zhyn","content":"elktest","crtime":
"2019-01-17T08:03:00.000Z","@version":"1","id":1}{"@timestamp":"2019-01-18T09:27:01.841Z","name":"张哈哈","content":"同步成功","crtime":"2019-01-17T08:31:08.000Z","@version":"1","id":3}{"@timestamp":"2019-01-18T09:27:01.840Z","name":"zhyn4j","content":"eltest333","crtime":
"2019-01-17T08:16:35.000Z","@version":"1","id":2}
使用ES查询功能,查看索引“sqlindex”下所有内容
curl -H "Content-Type: application/json" -XPOST 'localhost:9200/sqlindex/_search?pretty' -d '
{
"query": { "match_all": {} }
}'
得到返回:
{
"took" : 62,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "sqlindex",
"_type" : "doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2019-01-18T09:30:00.157Z",
"name" : "zhyn4j",
"content" : "eltest333",
"crtime" : "2019-01-17T08:16:35.000Z",
"@version" : "1",
"id" : 2
}
},
{
"_index" : "sqlindex",
"_type" : "doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2019-01-18T09:30:00.156Z",
"name" : "zhyn",
"content" : "elktest",
"crtime" : "2019-01-17T08:03:00.000Z",
"@version" : "1",
"id" : 1
}
},
{
"_index" : "sqlindex",
"_type" : "doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2019-01-18T09:30:00.157Z",
"name" : "张哈哈",
"content" : "同步成功",
"crtime" : "2019-01-17T08:31:08.000Z",
"@version" : "1",
"id" : 3
}
}
]
}
}
至此,logstash同步数据库成功。
三 Kibana
Kibana简介:Kibana 是一个开源和免费的工具,Kibana可以为 Logstash 和 ElasticSearch 提供的日志分析友好的 Web 界面,可以帮助汇总、分析和搜索重要数据日志。
3.1 安装kibana
到ELK目录下进行kibana安装
cd /opt/soft/ELK
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.5.0-darwin-x86_64.tar.gz
tar -zxvf kibana-6.5.0-darwin-x86_64.tar.gz
3.2运行kibana
cd kibana-6.5.0-darwin-x86_64
./bin/kibana
kibana的默认端口为5601
在网页输入 http://localhost:5601 查看是否可以使用
kibana侧栏常用功能如下:
Discover 数据搜索查看
Visualize 图表制作
Dashboard 仪表盘制作
Timelion 时序数据的高级可视化分析
DevTools 开发者工具
Management 配置
到DIscover栏查看索引,显示有sqlindex ,表示Kibana已经连上ES,Kibana 安装成功
四 小结
ELK环境部署已经完成,对于大数据采集、分析的实战,争取在之后的文章记述,帮助新同学们少走一些弯路。文中提到的文件我已经放到网盘,包括mysql的数据驱动jar。有需要的同学可以自行获取,网盘链接:
https://pan.baidu.com/s/1VVs667OiDvZph8oq9yBk3Q 密码:br7n