Logstash是一个具有实时管线能力的开源数据收集引擎。在ELK Stack中,通常选择更轻量级的Filebeat收集日志,然后将日志输出到Logstash进行加工处理,再将处理后的日志输出到指定的目标(ElasticSearch,Kafka等)当中。
Logstash事件的处理管线是inputs → filters → outputs
,三个阶段都可以自定义插件,本文主要介绍如何开发自定义需求最多的filter插件。
Logstash的安装就不详细介绍了,下载传送门:https://www.elastic.co/downloads/logstash。
生成filter插件
cd到Logstash的跟目录,使用bin/logstash-plugin
生成filter插件模板,如下:
bin/logstash-plugin generate --type filter --name test --path vendor/localgems
vendor/localgems
可修改为你自己的路径。
查看filter插件的目录结构,如下:
$ tree logstash-filter-test
├── Gemfile
├── LICENSE
├── README.md
├── Rakefile
├── lib
│ └── logstash
│ └── filters
│ └── test.rb
├── logstash-filter-test.gemspec
└── spec
└── filters
└── test_spec.rb
└── spec_helper.rb
filter插件初探
代码结构
Logstash插件是用ruby写的,查看lib/logstash/filters/test.rb
文件,如下:
# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"
# This filter will replace the contents of the default
# message field with whatever you specify in the configuration.
#
# It is only intended to be used as an .
class LogStash::Filters::Test < LogStash::Filters::Base
# Setting the config_name here is required. This is how you
# configure this filter from your Logstash config.
#
# filter {
# {
# message => "My message..."
# }
# }
#
config_name "test"
# Replace the message with this value.
config :message, :validate => :string, :default => "Hello World!"
public
def register
# Add instance variables
end # def register
public
def filter(event)
if @message
# Replace the event message with our message as configured in the
# config file.
event.set("message", @message)
end
# filter_matched should go in the last line of our successful code
filter_matched(event)
end # def filter
end # class LogStash::Filters::Test
UTF-8编码
Logstash依赖于UTF-8编码,需要在插件代码开始出添加:
# encoding: utf-8
require
模板代码里面默认require
了"logstash/filters/base"
和"logstash/namespace"
,如果需要依赖其它代码或者gems就在这添加,可以参考后面在插件中查询MySql
的代码。
插件名称配置
插件名称配置代码如下:
config_name "test"
test
就是插件名称,在Logstash配置的filter块中使用。
插件参数配置
插件参数配置代码如下:
config :message, :validate => :string, :default => "Hello World!"
message
是插件test
的可选参数,默认值是"Hello World!"
。下面是参数的通用配置代码:
config :variable_name, :validate => :variable_type, :default => "Default value", :required => boolean, :deprecated => boolean, :obsolete => string
-
:variable_name
:参数名称 -
:validate
:验证参数类型,如:string
,:password
,:boolean
,:number
,:array
,:hash
,:path
等 -
:required
:是否必须配置 -
:default
:默认值 -
:deprecated
:是否废弃 -
:obsolete
:声明该配置不再使用,通常提供升级方案
插件方法
Logstash插件必须实现两个方法:register
和filter
。
register
方法代码如下:
public
def register
# Add instance variables
end # def register
register
方法相当于初始化方法,不需要手动调用,可以在这个方法里面调用配置变量,如@message
,也可以初始化自己的实例变量。
filter
方法代码如下:
public
def filter(event)
if @message
# Replace the event message with our message as configured in the
# config file.
event.set("message", @message)
end
# filter_matched should go in the last line of our successful code
filter_matched(event)
end # def filter
filter
方法是插件的数据处理逻辑,其中event
变量封装了数据流,可以通过接口访问event
中的内容,具体参见https://www.elastic.co/guide/en/logstash/5.1/event-api.html。最后一句调用了filter_matched
,这个方法用于保证Logstash的配置add_field
, remove_field
, add_tag
和remove_tag
会被正确执行。
在插件中使用其它类库
这里以在插件中查询MySql
为例进行说明,使用jdbc
操作MySql
,需要安装jdbc-mysql
,操作如下:
添加Logstash的环境变量:
export LOGSTASH_HOME=/opt/logstash-5.2.1
export PATH=$PATH:$LOGSTASH_HOME/vendor/jruby/bin
安装jdbc-mysql
:
gem install jdbc-mysql
使用sequel
(代码和文档请查看vendor/bundle/jruby/1.9/gems/sequel-4.43.0
)操作MySql
,首先需要在logstash-filter-test.gemspec
配置文件中添加对sequel
的依赖,如下:
# Gem dependencies
s.add_runtime_dependency "logstash-core-plugin-api", "~> 2.0"
s.add_runtime_dependency 'sequel'
s.add_development_dependency 'logstash-devutils'
然后在test.rb
中require
相关代码:
require "sequel"
require "sequel/adapters/jdbc"
在test.rb
中添加:jdbc_driver_library
配置参数,用于配置jdbc驱动库的path,我这的路径是"/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar"
。
config :jdbc_driver_library, :validate => :string, :required => true
register
方法中做了两件事,一是初始化了几个实例变量,二是require
依赖的jdbc
库。简单说明下实例变量的用途,@logger
用于输出日志,@connection_retry_attempts
和@connection_retry_attempts_wait_time
用于数据库连接重试,@connection_wait_timeout
用于设置MySql
的session超时时间,避免与MySql
连接过多,这是一个双保险策略,正常情况下MySql
会设置全局的超时时间,并且查询完成之后我们会主动断开连接(见fetch_info
方法),在断开失败且MySql
的超时时间过长时@connection_wait_timeout
才会起作用。
public
def register
# Add instance variables
@logger = self.logger
@connection_retry_attempts = 5
@connection_retry_attempts_wait_time = 1
@connection_wait_timeout = 10
begin
require @jdbc_driver_library
rescue => e
@logger.error("Failed to load #{@jdbc_driver_library}", :exception => e)
end
end # def register
创建db实例:
private
def create_db(conn_str)
db = nil
retry_attempts = @connection_retry_attempts
while retry_attempts > 0 do
retry_attempts -= 1
begin
tmp_db = Sequel.connect(conn_str)
rescue Sequel::PoolTimeout => e
if retry_attempts <= 0
@logger.error("Failed to connect to database. 5 second timeout exceeded. Tried #{@connection_retry_attempts} times.")
raise e
else
@logger.error("Failed to connect to database. 5 second timeout exceeded. Trying again.")
end
rescue Sequel::Error => e
if retry_attempts <= 0
@logger.error("Unable to connect to database. Tried #{@connection_retry_attempts} times", :error_message => e.message)
raise e
else
@logger.error("Unable to connect to database. Trying again", :error_message => e.message)
end
else
db = tmp_db
break
end
sleep(@connection_retry_attempts_wait_time)
end
db
end
查询数据:
private
def fetch_info(db, sql, key)
all_info = {}
retry_attempts = @connection_retry_attempts
while retry_attempts > 0 do
retry_attempts -= 1
begin
db.fetch(sql) do |row|
all_info[row[key]] = row
end
db.run "set wait_timeout = " + @connection_wait_timeout.to_s
rescue Sequel::DatabaseConnectionError, Sequel::DatabaseError => e
if retry_attempts <= 0
@logger.warn("Exception when executing JDBC query", :exception => e)
raise e
else
@logger.error("Failed to execute query. Trying again.", :error_message => e.message)
end
else
break
end
sleep(@connection_retry_attempts_wait_time)
end
db.disconnect()
all_info
end
接下来就可以根据需要在register
和filter
中使用create_db
和fetch_info
方法了。
注意:这里只是以查询MySql
为例进行说明,处理Logstash事件时需要考虑对性能和吞吐量的影响。
在Logstash中配置定制的插件
cd到Logstash根目录下,在Gemfile
添加以下配置:
gem "logstash-filter-test", :path => "vendor/localgems/logstash-filter-test"
启动Logstash
启动Logstash,配置我们定制的test
插件,如下:
bin/logstash -e 'input { beats { port => "5043" } } filter { test { jdbc_driver_library => "/usr/local/lib/ruby/gems/2.3.0/gems/jdbc-mysql-5.1.40/lib/mysql-connector-java-5.1.40-bin.jar" } } output { stdout { codec => rubydebug }}'
也可以写配置文件,与上面的-e
参数内容一致,然后使用配置文件启动Logstash。
启动Logstash的传送门:https://www.elastic.co/guide/en/logstash/5.1/running-logstash-command-line.html。