繼上一篇ELK Stack整合mysql資料來源以後,我發覺我應該先把檔案來源弄清楚才對
這次實作的是logstash原生的file input機制,其實ELK stack他們針對檔案這塊有一個輕量化的套件,可以降低掃描file的資源耗用,提高效能,叫作filebeat。不過本次先略過(其實是還架不起來)
先針對logstash的基本功能進行配置吧。
以下是今天實作檔案所使用到的相關設定檔:
首先是Logstash.config
input {
file {
path => "/mnt/logs/mesocollection/*/*.log"
type => "meso_sys_filelog"
start_position => "beginning"
}
jdbc {
type => "meso_sys_log"
jdbc_driver_library => "/mnt/lib/mysql-connector-java-5.1.33.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://mysqldb:3306/meso_collections"
jdbc_user => "root"
jdbc_password => "123456"
statement => "SELECT * FROM sys_logs"
schedule => "* * * * *"
}
}
filter {
if [type] == "meso_sys_filelog" {
grok {
match => { "message" => "(?<Logtime>(%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})) - %{WORD:Logger} - %{WORD:Level} - (?<message>[^\r\n]+)"}
}
}
}
output {
if [type] == "meso_sys_log" {
elasticsearch {
hosts => "elasticsearch:9200"
index => "meso_sys_log"
document_id => "%{sequence_id}"
}
stdout { codec => rubydebug }
}
if [type] == "meso_sys_filelog" {
elasticsearch {
hosts => "elasticsearch:9200"
index => "meso_sys_filelog"
}
stdout { codec => rubydebug }
}
}
input內新增一個file的節點,type為meso_sys_filelog,用於filter與output的識別判斷,而path可以指定log的目錄,若有子目錄結構,例如依日期作為目錄結構的情況無法指定的話,可以使用我的格式,採用 logs/*/*.log。這個結構是可以work的,相當方便。關於file input的設定,其實還可以設定許多參數,例如,可以排除檔名或是設定掃描的間距,細部設定,可以到這邊來看看(傳送門)
output因為統一要送到elasticsearch去處理,所以只有index另外命名,其他設定都差不多
filter加入了grok expression判斷式,也是這是這次記錄的重點,grok express寫起來依循log的structure來拆解,需要來回測試,以下推薦幾個不錯用的網站,可以在上面測試,若拆解成功的話,下方會出現拆解的成果,可依循喜好的介面去選擇,功能差異不大,重點是要解的出來=.=
http://grokdebug.herokuapp.com/
http://grokconstructor.appspot.com/do/match#result
以我的例子,我提供我這邊解析來源與解法提供語法參考:
log結構:
2017-07-28 12:28:25,462 - MC_Entry - INFO - entry log : uuid:9285a6a0-19e2-45a2-a7d4-209b5b2c0f77-1501216105.4624586, behavior:insert_system_user_rdb, request_data={"$data$": "{\"passwordDecrypt\": \"8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92\", \"id\": \"69fa7aa0-e45f-4fdb-94fa-5b2705b900d4-1501216079.78184\", \"idNo\": \"A123456789\", \"account\": \"paul\"}"}, files_count=0
2017-07-28 12:28:25,642 - MC_Entry - INFO - entry log : uuid:1f16eb8d-5ae3-4523-962a-020c1193c281-1501216105.6422715, behavior:update_system_user_rdb, request_data={"$query$": "{\"id\": \"69fa7aa0-e45f-4fdb-94fa-5b2705b900d4-1501216079.78184\", \"account\": \"paul\"}", "$data$": "{\"email\": \"[email protected]\", \"firstName\": \"I-Ping\", \"lastName\": \"Huang\"}"}, files_count=0
grok 表示式:
(?<Logtime>(%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})) - %{WORD:Logger} - %{WORD:Level} - (?<message>[^\r\n]+)
當我們編輯好logstash.conf以後,就重起logstash的container吧,若logstash的logs沒有報錯的話
我們可以在server上,去追蹤看看elasticsearch是不是有正常運作?
#curl -XPOST http://yourhost:9200/meso_sys_filelog/_search?pretty=true
若你只是剛剛啟動就馬上想看結果的話,必須確認監看path的目錄裡面有新增的log,否則只會發現404,index not found
若已經開始有log的話,就可以看到類似訊息:
{
"_index" : "meso_sys_filelog",
"_type" : "meso_sys_filelog",
"_id" : "AV2I13eaVf3DdKZYkEUQ",
"_score" : 1.0,
"_source" : {
"path" : "/mnt/logs/mesocollection/20170728/MC_Entry_12.log",
"@timestamp" : "2017-07-28T10:56:48.391Z",
"Logtime" : "2017-07-28 18:56:48,189",
"@version" : "1",
"host" : "9bc64fb8bae4",
"Level" : "INFO",
"message" : [
"2017-07-28 18:56:48,189 - MC_Entry - INFO - entry log : uuid:9366d611-ef07-4502-badd-3ff23293aed6-1501239408.1892407, behavior:update_system_user_rdb, request_data={\"$query$\": \"{\\\"id\\\": \\\"ddaa3395-e908-4f1c-b216-01688fbba151-1501239382.1135163\\\", \\\"account\\\": \\\"paul\\\"}\", \"$data$\": \"{\\\"firstName\\\": \\\"I-Ping\\\", \\\"lastName\\\": \\\"Huang\\\", \\\"email\\\": \\\"[email protected]\\\"}\"}, files_count=0",
"entry log : uuid:9366d611-ef07-4502-badd-3ff23293aed6-1501239408.1892407, behavior:update_system_user_rdb, request_data={\"$query$\": \"{\\\"id\\\": \\\"ddaa3395-e908-4f1c-b216-01688fbba151-1501239382.1135163\\\", \\\"account\\\": \\\"paul\\\"}\", \"$data$\": \"{\\\"firstName\\\": \\\"I-Ping\\\", \\\"lastName\\\": \\\"Huang\\\", \\\"email\\\": \\\"[email protected]\\\"}\"}, files_count=0"
],
"type" : "meso_sys_filelog",
"Logger" : "MC_Entry"
}
},
{
"_index" : "meso_sys_filelog",
"_type" : "meso_sys_filelog",
"_id" : "AV2I13sPVf3DdKZYkEUX",
"_score" : 1.0,
"_source" : {
"path" : "/mnt/logs/mesocollection/20170728/MC_Return_12.log",
"@timestamp" : "2017-07-28T10:56:49.410Z",
"Logtime" : "2017-07-28 18:56:48,574",
"@version" : "1",
"host" : "9bc64fb8bae4",
"Level" : "INFO",
"message" : [
"2017-07-28 18:56:48,574 - MC_Return - INFO - entry log : uuid:13d93dc7-0956-4dd2-8117-653bf514165c-1501239408.5744486 is_success:True, message:, response_data={\"id\": \"ddaa3395-e908-4f1c-b216-01688fbba151-1501239382.1135163\", \"account\": \"[email protected]\"}",
"entry log : uuid:13d93dc7-0956-4dd2-8117-653bf514165c-1501239408.5744486 is_success:True, message:, response_data={\"id\": \"ddaa3395-e908-4f1c-b216-01688fbba151-1501239382.1135163\", \"account\": \"[email protected]\"}"
],
"type" : "meso_sys_filelog",
"Logger" : "MC_Return"
}
}
]
若elasticsearch的api有回傳json,那kibana應該就可以針對我們新的index pattern抓到對應的index了。
果不其然,index可以建的起來,而且剛剛我們在grok 解析的欄位,在fileds清單也看的到,代表我們可以進一步分析或查詢
是的,file就是這麼簡單,只有filter那邊要花費一點心力去做配置,若log的格式不統一的話,其實要增加的配置也會變更多。
下一次,還是要挑戰filebeat。必竟光是效能改善跟低資源耗用的優勢,那就一定是要用了,只可惜今天沒有一併打通,再努力。過兩天颱風天只能窩在家裡,看有沒有機會把它幹掉吧