Harnessing the power of ELK stack for log visualization.

Its been while since we were using ELK stack (Elasticsearch, Logstah and Kibana) for some log visualization. I decided to write blog post on how to do it to get data visualization on your systems.

I am not going to explain how to install the ELK components on your systems as there are plenty of guides available on the internet for the same.

I am assuming that you already have required components installed on your system and have basic understanding of each components.

Lets get started with configuration of components as below.

Elasticsearch:

The configuration file for elasticsearch is located at /etc/elasticsearch/elasticsearch.yml. Now lets start editing it.

sudo vim /etc/elasticsearch/elasticsearch.yml

We need to secure the elasticsearch so that outsiders can’t read data from it or can interact with it via API. To do this find the line with network.host and uncomment it. Replace the value with localhost so it should look like below.

network.host: localhost

 Save the config file and start the elasticsearch

[root@server101 ~]# systemctl start elasticsearch.service
[root@server101 ~]# systemctl status elasticsearch.service
● elasticsearch.service - Elasticsearch
 Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled)
 Active: active (running) since Tue 2016-08-23 17:13:43 IST; 6s ago
 Docs: http://www.elastic.co
 Process: 2929 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-pre-exec (code=exited, status=0/SUCCESS)
 Main PID: 2931 (java)
 CGroup: /system.slice/elasticsearch.service
 └─2931 /bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+Use...

Aug 23 17:13:43 server101 systemd[1]: Starting Elasticsearch...
Aug 23 17:13:43 server101 systemd[1]: Started Elasticsearch.
Aug 23 17:13:46 server101 elasticsearch[2931]: [2016-08-23 17:13:46,721][INF...]
Aug 23 17:13:46 server101 elasticsearch[2931]: [2016-08-23 17:13:46,721][INF....
Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,562][INF...]
Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,708][INF...]
Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,709][INF...]
Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,709][WAR...]
Hint: Some lines were ellipsized, use -l to show in full.
[root@server101 ~]#

Then we need to enable it to start the service automatically at boot, to do that execute below command.

[root@server101 ~]# systemctl enable elasticsearch.service
Created symlink from /etc/systemd/system/multi-user.target.wants/elasticsearch.service to /usr/lib/systemd/system/elasticsearch.service.
[root@server101 ~]#

 Kibana:

The config file for kibana is located at  /opt/kibana/config/kibana.yml

Open the file and find the line that says server.host and replace the default ip “0.0.0.0” with “localhost”.  It should look like below.

server.host: "localhost"

Notice the “”  before and after localhost

Save and exit the config edit. By making kibana serving on localhost we need to have apache or nginx to work as reverse proxy.

Now start and enable kibana.

[root@server101 ~]# systemctl start kibana
[root@server101 ~]# systemctl enable kibana
Created symlink from /etc/systemd/system/multi-user.target.wants/kibana.service to /usr/lib/systemd/system/kibana.service.
[root@server101 ~]#

Nginx:

In previous step we configured kibana to work only on localhost, so we need to configure Nginx to act as reverse proxy so that we can access kibana

Edit the nginx config file /etc/nginx/nginx.conf  and change server and location block to look like below.

server {
 listen 80 default_server;
 listen [::]:80 default_server;
 server_name server101;
# root /usr/share/nginx/html;

# Load configuration files for the default server block.
# include /etc/nginx/default.d/*.conf;

location / {

proxy_pass http://localhost:5601;
 proxy_http_version 1.1;
 proxy_set_header Upgrade $http_upgrade;
 proxy_set_header Connection 'upgrade';
 proxy_set_header Host $host;
 proxy_cache_bypass $http_upgrade;

}

Of course I am assuming that you do not have any instance configured on you nginx. If you do then you may need to create a config file separately under /etc/nginx/conf.d/

Now enable and start nginx web server.

[root@server101 ~]# service nginx start
Redirecting to /bin/systemctl start nginx.service
[root@server101 ~]# chkconfig nginx on
Note: Forwarding request to 'systemctl enable nginx.service'.
Created symlink from /etc/systemd/system/multi-user.target.wants/nginx.service to /usr/lib/systemd/system/nginx.service.
[root@server101 ~]#

Now configure logstash as below.

Logstash:

The config file for logstash needs to be created under /etc/logstash/conf.d/

It uses JSON format for configuration and has three sections, input, filters and outputs.

Lets create a config file and setup filters for filebeat output

[root@server101 ~]# vim /etc/logstash/conf.d/10-filebeat-input.conf
input {
 beats {
 port => 5044
 }
}

This file specifies the beat input and will listen on port 5044.

Next we create configuration file for filter.

[root@server101 ~]# vim /etc/logstash/conf.d/14-syslog-filter.conf
 filter {
 if [type] == "syslog" {
 grok {
 match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
 add_field => [ "received_at", "%{@timestamp}" ]
 add_field => [ "received_from", "%{host}" ]
 }
 syslog_pri { }
 date {
 match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
 }
 }
 }

Save and exit this  filter configuration  for syslog that is labeled by filebeat. It will try to use grok to parse the syslog to make it structured.

Next we need the config file for elasticsearch output

[root@server101 ~]# vim /etc/logstash/conf.d/17-elastic-output.conf
 output {
 elasticsearch {
 hosts => ["localhost:9200"]
 sniffing => true
 manage_template => false
 index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
 document_type => "%{[@metadata][type]}"
 }
 }

Save and exit this output configuration. We configured logstash to store the beats data in elasticsearch which we earlier configured to run on port 9200.

Now we will test the config for logstash with below command

[root@server101 ~]# service logstash configtest
Configuration OK
[root@server101 ~]#

It should give you configuration ok prompt. If you get any error then mostly its related to syntax in config file, correct it out before proceeding further.

Now restart and enable logstash.

[root@server101 ~]# systemctl restart logstash
[root@server101 ~]# systemctl enable logstash
logstash.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig logstash on
[root@server101 ~]#

We need to load filebeat index pattern and we can get samples from kibana dashboard provided by kibana.

[root@server101 ~]# wget https://download.elastic.co/beats/dashboards/beats-dashboards-1.1.0.zip

Now unzip the downloaded file, change it to the directory and execute load.sh shell script. This script will load sample dashboards, visualizations and beats index patterns into elasticsearch.

[root@server101 ~]# cd beats-dashboards-1.1.0
[root@server101 beats-dashboards-1.1.0]# ./load.sh

Next we get filebeat to ship logs into elasticsearch. So we need filebeat index template as well.

[root@server101 ~]# wget https://gist.githubusercontent.com/thisismitch/3429023e8438cc25b86c/raw/d8c479e2a1adcea8b1fe86570e42abab0f10f364/filebeat-index-template.json

Now we need to post this to elasticsearch. We can use curl to post the json template data.

[root@server101 ~]# curl -XPUT 'http://localhost:9200/_template/filebeat?pretty' -d@filebeat-index-template.json
{
 "acknowledged" : true
}

Now the ELK stack is almost ready.

Filebeat:

Next we need to setup filebeat on the client server. I am doing this on the same system to write the blog but this steps needs to be performed on client servers which you need to ship logs to our ELK stack.

Now create and edit the filebeat config file as below

[root@server101 ~]# vim /etc/filebeat/filebeat.yml

Under prospectors section we can define the prospects properties. Filebeat sends all log under /var/log, but we don’t need it in our setup so we can comment out the line as below and add paths for secure and messages logs.

 paths:
 - /var/log/secure
 - /var/log/messages
# - /var/log/*.log

Next find the line to specify document_type:, you need to uncomment it and change the value to syslog as below

 document_type: syslog

Next we need to configure our ELK server ip so find the line elasticsearch: under output section. We are not going to interact with elasticsearch directly so make sure to comment it out that entire elasticsearch part. Next find and ucomment logstash section and configure it as below

 logstash:
 # The Logstash hosts
 hosts: ["ELKserverIP:5044"]
 bulk_max_size: 1024

Now save and exit the file.

Next restart and enable filebeat service.

[root@server101 ~]# systemctl start filebeat
[root@server101 ~]# systemctl enable filebeat
Created symlink from /etc/systemd/system/multi-user.target.wants/filebeat.service to /usr/lib/systemd/system/filebeat.service.
[root@server101 ~]#

Now connect to kibana and it will prompt you to set default index pattern, you can select any from the uploaded search pattern.

 

 

Fun with Hadoop-Part 1

I deployed openstack private cloud previously at my home.

I have completed the series on my company’s official blog at http://pythian.com/blog/author/bhagat/

I was playing with bigdata and hadoop for some time now and decided to make use of my cloud infra. So I deployed 7 node hadoop cluster and did some fun stuff with it.

The first problem I faced however was computing power.

I had only one compute node and I was short on computing power to run the cluster. To overcome this I added 2 additional compute nodes in the cloud infra. But that is for another series.

I want to talk about hadoop so lets get started with it.

I decided to make use of Hortonworks hadoop with ambari server as the documentation and resources available to me were mostly for Hortonworks hadoop.

I first played with it on aws free tier but quickly realized that it might be costlier option for me to just play and learn hadoop.

So that made me decide to deploy my own hadoop cluster at my own place.

I am writing the contents of the follow up post, stay tuned for it.

On request from my company I have completed the series on Pythian official blog.

You can find it here once it is published.

 

OpenStack private cloud deployment – Part 1 Setting up network

For the private cloud there are plenty of open source solutions available.
How ever for my setup I used Open Stack Ice house version.

As you can see in the diagram I am having

Home Network Setup
Home Network Setup

 

  • Openfiler NAS storage box which also servers DNS, DHCP and PXE in my network.
  • CentOS 6.2 KVM hypervisor which I have been using so far to install sandobx VMs.
  • Home Workstation PC which runs Fedora 20.

OpenStack has vast amount of configurations and can be done with different variations to suite your need.
Ideally it is advisable to deploy the OpenStack with minimum 3 nodes.

  • Controller :- As name suggest this is the controller node which runs most of the control services.
  • Network   :- As name suggest this is the network node which handles virtual networking.
  • Compute  :- This is the hypervisor node which runs your VMs.

However due to lack of resource in my home network I decided to use OpenStack legacy networking.
Which only requires Controller and Compute nodes

I converted the Fedora 20 workstation into controller node and KVM hypervisor to compute node

I will be doing follow-up posts on the configurations and setup.

I have completed the series on Pythian official website. You can find it here.