The Blog latency was quite high on the old hosting provider and was slow to respond most of the times. So when my subscription to the old hosting provider was coming to end, instead of renewing it, I decided to migrate the blog to my selfhosted kubernetes. Below is the comparison of blog latency before and after migration.
All posts by rohan
Blog migrated to Kubernetes
My subscription on the old hosting provider was expiring and it has been few years since I updated my blog. So I decided to migrate the blog and get rid of outdated infra.
I moved the blog to my infrastructure where I am running most of my applications on kubernetes and I decided this is the time to move this blog along with few others on kubernetes.
I will later create a post on how I did the migration and how my current infrastructure is.
Installing and using docker containers on Fedora 24
Recently a lot of organizations started to migrate their environment to microservices architecture using containerization tools such as docker. The advantage with container tools are it is fast, robust and gives you ability to control its behavior. I am going to show you some basics with docker containers. But first we need to have it installed on the system. I am using Fedora 24 as my workstation. So this guide is written keeping in mind F24, but it should work on any linux flavors as well.
Installing docker
Docker website has very good documentation about installing and using docker which can be found here. To save time I will list the steps below.
First get docker repo file or create one with below contents in it.
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/fedora/$releasever/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
Now you can use dnf to install the docker engine.
#dnf install docker-engine
Next you need to start the docker service and enable it to run at boot.
#systemctl start docker
#systemctl enable docker.service
This will allow you to run docker containers as root user, but if you want to allow normal users to run containers then you need to add them to docker group. Make sure the users have sudo privilages as well.
#usermod -aG docker username
Running nginx container
Now that we have installed docker and want to deploy nginx container. There are few ways to do it, but for simplicity we will use nginx from docker repo. To get the nginx container image run below command.
$docker pull nginx Using default tag: latest latest: Pulling from library/nginx 8ad8b3f87b37: Pull complete c6b290308f88: Pull complete f8f1e94eb9a9: Pull complete Digest: sha256:aa5ac743d65e434c06fff5ceaab6f35cc8519d80a5b6767ed3bdb330f47e4c31 Status: Downloaded newer image for nginx:latest
As we have not specified any tags it will default to latest and pull the latest nginx container image available from docker repository. You can verify the available images locally using below command.
$docker images REPOSITORY TAG IMAGE ID CREATED SIZE nginx latest 4a88d06e26f4 4 days ago 183.5 MB
Lets say we want to run the container in detached mode and want to forward port 8080 from our host to port 90 on nginx container run the below command.
$ docker run -itd --name web -p 8080:80 nginx:latest 8ab540475170566d15f4576b1f93b8193947c80cf60bf57d9448f858a15a8410
Docker operations with container
We can check running docker containers using below command.
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8ab540475170 nginx:latest "nginx -g 'daemon off" 40 seconds ago Up 37 seconds 443/tcp, 0.0.0.0:8080->80/tcp web
As you can see the port 8080 is forwarded to container port 80.
Let’s check nginx on local port to see if it works
$ curl localhost:8080 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
If we want to stop the container just run below command.
$ docker stop web web
And if you check no container should be running on the system now.
$docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES $
There are lots of things that could be done with docker containers but we would revisit it next time.
Harnessing the power of ELK stack for log visualization.
Its been while since we were using ELK stack (Elasticsearch, Logstah and Kibana) for some log visualization. I decided to write blog post on how to do it to get data visualization on your systems.
I am not going to explain how to install the ELK components on your systems as there are plenty of guides available on the internet for the same.
I am assuming that you already have required components installed on your system and have basic understanding of each components.
Lets get started with configuration of components as below.
Elasticsearch:
The configuration file for elasticsearch is located at /etc/elasticsearch/elasticsearch.yml. Now lets start editing it.
sudo vim /etc/elasticsearch/elasticsearch.yml
We need to secure the elasticsearch so that outsiders can’t read data from it or can interact with it via API. To do this find the line with network.host and uncomment it. Replace the value with localhost so it should look like below.
network.host: localhost
Save the config file and start the elasticsearch
[root@server101 ~]# systemctl start elasticsearch.service [root@server101 ~]# systemctl status elasticsearch.service ● elasticsearch.service - Elasticsearch Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2016-08-23 17:13:43 IST; 6s ago Docs: http://www.elastic.co Process: 2929 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-pre-exec (code=exited, status=0/SUCCESS) Main PID: 2931 (java) CGroup: /system.slice/elasticsearch.service └─2931 /bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+Use... Aug 23 17:13:43 server101 systemd[1]: Starting Elasticsearch... Aug 23 17:13:43 server101 systemd[1]: Started Elasticsearch. Aug 23 17:13:46 server101 elasticsearch[2931]: [2016-08-23 17:13:46,721][INF...] Aug 23 17:13:46 server101 elasticsearch[2931]: [2016-08-23 17:13:46,721][INF.... Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,562][INF...] Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,708][INF...] Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,709][INF...] Aug 23 17:13:47 server101 elasticsearch[2931]: [2016-08-23 17:13:47,709][WAR...] Hint: Some lines were ellipsized, use -l to show in full. [root@server101 ~]#
Then we need to enable it to start the service automatically at boot, to do that execute below command.
[root@server101 ~]# systemctl enable elasticsearch.service Created symlink from /etc/systemd/system/multi-user.target.wants/elasticsearch.service to /usr/lib/systemd/system/elasticsearch.service. [root@server101 ~]#
Kibana:
The config file for kibana is located at /opt/kibana/config/kibana.yml
Open the file and find the line that says server.host and replace the default ip “0.0.0.0” with “localhost”. It should look like below.
server.host: "localhost"
Notice the “” before and after localhost
Save and exit the config edit. By making kibana serving on localhost we need to have apache or nginx to work as reverse proxy.
Now start and enable kibana.
[root@server101 ~]# systemctl start kibana [root@server101 ~]# systemctl enable kibana Created symlink from /etc/systemd/system/multi-user.target.wants/kibana.service to /usr/lib/systemd/system/kibana.service. [root@server101 ~]#
Nginx:
In previous step we configured kibana to work only on localhost, so we need to configure Nginx to act as reverse proxy so that we can access kibana
Edit the nginx config file /etc/nginx/nginx.conf and change server and location block to look like below.
server { listen 80 default_server; listen [::]:80 default_server; server_name server101; # root /usr/share/nginx/html; # Load configuration files for the default server block. # include /etc/nginx/default.d/*.conf; location / { proxy_pass http://localhost:5601; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; }
Of course I am assuming that you do not have any instance configured on you nginx. If you do then you may need to create a config file separately under /etc/nginx/conf.d/
Now enable and start nginx web server.
[root@server101 ~]# service nginx start Redirecting to /bin/systemctl start nginx.service [root@server101 ~]# chkconfig nginx on Note: Forwarding request to 'systemctl enable nginx.service'. Created symlink from /etc/systemd/system/multi-user.target.wants/nginx.service to /usr/lib/systemd/system/nginx.service. [root@server101 ~]#
Now configure logstash as below.
Logstash:
The config file for logstash needs to be created under /etc/logstash/conf.d/
It uses JSON format for configuration and has three sections, input, filters and outputs.
Lets create a config file and setup filters for filebeat output
[root@server101 ~]# vim /etc/logstash/conf.d/10-filebeat-input.conf input { beats { port => 5044 } }
This file specifies the beat input and will listen on port 5044.
Next we create configuration file for filter.
[root@server101 ~]# vim /etc/logstash/conf.d/14-syslog-filter.conf filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
Save and exit this filter configuration for syslog that is labeled by filebeat. It will try to use grok to parse the syslog to make it structured.
Next we need the config file for elasticsearch output
[root@server101 ~]# vim /etc/logstash/conf.d/17-elastic-output.conf output { elasticsearch { hosts => ["localhost:9200"] sniffing => true manage_template => false index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } }
Save and exit this output configuration. We configured logstash to store the beats data in elasticsearch which we earlier configured to run on port 9200.
Now we will test the config for logstash with below command
[root@server101 ~]# service logstash configtest Configuration OK [root@server101 ~]#
It should give you configuration ok prompt. If you get any error then mostly its related to syntax in config file, correct it out before proceeding further.
Now restart and enable logstash.
[root@server101 ~]# systemctl restart logstash [root@server101 ~]# systemctl enable logstash logstash.service is not a native service, redirecting to /sbin/chkconfig. Executing /sbin/chkconfig logstash on [root@server101 ~]#
We need to load filebeat index pattern and we can get samples from kibana dashboard provided by kibana.
[root@server101 ~]# wget https://download.elastic.co/beats/dashboards/beats-dashboards-1.1.0.zip
Now unzip the downloaded file, change it to the directory and execute load.sh shell script. This script will load sample dashboards, visualizations and beats index patterns into elasticsearch.
[root@server101 ~]# cd beats-dashboards-1.1.0 [root@server101 beats-dashboards-1.1.0]# ./load.sh
Next we get filebeat to ship logs into elasticsearch. So we need filebeat index template as well.
[root@server101 ~]# wget https://gist.githubusercontent.com/thisismitch/3429023e8438cc25b86c/raw/d8c479e2a1adcea8b1fe86570e42abab0f10f364/filebeat-index-template.json
Now we need to post this to elasticsearch. We can use curl to post the json template data.
[root@server101 ~]# curl -XPUT 'http://localhost:9200/_template/filebeat?pretty' -d@filebeat-index-template.json { "acknowledged" : true }
Now the ELK stack is almost ready.
Filebeat:
Next we need to setup filebeat on the client server. I am doing this on the same system to write the blog but this steps needs to be performed on client servers which you need to ship logs to our ELK stack.
Now create and edit the filebeat config file as below
[root@server101 ~]# vim /etc/filebeat/filebeat.yml
Under prospectors section we can define the prospects properties. Filebeat sends all log under /var/log, but we don’t need it in our setup so we can comment out the line as below and add paths for secure and messages logs.
paths: - /var/log/secure - /var/log/messages # - /var/log/*.log
Next find the line to specify document_type:, you need to uncomment it and change the value to syslog as below
document_type: syslog
Next we need to configure our ELK server ip so find the line elasticsearch: under output section. We are not going to interact with elasticsearch directly so make sure to comment it out that entire elasticsearch part. Next find and ucomment logstash section and configure it as below
logstash: # The Logstash hosts hosts: ["ELKserverIP:5044"] bulk_max_size: 1024
Now save and exit the file.
Next restart and enable filebeat service.
[root@server101 ~]# systemctl start filebeat [root@server101 ~]# systemctl enable filebeat Created symlink from /etc/systemd/system/multi-user.target.wants/filebeat.service to /usr/lib/systemd/system/filebeat.service. [root@server101 ~]#
Now connect to kibana and it will prompt you to set default index pattern, you can select any from the uploaded search pattern.
Fun with Hadoop-Part 1
I deployed openstack private cloud previously at my home.
I have completed the series on my company’s official blog at http://pythian.com/blog/author/bhagat/
I was playing with bigdata and hadoop for some time now and decided to make use of my cloud infra. So I deployed 7 node hadoop cluster and did some fun stuff with it.
The first problem I faced however was computing power.
I had only one compute node and I was short on computing power to run the cluster. To overcome this I added 2 additional compute nodes in the cloud infra. But that is for another series.
I want to talk about hadoop so lets get started with it.
I decided to make use of Hortonworks hadoop with ambari server as the documentation and resources available to me were mostly for Hortonworks hadoop.
I first played with it on aws free tier but quickly realized that it might be costlier option for me to just play and learn hadoop.
So that made me decide to deploy my own hadoop cluster at my own place.
I am writing the contents of the follow up post, stay tuned for it.
On request from my company I have completed the series on Pythian official blog.
You can find it here once it is published.
OpenStack private cloud deployment – Part 1 Setting up network
For the private cloud there are plenty of open source solutions available.
How ever for my setup I used Open Stack Ice house version.
As you can see in the diagram I am having
- Openfiler NAS storage box which also servers DNS, DHCP and PXE in my network.
- CentOS 6.2 KVM hypervisor which I have been using so far to install sandobx VMs.
- Home Workstation PC which runs Fedora 20.
OpenStack has vast amount of configurations and can be done with different variations to suite your need.
Ideally it is advisable to deploy the OpenStack with minimum 3 nodes.
- Controller :- As name suggest this is the controller node which runs most of the control services.
- Network :- As name suggest this is the network node which handles virtual networking.
- Compute :- This is the hypervisor node which runs your VMs.
However due to lack of resource in my home network I decided to use OpenStack legacy networking.
Which only requires Controller and Compute nodes
I converted the Fedora 20 workstation into controller node and KVM hypervisor to compute node
I will be doing follow-up posts on the configurations and setup.
I have completed the series on Pythian official website. You can find it here.
Why I deployed private cloud at my home and How?
For long I have been thinking to deploy private cloud at my home.
This was due to fact that as part of my day to day job I comes across some “interesting”
issues to troubleshoot and fix.
Due to the fact that we can not do modifications and changes on trial and error bases on production boxes,
We always require to check things out in sandbox before proceeding on production.
Also having sandbox helps sharpen your skills and practising the technology/skills you want to develop.
I am already having pxe tftp, kvm and NAS storage at my place.
Which I have been using to generate different kind of scenarios in virtual environment.
Since long I wanted to convert the KVM only virtualization to private cloud for two reasons.
It will give me my own private cloud and deploying a scenario even in a vm takes time like installing os and cloing the parameters.
While in cloud once you launch the instnce you don’t have to worry about installing os as it gives you that on the fly ready to roll.
I will do doing follow up posts on how my home network is setup and how I achieved private cloud computing at home.
PXE Boot server on Fedora 15
Recently I planned to upgrade my laptop from Fedora 13 to Fedora 16.
Earlier I used to burn DVDs of the downloaded iso images of Fedoras.
This time I decided to setup PXE boot server on my desktop which is running Fedora 15.
I found a good post of setting up PXE boot server here.
I used it as reference and created the PXE server as described. But somehow it didn’t work for me as expected.
So I did some modification on the steps described in the post as it was written for FC4.
I am listing it down to save the time.
As I was doing it in my home network only, I did not isolate the network, as it was already isolated and only my desktop and laptop was connected to the network via ADSL router (The router was also serving as dhcp server, but I disabled it as I was configuring PXE boot server on my desktop which has dhcp server configuration)
Then I installed the required packages which are tftp-server, dhcp, syslinux and http via yum.
yum install tftp-server dhcp syslinux http -y
Then on second step I configured my dhcp server as below.
In the reference the dhcp path was /etc/dhcpd.conf which was for FC4 and for Fedora 15 its changed to /etc/dhcp/dhcpd.conf
The contents of the dhcpd.conf file is below
ddns-update-style interim; subnet 192.168.1.0 netmask 255.255.255.0 { range 192.168.1.10 192.168.1.254; default-lease-time 3600; max-lease-time 4800; option routers 192.168.1.1; option domain-name-servers 192.168.1.1; option subnet-mask 255.255.255.0; option domain-name "home.local"; option time-offset -8; } host lap0 { hardware ethernet 04:4B:EE:80:FF:03; fixed-address 192.168.1.254; option host-name "lap0"; option next-server 192.168.1.2; filename "pxelinux.0"; }
What I did is to setup a DNS and DHCP server which will assign the IP address 192.168.1.254 to the laptop with MAC address 04:4B:EE:80:FF:03.
Now for third step I did as follows.
Here I configured tftp server which will serve the PXE kernel to the PXE-boot capable laptop NIC for network booting.
The steps are as follows
Open and edit the /etc/xinetd.d/tftp file and make the changes as follows.
Change the line with disable=yes to disable=no
Change the line with srv_args = -s /var/lib/tftpboot to srv_args = -s /tftpboot
Here we enabled the tftp service and changed the root directory of tftp from /var/lib/tftpboot to /tftpboot
For some reason this didn’t work for me first time, so I uninstalled tftp-server and re installed again and that fixed the issue.
We are almost ready to finish. Now we only need to copy the necessary files and setup the apache and tftp-server to complete it.
Step four is as follows.
Create the directory /tftpboot/pxelinux.cfg
Now go the pxelinux.cfg directory and create a file default.
This is our default boot option file and put the contents as follows
prompt 1 default Fedora 16 x64 Install timeout 100 label Fedora 16 x64 Install kernel vmlinuz append initrd=initrd.img ramdisk_size=9216 noapic acpi=off install=http://192.168.1.2/linux
Copy the file pxelinux.0 from /usr/share/syslinux to /tftpboot
Now we need the vmlinuz and initrd images for booting. For that we need to mount the iso image and copy the vmlinuz and initrd.img files from the isolinux directory on DVD.
After that I created a fedora install directory under /tftpboot/fedora-install.
Now we need to make it available via http so I did as follows.
Created a file /etc/httpd/conf.d/fedora_install.conf
And added the contents as follows.
Alias /linux /tftpboot/fedora-install <Directory /tftpboot/fedora-install> Options Indexes AllowOverride None </Directory>
Now restart the apache and xinetd for tftp PXE boot to take effect.
That is it. Now all I did is plug in my Laptop in the network and turn it on to boot via Network and voila.
Installation screen of Fedora 16 presented itself to complete it.
You can also use NFS and FTP install method the similar way.
That’s it for now. Please post your comments.
Find files using forfiles in windows
Few days back I had to accomplish yet another task, finding some files among large data.
The problem was internal windows search function was very slow and it requires indexing (Which we disabled to save resources on server). In linux you can use find command to find the files you are looking for by using following command
find /opt/data -name "*.c"
This will find all the files with c extension under /opt/data directory on Linux. But unfortunately I was on windows and had to improvise. So I remember forfiles utility and used it to find files under D:\data directory on windows using following syntex.
forfiles /P d:\data /M *.c /S /C "cmd /c echo @path
I am sharing this syntex so that it might help somebody.
Please comment your views so that I can also learn a trick or two
Migrating data with robocopy
Few days back I had a situation when I needed to migrate data from one Windows 2003 server to another.
The data contained 2 level of directories and thousands of files in them.
Also the data needed to copied with ACL as well.
so I wrote a script on source server as below to speed up copy.
D: cd D:\data For /D %%r in (*) do ( For /D %%a in (*) do( cd %%a start robocopy.exe D:\data\%%r\%%a \\destination\data\%%r\%%a *.* /e /s /w:3 /r:3 /XX /XO /SEC /LOG:c:\robocopy_%%r_%%a.log) )
I created log file to retain the dir and files list which were copied.
What I did with script is I queried for all directories in D:\data and I again queried the directories under D:\data\DIRECTORIES.
Then I used robocopy utility from Windows server kit to start copying each directories and files at the same time with start command.
If you don’t put start in front of robocopy it will still copy, but it will copy one directory at a time.
Hope this solution will help some other people who may require it.
Please comment suggestions for improvement or what solution you would use to complete this kind of task.