How to setup and configure elasticsearch, logstash and kibana on various distributions

elasticsearch, logstash and kibana

Background

This is the first part of a series of tutorials on how to install configure and setup elasticsearch, logstash and kibana on debian jessie using VPSie SSD VPS service.
Elastic as the company behind the three opensource projects – Elasticsearch, Logstash, and Kibana — designed to take data from any source and search, analyze, and visualize it in real time, Elastic is helping people make sense of data. From stock quotes to Twitter streams, Apache logs to WordPress blogs, our products are extending what’s possible with data, delivering on the promise that good things come from connecting the dots.

Expectations

I am assuming that you already have a VPS service and you have the knowledge how to deploy a VPS. For this tutorial series I will be using VPSie as a service and will also refer to the VPS as vpsies.
This tutorial will not cover the installation of the Debian Jessie on the vpsie for this you can check their blog. Having all these clarified I am also assuming that you will know how to SSH into your VPS. That mean your vps will have a fully working networking configured.

ELASTICSEARCH

First we need to install the dependencies. Elasticsearch does need to have installed java on your system before installing the elasticsearch itself. On Debian/Ubuntu you will need to do the following:

 


#echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" > /etc/apt/sources.list.d/webupd8team-java.list 
#echo "deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" >>  /etc/apt/sources.list.d/webupd8team-java.list
#apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys EEA14886 
#apt-get update 
#apt-get install -y oracle-java8-installer

On Centos/Fedora you can run the following commands:

#yum install -y http://download.oracle.com/otn-pub/java/jdk/8u65-b17/jdk-8u65-linux-x64.rpm

Use this command to add the apt key and add the apt repository to the VPSie server on Debian/Ubuntu:

#wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
OK
#echo "deb http://packages.elastic.co/elasticsearch/2.x/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-2.x.list

Then update the apt list to read the repository and install the elasticsearch.

#apt-get update
#apt-get install -y elasticsearch

On Centos/Fedora:

#rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch

Create a file called elasticsearch.repo in the folder /etc/yum.repos.d/ with the content:

[elasticsearch-2.x] 
name=Elasticsearch repository for 2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos 
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1

Install the elasticsearch by running:

#yum install elasticsearch

Let’s edit the elasticsearch configuration file and add a cluster name.

Note: If the cluster name is not specified the default name will be used.

Note: If you want to have separate elasticsearch clusters for different services is highly recommended to use separate cluster names cause otherwise the nodes will automatically join existing clusters with the default name.

Now let’s configure the elasticsearch cluster name edit the /etc/elasticsearch/elasticsearch.yml and set a name for cluster.name as in this example:

cluster.name: vpsie.io

You can also set the node.name variable otherwise the elasticsearch cluster will assign a random name from a list of superheroes. To do so in the same configuration file you need to set the node.name variable/

node.name: "Superman"

There are tons of different other variables which can be set but we will only cover the basic configuration. Save your configuration file then set up elasticsearch for automatic start upon boot time and start the service.

#systemctl enable elasticsearch
#systemctl start elasticsearch

If your elasticsearch has started properly by running netstat -ntl  you will see that elasticsearch is running on two different ports one is 9200 and the other is 9300 the port 9200 is the one you will be using to query data from your elasticsearch cluster.

It might be also useful to install plugins like head and bigdesk in elasticsearch.

cd /usr/share/elasticsearch

#cd /usr/share/elasticsearch
#bin/plugin -install mobz/elasticsearch-head

Restart elasticsearch for the changes to take effect

#systemctl restart elasticsearch

You can access the plugins by browsing your elasticsearch IP on port 9200 for example:

http://<elasticsearch_ip>:9200/_plugin/plugin_name

for head you will get a page something like this:

for bigdesk nodeview:

bigdesk-nodeview and for bigdesk clusterview:

bigdesk-clusterview

LOGSTASH

We need to create the repository to install logstash on Debian/Ubuntu:

#echo "deb http://packages.elastic.co/logstash/2.1/debian stable main" | sudo tee -a /etc/apt/sources.list

let’s update the repository and install the logstash

#apt-get update
#apt-get install -y logstash

On the Centos/Fedora systems run:

#cat > /etc/yum.repos.d/logstash.repo << EOF
[logstash-2.1] 
name=Logstash repository for 2.1.x packages 
baseurl=http://packages.elastic.co/logstash/2.1/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
EOF

Logstash configuration file consist in three main parts first is the input is where you set where are the source of the information from where you want to load the data into the elasticsearch.

The filters, codecs where you set the filters which you want to run on the information you have loaded, and the output is where you configure where the loaded and parsed information to be sent.

Since we will be configuring logstash to read the information from a remote server for input we will be using an input plugin called lumberjack.

Let’s create ssl certificate using the following command:

#cd /etc/pki/tls
#openssl req -x509  -batch -nodes -newkey rsa:2048 -keyout private/vpsie.io.key -out certs/vpsie.io.crt -subj /CN=*.vpsie.io

Note: You can use as CN the full domain name IP address or wildcard domain. Important is that if you use the IP address as CN you will have to use the IP address of the VPS which has the logstash running. If you have your logstash behind NAT (Network Address Translation)  then I strongly suggest to use full domain or wildcard domains cause otherwise the logstash will drop all the connections. In this tutorial we will make elasticsearch to store the logs from nginx access logs.

The basic configuration file for logstash is located at /etc/logstash/conf.d/.

We will name the file logstash.conf for as less confusion as possible.

input {
    lumberjack {
            port => 5000
            type => "logs"
            ssl_certificate => "/etc/pki/tls/certs/vpsie.io.crt"
            ssl_key => "/etc/pki/tls/private/vpsie.io.key"
    }
}
filter {
    if [type] == "nginx-access" {
            grok {
                    match => { 'message' => '%{IPORHOST:clientip} %{NGUSER:indent} %{NGUSER:agent} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{URIPATHPARAM:request}(?: HTTP/%{NUMBER:httpversion})?|)\" %{NUMBER:answer} (?:%{NUMBER:byte}|-) (?:\"(?:%{URI:referrer}|-))\" (?:%{QS:referree}) %{QS:agent}' }
            }
      }
}
output {
    stdout {
            codec => rubydebug
      }
    elasticsearch {
            hosts => ["127.0.0.1:9200"]
            cluster => "vpsie.io"
            flush_size => 2000
      }
}

There are patterns for nginx to be find on the internet but the information from the logs can differ depending on the configuration of the nginx you could use grok pattern to create your own patters which would work with the log files you are generating.
There are two great tools for creating the grok patterns or to check the grok pattern if it will work with your log files. The first one is called Grok Debuggerand the second one is Grok Incremental Constructor which is great in constructing your own patterns incrementally.

Looking at the logstash.conf

The input section has the lumberjack plugin which contains of:
port – the port which logstash-forwarder will use to connect to logstash type – the type of information will be provided to logstash sslcertificate – the certificate generated to connect to logstash sslkey – the key for the certificate generated to connect to logstash The filter section we are verifying if the type is nginx-access and if it is true then we apply the grok pattern to the log.

On the output first we run the codec rubydebug, like a debug log and also send the results to elasticsearch on localhost with cluster name vpsie.io.
When your logstash is configured properly you have to make it to start at boot and start it up.

#systemctl enable logstash
#systemctl start logstash

Once the logstash has started up running netstat -ntlp you will see that is listening on port 5000.
logstash

KIBANA

Now that we have logstash and elasticsearch up and running it is time to install kibana
We need to download kibana to our server using the following command:

#wget https://download.elastic.co/kibana/kibana/kibana-4.1.1-linux-x64.tar.gz

Once the kibana is downloaded we need to extract it to /opt:

#tar -xzvf kibana-4.1.1-linux-x64.tar.gz -C /opt

Now we need to rename the folder and setup the init to be able to run it as a service

#mv /opt/kibana-* /opt/kibana
#wget https://gist.githubusercontent.com/thisismitch/8b15ac909aed214ad04a/raw/bce61d85643c2dcdfbc2728c55a41dab444dca20/kibana4
#chmod +x /etc/init.d/kibana4
#sed -i '/^NAME/d' /etc/init.d/kibana4
#sed -i '/^KIBANA_BIN/a NAME=kibana4' /etc/init.d/kibana4

Now that we have all in the place we need to edit the kibana configuration file to set up for the IP address where elasticsearch is listening. If you are running kibana from the same server where elasticsearch is running then you don’t need to do anything. The configuration file is located at /opt/kibana/config/kibana.yml and edit the following line by changing he localhost with your elasticsearch IP address:

elasticsearch_url: "http://localhost:9200"

Let’s enable the startup script and start the kibana4 service

#update-rc.d kibana4 defaults
#service kibana4 start

You will be able to access kibana by browsing the following address

http://{your IP address}:5601

Let’s install logstash-forwarder to the server where the nginx is running and set it up to send the logs to the logstash.

LOGSTASH-FORWARDER

Connect to your remote server and download the logstash-forwarder.crt from the logstash server and place it to /etc/pki/tls/certs

#scp root@{logstash server IP}:/etc/pki/tls/certs/logstash-forwarder.crt /etc/pki/tls/certs/

Download the logstash-forwarder package related to the distribution you are using from here. We expect that the nginx server is running on Debian so I will be downloading the deb file. For centos download the RPM file

For Debian run the following command:

#wget https://download.elastic.co/logstash-forwarder/binaries/logstash-forwarder_0.4.0_amd64.deb
#dpkg -i logstash-forwarder_0.4.0_amd64.deb

for Centos run the following command:

#wget https://download.elastic.co/logstash-forwarder/binaries/logstash-forwarder-0.4.0-1.x86_64.rpm
#yum install -y logstash-forwarder-0.4.0-1.x86_64.rpm

or you can use directly:

#yum install -y https://download.elastic.co/logstash-forwarder/binaries/logstash-forwarder-0.4.0-1.x86_64.rpm

The configuration file is located at /etc/logstash-forwarder.conf.
Now it’s time to set it up to take the files from the nginx logs and send it to logstash.

{
  "network": {
    "servers": [ "elk.vpsie.io:5000" ],
    "ssl ca": "/etc/pki/tls/certs/vpsie.io.crt",
    "timeout": 15
  },
  "files": [
    {
       "paths": [ "/var/log/nginx/access.log" ],
      "fields": { "type": "nginx-access" }
    }
  ]
}

As I mentioned in the logstash section I have created the certificates as a wildcard domain because the elk stack is behind the NAT and it cannot be accessed directly from the server on which nginx is running.

Now I have started the logstash-forwarder with nohup

#nohup /opt/logstash-forwarder/bin/logstash-forwarder -c "/etc/logstash-forwarder" -spool-size=100 -t=true &

and set up a crontab to run in at startup.

#crontab -e
@reboot nohup /opt/logstash-forwarder/bin/logstash-forwarder -c "/etc/logstash-forwarder" -spool-size=100 -t=true &

Now that we have set up logstash forwarder to read the access.log and send the logfiles to the logstash it’s time to set up kibana to make graphs out of those logs.

You can start browsing kibana using http://IP_address:5601

kibana setup

Time_field name make sure you have @timestamp selected and click create to start indexing the logs in elasticsearch.
After some time you will see the logs to be appear in the Discover menu:
kibana discover

These are the basic configurations for the ELK stack will be adding more advance configurations in the next few days like GEO mapping and how to create visualizations and dashboards on Kibana. Soon I will also show you how to make elk stack load mysql general logs.

Related Articles