Our own ELK, or the story of how we “invented the wheel” for statistics gathering and project analysis.

Hello! My name is Ivan Melnichuk, I’m the head of programming department at Nexteum. In my article, I’ll tell you about an interesting case. For business, it was a breath of fresh air and, at the same time, a quite ambitious goal and opportunity for scaling up. For the team of developers, it was a real challenge, because during the past few months we needed to develop 8 business directions on one online platform, each of which was supposed to sell, attract new customers and work smoothly.

What were the main challenges?

First of all — to please the customer. In 2017 the business has decided active scaling of the online store. We had a goal to create 8 directions of trading on one online platform. This is a rapid expansion of assortment, an increase of content and a sharp rise of traffic.

Since our web-platforms is online-stores of a highload format, the main task was in providing stable work of the website considering high traffic and quick response of a webpage. 

In a short notice, we needed not only to increase the number of catalogues on the one platform but also to solve the analytics and statistics problem, because the Google Analytics is hiding some data and sometimes gives only 5% of the reliable information.

What were the problems and why?

With the active expansion of the online platform there comes a moment when everything is failing in one minute: database is falling, a cache is cracking, the website works unstable or slowly. The reasons are pretty obvious. Our website became a multifunctional catalogue with certain loads and its life has changed. 

The culprits of the problems were:

  • a huge amount of the content;
  • growth of the organic traffic;
  • omnipresent bots.

How did we solve the problem?

In two days we have designed and customized our own ELK, which provided us with access to reliable statistics and became an excellent tool for comprehensive tracking of the status of the project.

Visually, it looks like this:

You can implement everything you want in this graph. But the most important indicators for me were:

  • Diff from the past week;
  • percentiles of response time;
  • the average number of requests;
  • the number of cache connections;
  • cache timing.

This is how flowchart of our ELK looks:

When all the data is stacked in syslog-ng, they are quickly parsed into Log stash. As soon as the data appears in ElasticSearch, Kibana displays it on the graph.

Pros and additional features of this solution

This system is very easy in implementation and operational in terms of realization the decision (as I said we needed only two days for launching). The system is also flexible due to variables Apache and Nginx and has the ability to logging third-party services.

Our ELK gives access to many useful features. With its help, you can log timings of:

  • memcached;
  • response to requests;
  • connection to the database.

 

Instead of the epilogue:

Our ELK has made everyone happy: the customer who received reliable statistics; users who have uninterrupted service; our team leaders who can control the situation, and of course our SEO department who got answers to all the questions about traffic.