It was another morning in the office. I went over the daily rides report and noticed a slight decline in the total number of rides from our clients in Russia. As CTO of GetTaxi (also known as Gett), a mobile service that allows users to order taxis in a single click, going over reports like this was a daily part of my job.
I immediately started to investigate why the decline occurred. Knowing that collecting business activity data was critical to the success of any web service, I needed to ensure that this data was accessible in our reports and dashboards.
Real Time Data; Delayed Diagnosis
The first step we took was to examine whether or not the dip in the report was significant, and if so, to discover where and why it occurred. After several hours of drilling into the different metrics related to rides in Russia, we discovered that indeed the dip was important – the reason for the decrease in the number of rides was because a subset of the users did not receive the SMS messages they needed to validate their rides. It took several more hours to discover that the cessation of the text messages originated from one of the SMS providers, and several more hours to fix the issue. The whole issue took 48 hours to resolve, and resulted in a real loss of revenue.
This kind of challenge was not unique. I noticed that typically we received our business insights with at a latency of at least 12-24 hours, plus another 24 hours or so to understand and react to the discovered insight. This prompted me to start looking for a solution that could provide us with real-time insights that were much simpler to investigate. After all, we collected all the data in real-time. If required, we could graph any of this data with our existing metric/dashboard infrastructure – if we knew which data to look at.
Collect, Visualize and Track All Metrics on a Large Scale
There were plenty of solutions for collecting, visualizing and generating reports on our data, both open source and commercial, but none of them solved my problem – we still needed to know what questions to ask in order to gain insights about all aspects of our service.
The solutions that looked the most promising implemented automated machine learning – more specifically, anomaly detection. Open source tools such as Etsy (Skyline and Oculus), coupled with Graphite, seemed to be what we needed. However, implementing them would require at least one data scientist and additional development efforts. Even then, it wasn’t clear that they would solve our problem.
We needed a product that would do it all for us – collect, visualize, track all metrics, on a large scale. The solution we sought would also be required to alert us with insights that would be automatically prioritized, thereby minimizing the potential flood of false alarms or low priority issues.
When Operations Meets Innovation
That’s when I met my partners: Ira Cohen and Shay Lang. Ira was a chief data scientist in the HP-software business and Shay was an old friend and an R&D manager at a security software company. We realized that by combining our expertise – my operational experience as CTO of several companies, Ira’s experience in inventing and applying machine learning algorithms on time series data, and Shay’s R&D skills, we could create the service that I saw was so critical to our market’s needs. And that’s how Anodot was born.
More Than 200 Users and Growing
During our year and two months of existence, we have already created a metrics service platform that tracks, visualizes, detects and alerts over 200 users about issues and insights affecting their businesses. These businesses include Ad-tech companies, web services, e-commerce, and even industrial Internet of Things (IoT) companies.
And that’s just beginning…