It is well known that when it comes to gaining insights from your BI system, the more granular data you have, the more accurate the insights you will gain. While most of the existing BI solutions can process and store a huge amount of data with many dimensions, they don’t offer an easy way to get the insights from the data. In fact, the BI solutions left the “I” – the intelligence – completely in the hands and minds of the data analysts. The human brain is limited to processing not more than few dozen signals, which is why you typically find organizations looking at the big picture and possibly missing issues that impact a specific segment or product because the root cause gets lost in the average.
You’ve probably seen dashboards like this one with multiple KPIs to show sales figures, customer satisfaction score, churn etc. These kinds of dashboards (often called executive dashboards) are designed to provide high level visibility of different business metrics for senior management. Often, one of the KPIs will show a negative value and the data analysts will be tasked with providing explanations. It is up to the investigation capabilities of the data analyst to ask the right questions to get to the right answers that explain the shift in the KPI. This task is very time consuming and frequently feels like finding a needle in a haystack.
I always was a big fan of detective stories, so it’s not surprising that my favorite characters are Sherlock Holmes and Doctor Gregory House. For those of you who haven’t seen the show, a typical episode of “House” begins with someone getting really sick and rushed to the hospital where he or she is referred to the diagnostic departments headed by Dr. House. This is when he fun begins… House and his team review the different symptoms and make a call on the most probable cause (It’s not Lupus!) and start treatment. When the symptoms worsen, they look for more clues by searching the person’s home for toxic substances, finding out the family history, running more tests and trying a new treatment which usually fails. Eventually something unrelated triggers House to correlate the different signals with the huge amount of data he stores in his brain, a light bulb goes on and he finds the real root cause and is able to order the proper treatment which saves the person’s life. House is probably the best example of human correlation and anomaly detection, making him an anomaly that stands out as an unconventional and misanthropic genius. Imagine how the world of medicine could benefit from automating House’s diagnostic capabilities…well, with machine learning and automated anomaly detection, this day isn’t really so far off anymore.
Let’s take a real example from PayPal’s command center which relies heavily on multiple dashboards powered by its homegrown monitoring solution named “Sherlock” to early detect major site incidents. The Technical Duty Officers (TDOs, AKA the “diagnosticians”) constantly scan the large wall with seven big HD screens and dozens of signals to identify abnormal behavior. The PayPal monitoring system collects more than 350 thousand signals per second, and the high-value signals are displayed on the wall while the rest go to a simple alert mechanism. You can read more about this here.
In case of a significant drop in volume or spike in error rate, the TDO on duty will try to figure out if this is a real issue by correlating different signals displayed on the screens. In fact, they are applying human correlation to determine if the different signals are indicative of a real issue or not. This is a time consuming process dependent on the capabilities of the person on duty, and occasionally when the signals are not that clear or strong, it can lead to a miss.
This is an example of a BI solution based on visualizations that require humans to have the intelligence to make a decision based on multiple signals. With the explosion in complexity of the environment, moving into cloud (public and private) and software designed as micro services deployed in containers, we are experiencing surge in the different signals that are collected and need analysis to gain insights.
We are seeing more and more organizations face the challenge of getting real-time business insights from the enormous amount of data they collect. Data analysts simply can’t keep up with the increasing demand to crunch all the data to find ways for the business to improve its key KPIs. This reality is driving a the new paradigm of “monitoring by exception” that surfaces the most relevant anomalies so the business analysts can investigate them further.
The human brain is limited in the number of data points it can process and correlate and this is exactly where the next wave of BI solutions comes in handy. With highly scalable machine learning-based algorithms, we now have software that can learn the normal pattern of any number of data points and correlate different signals to accurately identify anomalies that require action or investigation.