Resources - Page 48 of 56

Anodot Resources Page 48

Blog Post13 min read

Manage Cloud Costs Like the Pros: 6 Tools and 7 Best Practices

Continuous monitoring, deep visibility, business context, and forecasting are essential capabilities for eliminating unpredictable cloud spend

Perry Tapiero

Blog Post 5 min read

Expectations are High for Advanced Anomaly Detection

Documents 1 min read

Mission-critical monitoring that scales with rapid growth

"Anodot is a game-changer. It connects the dots all together quickly, in a single email. We've reduced the number of false positives to almost zero."

Download

Blog Post 5 min read

We’re Only Human, After All: How Can Effective AI Metrics Monitoring Uncover the Opportunities You’re Missing

In this first post of our new series on overlooked incidents and AI analytics, we’ll discuss how intelligent monitoring of your metrics can open the door when opportunity knocks…before it walks away. “The main thing is to keep the main thing the main thing” - Stephen Covey Or at least that’s the intent when starting off. While it’s technically true that revenue and cost are only two key performance indicators (KPIs), it’s also true that both of those metrics are in fact aggregates - or more accurately, summaries. At such a high level of aggregation, there’s zero actionable insights because your team can’t directly alter either of those two metrics, but rather the myriad of little decisions which can tune the performance of the complex engine which is your business. Without access to much more granular data, you’re forced to make specific actions with incredibly vague information. The solution, of course, is to monitor a much higher number of much more specific metrics. This approach, however, can turn into an embarrassment of riches if your monitoring solution can’t keep up all the signals that will now need monitoring. Metrics monitoring can easily become an overwhelming task There’s an example we like to use which really drives home this problem of scale: Let’s say you’re an analyst at an ecommerce company and at the moment you’re monitoring only two KPIs - the number of purchases, and revenue. You’re a smaller e-tailer, with only 50 products total spread out over 10 categories, and you’re focused exclusively on the United States and want state-level data. As a company that does business on the web, you’ve already been bitten by platform and version-specific problems, and thus you want statistics from eight operating systems (four major Windows versions, 2 desktop Mac OSs, and one version each for both iOS and Android) to help you identify and fix future problems much faster. So how many total metrics will you be monitoring? The answer is: 2 X 50 X 10 X 50 X 8 = 400,000 As you get more granular, the number of permutations increases rapidly; in the off chance that Puerto Rico becomes the 51st state - that number will increase by 8,000. 400,000 metrics is far too many for manual metrics monitoring via traditional BI dashboards, alerts and teams of data scientists and analysts to be practical. Fortunately, AI-powered automated machine learning solutions are able to take human eyeballs off the dashboards because these real-time analytics solutions are able to accurately and automatically detect the anomalies in all that time series data. Those anomalies are the real signals in your data upon which you need to act. Accurate and real-time anomaly detection, coupled with the ability to correlate related anomalies across multiple data sources, is ushering in a new time in business intelligence when no data-driven organization should be surprised by an unexpected business incident ever again. The anomalies in the data which point to the opportunities in the market can now be found. Think of it this way: if you monitor everything, you can detect anything, especially events you didn’t know you had to look for. This is the real power of AI analytics. Absolutely no metric is overlooked Perhaps for a better perspective, let’s discuss a real-life example of how machine learning-based anomaly detection can help a business gain actionable business insights. When a celebrity endorses a product on Instagram, the free positive buzz can really drive up sales, but only if the reaction is in time. A large apparel conglomerate learned that the hard way when their BI team discovered the endorsement…two days later. If they discovered the sharp uptick in sales for that product and the rapidly dwindling inventory of that product in one of their regional warehouses in real time, they could have capitalized on the opportunity by increasing the price or replenishing the inventory to keep the customer demand fed. Now, that same Fortune 500 heavyweight is an Anodot customer and hot opportunities like that don’t slip by anymore. According to the Data Analytics Director, “With Anodot, we get real time alerts for sales spikes…or when an impending snow storm causes a decline in in-store purchases in the Midwest.” Switching to much more effective metrics monitoring has obviously added to their bottom line. When you have a data scientist in the cloud, any user can easily and automatically gain actionable business insights. Whether it’s a celebrity endorsement or signs of premature equipment failure from a swarm of IoT sensor devices, AI analytics tools are data agnostic and can find the signal hidden in your time series data. With these tools, you can extract not only actionable insights, but ultimately increased revenue from the business incidents discovered in your data. And isn’t that supposed to be the main thing? Read our next post about "KPI Analysis with AI Anomaly Detection"

Documents 1 min read

Case Study: Vimeo Uses Anodot to Tap Into User Experience and Optimize Internal Operations

Data is a treasure trove for Vimeo, and certainly something it sees as a differentiator and competitive advantage. Anodot helps the company find those nuggets of insight that would otherwise be overlooked.

Download

Blog Post 3 min read

Anodot’s AI Analytics Deliver Valuable Business Insights to Media Giant PMC

Facing significant delays in discovering important incidents in their active, online business, PMC’s data science team needed a better way to stay on top of business incidents. PMC had been relying on Google Analytics’ alert function. However, the problem was that they had to know what they’re looking for in order to set the alerts in Google Analytics. This was really time consuming and some things tended to get missed, especially with millions of users across dozens of professional publications. A Bit About Penske Media Corporation (PMC) PMC is a leading digital media and information services company. Its owned and operated brands reach more than 179 million visitors monthly and Penske Media was recently named one of the Top 100 Private Companies in the United States and North America. PMC is one of the largest digital media companies in the world, publishing more than 20 digital media brands including PMC Studios, WWD, Footwear News, M, Deadline.com, Variety magazine, Beauty Inc, Movieline, and more. PMC additionally owns and produces more than 80 annual events, summits, award shows and conferences while housing a dynamic research and emerging data business. Finding Issues Even Without Knowing What to Look For And then came Anodot. Anodot’s AI Analytics started by tracking PMC’s Google Analytics activity, identifying anomalous behavior in impressions and click-through-rates for advertising units. Analyzing the Google Analytics data, Anodot identified a new trend where a portion of the traffic to one of PMC’s media properties came from a “bad actor” - referral spam that was artificially inflating visitor statistics. For PMC’s analytics team, spotting this issue would have meant the impossible - that they already knew what they were looking for in advance. After discovering this by using Anodot, PMC was able to block the spam traffic, and free up critical resources for legitimate visitors. PMC could then accurately track the traffic that mattered the most, enabling PMC executives to make more informed decisions. If a story in one of our publications goes wild, we see spikes in our data which Anodot catches. Our editorial team can use the information about successful content themes derived from Anodot’s analytics. Andrew Maguire, PMC’s Head Data Scientist Moving Forward: PMC uses Anodot for Intelligent Alerting PMC plans to apply Anodot to provide an intelligent alerting system driven by machine learning with hardly any direction required from business users in terms of complex rules or triggers. PMC will incorporate Anodot into more core sources of data across the business and implement even more nuanced custom tracking on top of Google Analytics so that they can track key metrics that matter. Click here to download the full case study and find out how Anodot’s AI Analytics integrated with Google Analytics is helping PMC to prevent revenue loss, remedying urgent problems quickly, and capturing opportunities fast.

Blog Post 5 min read

Small Glitches, Big Trouble: How Checking for Potential Outliers in Time Series Data is a Necessity in eCommerce

Just before we get into how Anodot extracts actionable insights from time series data, it’s worthwhile to recap what exactly a time series is and how businesses typically generate them. First, companies take a certain metric (a value or quantity) which is considered important, most often it’s one of the usual key performance indicators (KPIs): revenue, profit, or cost. Then, the company decides how often they’re going to sample (update) that number, and then they pull in data samples at that interval. Lastly, those two-item data points then go into some designated data bucket, such as a database. Analytics tools like dashboards then retrieve the data as a set, generate a plot and update it as each data point comes in. Depending on the type of data, the amount of noise and sampling rate - the actual data distribution (and thus appearance of the plotted time series) can vary widely. It’s important to note that the data bucket is for the benefit of later in-depth analysis, not for Anodot’s machine-learning powered outlier detection. This is because Anodot uses “online” algorithms - computational processes which learn and update its models with each incoming data point, without having to go back to all the data points in that time series before it. All the previous data points are “encoded” into the models already. Examples of time series data Time series data sets are very useful when monitoring a given value over time. Since this is a near-universal need in many industries, it’s no surprise that all of today’s fastest growing (and data-intensive) industries use time series data sets. In ad tech, for example, there are metrics such as cost per lead, impression share, cost per click, bounce rate, page views and click-through rate. Common metrics in ecommerce include conversion rate, revenue per click, number of transactions, and average order value. However, the actual time series data which are measured are far more specific because each of those above examples are often broken down by geographic region (e.g. North America or Asia), or operating system - this is especially true for mobile app metrics, since revenue-sapping compatibility problems are often OS specific. This level of granularity allows companies to spot very specific anomalies, especially those which would get smoothed out and thus left unnoticed by more encompassing metrics, especially averages and company-wide totals. How Anodot checks for different types of potential outliers In the first installment of this series, we discussed the three different categories of outliers: global (or point), contextual (also called conditional) and collective outliers. All three of these can occur in time series data and all three can be detected. Take for example, a large spike in transaction volume at an ecommerce company which reaches a value never before seen in the data, thus making it a textbook example of a global outlier. This can be a great thing, since more sales usually means more revenue. Well, usually. A large spike in sales volume can also indicate that you have a stampede of online shoppers taking advantage of a pricing glitch. In such a case, your average revenue per transaction might actually be dipping down a little, depending on the ratio of glitch sales to normal sales. This slight dip might actually be completely normal at other times of year (like the normal retail slow periods which occur outside of the holiday season or back to school shopping), but not when you’re running a promotion. In this case, the low values of average revenue per transaction would be considered a contextual outlier. Hmmm, perhaps the promotional sale price for those TVs was entered as $369 instead of $639. Anodot is able to detect both types of outliers thanks to its ability to account for any and all seasonal patterns in a time series metric, thus catching the contextual outliers, and accurately determine whether a data point falls far outside the natural variance of that time series, thus catching global outliers. Anodot’s first layer - univariate outlier detection - is all about identifying global and contextual outliers in individual metrics. A second layer then focuses on what are called collective outliers (a subset of data which, as a collection, deviates from rest of the data it’s found in). This second layer uses multivariate outlier detection to group related anomalies together. This two-layer approach provides analysts both granularity and concise alerts at the same time. The advantage of automated outlier detection for finding potential outliers Human beings are quite good at spotting outliers visually on a time series plot, because we benefit from possessing the most sophisticated neural network we know of. Our “wetware”, however, can’t do this instantly at the scale of millions of metrics in real time. Under those constraints, not only are recall and precision important, but also detection time. In the case of our hypothetical price glitch above, each second that glitch persists unfixed means thousands of dollars lost. Automated anomaly detection simplifies the detection and correlation of outliers in your data, giving you real-time insights and real-world savings. Considering building a machine learning anomaly detection system? Download this white paper.

Blog Post 5 min read

Unexpected plot twist: understanding what are data outliers and how their detection can eliminate business latency

Business metric data is only as useful as the insights that can be extracted from them, and that extraction is ultimately limited by the tools employed. One of the most basic and commonly used data analysis and visualization tools is the time series: a two-dimensional plot of some metric’s value at many sequential moments in time. Each data point in that time series is a record of one facet of your business at that particular instant. When plotted over time, that data can reveal trends and patterns that indicate the current state of that particular metric. The basics: understanding what data outliers are A time series plot shows what is happening in your business. Sometimes, that can diverge from what you expect should happen. When that divergence is outside the usual bounds of variance, it’s an outlier. In Anodot’s outlier detection system, the expectations which set those bounds are derived from a continuous examination of all the data points for that metric. In many situations, data outliers are errant data which can skew averages, and thus are usually filtered out and excluded by statisticians and data analysts before they attempt to extract insights from the data. The rationale is that those outliers are due to reporting errors or some other cause they needn’t worry about. Also, since genuine outliers are relatively rare, they aren’t seen as indicating a deeper, urgent problem with the system being monitored. Outliers, however, can be significant data points in and of themselves when reporting or when other sources of error aren’t suspected. An outlier in one of your metrics could reflect a one-off event or a new opportunity, like an unexpected increase in sales for a key demographic you’ve been trying to break into. Outliers in time series data mean something has changed. Significant changes can first manifest as outliers when only a few events serve as an early harbinger of a much more widespread issue. A large ecommerce company, for example, may see a larger than usual number of payment processing failures from a specific but rarely used financial institution. The failures were due to the fact that they updated their API to incorporate new regulatory standards for online financial transactions. This particular bank was merely the first to be compliant with the new industry-wide standard. If these failures get written off as inconsequential outliers and not recognized as the canary in a coal mine, the entire company may soon not be able to accept any payments, as every bank eventually adopts the new standard. Outliers to the rescue At Anodot, we learned firsthand that an entire metric for a particular object may be an outlier, compared to that identical metric from other similar objects. This is a prime example of how outlier detection can be a powerful tool for optimization: by spotting a single underperforming component, the performance of the whole system can be dramatically improved. For us, it was a degradation of the performance from a single Cassandra node. For your business, it could be a CDN introducing unusually high latency, causing web page load times to rise, becoming unbearable for your visitors as they click away and fall into someone else’s funnel. Anodot’s outlier detection compares aspects that are supposed to behave similarly and identifies the ones that are behaving differently: a single data point which is unexpectedly different from the previous ones, or a metric from a particular aspect which deviates from that same metric from other identical aspects. Context requires intelligent data monitoring Anomalous data points are classified in the context of all the data points which came before. The significance of detected anomalies is then quantified in the context of their magnitude and persistence. Finally, concise reporting of those detected significant anomalies is informed by the context of other anomalies in related metrics. Context requires understanding… understanding gleaned from learning. Machine learning, that is. Even though there are several tests for outliers which don’t involve machine learning, they almost always assume a standard Gaussian distribution (the iconic bell curve), which real data often doesn’t exhibit. But there’s another kind of latency which outlier detection can remove: business latency. One example of business latency is the lag between a problem’s occurrence and its discovery. Another is the time delay between the discovery of the problem and when an organization possesses the actionable insights to quickly fix it. Anodot’s outlier detection system can remove both: the former by accurate real-time anomaly detection, the latter by concise reporting of related anomalies. Solving the problem of business latency is a priority for all companies in the era of big data, and it’s a much harder problem to solve with traditional business intelligence (BI) tools. Traditional BI: high latency, less results Traditional BI is not designed for real time big data, but rather for analyzing historical data. In addition, they simply visualize the given data, rather than surface issues that need to be considered. Therefore, analysts cannot rely on BI solutions to find what they are looking for, as they first must understand what they need to find. Using traditional BI, analysts may identify issues late, if at all, which leads to loss of revenue, quality, and efficiency. Speed is what’s needed – that one essential component for successful BI alerts and investigations. And that speed can make your business an outlier - way above your competition.

Documents 1 min read

Increasing customer retention and facilitating upsells

“Anodot has dramatically decreased the number of support tickets and increased customer satisfaction.”

Download

Blog Post 6 min read

Practical Elasticsearch Anomaly Detection Made Powerful with Anodot

Elasticsearch is a great document store that employs the powerful Lucene search engine. The ELK stack provides a complete solution for fetching, storing and visualizing all kinds of structured and unstructured data. ELK has been traditionally used for log collection and analysis, but it is also often used for collecting business and application data, such as transactions, user events and more. At Anodot, we use Elasticsearch to store the metadata describing all of the anomalies our system discovers across all of our customers. We index and query millions of documents every day to alert our customers to and provide visualizations of those anomalies, as an integral part of our anomaly detection solution. Below is a diagram illustrating the Anodot system architecture. Detecting and investigating issues that are somehow hidden within the huge amount of documents is a difficult task, especially if you don’t know what to look for beforehand. For example, a glitch in one of our own algorithms can lead to a sharp increase (or decrease) in the number of anomalies our system discovers and alerts on for our customers. To minimize the possible damage this kind of a glitch could cause to our customers, we query the data we store in Elasticsearch to create metrics which we then feed into our own anomaly detection system, as seen in the illustration below. This allows us to find anomalies in our own data so we can quickly fix any glitches and keep our system running smoothly for our customers. Harnessing Elasticsearch for Anomaly Detection We have found that using our own anomaly detection system to find anomalies, alert in real time and correlate events using data queried from Elasticsearch or other backend systems is ridiculously easy and highly effective, and can be applied to pretty much any data stored in Elasticsearch. Many of our customers have also found it convenient and simple to store data on Elasticsearch and query it for anomaly detection by Anodot, where it is then correlated with data from additional sources like Google Analytics, BigQuery, Redshift and more. Elasticsearch recently released an anomaly detection solution, which is a basic tool for anyone storing data in Elasticsearch. However, as seen in the diagram above, it is so simple to integrate data from Elasticsearch into Anodot together with all of your other data sources, for the added benefit that Anodot’s robust solution discovers multivariate anomalies, correlating data from multiple sources. Here is how it works: Collecting The Documents: Elasticsearch Speaks the Anodot language The first thing that needs to be done is to transform the Elasticsearch documents to Anodot metrics. This is typically done in two ways: Using Elasticsearch Aggregations to pull aggregated statistics including: Stats aggregation – max, min, count, avg, sum Percentile aggregation – 1,5,25,50,75,95,99 Histogram – custom interval Fetch “raw” documents right out of Elasticsearch, and build metrics externally using other aggregation tools (either custom or existing tools like statsd). We found method 1 to be easier and more reasonably priced. By using the built-in Elasticsearch aggregations, we can easily create metrics from the existing documents. Let’s go through an example of Method A. Here, we see a document indexed in Elasticsearch describing an anomaly: { "_index": "anomaly_XXXXXXXXXXX", "_type": "anomaly_metrics", "_id": "07a858feff280da3164f53e74dd02e93", "_score": 1, "_ttl": 264789, "_timestamp": 1494874306761, "value": 2, "lastNormalTime": 1494872700, "timestamp": 1494874306, "correlation": 0, "maxBreach": 0.2710161913271447, "maxBreachPercentage": 15.674883128904089, "startDate": 1494873960, "endDate":, "state": "open", "score": 60, "directionUp": true, "peakValue": 2, "scoreDetails": "{"score":0.6094059750939147,"preTransform":0.0}", "anomalyId": "deea3f10cdc14040b65ecfc3a120b05b", "duration": 60, "bookmarks": [ ] } The first step is to execute an Elasticsearch query to fetch statistics from an index which includes a “score” and a “state” field, i.e. aggregate the “score” field values to generate several statistics: percentiles, histogram (with 10 bins) and count, for all anomalies where the “state” field is “open” as seen below. { "size": 0, "query": { "bool": { "must": [ { "term": { "state": "open" } } ] } }, "aggs": { "customer": { "terms": { "field": "_index", "size": 1000 }, "aggs": { "score_percentiles": { "percentiles": { "field": "score" } }, "score_stats": { "stats": { "field": "score" } }, "score_histogram": { "histogram": { "field": "score", "interval": 10, "min_doc_count": 0 } } } } } This would be the response: { "took": 851, "timed_out": false, "_shards": { "total": 5480, "successful": 5480, "failed": 0 }, "hits": { "total": 271564, "max_score": 0, "hits": [] }, "aggregations": { "customer": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "customer1", "doc_count": 44427, "score_stats": { "count": 44427, "min": 20, "max": 99, "avg": 45.32088594773449, "sum": 2013471 }, "score_histogram": { "buckets": [ { "key": 20, "doc_count": 10336 }, { "key": 30, "doc_count": 7736 }, { "key": 40, "doc_count": 8597 }, { "key": 50, "doc_count": 8403 }, { "key": 60, "doc_count": 4688 }, { "key": 70, "doc_count": 3112 }, { "key": 80, "doc_count": 1463 }, { "key": 90, "doc_count": 92 } ] }, "score_percentiles": { "values": { "1.0": 20, "5.0": 21, "25.0": 30.479651162790702, "50.0": 44.17210144927537, "75.0": 57.642458100558656, "95.0": 76.81333333333328, "99.0": 86 } } }, Once we receive the Elasticsearch response, we use code like the example below to transform the data into Anodot’s Graphite protocol and submit it to our open source Graphite relay (available for Docker, NPM and others). Anodot Transforming Code: #!/usr/bin/env ruby require 'graphite-api' @CONNECTION = GraphiteAPI.new(graphite: $graphite_address) @CONNECTION.metrics({ " #{base}.target_type=gauge.stat=count.unit=anomaly.what=anomalies_score" => customer['score_stats']['count'], " #{base}.target_type=gauge.stat=p95.unit=anomaly.what=anomalies_score" => customer['score_percentiles']['values']['95.0'], " #{base}.target_type=gauge.stat=p99.unit=anomaly.what=anomalies_score" => customer['score_percentiles']['values']['99.0']}) Anodot Graphite Protocol: “what=anomalies_score.customer=customer1.stats=p99” “what=anomalies_score.customer=customer1.stats=p95” “what=anomalies_score.customer=customer1.stats=counter” “what=anomalies_score.customer=customer1.stats=hist10-20” By applying the method above, it is possible to store an unlimited number of metrics efficiently and at low cost. Submitting Metrics to Anodot Anodot’s API requires a simple HTTP POST to the URL: https://api.anodot.com/api/v1/metrics?token=<user’s token> The actual HTTP request’s body is a simple JSON array of metrics objects in the following format: [ { "name": "<Metric Name>", "timestamp": 1470724487, "value": 20.7, } ] Since Anodot provides many integration tools to existing systems, in particular the Graphite relay and Statsd, any tool that implements a Graphite Reporter can be used to submit the metrics. This may include a customer code or even the Logstash itself. A scheduled cron job can be set to submit these metrics regularly. For more information on the various ways to submit metrics to Anodot, visit our documentation page. Detecting and Investigating Anomalies with Anodot We recently had a misconfiguration of one of the algorithms used for one of our customers that led to a temporary increase in the number of anomalies detected and a decrease in their significance score. The issue was detected quickly in our monitoring system, so we were able to deploy a new configuration and restore normal functions before the glitch was noticeable to our customer. In another case (below), we received an alert that the number of anomalies discovered for a customer increased dramatically in a short period of time. The alert was a positive one for us because this was a new customer in their integration phase, and the alert signaled to us that our system had “learned” their data and had become fully functional. Our customer success team then reached out to initiate training discussions. Note that the data metrics from Elasticsearch can be correlated within Anodot to metrics from other backend systems. We do this for our own monitoring and real-time BI, and I’ll go into more depth about this in a later post.