Anodot CEO and Co-Founder David Drai joined Amazon Web Services and Xandr to discuss the shift to machine learning-based anomaly detection in business monitoring. Xandr Chief Technology Officer Ben John shared how their advertising marketplace is using Anodot platform to cut detection from “up to a week to less than a day”.
Watch the Webinar
You can watch the webinar at the link above or read on for the highlights of that talk.
Xandr’s Story
Xandr, formerly AppNexus, was launched as AT&T’s advertising company in September 2018. It’s a massive-scale marketplace – “the Uber of advertising” – that connects the demand side to the supply side in the advertising ecosystem. Xandr built an exchange to monetize the content and leverages its platform for the marketers to reach the right audience. The platform Xandr provides to bring the two sides together is very complex and dynamic and requires precise balance despite the scale of its operations.
“We handle billions of transactions and serve billions of ads every single day. Just for you to understand the scale, we handle something like 45 million transactions per second,” John said. “Our systems process them all – 175+ terabytes of data – and these platforms make a lot of complex business decisions to reach the right consumers for the market. It is very complex, it is very large scale, so we were looking for a machine learning-based autonomous monitor that could help run our platform and, most importantly, detect, identify, and resolve anomalies.”
Before Anodot, John’s team was using quite a few monitoring solutions, but for the problems that they were looking to solve, the fragmented tool set created more complexity and confusion.
“We had to install these agents and run hundreds if not thousands of servers and applications across our global data centers,” John said. “When a business-critical incident happened, people had to look at the logs, at some of the monitors, and at alerts. They would try to correlate it all to understand the business incident or business impact. That is really hard, and every minute we were losing revenue, and also our customers were losing the revenue, so time is of the essence.”
Why Xandr Chose Anodot
John said he needed an automated solution that could scale to meet Xandr’s needs, and yet could detect anomalies happening for a single customer in a single region of a global business. He also wanted a cloud-based solution so he wouldn’t have to manage the infrastructure. Another requirement was for the solution to work straight out of the box, with all the data integration already built in. Anodot met the criteria, and then some.
“Anodot is a machine learning-based system that reads the data, collects the correlation, and generates the feedback, so we were able to identify the business metrics that we wanted to solve. We were able to push that into Anodot and we got better results,” John said. “You want Anodot to learn your business, learn your metrics, learn the behaviors. What day of the week is the traffic? What day of the month is peak? What hour of the day is different than the normal day? Things like that, the system will learn and optimize by itself. We didn’t need to do much work there.”
John also stated, “The more data you identify, integrate, and push into Anodot, and then have the system learn and optimize and get better results, the better it is for the business and for the customers.”
He emphasized the importance of defining specific use cases for the Anodot system. For his business, that included revenue monitoring:
“Somewhere in Europe, in some country, a specific region is serving blank ads. If you go to a website on a mobile application and you see a blank spot, that is not good. It’s not a good customer experience, not good monetization for the customers, as well as Xandr is losing revenue. We should be able to identify if a campaign is serving ads, and there are blanks. We should be able to capture that so we set specific use cases like that.”
Lessons Learned
As the AT&T team began working with Anodot’s real-time business monitoring and anomaly detection, they found that having the following elements in place helped maximize results:
- Clear ownership: Tech and business Directly Responsible Individual (DRI) is critical for success
- Defined use cases: Business and customer-specific use cases are critical
- Measurement criteria: Define KPIs and outcomes
- Ongoing Feedback: Incorporate feedback loop into the Product Development Lifecycle
- Change/Release management: Capture any change/release and correlate with alerts
- Integration: Possible integration with customer systems
“We learned how important it is to have clear ownership on both the tech side as well as the business side,” advised John. “Somebody on the technology side or the engineering side should be able to say, ‘Let me do the integration job, work with the Anodot folks, collect the data from our business and our internal data systems, and push the team.’ That person should wake up and think about this every single day. The same is needed on the business side.”
John stressed the importance of using metrics to evaluate the success of using Anodot. “I have certain questions I ask my team when it’s time for renewal of the Anodot license,” John said. “How many incidents did we catch? How much revenue were we able to save? What percent of the customers are now happy? How would our business function if we didn’t have Anodot? Questions like these build an open conversation for me with my teams every time we talk about Anodot.”
Continuous optimization of use cases is important, as are KPIs and outcomes.
“This is critical. You’ve got to make sure anything that the system learned about the root cause of a problem gets fixed,” John said. “Go and fix the product fundamentally so those issues don’t show up again. Anodot will start creating a different baseline based on the new learnings and based on the new data it collects. You can see the difference between the changes of the performance of your business and platform before the release and after the release.”
He also discovered that Anodot can work directly with his customers, saving him time and money. “An exciting idea that David Drai and I talked about is integration with customers’ systems. I don’t need to be in the middle. If a customer configures something, uses my platform for something, and they’re expecting or seeing a different result, Anodot can send a notification directly to the customer with the data, then they can configure. I cut multiple layers, save a ton of money on my resources and timing. The customers can also do that by themselves.”
Impact: “From Up to a Week to Less Than a Day”
John shared the results of what they have achieved using Anodot. “We reduced the time to detection of root causes from up to a week to less than a day. The complexity of our platform makes manual detection incredibly difficult,” he said. “Before Anodot, it could take up to a week because our platform integrates with so many partners. Now, this data helps us find so many incidents within a few hours or within a day, compared to multiple days and weeks.”
Anodot caught events that resulted in savings of thousands of dollars per event. “Each campaign going through the Xandr platform configures hundreds of thousands, if not millions of ads, and if things go wrong, it can have a significant financial impact. We were able to save lots of money for both Xandr and our customers,” John said.
Concluding his presentation, John said, “The benefits are definitely worth the investment that we put in and the continuous commitment that we have with Anodot. It’s a great product and platform that is definitely working for our skill and for our needs. We are not done yet. There are new ideas that we are exploring and we know there’s more work to be done.”
Learn More
There’s much more that we covered in the webinar that may help you in determining if your business has the infrastructure in place to benefit from automated business monitoring and anomaly detection. You can watch the webinar here or reach out with any questions you may have by clicking here.