June 9, 2021 – Datanami.com

If your CFO is in a state of shock over a recent AWS bill, she is not alone. According to a recent survey by Anodot, 30% of data and analytics professionals reported nearly a 50% increase in their monthly cloud bills over a six-month period last year, and one out of five saw their cloud bills double. Better monitoring and alerting to sudden spikes, Anodot says, is a potential solution.

At a macro level, COVID-19 and the work-from-home mandate clearly were big drivers in cloud adoption in 2020. Gartner says cloud spending in 2021 is set to rise 18% from last year, hitting nearly $305 billion. As a percentage of total IT spending, cloud is set to grab a 14% share, up from 9% last year.

While the big-picture migration of data and applications to the cloud is becoming crystal clear, that storyline doesn’t reflect the difficult circumstances that many individual organizations are facing as they adopt new cloud services and move existing IT processes into public cloud. Getting a better handle on the challenges the cloud poses to these companies is what drove Anodot in March to survey nearly 110 data and analytics professionals about their organizations’ cloud spending patterns.

The results, which Anodot shared exclusively with Datanami before they were set to be released today, show that a large number of organizations are facing rapidly growing cloud costs, including large and sudden spikes, and that they’re struggling to get a handle on this.

About 55% of respondents to Anodot’s survey say they have been “surprised” by cloud costs or had an incident where cloud costs suddenly spiked. About two-thirds said cloud costs were a consideration in day-to-day business activities, while slightly more than 50% said they were able to remediate cloud usage and cost issues in real time.

When spikes in cloud usage occur, only about 20% said they were able to immediately detect them, while about 25% said they could detect them in a few hours. About 35% said it takes them “a few days” to detect the cloud usage spikes, while another 20% said it took them a few weeks. Less than 10% indicated it took them months to detect the spikes.

 

Source: Anodot cloud survey

“2020 was a really rough year with cloud costs skyrocketing, and it’s just going to continue,” an Anodot spokesperson said. “Sometimes, the costs are due to mistakes and glitches. Sometimes they’re just using a lot of cloud applications. They’re working from home, and they need access to corporate data.”

The migration to the cloud is in full swing, but many organizations are struggling with the move. According to Anodot’s survey, only 10% of survey respondents reported having a “extremely smooth transition” to the cloud, about 25% had a “relatively smooth transition,” whereas 30% said they had a “challenging transition.” Around 40% said “it went OK.”

Anodot develops a machine learning environment that is geared toward monitoring business metrics. The offering, which specializes in automatically detecting anomalies in time-series data, has been used extensively in the telecommunications and financial services industries. With the recent spike in cloud costs, the company is pivoting its focus a bit to address that unanticipated concern.

Misconfigured in the Cloud

Count Kenshoo, an Israel-based provider of campaign management software and services, among the companies struggling to get a handle on cloud costs. According to Danny Zalkind, Kenshoo’s DevOps group manager, the company is very concerned about losing control of its costs as it migrates more applications to AWS.

“Once you go to the cloud, it’s very easy to lose track,” Zalkind told Datanami. “Some of it is slow leaks and some of it could be human mistakes, misconfiguration. You could easily, in a couple of days or a week, pay large amounts of money for a simple misconfiguration.”

Kenshoo recently had an incident where one of its employees selected the wrong EC2 instance, and it cost the company nearly $40,000 over the course of a couple of days, Zalkind said. At other times, internal users have left cloud-based GPUs spinning after work on them has stopped.

“We use GPU-based instances for all kinds of machine learning and data labeling processes, so that can be very costly if you start using those expensive GPU instances and you don’t turn it off on time or once you finish,” Zalkind added. “We’ve had some experience with that.”

AWS shows steady revenue growth (Source: Statista)

The SaaS company runs its data analytics and data science workloads in the cloud, but it’s planning to migrate other parts of its core campaign management application, including the Web serving and database configuration management components, to AWS too. That has caused Zalkind and his colleagues in DevOps to focus more attention on addressing the cloud spikes and the unanticipated costs that are associated with them.

Before the cloud bill started to grow, the company was spending around $150,000 to $200,000 per month on the cloud, Zalkind said. But several times in recent months, that bill has risen to $500,000, which Zalkind attributes to a mix of cloud instance misconfigurations as well as natural customer-driven growth.

Overall, Kenshoo’s cloud bill is growing around 10% to 15% per month. That figure reflects the unanticipated spikes as well as natural growth in its business; there was also an acquisition, according to Zalkind, who indicated the monthly cloud spending should be about $300,000.

“We’re constantly monitoring usage because it can very easily just go week by week, and just increase slowly,” Zalkind said. “It’s very easy to not notice and lose track, and one day you look at the bill at the end of the month and you ask yourself, how did that happen?”

Kenshoo recently started using Anodot’s machine learning tool to monitor its cloud usage. For Kenshoo, it’s relying on Anodot’s capabilities with anomaly detection in time-series data to be able to remove the natural seasonality from its cloud workloads and detect when something has actually gone awry.

“Before we were actively managing or monitoring and reacting to alerts, we used to grow up to 15% per month in certain services or total cost,” Zalkind said. “Now it’s very rare to see a misconfiguration that lasts more than a day. And we’re also now pretty stable on the cost. We’re proud of that.”

An AI-approach to monitoring and alerting for cloud costs is superior to static, BI-based methods because AI-based methods are more adaptable, according to Anodot. For Kenshoo, the static approach didn’t work, either.

“The main thing is, more accuracy,” Zalkind said. “The static threshold method creates a lot of false positives. There’s really no other way to monitor those types of seasonal datatypes without looking at seasonality and without breaking it down to multiple dimensions. Pretty much at a certain size, it’s not possible without that.”

To get a copy of Anodot’s survey, go to www.anodot.com/blog/cloud-cost-survey.

You'll believe it when you see it