Blog Post | 6 min read

AWS Bedrock Pricing: Your 2025 Guide to Amazon Bedrock Costs

by Perry Tapiero

The future is AI. That’s a fact, and all the major cloud corporations are taking notice and investing in generative AI offerings to serve their customers better.

Microsoft Azure has invested in OpenAI‘s ChatGPT, Google has Vertex AI, and Amazon has created Bedrock.

But what exactly is AWS Bedrock? And, most importantly, how much will it cost? Will this generative AI be an easy investment, or will you have to break the budget to squeeze it in?

What is AWS Bedrock?

AWS Bedrock Pricing: Your 2025 Guide to Amazon Bedrock Costs

Source: Amazon

To get started, let’s define some basic terms. AWS Bedrock is a fully managed service designed for developers to create generative AI applications.

Since AWS Bedrock comes with foundation models sourced from other leading AI companies like Anthropic, Meta, and Cohere, developers don’t have to worry about the learning process for your new generative AI application. Depending on your chosen foundation model, you can access different tools and infrastructure to build new models.

And no, you don’t need to worry about data security. AWS Bedrock prides itself on its integrity and user privacy; all data is kept 100% secure 100% of the time.

How does Amazon Bedrock work?

AWS Bedrock works by supplying dev teams with a wide range of foundation models to create generative AI tools that can generate anything from images to code to copy and more.

All you need to do to get started is:

Pick the foundation model that best fits your need (ex: if you’re looking to build a custom model to train bank customer onboarding, you’ll probably want to go with an Anthropic model like Claude 3 Sonnet or Titan).
Send an API request to ensure your data has been sent to the model
Ensure the model has received your input.
Wait for your model to generate your next blog copy, line of JavaScript, or branding image.

Yes, it’s that easy!

AWS Bedrock pricing

The harder part is AWS Bedrock pricing.

Like most cloud providers, AWS Bedrock offers two pricing models: on-demand, a pay-as-you-go structure, and provisioned throughput, a pricing structure where you pay for a one-month or six-month commitment for a set amount of services.

We’ll go into more detail about these different structures below, but before we get into that, know that the following are all factors that might negatively influence your bill:

Foundation model provider: Different FMs charge different amounts for input/output tokens. For example, image FMs typically cost a bit more.

Input and output token usage: Every time you send a request into an FM, you’ll be charged an input token, and everytime you receive that output in the form of copy, an image, or a chatbot conversation, you’ll be charged an ouput token.

Model usage and customization: Model customization prices are based on the number of processed tokens and overall model storage.

On-demand

On-demand prices look the same as your typical pay-as-you-go plan: you’re paying more for the freedom to stop anytime. Your prices will vary depending on your model of choice and the content you’re trying to generate.

The following typically costs more per token:

Text generation
Image generation
Embedding models

With that said, here’s a breakdown of how much you should expect to pay for common models and services based on 2024 U.S. prices:

Model	Price per 1000 Input Tokens	Price per 1000 Output Tokens
Claude 3.5 Sonnet	$0	$0.02
Claude 2.0	$0	$0.02
Command R+	$0	$0.02
Embed – English	$0	n/a
Jamba-Instruct	$0	$0.00
Jurassic-2 Ultra	$0.02	$0.02
Llama 3 70B	$0	$0.00
Mistral 7B	$0.00	$0.00
Titan Text Lite	$0.00	$0.00

Here’s how much you should expect to pay for image-generation AI assets:

Model	Image resolution	Cost Per Image – Standard Quality (<51 steps)	Cost Per Image – Premium Quality (>51 steps)
SDXL 0.8 (Stable Diffusion)	512X512 or smaller	$0.02	$0.04
SDXL 0.8 (Stable Diffusion)	Larger than 512X512	$0.04	$0
SDXL 1.0 (Stable Diffusion)	1024X1024 or smaller	$0.04	$0
Titan Image Generator (Standard)	512X512	$0.01	$0
Titan Image Generator (Standard)	1024X1024	$0.01	$0
Titan Image Generator (Custom Models)	512X512	$0.02	$0.02
Titan Image Generator (Custom Models)	1024X1024	$0.02	$0

Provisioned Throughput

Buying via provisioned throughput is the best option if you can commit to a one—or six-month period.

There are options to buy certain model units and specific throughput amounts. These will be measured based on the maximum number of input/out tokens for each minute and charged on an hourly basis. This plan is best for those with consistent workloads and who can commit to longer periods of use.

Here’s roughly what your costs should look like:

Model	Price per Hour per Model Unit With No Commitment (Max One Custom Model Unit Inference)	Price per Hour per Model Unit With a One Month Commitment (Includes Inference)	Price per Hour per Model Unit With a Six Month Commitment (Includes Inference)
Claude 2.0/2.1	$70	$63.00	$35.00
Command	$50	$39.60	$24
Command – Light	$9	$6.85	$4
Embed – English	$7	$6.76	$6
Embed – Multilingual	$7	$6.76	$6
Llama 2 Pre-Trained and Chat (13B)	N/A	$21.18	$13.08
SDXL1.0 (Stable Diffusion)	N/A	$49.86	$46
Titan Embeddings	N/A	$6.40	$5.10
Titan Image Generator (Standard)	N/A	$16.20	$13.00

Model customization

Model customization is one of the factors that influences you AWS Bedrock costs.

Your model customization cost is based on amount of model storage and number of processed tokens you use.

Pro tip: you can only get interference on multiple models if you use Provisioned Throughput.

Model	Price to Train 1000 Tokens or Price per Image Seen	Price for Storage per Custom Model per Month	Price to Infer from a Custom Model for One Model Unit per Hour with No Commitment
Command	$0	$1.95	$49.50
Command Light	$0	$1.95	$9
Llama 2 Pre-trained (13B)	$0	$1.95	$24
Llama 2 Pre-trained (70B)	$0	$1.95	$24
Titan Image Generator	$0	$1.95	$23
Titan Multimodal Embeddings	$0.00	$1.95	$9.38
Titan Text Lite	$0	$1.95	$7
Titan Text Express	$0.01	$1.95	$20.50

Melissa Abecasis
Director of Customer Success & Sr. Cloud FinOps Engineer, Anodot

Melissa brings a wealth of experience in customer success, cloud financial operations, and program management, with a demonstrated work history in the Information Technology and healthcare industry.

TIPS FROM THE EXPERT

1. Choose the right pricing model for your workload
For variable or short-term projects, use the On-Demand model for flexibility. For consistent, long-term workloads, the Provisioned Throughput model offers significant cost savings over time.

2. Monitor token usage closely
Input and output token usage drives costs. Implement real-time monitoring to detect anomalies or unexpected token spikes, ensuring your usage remains within budget.

3. Consolidate workloads with Provisioned Throughput
If you have multiple teams using Bedrock, centralize their workloads to share provisioned throughput allocations. This maximizes resource utilization and reduces costs per model unit.

4. Leverage training and customization selectively
Only train custom models when existing foundation models cannot meet your needs. Custom training and storage add significant costs, so use it strategically for unique requirements.

5. Use embedding models for cost-effective search and classification
Embedding models like Titan Multimodal Embeddings are less expensive for tasks such as semantic search and classification. Employ them for specific functionalities instead of general-purpose LLMs.

How to optimize AWS Bedrock pricing

The pricing for Amazon Bedrock might be the best option for developers compared to other choices on the market, such as AWS Bedrock; now the question is how it fits in the 2025 budget for cloud cost?

Cloud cost management tools like Anodot are designed optimize all of your cloud spend of MSPs and enterprises. We do this by giving you 100% visibility into your entire mulitcloud environment capturing spend down to the hour across all of your cloud platforms, with up to a two year retention period.

Other Anodot features include:

AI-powered cloud management, forecasting, and recommendations. Start saving with the click of a button.
Customizable multicloud dashboards that capture your cloud spend across any and all cloud devices.
24/7 automated budget monitoring that alerts you when your cloud spend exceeds certain threshholds.
Easy integration with your other cloud services.

Since 2014, Anodot has worked with FinOps organizations, MSPs, and enterprises of all sizes and worldwide, demystifying cloud costs.

Want a proof of concept? Talk to us to learn how much you can save on cloud cost with Anodot’s tools.

Written by Perry Tapiero

Perry Tapiero is an experienced marketer specializing in demand generation across diverse B2B verticals such as AdTech, FinTech, and Cyber. With a focus on driving revenue and growth, Perry excels in developing and executing effective Go-To-Market strategies.

You'll believe it when you see it

Featured resources

Blog Post 7 min read

Unveiling Azure’s Hidden Costs: What You Need to Know

So, you’re new to the cloud or just starting off with Azure. You’re probably starting your first project and using the Azure Calculator to help estimate your monthly run rate. The problem is that Azure, like all clouds, has hidden costs. So why does the cloud have hidden costs? Well, while we call them hidden […]

Blog Post 7 min read

To Commit or Not to Commit: Making Sense of Cloud Savings Options

So we all have commitment issues. Let’s face it, “commitment” is probably one of the most scary words in the English lexicon. Jokes aside, what does it actually mean when we talk about cloud commitments? What Are Cloud Commitments and How Do They Work? Well, basically it’s a model where you can commit to […]

Blog Post 3 min read

Anodot vs. Cast AI: Which FinOps Platform Delivers All-Inclusive Value?

There’s no doubt about Kubernetes’ importance for success in the cloud. It offers a cost-efficient, scalable, and automated platform for managing containerized applications while simplifying operations. Cast AI is a well-established platform specializing in Kubernetes optimization, including workload rightsizing and cluster autoscaling. But is that enough for MSPs and enterprises prioritizing cloud costs? And is […]