The future is AI. That’s a fact, and all the major cloud corporations are taking notice and investing in generative AI offerings to serve their customers better.
Microsoft Azure has invested in OpenAI‘s ChatGPT, Google has Vertex AI, and Amazon has created Bedrock.
But what exactly is AWS Bedrock? And, most importantly, how much will it cost? Will this generative AI be an easy investment, or will you have to break the budget to squeeze it in?
What is AWS Bedrock?
To get started, let’s define some basic terms. AWS Bedrock is a fully managed service designed for developers to create generative AI applications.
Since AWS Bedrock comes with foundation models sourced from other leading AI companies like Anthropic, Meta, and Cohere, developers don’t have to worry about the learning process for your new generative AI application. Depending on your chosen foundation model, you can access different tools and infrastructure to build new models.
And no, you don’t need to worry about data security. AWS Bedrock prides itself on its integrity and user privacy; all data is kept 100% secure 100% of the time.
How does Amazon Bedrock work?
AWS Bedrock works by supplying dev teams with a wide range of foundation models to create generative AI tools that can generate anything from images to code to copy and more.
All you need to do to get started is:
- Pick the foundation model that best fits your need (ex: if you’re looking to build a custom model to train bank customer onboarding, you’ll probably want to go with an Anthropic model like Claude 3 Sonnet or Titan).
- Send an API request to ensure your data has been sent to the model
- Ensure the model has received your input.
- Wait for your model to generate your next blog copy, line of JavaScript, or branding image.
Yes, it’s that easy!
AWS Bedrock pricing
The harder part is AWS Bedrock pricing.
Like most cloud providers, AWS Bedrock offers two pricing models: on-demand, a pay-as-you-go structure, and provisioned throughput, a pricing structure where you pay for a one-month or six-month commitment for a set amount of services.
We’ll go into more detail about these different structures below, but before we get into that, know that the following are all factors that might negatively influence your bill:
- Foundation model provider: Different FMs charge different amounts for input/output tokens. For example, image FMs typically cost a bit more.
- Input and output token usage: Every time you send a request into an FM, you’ll be charged an input token, and everytime you receive that output in the form of copy, an image, or a chatbot conversation, you’ll be charged an ouput token.
- Model usage and customization: Model customization prices are based on the number of processed tokens and overall model storage.
On-demand
On-demand prices look the same as your typical pay-as-you-go plan: you’re paying more for the freedom to stop anytime. Your prices will vary depending on your model of choice and the content you’re trying to generate.
The following typically costs more per token:
- Text generation
- Image generation
- Embedding models
With that said, here’s a breakdown of how much you should expect to pay for common models and services based on 2024 U.S. prices:
Model | Price per 1000 Input Tokens | Price per 1000 Output Tokens |
Claude 3.5 Sonnet | $0 | $0.02 |
Claude 2.0 | $0 | $0.02 |
Command R+ | $0 | $0.02 |
Embed – English | $0 | n/a |
Jamba-Instruct | $0 | $0.00 |
Jurassic-2 Ultra | $0.02 | $0.02 |
Llama 3 70B | $0 | $0.00 |
Mistral 7B | $0.00 | $0.00 |
Titan Text Lite | $0.00 | $0.00 |
Here’s how much you should expect to pay for image-generation AI assets:
Model | Image resolution | Cost Per Image – Standard Quality (<51 steps) | Cost Per Image – Premium Quality (>51 steps) |
SDXL 0.8 (Stable Diffusion)
|
512X512 or smaller | $0.02 | $0.04 |
Larger than 512X512 | $0.04 | $0 | |
SDXL 1.0 (Stable Diffusion) | 1024X1024 or smaller | $0.04 | $0 |
Titan Image Generator (Standard)
|
512X512 | $0.01 | $0 |
1024X1024 | $0.01 | $0 | |
Titan Image Generator (Custom Models)
|
512X512 | $0.02 | $0.02 |
1024X1024 | $0.02 | $0 |
Provisioned Throughput
Buying via provisioned throughput is the best option if you can commit to a one—or six-month period.
There are options to buy certain model units and specific throughput amounts. These will be measured based on the maximum number of input/out tokens for each minute and charged on an hourly basis. This plan is best for those with consistent workloads and who can commit to longer periods of use.
Here’s roughly what your costs should look like:
Model | Price per Hour per Model Unit With No Commitment (Max One Custom Model Unit Inference) | Price per Hour per Model Unit With a One Month Commitment (Includes Inference) | Price per Hour per Model Unit With a Six Month Commitment (Includes Inference) |
Claude 2.0/2.1 | $70 | $63.00 | $35.00 |
Command | $50 | $39.60 | $24 |
Command – Light | $9 | $6.85 | $4 |
Embed – English | $7 | $6.76 | $6 |
Embed – Multilingual | $7 | $6.76 | $6 |
Llama 2 Pre-Trained and Chat (13B) | N/A | $21.18 | $13.08 |
SDXL1.0 (Stable Diffusion) | N/A | $49.86 | $46 |
Titan Embeddings | N/A | $6.40 | $5.10 |
Titan Image Generator (Standard) | N/A | $16.20 | $13.00 |
Model customization
Model customization is one of the factors that influences you AWS Bedrock costs.
Your model customization cost is based on amount of model storage and number of processed tokens you use.
Pro tip: you can only get interference on multiple models if you use Provisioned Throughput.
Model | Price to Train 1000 Tokens or Price per Image Seen | Price for Storage per Custom Model per Month | Price to Infer from a Custom Model for One Model Unit per Hour with No Commitment |
Command | $0 | $1.95 | $49.50 |
Command Light | $0 | $1.95 | $9 |
Llama 2 Pre-trained (13B) | $0 | $1.95 | $24 |
Llama 2 Pre-trained (70B) | $0 | $1.95 | $24 |
Titan Image Generator | $0 | $1.95 | $23 |
Titan Multimodal Embeddings | $0.00 | $1.95 | $9.38 |
Titan Text Lite | $0 | $1.95 | $7 |
Titan Text Express | $0.01 | $1.95 | $20.50 |
Melissa Abecasis
Director of Customer Success & Sr. Cloud FinOps Engineer, Anodot
Melissa brings a wealth of experience in customer success, cloud financial operations, and program management, with a demonstrated work history in the Information Technology and healthcare industry.
TIPS FROM THE EXPERT
1. Choose the right pricing model for your workload
For variable or short-term projects, use the On-Demand model for flexibility. For consistent, long-term workloads, the Provisioned Throughput model offers significant cost savings over time.
2. Monitor token usage closely
Input and output token usage drives costs. Implement real-time monitoring to detect anomalies or unexpected token spikes, ensuring your usage remains within budget.
3. Consolidate workloads with Provisioned Throughput
If you have multiple teams using Bedrock, centralize their workloads to share provisioned throughput allocations. This maximizes resource utilization and reduces costs per model unit.
4. Leverage training and customization selectively
Only train custom models when existing foundation models cannot meet your needs. Custom training and storage add significant costs, so use it strategically for unique requirements.
5. Use embedding models for cost-effective search and classification
Embedding models like Titan Multimodal Embeddings are less expensive for tasks such as semantic search and classification. Employ them for specific functionalities instead of general-purpose LLMs.
How to optimize AWS Bedrock pricing
The pricing for Amazon Bedrock might be the best option for developers compared to other choices on the market, such as AWS Bedrock; now the question is how it fits in the 2025 budget for cloud cost?
Cloud cost management tools like Anodot are designed optimize all of your cloud spend of MSPs and enterprises. We do this by giving you 100% visibility into your entire mulitcloud environment capturing spend down to the hour across all of your cloud platforms, with up to a two year retention period.
Other Anodot features include:
- AI-powered cloud management, forecasting, and recommendations. Start saving with the click of a button.
- Customizable multicloud dashboards that capture your cloud spend across any and all cloud devices.
- 24/7 automated budget monitoring that alerts you when your cloud spend exceeds certain threshholds.
- Easy integration with your other cloud services.
Since 2014, Anodot has worked with FinOps organizations, MSPs, and enterprises of all sizes and worldwide, demystifying cloud costs.
Want a proof of concept? Talk to us to learn how much you can save on cloud cost with Anodot’s tools.