6 Steps to Cost Optimization on AWS Cloud

Keeping track of the resources you are investing in is just one of many important considerations when you become cloud native. Finding the IT budget to drive new business initiatives can be a challenge. Similarly, knowing the appropriate IT spend for your business and how to optimize it is an issue all its own. This article is to help with some of those pain points. With our six steps to cost optimization on the AWS Cloud, you’ll learn how to improve your infrastructure and also free up budgets to drive innovation and move your business forward.

Mastering AWS Cost Optimization in 6 Steps

Following our six steps provides immediate impact by offering valuable insight into your current costs and give you areas to optimize.

Content Summary

Step 1: Understand Your Cost
Step 2: Monitor Cost Trends
Step 3: Comparing Cost and Utilization
Step 4: Examine Your Ratios
Step 5: The Gold Standard: Show & Chargeback Cost Tags
Step 6: Utilizing Financial Instruments
Bonus Tip

Step 1: Understand Your Cost

While it should be as simple as glancing at your bill, understanding your cost in AWS is not always easy. A typical AWS detailed billing record (DBR) will have two to eight unique line items for every dollar of cost. If you utilize cost allocation tags, the data becomes even more granular. With over 90 AWS services, 19 regions, and thousands of unique usage types, AWS cost management can get convoluted very quickly.

Due to this complexity, cost management in AWS should be treated as a data problem. The primary challenge is to identify actionable cost trend information (for example costs for specific services or types of usage) within the larger overall cost trend. The first step toward managing AWS costs is to tackle the data problem with a data solution — dashboards.

A good cost dashboard will give you the ability to track spend down to meaningful subcomponents, which should help you establish baseline cost, as well as understand variation and trends.

Some of the items your dashboard should track include:

  • Cost by Service
  • Cost by Region
  • Cost by Unique AWS Accounts
  • Cost by Allocations
  • Day-to-day cost trends
  • Month-to-month cost trends

By breaking out cost tracking in this manner, you’re able to understand which services and assets in play have the greatest impact, identify items that are no longer in use, and track trends to establish cost expectations for your environment.

Much of cost can be broken down into trends. When you understand trends, you can better predict your cost needs, and identify problems much earlier.

Having helped a number of clients with their cost optimization, there are a few common trends we have identified for which to keep an eye out. These trends are indicators of certain patterns in cloud architecture that can lead to cost concerns.

“The Sharks Fin”: The sharks fin trend is most common when building out new services. Optimization is enacted initially, however cost will eventually spike. Then optimization is assessed again, reigning costs in. This pattern continues to cycle.

The Sharks Fin
The Sharks Fin

“The Climb”: The climb comes in periods of architecture growth. Cost is expected to increase as usage does, but as both continue to grow, cost goes unchecked, masking any waste or mis-provisioned resources.

The Climb
The Climb

“The Science Experiment”: When new development projects are implemented and costs are unknown, the result is sudden increase then leveling off.

The Science Experiment
The Science Experiment

“The Delayed Mistake”: This occurs when something new is implemented, but its effectiveness is not immediately obvious. In this case, there may be some time before a reaction to the result can occur.

The Delayed Mistake
The Delayed Mistake

Being aware of these trend patterns when monitoring your architecture can save significant time and headaches around optimizing architecture by catching issues before they occur.

Step 3: Comparing Cost and Utilization

Another common consumption pitfall is the over-provisioning of resources. Whether undergoing migration or developing a new application, it can be difficult to know what’s necessary for your infrastructure to run efficiently.

This can occur in a number of ways including:

  • Horizontal Over-Scaling: The use of too many resources when less would do the job.
  • Vertical Over-Scaling: Running the right amount of resources but the wrong type. Resources are larger than what’s needed in regards to size. (This is more commonly seen in lift and shift migrations)
  • Keeping active unused resources and older generation resources, which may be less cost-efficient.

This is why it’s incredibly important to have a grasp of the services and resources active in your environment. By taking a holistic approach to observing services, resources, and other assets both active and idle within your AWS infrastructure today, troubles can be circumvented down the road.

In an elastic cloud environment like AWS, you only pay for what you use – however, you do still pay for things you don’t use if they are inactive and turned on within your account. Beyond being aware of some of the common over-usage scenarios, being proactive is key to avoiding issues around cost vs. utilization.

Eliminating services when they they aren’t being used (such as deleting unused Elastic Load Balancers, Elastic Block Storage, and Elastic IP addresses) is a good start. This can be accomplished by deleting unused assets entirely – or moving them to AWS S3 (Simple Storage Service), a cheaper storage solution that will allow you to pull them down for use at a later date.

Step 4: Examine Your Ratios

Looking at cost ratios is a more time consuming effort, however it can offer some of the most valuable insights into the health and efficiency of your architecture. Cost ratios are highly contextual i.e. they’re heavily dependent on your knowledge of your spend and your architecture needs. There is no one right ratio for every company. Without a strong cost dashboard, cost ratios could prove to be a challenge to assess, however they provide important indicators as to the cost efficiency of your environment.

Here are some common ratios that are important to look at:

  • On-Demand Cost vs. Reserved Instance Cost – Generally important for more mature and extensive environments, your ratio of on-demand to reserved instances indicates cost-effectiveness. In such environments, reserved instances should be used at higher rates given the cost-savings involved. If reserved instances are a low proportion of cost, then a strong reserved instance strategy isn’t being considered, and you may be losing out on savings.
  • EBS Volume Cost vs. EC2 Instance Cost – In a cloud native environment, EBS volume cost is typically 5 – 20% of cost (save for more database driven applications). If that cost is more like 50% of EC2 instance cost, then there are probably optimization opportunities.
  • EBS Snapshot Cost vs. EBS Volume Cost – Similar to EBS and EC2, EBS Snapshot Cost should be around 10% compared to EBS Volume Cost. If they are equal, then there’s a need to optimize.

Again, ratios may be hard to navigate, but can be an important tool to help cost optimization. There’s no set rule to what your ratios should look like. Whether you run a database backed service or a web based service, your ratios may look completely different.

Here’s an example of an Amazon S3 Ratio:

  • Storage should be the dominant cost, making up more than 75% of cost
  • Requests should be some small amount between 5 and 10%
  • Data transfer costs should also be around 5 to 10%
  • The remainder should be other costs

Step 5: The Gold Standard: Show & Chargeback Cost Tags

Showback or chargeback cost comes down to looking at your product costs, such as breaking out R&D costs separately, allowing you to break out your environment into unique services to be managed for cost optimization. An example of this is separating accounts for production and development.

What this comes down to is leveraging cost tags to indicate the differences in usage. When it comes to tagging, things can be painful and tedious, requiring strong attention to detail. It also requires an amount of team alignment within a common tagging policy. But tagging is integral to managing costs, and having a simple but defined tagging policy can help keep your accounts in order.

Common tag keys include but are not limited to:

  • Cost Center
  • Environment
  • Application
  • Owner
  • Function
  • Name

This helps to keep it simple for cross-tagging purposes. Each key should also have a range of applicable values. Values help you break out costs within the tags.

Spelling and capitalization matters within tagging, as do spaces. For this reason, using automation where appropriate is helpful to saving time and promoting efficiency when tagging.

Remember that some tags can be inherited, so be sure to put some thought into how tagging it set up.

Step 6: Utilizing Financial Instruments

We’ve touched on some of the components featured here, but financial instruments are an integral part of reigning in cost issues in your AWS environment.

Financial instruments are one of the simplest topics in cost ops because they can be applied no matter the situation. The most common of these are reserved instances (RIs).

There are many kinds of reserved instances, and these instances have their own metrics that can be tracked. An RI offers two major benefits – it guarantees you access to a particular instance type and in return for the commitment, AWS provides anywhere from a 10% to 55% discount on a instance.

There are 4 types of RIs:

  • No Upfront – Pay no money upfront, but make a contractual commitment to AWS for either one or three years.
  • Partial Upfront – Pay a small upfront payment to AWS and then a small monthly payment for the term of the instance for either one or three years.
  • All Upfront – Pay a total upfront payment to AWS for either one or three years.
  • Convertible – Pay either a partial or all upfront payment and gain the flexibility to change your instance type, only for three year options.

Tracking utilization (% of reserved hours you’re actually using) and coverage (% of all runtime hours “covered” by reservations) can help you develop a strategy to applying RIs. Maximizing these two can help you maximize your savings.

Another notable aspect is RIs are EC2 convertible v. standard RIs. Convertibles allow you to change what types of instances the RI covers. Convertibles are less risky, especially over 3 years, and support more aggressive coverage giving them a major advantage in most situations.

EC2 is the only service that offers this kind of RI. Other services offer standard RIs with the option of one or three years, with three years offering a higher discount but also a higher longer term risk and upfront commitment.

A final thought on RI Strategy: Purchasing instances within the same instance family will give your organization the flexibility to move up and down in size without penalty. This is great for companies that are growing fast and have more dynamic computing needs.

Bonus Tip

Make sure the business case justifies the technology!

Understanding the business and technology case, and then leveraging other AWS strategies to make your infrastructure more efficient is key to cost optimization.

All the previous cost strategies and decision points are backed by data and can be found through analysis of your AWS account however the following are proven strategies that can make your applications running on AWS’ infrastructure even more powerful.

Some examples include:

  • Auto scaling: An elastic environment that scales, aligning your infrastructure to your computing needs.
  • Lambda calls: Have no servers up and running and write code that makes an impact for your infrastructure.
  • Spot instances: Instances that are up for bid and not guaranteed, but can save up to 90% off an instance. Best for major batch loads that need to run quickly.

Source: Onica eBook – Mastering Cost Optimization on AWS in 6 Steps