Updated 26 March 2026

Cloud Cost Savings Strategies

Eight proven optimisation techniques with typical savings ranges, effort levels, and step-by-step implementation guidance. Works for AWS, Azure, and GCP.

60-90%

Max savings on Spot compute

30-60%

Typical RI/SP discount

65%

Saved by scheduling non-prod

Hours

Time to first quick wins

ComputeCommitment DiscountsStorageWaste EliminationSchedulingNetworking

Rightsize Oversized Instances

Compute

15-30%

typical savings

Low effortValue in Days

Most cloud instances are provisioned with excess capacity. Average CPU utilisation across enterprise cloud accounts is 10-15%, meaning many instances could run comfortably on a size down or two.

How to implement

1Enable AWS Compute Optimizer, Azure Advisor, or GCP Recommender for your accounts
2Review recommendations filtered to 'high confidence' only
3For each recommendation, verify with 30-day CPU and memory metrics
4Schedule a maintenance window to change instance types - typically zero-downtime for stateless apps
5Set a monthly reminder to review new recommendations as workloads change

Best suited for

All workloads, especially development and staging environments

Watch out

Verify memory as well as CPU. Some workloads are memory-constrained, not CPU-constrained.

Purchase Reserved Instances or Savings Plans

Commitment Discounts

30-60% on committed resources

typical savings

Medium effortValue in Immediate (discount applied at billing)

Reserved Instances (AWS, Azure) and Committed Use Discounts (GCP) offer 30-60% savings on on-demand prices in exchange for 1 or 3 year commitments. This is often the single highest-ROI FinOps action for stable production workloads.

How to implement

1Identify workloads that have run consistently for at least 90 days
2Use Cost Explorer Savings Plans recommendations (AWS) or Advisor (Azure) to see recommended commitment amounts
3Start with 1-year Compute Savings Plans (AWS) or general-purpose reservations - they cover more instance types and regions
4Commit incrementally: start at 50% coverage, review quarterly, and increase as confidence grows
5Track your RI/SP utilisation rate - aim for above 90%

Best suited for

Stable production workloads, databases, long-running analytics jobs

Watch out

Over-committing can leave you paying for unused reservations. Never commit more than 12 months of steady-state usage.

Use Spot and Preemptible Instances

Compute

60-90% versus on-demand

typical savings

High effortValue in Weeks (requires workload changes)

Spot (AWS), Spot VMs (Azure), and Preemptible VMs (GCP) use spare cloud capacity at 60-90% discounts. Workloads must tolerate interruption with 2-minute notice. The savings are exceptional for suitable workloads.

How to implement

1Identify interruption-tolerant workloads: batch processing, CI/CD build agents, ML training, data pipelines
2Implement checkpointing in batch jobs so they can resume after interruption
3Use multiple instance types and Availability Zones to reduce interruption probability
4Consider Spot.io (NetApp) or Granulate for automated Spot management
5Set Spot instance diversification across 5+ instance types to maintain availability

Best suited for

Batch jobs, CI/CD runners, ML training, rendering, data processing pipelines

Watch out

Not suitable for databases, stateful services, or customer-facing apps without significant architecture changes.

Implement Storage Lifecycle Policies

Storage

30-60% on storage costs

typical savings

Low effortValue in Days to weeks

Object storage (S3, Azure Blob, GCS) costs can be dramatically reduced by moving infrequently accessed data to cheaper storage tiers automatically. Most organisations have no lifecycle policies at all.

How to implement

1Audit S3 buckets or Azure containers for last-access dates using Storage Lens or similar
2Set lifecycle rules to move objects to Infrequent Access after 30-90 days of no access
3Move further to Glacier or Archive after 180 days for long-term retention data
4Delete objects that have no retention requirement after an expiry period
5Enable S3 Intelligent-Tiering for datasets with unknown access patterns

Best suited for

Log storage, backup buckets, static assets, compliance archives

Watch out

Retrieval costs apply to archived data. Ensure lifecycle rules align with your access patterns and compliance requirements.

Delete Unattached and Unused Resources

Waste Elimination

5-15% from quick wins

typical savings

Low effortValue in Hours

Unattached EBS volumes, unused Elastic IPs, idle load balancers, and forgotten RDS snapshots accumulate silently. A one-time audit typically finds $5,000-$50,000 of annual savings with no operational impact.

How to implement

1Run AWS Trusted Advisor or Azure Advisor Idle Resources check
2Query for EBS volumes with no attachment or 0 IOPS for 30+ days
3List all Elastic IPs not associated with running instances ($0.005/hour each adds up)
4Identify load balancers with no active target groups or backends
5Review old AMIs and snapshots - check if anything still references them before deleting

Best suited for

All cloud accounts, especially long-running accounts without regular cleanup

Watch out

Verify nothing references a resource before deleting. Create a snapshot before deleting volumes from unknown sources.

Auto-Scale Instead of Over-Provision

Compute

20-40% on web and app tier

typical savings

Medium effortValue in Weeks

Many production environments are sized for peak load 24/7. Auto-scaling allows you to run at minimum required capacity most of the time, scaling out for traffic spikes and scaling back down automatically.

How to implement

1Identify services with predictable or irregular traffic patterns using CPU/request metrics
2Enable EC2 Auto Scaling Groups or Azure VMSS with target tracking policies
3For containers: configure Kubernetes HPA (Horizontal Pod Autoscaler) with CPU and custom metrics
4Set scheduled scaling for known peaks (business hours, weekly peaks)
5Use KEDA for event-driven scaling based on queue depth or custom metrics

Best suited for

Web applications, API services, containerised microservices

Watch out

Cold start latency can affect user experience if minimum capacity is too low. Test scale-out speed under realistic load.

Shut Down Non-Production Environments

Scheduling

10-25% overall

typical savings

Low effortValue in Days

Development, staging, and QA environments often run 24/7 but are only used during business hours. Scheduling shutdown outside working hours reduces their cost by 65-75% with minimal friction.

How to implement

1Tag all non-production resources with Environment=dev, staging, or test
2Use AWS Instance Scheduler, Azure Automation, or a Lambda function to stop tagged instances outside hours
3Schedule shutdown at 7pm and startup at 8am Monday to Friday (65% reduction)
4Consider full weekend shutdown for environments not needed on weekends
5Give developers a simple way to request an override for late-night deploys

Best suited for

Development, staging, QA, and demo environments

Watch out

Ensure no critical jobs (nightly builds, end-of-day processing) run during shutdown windows.

Optimise Data Transfer and Egress Costs

Networking

10-30% for data-intensive architectures

typical savings

High effortValue in Weeks to months

Data egress is one of the most overlooked cloud costs. Transferring data out of a cloud provider or between regions can cost $0.05-$0.09/GB, adding up to tens of thousands per month for data-intensive workloads.

How to implement

1Audit your top 10 data transfer line items in your cloud bill
2Co-locate compute with data: ensure processing happens in the same region as storage
3Use CloudFront (AWS), Azure CDN, or Cloud CDN to serve static assets - CDN egress is cheaper
4Consider AWS Direct Connect or Azure ExpressRoute for large on-premises data transfers
5Review whether cross-AZ traffic is necessary - it costs $0.01/GB per AZ hop

Best suited for

Applications with high data volumes, multi-region architectures, hybrid cloud

Watch out

Reducing cross-region redundancy to save egress costs can impact resilience. Model the trade-off carefully.

Recommended Implementation Order

Not all strategies are equal. Here is the recommended order to maximise early ROI while managing risk.

Week 1-2: Quick Wins

Delete unattached volumes and unused IPs
Enable auto-shutdown for non-production
Apply storage lifecycle policies to log buckets

Expected savings: 5-15%

Month 1-3: Core Optimisation

Rightsize top 20 most expensive instances
Purchase Savings Plans for stable workloads
Implement auto-scaling for web tier

Expected savings: 20-40%

Month 3+: Advanced

Migrate batch and CI/CD to Spot instances
Optimise data transfer architecture
Expand RI/SP coverage to 70%+

Expected savings: 30-50%+

← FinOps Calculator | Compare Tools | FinOps Framework