Cloud Cost Optimization Tips


Cloud costs are often the second-largest expense after payroll for SaaS companies. Without active management, spending grows faster than revenue. This guide covers practical cost optimization strategies that reduce bills by 30-50% without sacrificing performance.





Right-Sizing Instances





The most common waste is over-provisioned resources. Use cloud provider tools to analyze utilization:




* **AWS Compute Optimizer**: Analyzes CPU, memory, and network utilization to recommend instance types.

* **GCP Rightsizing Recommendations**: Built into the Compute Engine console.

* **Azure Advisor**: Provides cost recommendations across all services.




Target utilization rules of thumb:





| Resource | Target Utilization |


|----------|-------------------|


| CPU | 40-70% average |


| Memory | 60-80% average |


| Disk IOPS | Below 80% of provisioned |





Downsize instances that consistently run below 20% utilization. For variable workloads, consider scaling horizontally rather than vertically.





Reserved Instances and Savings Plans





Commit to usage in exchange for discounts:





| Option | Discount | Commitment |


|--------|----------|------------|


| AWS Reserved Instances | 40-60% | 1 or 3 years |


| AWS Savings Plans | 40-60% | 1 or 3 years ($/hour) |


| GCP Committed Use | 40-57% | 1 or 3 years |


| Azure Reserved | 40-60% | 1 or 3 years |





Start with 1-year commitments for baseline workloads (30-50% of your total compute). Use 3-year commitments for stable, predictable workloads. Combine Savings Plans with Spot Instances for maximum flexibility.





Spot and Preemptible Instances





Use spot instances (AWS), preemptible VMs (GCP), or low-priority VMs (Azure) for fault-tolerant workloads:






# AWS: Request spot instances in Auto Scaling


aws autoscaling create-auto-scaling-group \


--mixed-instances-policy file://spot-policy.json




# GCP: Create preemptible VM


gcloud compute instances create worker \


--preemptible







Ideal workloads: batch processing, CI/CD runners, stateless web workers, data analytics, rendering.





Savings: 60-90% compared to on-demand pricing. Combine with Spot Instance interruption handling (checkpointing, graceful shutdown).





Storage Optimization





Storage costs accumulate silently. Audit your storage regularly:




* **Delete unused volumes**: Snapshots of deleted volumes and unattached EBS volumes.

* **Use lifecycle policies**: Move infrequently accessed data to colder tiers.

* **Object storage tiers**:




| Tier | Cost/GB/Month | Use Case |


|------|--------------|----------|


| S3 Standard | $0.023 | Active data |


| S3 Infrequent Access | $0.0125 | Accessed monthly |


| S3 Glacier | $0.0036 | Archived data |


| S3 Deep Archive | $0.001 | Regulatory retention |





Set S3 Lifecycle rules to transition objects automatically:






{


"Rules": [


{


"Id": "MoveToIA",


"Filter": {"Prefix": "logs/"},


"Status": "Enabled",


"Transitions": [


{"Days": 30, "StorageClass": "STANDARD_IA"},


{"Days": 90, "StorageClass": "GLACIER"}


],


"Expiration": {"Days": 365}


}


]


}







Network Egress Costs





Data transfer out of cloud providers is expensive. Minimize egress:




* **Use the same region**: Keep services that communicate frequently in the same region. Cross-region traffic is billed.

* **CloudFront/CDN**: Serve static assets through a CDN. CloudFront data transfer to the internet is cheaper than S3 direct access.

* **Leverage direct connect**: For large data transfers, use AWS Direct Connect or equivalent.

* **NAT Gateway costs**: Use NAT instances instead of NAT Gateway for high-volume traffic (cost savings of 70-80%).




Autoscaling





Scale resources to match demand:






# AWS Auto Scaling with target tracking


autoscaling:


target_tracking:


- predefined_metric_specification:


predefined_metric_type: ASGAverageCPUUtilization


target_value: 60







For containerized workloads, use Kubernetes Cluster Autoscaler:






# Karpenter for AWS EKS


kubectl scale deployment api-server --replicas=0 # Automated idle scaling







Karpenter and similar tools scale nodes based on actual pod resource requests, eliminating node-level waste.





Database Cost Optimization





Databases are often the most expensive service:




* **Serverless databases**: Use Aurora Serverless, Cloud SQL auto-scaling, or Azure SQL Serverless for variable workloads.

* **Read replicas**: Add replicas for read-heavy workloads instead of upscaling a single instance.

* **Connection pooling**: Use PgBouncer or RDS Proxy to handle thousands of connections without provisioning for peak.

* **Delete old data**: Archive historical data to object storage.




Reserved Capacity with Spot





Combine reserved capacity for baseline with spot for spikes:






# Terraform: Mix OD and Spot in ASG


resource "aws_autoscaling_group" "app" {


mixed_instances_policy {


launch_template {


launch_template_specification { ... }


override {


instance_type = "t3.medium"


}


}


instances_distribution {


on_demand_base_capacity = 2


on_demand_percentage_above_base_capacity = 50 # Rest from Spot


spot_allocation_strategy = "capacity-optimized"


}


}


}







Monitoring and Budgets





Set up cost monitoring to catch anomalies early:






# AWS Budget with action


aws budgets create-budget \


--budget-name "Monthly-Infra" \


--budget-file monthly-budget.json \


--notifications-with-subscribers file://alert-config.json




# GCP budget alert


gcloud billing budgets create \


--billing-account=XXXXXX \


--display-name="Monthly Budget" \


--budget-amount=5000USD \


--threshold-rules=percent=50,percent=90 \


--notifications-pubsub-topic=budget-alerts







Set up alerts at 50%, 80%, and 90% of budget. Tag resources by cost center and review spending weekly.





Summary





Cloud cost optimization is an ongoing process, not a one-time cleanup. Start with right-sizing (the quickest wins), add reserved capacity for baseline workloads, use spot instances for fault-tolerant work, and configure autoscaling to match demand. Monitor storage lifecycle, minimize data egress, and optimize database tiering. The most effective approach is a 20% time investment per quarter: review spending, tag resources, and implement the top three savings opportunities. Most organizations can reduce their cloud bill by 30-50% within three months by following these practices.