Cloud costs are often the second-largest expense after payroll for SaaS companies. Without active management, spending grows faster than revenue. This guide covers practical cost optimization strategies that reduce bills by 30-50% without sacrificing performance.


Right-Sizing Instances


The most common waste is over-provisioned resources. Use cloud provider tools to analyze utilization:


  • **AWS Compute Optimizer**: Analyzes CPU, memory, and network utilization to recommend instance types.
  • **GCP Rightsizing Recommendations**: Built into the Compute Engine console.
  • **Azure Advisor**: Provides cost recommendations across all services.

  • Target utilization rules of thumb:


    | Resource | Target Utilization |

    |----------|-------------------|

    | CPU | 40-70% average |

    | Memory | 60-80% average |

    | Disk IOPS | Below 80% of provisioned |


    Downsize instances that consistently run below 20% utilization. For variable workloads, consider scaling horizontally rather than vertically.


    Reserved Instances and Savings Plans


    Commit to usage in exchange for discounts:


    | Option | Discount | Commitment |

    |--------|----------|------------|

    | AWS Reserved Instances | 40-60% | 1 or 3 years |

    | AWS Savings Plans | 40-60% | 1 or 3 years ($/hour) |

    | GCP Committed Use | 40-57% | 1 or 3 years |

    | Azure Reserved | 40-60% | 1 or 3 years |


    Start with 1-year commitments for baseline workloads (30-50% of your total compute). Use 3-year commitments for stable, predictable workloads. Combine Savings Plans with Spot Instances for maximum flexibility.


    Spot and Preemptible Instances


    Use spot instances (AWS), preemptible VMs (GCP), or low-priority VMs (Azure) for fault-tolerant workloads:


    
    # AWS: Request spot instances in Auto Scaling
    
    aws autoscaling create-auto-scaling-group \
    
      --mixed-instances-policy file://spot-policy.json
    
    
    
    # GCP: Create preemptible VM
    
    gcloud compute instances create worker \
    
      --preemptible
    
    

    Ideal workloads: batch processing, CI/CD runners, stateless web workers, data analytics, rendering.


    Savings: 60-90% compared to on-demand pricing. Combine with Spot Instance interruption handling (checkpointing, graceful shutdown).


    Storage Optimization


    Storage costs accumulate silently. Audit your storage regularly:


  • **Delete unused volumes**: Snapshots of deleted volumes and unattached EBS volumes.
  • **Use lifecycle policies**: Move infrequently accessed data to colder tiers.
  • **Object storage tiers**:

  • | Tier | Cost/GB/Month | Use Case |

    |------|--------------|----------|

    | S3 Standard | $0.023 | Active data |

    | S3 Infrequent Access | $0.0125 | Accessed monthly |

    | S3 Glacier | $0.0036 | Archived data |

    | S3 Deep Archive | $0.001 | Regulatory retention |


    Set S3 Lifecycle rules to transition objects automatically:


    
    {
    
      "Rules": [
    
        {
    
          "Id": "MoveToIA",
    
          "Filter": {"Prefix": "logs/"},
    
          "Status": "Enabled",
    
          "Transitions": [
    
            {"Days": 30, "StorageClass": "STANDARD_IA"},
    
            {"Days": 90, "StorageClass": "GLACIER"}
    
          ],
    
          "Expiration": {"Days": 365}
    
        }
    
      ]
    
    }
    
    

    Network Egress Costs


    Data transfer out of cloud providers is expensive. Minimize egress:


  • **Use the same region**: Keep services that communicate frequently in the same region. Cross-region traffic is billed.
  • **CloudFront/CDN**: Serve static assets through a CDN. CloudFront data transfer to the internet is cheaper than S3 direct access.
  • **Leverage direct connect**: For large data transfers, use AWS Direct Connect or equivalent.
  • **NAT Gateway costs**: Use NAT instances instead of NAT Gateway for high-volume traffic (cost savings of 70-80%).

  • Autoscaling


    Scale resources to match demand:


    
    # AWS Auto Scaling with target tracking
    
    autoscaling:
    
      target_tracking:
    
        - predefined_metric_specification:
    
            predefined_metric_type: ASGAverageCPUUtilization
    
          target_value: 60
    
    

    For containerized workloads, use Kubernetes Cluster Autoscaler:


    
    # Karpenter for AWS EKS
    
    kubectl scale deployment api-server --replicas=0  # Automated idle scaling
    
    

    Karpenter and similar tools scale nodes based on actual pod resource requests, eliminating node-level waste.


    Database Cost Optimization


    Databases are often the most expensive service:


  • **Serverless databases**: Use Aurora Serverless, Cloud SQL auto-scaling, or Azure SQL Serverless for variable workloads.
  • **Read replicas**: Add replicas for read-heavy workloads instead of upscaling a single instance.
  • **Connection pooling**: Use PgBouncer or RDS Proxy to handle thousands of connections without provisioning for peak.
  • **Delete old data**: Archive historical data to object storage.

  • Reserved Capacity with Spot


    Combine reserved capacity for baseline with spot for spikes:


    
    # Terraform: Mix OD and Spot in ASG
    
    resource "aws_autoscaling_group" "app" {
    
      mixed_instances_policy {
    
        launch_template {
    
          launch_template_specification { ... }
    
          override {
    
            instance_type = "t3.medium"
    
          }
    
        }
    
        instances_distribution {
    
          on_demand_base_capacity = 2
    
          on_demand_percentage_above_base_capacity = 50  # Rest from Spot
    
          spot_allocation_strategy = "capacity-optimized"
    
        }
    
      }
    
    }
    
    

    Monitoring and Budgets


    Set up cost monitoring to catch anomalies early:


    
    # AWS Budget with action
    
    aws budgets create-budget \
    
      --budget-name "Monthly-Infra" \
    
      --budget-file monthly-budget.json \
    
      --notifications-with-subscribers file://alert-config.json
    
    
    
    # GCP budget alert
    
    gcloud billing budgets create \
    
      --billing-account=XXXXXX \
    
      --display-name="Monthly Budget" \
    
      --budget-amount=5000USD \
    
      --threshold-rules=percent=50,percent=90 \
    
      --notifications-pubsub-topic=budget-alerts
    
    

    Set up alerts at 50%, 80%, and 90% of budget. Tag resources by cost center and review spending weekly.


    Summary


    Cloud cost optimization is an ongoing process, not a one-time cleanup. Start with right-sizing (the quickest wins), add reserved capacity for baseline workloads, use spot instances for fault-tolerant work, and configure autoscaling to match demand. Monitor storage lifecycle, minimize data egress, and optimize database tiering. The most effective approach is a 20% time investment per quarter: review spending, tag resources, and implement the top three savings opportunities. Most organizations can reduce their cloud bill by 30-50% within three months by following these practices.