Minimizing EC2 Costs

Katherine Daniels

As a growing startup, it’s important for us to keep our costs as low as possible. Recently, we came to the conclusion that our Amazon Web Services costs were higher than we thought they should be.

meme

Digging into things a bit further, we noticed that there were some EC2 instances that were up and running (and costing us money) that weren’t being used. Some of them had our code running on them, but weren’t configured behind the Elastic Load Balancers that would enable them to actually get traffic and be useful. Some of them didn’t have our code and weren’t even in our configuration management system, Chef. Those are servers that should either be stopped or configured properly, so we aren’t throwing away money on instances that weren’t providing any value.

This could have been done by hand, but that would have been tedious, time-consuming, and not easily repeatable. I’m a strong proponent of the theory that nothing should be done by humans that could be more effectively done by computers, so I wrote a script that would do this for us. For each of our running EC2 instances, it checks to make sure that it’s in Chef as well as making sure that it is configured to be behind the correct load balancer.

While this was relatively straightforward, it got a bit more complicated by the fact that we have 3 different environments: production, pre-production, and staging. So instead of simply checking ‘is this web server behind the web load balancer’, it has to be aware of which environment the server is configured in as well. Unfortunately, it’s not possible to query a server regarding whether or not it’s behind a load balancer, because the server instances don’t know that. Instead, the script had to query each of the load balancers for all the the instances that are behind it. And because python, I used some fancy comprehensions.

elbs = {
    env: {
        name: ebs_conn.get_all_load_balancers(load_balancer_names=[ELB_NAMES[env][name]])[0]
        for name in ELB_NAMES[env].keys()
    } for env in ENVIRONMENTS
}
elb_instances = {
    env: {
        elb: [instance_info.id for instance_info in elbs[env][elb].instances]
        for elb in elbs[env].keys()
    } for env in ENVIRONMENTS
}

Two lines of code to get an organized set of all the instances behind all the load balancers for all the environments, and now we have an automatic check in place to make sure we’re not spending money on servers that aren’t being used.