We do a good deal of cacheing on our web properties here at Cooper-Hewitt.
Our web host, PHPFog adds a layer of cacheing for free known as Varnish cache. Varnish Cache sits in front of our web servers and performs what is known as reverse proxy cacheing. This type of cacheing is incredibly important as it adds the ability to quickly serve cached files to users on the Internet vs. continually recreating dynamic web-pages by making calls into the database.
For static assets such as images, javascripts, and css files, we turn to Amazon’s CloudFront CDN. This type of technology ( which I’ve mentioned in a number of other posts here ) places these static assets on a distributed network of “edge” locations around the world, allowing quicker access to these assets geographically speaking, and as well, it removes a good deal of burden from our application servers.
However, to go a bit further, we thought of utilizing memcache. Memcache is an in-memory database key-value type cacheing application. It helps to speed up calls to the database by storing as much of that information in memory as possible. This has been proven to be extremely effective across many gigantic, database intensive websites like Facebook, Twitter, Tumblr, and Pinterest ( to name just a few ). Check this interesting post on scaling memcached at Facebook.
To get started with memcache I turned to Amazon’s Elasticache offering. Elasticache is essentially a managed memcache server. It allows you to spin up a memcache cluster in a minute or two, and is super easy to use. In fact, you could easily provision a terabyte of memcache in the same amount of time. There is no installation, configuration or maintenance to worry about. Once your memcache cluster is up and running you can easily add or remove nodes, scaling as your needs change on a nearly real-time basis.
Check this video for a more in-depth explanation.
Elasticache also works very nicely with our servers at PHPFog as they are all built on Amazon EC2, and are in fact in the same data center. To get the whole thing working with our www.cooperhewitt.org blog, I had to do the following.
- Create a security group. In order for PHPFog to talk to your own Elasticache cluster, you have to create a security group that contains PHPFog’s AWS ID. There is documentation on the PHPFog website on how to do this for use with an Amazon RDS server, and the same steps apply for Elasticache.
- Provision an Elasticache cluster. I chose to start with a single node, m1.large instance which gives me about 7.5 Gig of RAM to work with at $0.36 an hour per node. I can always add more nodes in the future if I want, and I can even roll down to a smaller instance size by simply creating another cluster.
- Let things simmer for a minute. It takes a minute or two for your cluster to initialize.
- On WordPress install the W3TC plugin. This plugin allows you to connect up your Elasticache server, and as well offers tons of configurable options for use with things like a CloudFront CDN and more. Its a must have! If you are on Drupal or some other CMS. there are similar modules that achieve the same result.
- In W3TC enable whatever types of cacheing you wish to do and set the cache type to memcache. In my case, I chose page cache, minify cache, database cache, and object cache, all of which work with memcache. Additionally I set up our CloudFront CDN from within this same plugin.
- In each cache types config page, set your memcache endpoint to the one given by your AWS control panel. If you have multiple nodes, you will have to copy and paste them all into each of these spaces. There is a test button you can hit to make sure your installation is communicating with your memcache server.
That last bit is interesting. You can have multiple clusters with multiple nodes serving as cache servers for a number of different purposes. You can also use the same cache cluster for multiple sites, so long as they are all reachable via your security group settings.
Once everything is configured and working you can log out and let the cacheing being. It helps to click through the site to allow the cache to build up, but this will happen automatically if your site gets a decent amount of traffic. In the AWS control panel you can check on your cache cluster in the CloudWatch tab where you can keep track of how much memory and cpu is being utilized at any given time. You can also set up alerts so that if you run out of cache, you get notified so you can easily add some nodes.
We hope to employ this same cacheing cluster on our main cooperhewitt.org website, as well as a number of our other web properties in the near future.
I want before and after stats! How many requests / second with everything cached like that? Do you guys have everything in the cloud now? Jealous. I spent today replace a motherboard… 🙁
Nate,
Yeah, I was thinking about doing a little study. We are currently only using it on this labs blog ( WordPress ) and another WP site which is still in development. I hope to roll this out on our Drupal CMS for the main site when a few things fall in to place in the next few weeks. New Relic charts might offer the best view of a before and after… they even report the avg. response time of memcache itself, which is interesting to look at.
BTW – Nate, you can see the memcache response time for the labs blog at https://www.cooperhewitt.org/performance/ — click the icons for each item in the key to filter out what you want to look at. I think its just the last three hours…
Micah,
Great to see you using ElastiCache. I’m a member of the ElastiCache team at AWS and I’d love to chat with you about your experience. Please email me if you’re open to the idea.
Cheers,
Rahul
Rahul,
Feel free to reach me at walterm@si.edu.
Micah
Thanks Micah for the post. I have published a blog on Caching Architectures using Amazon ElastiCache for the AWS community. Can be found at : http://harish11g.blogspot.in/2012/11/amazon-elasticache-memcached-ec2.html