Learning from Netflix's move to Cloud Computing

For those that follow the cloud computing world, you may have noticed a few articles a while back highlighting the decision by Netflix to move to Amazon Web Services. There wasn't much detail back then on what exactly was moving to Amazon's cloud suite nor was there much information on when and how the move was to be made. What was clear, however, was that it was a cost cutting move for Netflix versus building out new data centers to hold their infrastructure. Here is a little more information from a recent interview with Adrian Cockcroft, Netflix Cloud Architect.

According to Cockcroft, "Encoding movies for streaming, log analysis, production web site and API, most everything that scales with customers and streaming usage" is in the Amazon Cloud. Keep in mind the key phrase "most everything that scales with customers and streaming usage." It is evident that the elastic scalability of the cloud fits usage scenarios in which demand fluctuates. Being able to ramp up resources when needed and shut them off when they are no longer needed is of immense value in a utility computing context where billing is by usage. Compare this to the traditional model of building out data centers and trying to estimate the highest capacity needed so that you meet demand but do not over provision. The cloud just makes sense.

Beyond Amazon, even more cloud services are used for content delivery. "We use Akamai, Limelight and Level3 CDNs for streaming the movies, which is a cloud based service. There is an AWS CDN service, but they aren’t a big enough player in this space at this point" says Cockcroft. I suspect this will change over time as Amazon is able to achieve economies of scale on the CDN side. It would be great to add Amazon to the list of large CDNs out there, but it is still wise to spread the load over several CDNs for obvious reasons.

One area where Cockcroft made a huge (in my opinion) point is private cloud. Keep in mind that Oracle is a huge proponent of private cloud. Netflix was running Oracle on IBM and that is very expensive. Couple that with the recent mass exodus of Fortune 500s from Oracle and Sun products, and you have even more reason to get off of the Oracle platform. Not only did Amazon (AWS) look cheaper in comparison when considering hardware and software costs, but "AWS costs are fully burdened, and we could not have hired enough SAs and DBAs to build out our own datacenter this fast. We have added 4-5x as many systems in the cloud as the total we have in our datacenter over the last year" says Cockcroft. Furthermore, "costs are elastic, you start paying for a resource just before it goes live, and if you stop using a resource you stop paying for it. If you own a resource it sits around a long time waiting to be delivered and installed, and if you no longer want to use that type of resource you are still paying for it for three years. When Amazon cuts prices, your installed capacity gets cheaper. When they install new instance types you can be running on them in hours, technology refresh in real time."

The bottom line is that Cockcroft (and I agree) doesn't believe that "private clouds can compete with public on price, however if you have a bunch of empty datacenter space or want to re-organize your internal systems to be automated and API driven then there are real cost savings to building your own private cloud. I think VMware and Microsoft are going to own the private cloud space, but Amazon is going to continue to disrupt both of them at a lower price point for public cloud."

I believe the most important thing that popped up in the interview was the question of how cloud is going to displace enterprise IT as enterprise IT displaced the mainframe. I always look to Google when I discuss the ephemeral nature of hardware and services and how fault tolerance, redundancy, availability and load balancing must be handled higher up in the software architecture. To this, Cockcroft says "The key challenge is to get into the same mind-set as the Google’s of this world, the availability and robustness of your apps and services has to be designed into your software architecture, you have to assume that the hardware and underlying services are ephemeral, unreliable and may be broken or unavailable at any point, and that the other tenants in the multi-tenant public cloud will add random congestion and variance. In reality you always had this problem at scale, even with the most reliable hardware, so cloud ready architecture is about taking the patterns you have to use at large scale, and using them at a smaller scale to leverage the lowest cost infrastructure."

If you are in need of consulting in the Cloud Computing or Virtualization space, drop me a line via the Contact page and I'll be happy to assist. I regularly travel around the US, Canada, Mexico and Western Europe for Architecture and Deployment of Cloud Computing and Virtualization projects.

You can read more of the interview at the Cloud Scaling Blog.

No comments:

Post a Comment