Cloud Tech is the largest gathering of cloud technologists & engineers in the bay area. Our speakers include the top cloud computing entrepreneurs & experts.
Come join us Saturday, October 6th, from 9am to 6pm at the Computer History Museum in Mountain View, CA for a full 8 hours of learning directly from great minds sharing their secrets!
Special thanks to our sponsors who made this all possible. They are: CloudStack, Scalr, VMware, Rackspace, HP, DataStax, AWS, Canonical, Puppet, and General Catalyst.
At Facebook, we use various types of databases and storage system to satisfy the needs of different applications. The solutions built around these data store systems have a common set of requirements: they have to be highly scalable, maintainence costs should be low and they have to perform efficiently. We use a sharded mySQL+memcache solution to support realtime access of tens of petabytes of data and we use TAO to provide consistency of this web-scale database across geographical distances. We use Haystack datastore for storing the 3 billion new photos we host every week. We use Apache Hadoop to mine intelligence from 100 petabytes of clicklogs and combine it with the power of Apache HBase to store all Facebook Messages.
This talk describes the reasons why each of these databases are appropriate for their workloads and the design decisions and tradeoffs that were made while implementing these solutions. We touch upon the consistency, availability and partitioning tolerance of each of these solutions. We touch upon the reasons why some of these systems need ACID semantics and other systems do not. We describe the evolution of our mySQL databases to a pure SSD based deployment.