Big Data at CallFireVijesh Mehta (Co-Founder and CTO)
Agenda• A little about CallFire• CallFire’s technical challenges• How CallFire deals with data• Summary
Some background about myself• I am one of the founders of CallFire. – Started in 2005 in a small apartment – Now 28 people – Bootstrapped and profitable• I’ve been writing software primarily in the Java space for 12 years. CallFire is all Java. – We use : Wicket, Guice, Hibernate, MySQL, Cassandra, ActiveMQ, XEN, Puppet
About CallFire• We are a cloud telephony provider. – Outbound Phone calls – Phone Numbers – SMS through long and short codes – IVR – Interactive Voice Response – Power Dialing• CallFire’s call volume can get large very quickly. – Hurricane Sandy : 1.9 million emergency calls• 4 Engineers and 1 System admin managing operations and new features. • We just hired 7 more engineers this year, and still hiring!
Technical Challenges by Numbers• 1.4 billion calls and texts – Growing exponentially• Over 50,000 accounts• Over 6 million campaigns• 80 million sound files• 14 TB in storage (NFS)• MySQL : Over 10,000 qps at peak Big data isn’t always big company problem!
Growing faster each day Campaigns over Time 7000000 6000000 5000000 4000000 3000000 2000000 1000000 0
The first challenge• Problem : We outgrew our datacenter. New systems need access to central storage. Replication across a 1gb/s interconnect.• Needed Solution: – Must work across datacenter – Must scale as demand increases – Must be fault tolerant – Must deal with over 80 million sound files – Cheaper the better
Solutions Considered (2010) NFS GLUSTER HDFS CASSANDRA Fault Tolerant Yes, if conﬁgured Yes Yes Yes Datacenter Maybe. Rsync isn’t Not at the Dme Yes Yes Replica>on fun with lots of ﬁles. Easy to add storage No Not at the Dme Yes Yes No Single point of No Yes Not exactly, Yes failure NameNode. Data always No, hard to sort No, same as a ﬁle Yes Yes accessible easily through ﬁle system systems. Notes Not working for us. Looks good, tried it Didn’t like the name Everything we Too much for a while. Easy at node issue. May need, quick to management and ﬁrst because it was have been a good learn. We went all downDme. a ﬁle system. way to go. in! * Only LAN soluDons considered. Calls had too much latency in the cloud, or even across datacenter.
Cassandra• Storage isn’t the best use of Cassandra.• Do not exceed 50% of drive space. – Compaction needs the space. Hard lesson learned.• Fault Tolerance: Replication factor of 3.• Result • 1 TB of data = 6 TB of storage needed! • CallFire has a 74TB Cassandra Cluster
Extending the scope• We like SQL and Hibernate. – Pros: Easy, Flexible, Ad-Hoc Queries, Locks – Cons: Scaling• Solution: Sharding with Cassandra for universal data Shard 1 Shard 2 Shard 3 Cassandra Cluster
Sharding + Big Data• Cassandra makes sharding easier – Easy to store universal data. (Authentication) – Performs very well• Tungsten Replicator (Big Data with SQL) – Sharding makes joins impossible, so fan your data into central places. – NoSQL can’t handle ad-hoc queries. No worries, you can still have SQL.
Big Data Summary• Not Just for big companies, data grows rapidly in todays environment. – Nice article about Obama’s Data Crunchers: – http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/• NoSQL systems have easier scaling and fault tolerance mechanisms. – Not uncommon to see small teams with 10-20 node clusters.• SQL is still a big part of the equation. (Tungsten) – Fan in information across partitions – Replicate across datacenters – Keep your ad-hoc dreams alive!