Introduction to 
Big Data Infrastructure 
Vancouver 
SMAC (Social, Mobile, Analytics & Cloud) Meetup 
Oct 22, 2014 
Ganesh Swami 
www.silota.com
Hi 
• Programming professionally for 10+ years 
• x86 assembly, STL, boost, python-boost, python 
• Built emacs-­‐wiki-­‐blog: first blogging engine for 
Emacs!
What is Big Data?
What is Big Data? 
3V: Volume, Velocity, Variety
The Big Data Zoo 
Amazon Kinesis Riak Cassandra Hive 
Apache Spark Apache Hadoop Pig Apache Storm 
Kibana Tableu Apache Kafka 
Elasticsearch Amazon EMR Redshift 
Dynamo DB Riak HBase
The Zoo Organized 
Ingest Store 
Process/ 
Enrich 
Visualize 
Kafka S3 Hive/Pig/EMR Tableu 
Data Answers 
Kinesis DynamoDB Spark Kibana 
Flume HDFS Storm 
Scribe Redshift
Data Ingestion 
Ingest 
Layer 
Mobile Apps 
Websites 
Internet of Things
Elasticsearch 
Open-source search and analytics solution
Kibana
Amazon Redshift 
Petabyte-scale data warehouse solution
What is Silota 
• Building blocks of analytics 
• A simple REST API 
• to ingest 
• to analyze 
• to export 
• based on Kafka, Storm and Elasticsearch
Silota -vs- Mixpanel 
• Mixpanel for product people 
• great UI 
• cookie-cutter analysis for verticals (gaming, e-commerce) 
• Silota is an API 
• more low-level, full-power 
• first class API: responses, pagination, errors, etc.
Keep in Touch! 
Ganesh Swami 
ganesh@silota.com 
www.silota.com

Introduction to Big Data Infrastructure

  • 1.
    Introduction to BigData Infrastructure Vancouver SMAC (Social, Mobile, Analytics & Cloud) Meetup Oct 22, 2014 Ganesh Swami www.silota.com
  • 2.
    Hi • Programmingprofessionally for 10+ years • x86 assembly, STL, boost, python-boost, python • Built emacs-­‐wiki-­‐blog: first blogging engine for Emacs!
  • 3.
  • 4.
    What is BigData? 3V: Volume, Velocity, Variety
  • 5.
    The Big DataZoo Amazon Kinesis Riak Cassandra Hive Apache Spark Apache Hadoop Pig Apache Storm Kibana Tableu Apache Kafka Elasticsearch Amazon EMR Redshift Dynamo DB Riak HBase
  • 6.
    The Zoo Organized Ingest Store Process/ Enrich Visualize Kafka S3 Hive/Pig/EMR Tableu Data Answers Kinesis DynamoDB Spark Kibana Flume HDFS Storm Scribe Redshift
  • 8.
    Data Ingestion Ingest Layer Mobile Apps Websites Internet of Things
  • 9.
    Elasticsearch Open-source searchand analytics solution
  • 10.
  • 11.
    Amazon Redshift Petabyte-scaledata warehouse solution
  • 12.
    What is Silota • Building blocks of analytics • A simple REST API • to ingest • to analyze • to export • based on Kafka, Storm and Elasticsearch
  • 13.
    Silota -vs- Mixpanel • Mixpanel for product people • great UI • cookie-cutter analysis for verticals (gaming, e-commerce) • Silota is an API • more low-level, full-power • first class API: responses, pagination, errors, etc.
  • 14.
    Keep in Touch! Ganesh Swami ganesh@silota.com www.silota.com