SlideShare a Scribd company logo
1 of 25
@yourFriendDhruv
For Startup Saturday : Jan 2016
Big Data
for Startups
– An introductory session about
What, Why, When & When Not of Big Data for Startups
Yes, You may not like it; but its really my twitter id
Dhruv Gohil
@yourFriendDhruv
Welcome!
lWhy do you care to hear my opinion?
lWhat is this “big data”?
lWhy “startup”s should care about it?
lWhen to “do big data”?
lWhen “NOT to do big data”?
@yourFriendDhruv
Seems too serious?
Now, This is much better!
So, let's change the font!
@yourFriendDhruv
OK... Why do you care to hear this from me?
Meet me after the session, to compare favorites
@yourFriendDhruv
OK... So what questions I will try to answer?
Big is not only ‘big’.
Why startup needs 'Big data'?
What 'Big data' is NOT?
fear of Big data? Kick it off!
Big Data for “small startups”?
@yourFriendDhruv
Let me tell you a story..
http://en.wikipedia.org/wiki/Information_Management_System
@yourFriendDhruv
If you were thinking about RDBMS now...then
Everything you have been taught in academics about
Database is ALL WRONG.
http://slideshot.epfl.ch/play/suri_stonebraker
@yourFriendDhruv
Big Data is...
http://www.ibmbigdatahub.com/infographic/four-vs-big-
@yourFriendDhruv
Big Data is not only ‘big’
Volume, Velocity, Variety
GB/TB vs PB/EB
Centralized vs Distributed
Structured vs Semi-Structured/Unstructured
Data Model vs Schema
Known relationships vs Flexible associations
@yourFriendDhruv
What 'Big data' is NOT?
Big data हैं इसलिए Hadoop हैँ , Hadoop हैँ इसिए Big data नहहिं!
@yourFriendDhruv
What 'Big data' is NOT?
Applying for a funding here?
Hadoop से कम तो गािी के बराबर हैं !
@yourFriendDhruv
What 'Big data' is NOT?
Why always Hadoop/Technologies comes to mind with big
data?
What else we should know?
Tools vs Methodologies
Being too futuristic vs. being practical/economical
@yourFriendDhruv
Big Data in your startup
Cost of tools/software decreases, but cost of knowledge
increases
Being agile is the only way to deal competition
Are you working with...
Social networking and media
Mobile devices
Internet transactions
Networked devices and sensors
@yourFriendDhruv
Big Data in your product/service
Have to change thinking in perspective of access vs. storage
Design based on when/where data is used vs. when/where data
is produced.
Use redundancy in contrast of storage cost
Understand NoSQL = Not Only SQL
Streams
In memory analytics
Massively parallel processing (Data crunching)
@yourFriendDhruv
Big Data in your startup
Random Research says..
99% client of Big Data startups, ended up having total paid
customers less then your own fingers.
A Startup hits Business scalability much much earlier then
technical scalability.
@yourFriendDhruv
Big Data for your clients
Business first - technology second
Current reality for client projects:
Use big data tools which works at small scale :-)
Design with domain in mind not the database client suggests.
Always design for read optimization in mind (the golden
rule)
@yourFriendDhruv
Big Data project for small data startups
If you can do it postgresql, then do it postgresql
(the blue elephant rule)
@yourFriendDhruv
For Tech centric startups - The CAP theorem
Read a lot about design of database before using any non
traditional database. Or read good negative posts to know when
NOT to use it.
e.g. : http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
@yourFriendDhruv
And Now... Quick Tips
Why Big Data?
Data == VALUE MONEY $$$!
It's a buzzword, but ride on it like you mean it.
Your competitors do it. claims to do it.
Think of your growth exit stretegy, again!
Yes, I never owned/worked at startup, Still advising you!
@yourFriendDhruv
And Now... Quick Tips
When to actually do Big Data?
The purpose of Data your startup has, changes
should change == To PIVOT
Do it for “Unfair advantage” not for UVP www://leanstack.com
See, I did it again.
@yourFriendDhruv
And Now... Quick Tips
How to do Big Data?
Big Data Storage
Use Big Data patterns, but don't use Big Data
tools/technologies (yet)
Fact/Event based system design
CQRS (command query responsibility seperation)
Easy RDMS but with NON-Relational Design
Big Data Analytics
Until you hit 1K customer use Analytics-as-services
IBM WATSON
Prediction.io
Even more!, I am liking it, not sure about you although.
@yourFriendDhruv
And Now... Quick Tips
How NOT to do Big Data?
If you are not selling your startup in NEXT 6 months
Don't start with Technology, start with business case on NON-BIGDATA-
TECHNOLOGY
If you have not pivoted even once!
Even more!, I am liking it, not sure about you although.
@yourFriendDhruv
Few references used AND this is not last slide
Basic hadoop introductory material : http://www.coreservlets.com/hadoop-tutorial/
Evaluate hadoop without installation : http://go.cloudera.com/cloudera-live.html
Postgresql good parts : http://www.slideshare.net/Aveic/postgresql-34323147
Postgresql as NOSQL column store : http://postgresguide.com/sexy/hstore.html
Postgresql as Elastic search basic functionality : http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-enough/
Good big data compatible OSS softwares : http://netflix.github.io/
Practical Hbase usage : https://www.facebook.com/UsingHbase
Why BigData technologies are on Linux : https://www.youtube.com/watch?v=njos57IJf-0
Using cassandra for write heavy applications : http://www.datastax.com/1-million-writes
On-line analytics in STORM : http://hortonworks.com/hadoop/storm/
E-commerce Domain specific use case : http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376
Good use case of selecting data store based on proper understanding of CAP theorem : http://tech-blog.flipkart.net/2013/01/nosql-for-a-user-
engagement-platform/
Recommendation engine in Big Data scenarios : http://www.slideshare.net/hava101/recommendations-play-flipkart-14115791
High volume log proessing: http://www.splunk.com/view/product-tour/SP-CAAAAGV Open source alternatives : http://logstash.net/ and
http://graylog2.org/
Yes, It's unreadable and even un complete, And has irrelevant you tube video links!
@yourFriendDhruv
Question & Answers
Ask the Question now if your Question:
Is 1 liner
Is not personal, from either side.
Ask it post session today
If your context is specific
If its personal and you don't wanna be humiliated in public
If its technical, then attend next 2 sessions
Ask in any café @Prahladnagar
We advice free to startups and individuals (Not joking this time)
Don't ask
Melody itni chocolaty kyun hain?
“No question unanswered” is not copyrighted by me, yet.

More Related Content

What's hot

Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeCloudera, Inc.
 
Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)Cloudera, Inc.
 
6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWSRackspace
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
 
Auto AI : AI used to create AI applications
Auto AI : AI used to create AI applicationsAuto AI : AI used to create AI applications
Auto AI : AI used to create AI applicationsKaran Sachdeva
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Technologies
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Cindy Gross
 
Big Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesBig Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesCloudera, Inc.
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduCloudera, Inc.
 
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Cloudera, Inc.
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascienceAdam Muise
 
Think Big Analytics AWS for Financial Services
Think Big Analytics AWS for Financial ServicesThink Big Analytics AWS for Financial Services
Think Big Analytics AWS for Financial ServicesAmazon Web Services
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpCloudera, Inc.
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveDataStax
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningCloudera, Inc.
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Amazon Web Services
 

What's hot (20)

Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)
 
6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS6 Commonly Asked Questions from Customers Building on AWS
6 Commonly Asked Questions from Customers Building on AWS
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Auto AI : AI used to create AI applications
Auto AI : AI used to create AI applicationsAuto AI : AI used to create AI applications
Auto AI : AI used to create AI applications
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
Big Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesBig Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial Services
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
 
Think Big Analytics AWS for Financial Services
Think Big Analytics AWS for Financial ServicesThink Big Analytics AWS for Financial Services
Think Big Analytics AWS for Financial Services
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's Perspective
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100
 

Similar to Why, How, When and When Not of Big Data For Startups

Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeAhmedabadJavaMeetup
 
Master the essentials of conversion optimization
Master the essentials of conversion optimizationMaster the essentials of conversion optimization
Master the essentials of conversion optimizationArnas Rackauskas
 
How to become Data Driven for startups - keboola
How to become Data Driven for startups - keboolaHow to become Data Driven for startups - keboola
How to become Data Driven for startups - keboolaPavel Dolezal
 
Playing Nice in the Product Playground
Playing Nice in the Product PlaygroundPlaying Nice in the Product Playground
Playing Nice in the Product PlaygroundIntuit Inc.
 
Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014Allen Bonde
 
BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015Fiona Lew
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Earning possibilites being an SAP Consultant
Earning possibilites being an SAP ConsultantEarning possibilites being an SAP Consultant
Earning possibilites being an SAP ConsultantConnaissance IT
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
I-CAN | Digital Marketing & Marketing Strategy
I-CAN | Digital Marketing & Marketing StrategyI-CAN | Digital Marketing & Marketing Strategy
I-CAN | Digital Marketing & Marketing StrategyHany Sewilam Abdel Hamid
 
Leveraging on scalable technology to expand regionally
Leveraging on scalable technology to expand regionallyLeveraging on scalable technology to expand regionally
Leveraging on scalable technology to expand regionallyMichael Smith Jr.
 
Top 25 Big Data Interview Questions and Answers
Top 25 Big Data Interview Questions and Answers Top 25 Big Data Interview Questions and Answers
Top 25 Big Data Interview Questions and Answers Whizlabs
 
Data for Hong Kong startups
Data for Hong Kong startupsData for Hong Kong startups
Data for Hong Kong startupsGuy Freeman
 
Google Analytics: Stop Wondering And Start Measuring
Google Analytics: Stop Wondering And Start MeasuringGoogle Analytics: Stop Wondering And Start Measuring
Google Analytics: Stop Wondering And Start MeasuringAffiliate Summit
 
Dig up the gold in your godowns
Dig up the gold in your godownsDig up the gold in your godowns
Dig up the gold in your godownsVulcanMinds
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications todayMichal Hodinka
 

Similar to Why, How, When and When Not of Big Data For Startups (20)

Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google Chrome
 
Master the essentials of conversion optimization
Master the essentials of conversion optimizationMaster the essentials of conversion optimization
Master the essentials of conversion optimization
 
How to become Data Driven for startups - keboola
How to become Data Driven for startups - keboolaHow to become Data Driven for startups - keboola
How to become Data Driven for startups - keboola
 
Playing Nice in the Product Playground
Playing Nice in the Product PlaygroundPlaying Nice in the Product Playground
Playing Nice in the Product Playground
 
WordPress SEO 101
WordPress SEO 101WordPress SEO 101
WordPress SEO 101
 
Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014
 
Big Data : a 360° Overview
Big Data : a 360° Overview Big Data : a 360° Overview
Big Data : a 360° Overview
 
BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Earning possibilites being an SAP Consultant
Earning possibilites being an SAP ConsultantEarning possibilites being an SAP Consultant
Earning possibilites being an SAP Consultant
 
WordPress SEO 101
WordPress SEO 101WordPress SEO 101
WordPress SEO 101
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
I-CAN | Digital Marketing & Marketing Strategy
I-CAN | Digital Marketing & Marketing StrategyI-CAN | Digital Marketing & Marketing Strategy
I-CAN | Digital Marketing & Marketing Strategy
 
Leveraging on scalable technology to expand regionally
Leveraging on scalable technology to expand regionallyLeveraging on scalable technology to expand regionally
Leveraging on scalable technology to expand regionally
 
AI and The future of work
AI and The future of work AI and The future of work
AI and The future of work
 
Top 25 Big Data Interview Questions and Answers
Top 25 Big Data Interview Questions and Answers Top 25 Big Data Interview Questions and Answers
Top 25 Big Data Interview Questions and Answers
 
Data for Hong Kong startups
Data for Hong Kong startupsData for Hong Kong startups
Data for Hong Kong startups
 
Google Analytics: Stop Wondering And Start Measuring
Google Analytics: Stop Wondering And Start MeasuringGoogle Analytics: Stop Wondering And Start Measuring
Google Analytics: Stop Wondering And Start Measuring
 
Dig up the gold in your godowns
Dig up the gold in your godownsDig up the gold in your godowns
Dig up the gold in your godowns
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Why, How, When and When Not of Big Data For Startups

  • 1. @yourFriendDhruv For Startup Saturday : Jan 2016 Big Data for Startups – An introductory session about What, Why, When & When Not of Big Data for Startups Yes, You may not like it; but its really my twitter id Dhruv Gohil @yourFriendDhruv
  • 2. Welcome! lWhy do you care to hear my opinion? lWhat is this “big data”? lWhy “startup”s should care about it? lWhen to “do big data”? lWhen “NOT to do big data”?
  • 3. @yourFriendDhruv Seems too serious? Now, This is much better! So, let's change the font!
  • 4. @yourFriendDhruv OK... Why do you care to hear this from me? Meet me after the session, to compare favorites
  • 5. @yourFriendDhruv OK... So what questions I will try to answer? Big is not only ‘big’. Why startup needs 'Big data'? What 'Big data' is NOT? fear of Big data? Kick it off! Big Data for “small startups”?
  • 6. @yourFriendDhruv Let me tell you a story.. http://en.wikipedia.org/wiki/Information_Management_System
  • 7. @yourFriendDhruv If you were thinking about RDBMS now...then Everything you have been taught in academics about Database is ALL WRONG. http://slideshot.epfl.ch/play/suri_stonebraker
  • 10. @yourFriendDhruv Big Data is not only ‘big’ Volume, Velocity, Variety GB/TB vs PB/EB Centralized vs Distributed Structured vs Semi-Structured/Unstructured Data Model vs Schema Known relationships vs Flexible associations
  • 11. @yourFriendDhruv What 'Big data' is NOT? Big data हैं इसलिए Hadoop हैँ , Hadoop हैँ इसिए Big data नहहिं!
  • 12. @yourFriendDhruv What 'Big data' is NOT? Applying for a funding here? Hadoop से कम तो गािी के बराबर हैं !
  • 13. @yourFriendDhruv What 'Big data' is NOT? Why always Hadoop/Technologies comes to mind with big data? What else we should know? Tools vs Methodologies Being too futuristic vs. being practical/economical
  • 14. @yourFriendDhruv Big Data in your startup Cost of tools/software decreases, but cost of knowledge increases Being agile is the only way to deal competition Are you working with... Social networking and media Mobile devices Internet transactions Networked devices and sensors
  • 15. @yourFriendDhruv Big Data in your product/service Have to change thinking in perspective of access vs. storage Design based on when/where data is used vs. when/where data is produced. Use redundancy in contrast of storage cost Understand NoSQL = Not Only SQL Streams In memory analytics Massively parallel processing (Data crunching)
  • 16. @yourFriendDhruv Big Data in your startup Random Research says.. 99% client of Big Data startups, ended up having total paid customers less then your own fingers. A Startup hits Business scalability much much earlier then technical scalability.
  • 17. @yourFriendDhruv Big Data for your clients Business first - technology second Current reality for client projects: Use big data tools which works at small scale :-) Design with domain in mind not the database client suggests. Always design for read optimization in mind (the golden rule)
  • 18. @yourFriendDhruv Big Data project for small data startups If you can do it postgresql, then do it postgresql (the blue elephant rule)
  • 19. @yourFriendDhruv For Tech centric startups - The CAP theorem Read a lot about design of database before using any non traditional database. Or read good negative posts to know when NOT to use it. e.g. : http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
  • 20. @yourFriendDhruv And Now... Quick Tips Why Big Data? Data == VALUE MONEY $$$! It's a buzzword, but ride on it like you mean it. Your competitors do it. claims to do it. Think of your growth exit stretegy, again! Yes, I never owned/worked at startup, Still advising you!
  • 21. @yourFriendDhruv And Now... Quick Tips When to actually do Big Data? The purpose of Data your startup has, changes should change == To PIVOT Do it for “Unfair advantage” not for UVP www://leanstack.com See, I did it again.
  • 22. @yourFriendDhruv And Now... Quick Tips How to do Big Data? Big Data Storage Use Big Data patterns, but don't use Big Data tools/technologies (yet) Fact/Event based system design CQRS (command query responsibility seperation) Easy RDMS but with NON-Relational Design Big Data Analytics Until you hit 1K customer use Analytics-as-services IBM WATSON Prediction.io Even more!, I am liking it, not sure about you although.
  • 23. @yourFriendDhruv And Now... Quick Tips How NOT to do Big Data? If you are not selling your startup in NEXT 6 months Don't start with Technology, start with business case on NON-BIGDATA- TECHNOLOGY If you have not pivoted even once! Even more!, I am liking it, not sure about you although.
  • 24. @yourFriendDhruv Few references used AND this is not last slide Basic hadoop introductory material : http://www.coreservlets.com/hadoop-tutorial/ Evaluate hadoop without installation : http://go.cloudera.com/cloudera-live.html Postgresql good parts : http://www.slideshare.net/Aveic/postgresql-34323147 Postgresql as NOSQL column store : http://postgresguide.com/sexy/hstore.html Postgresql as Elastic search basic functionality : http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-enough/ Good big data compatible OSS softwares : http://netflix.github.io/ Practical Hbase usage : https://www.facebook.com/UsingHbase Why BigData technologies are on Linux : https://www.youtube.com/watch?v=njos57IJf-0 Using cassandra for write heavy applications : http://www.datastax.com/1-million-writes On-line analytics in STORM : http://hortonworks.com/hadoop/storm/ E-commerce Domain specific use case : http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376 Good use case of selecting data store based on proper understanding of CAP theorem : http://tech-blog.flipkart.net/2013/01/nosql-for-a-user- engagement-platform/ Recommendation engine in Big Data scenarios : http://www.slideshare.net/hava101/recommendations-play-flipkart-14115791 High volume log proessing: http://www.splunk.com/view/product-tour/SP-CAAAAGV Open source alternatives : http://logstash.net/ and http://graylog2.org/ Yes, It's unreadable and even un complete, And has irrelevant you tube video links!
  • 25. @yourFriendDhruv Question & Answers Ask the Question now if your Question: Is 1 liner Is not personal, from either side. Ask it post session today If your context is specific If its personal and you don't wanna be humiliated in public If its technical, then attend next 2 sessions Ask in any café @Prahladnagar We advice free to startups and individuals (Not joking this time) Don't ask Melody itni chocolaty kyun hain? “No question unanswered” is not copyrighted by me, yet.