SlideShare a Scribd company logo
1 of 22
Download to read offline
For Ahmedabad Java Meetup Group (300+ members strong now!)
Big Data Workshop
– An introduction
and workshop launch session
10th
May, 2014
Dhruv Gohil
From Ishi systems
Welcome!
l
Why a workshop and not a presentation
l
What you should do in workshop?
l
What is expected from you in this session
l
What you should expect from this session?
l
What are up-coming sessions going to be like?
Seems too serious?
Now, This is much better!
So, let's change the font!
OK... So what are we gonna do today?
➔Workshop setup and series introduction
➔Already done! (See it's easy!)
➔Big is not only ‘big’.
➔Why we need 'Big data'?
➔What 'Big data' is NOT?
➔fear of Big data? Kick it off!
Let me tell you a story..
http://en.wikipedia.org/wiki/Information_Management_System
If you still think about 'Entities' and 'Tables'
Everything you have been taught in college
about Database is ALL WRONG.
http://slideshot.epfl.ch/play/suri_stonebraker
Big Data is...
Big Data is not only ‘big’
Volume, Velocity, Variety
GB/TB vs PB/EB
Centralized vs Distributed
Structured vs Semi-Structured/Unstructured
Data Model vs Schema
Known relationships vs Flexible associations
What 'Big data' is NOT?
Big data हहैं इसललिए Hadoop हहहैँ , Hadoop हहहैँ इसलिए Big data नलहह!
What 'Big data' is NOT?
Applying for a job here?
Hadoop ससे कम ततो गगालिली कसे बरगाबर हहैं !
What 'Big data' is NOT?
Why always Hadoop comes to mind with big
data?
What else we should know?
Tools vs Methodologies
Being too futuristic vs. being
practical/economical
Big Data in your organization
http://www.fakingnews.firstpost.com/2014/04/transcript-of-rahul-gandhis-interview-for-job-of-a-c-programmer/
We brought RTSC. Right To Source Code.
Now, deal with it.
Big Data in your organization
➢ Cost of tools/software decreases, but cost of
knowledge increases
➢ Being agile is the only way to deal competition
➢ Are you working with.
✔ Social networking and media
✔ Mobile devices
✔ Internet transactions
✔ Networked devices and sensors
Big Data in your product/service
● Have to change thinking in perspective of access vs. storage
● Design based on when/where data is used vs. when/where
data is produced.
● Use redundancy in contrast of storage cost
● Understand NoSQL = Not Only SQL
✔ Streams
✔ In memory analytics
✔ Massively parallel processing (Data crunching)
Big Data in your project
Random Research says..
➔ 99% client of yours asked for Big Data
project, ended up having total paid customers
less then your own fingers.
A Project hits Business scalability much much
earlier then technical scalability.
Big Data for your clients
➢ Business first - technology second
➢ Current reality for client projects:
✔ Use big data tools which works at small scale :-)
✔ Design with domain in mind not the database
client suggests.
➢ Always design for read optimization in mind
(the golden rule)
Big Data project for small data customers
If you can do it postgresql, then do it postgresql
(the blue elephant rule)
Few important tips..
The CAP theorem- Basics of NoSQL Databases
Read a lot about design of database before
using any non traditional database. Or read
good negative posts to know when NOT to use
it.
e.g. : http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
Now... the good parts !
It's your time to speak now!
Workshop session:
About practical selection of technology and
design for real word use cases.
All references used in workshop reference
➔
Basic hadoop introductory material : http://www.coreservlets.com/hadoop-tutorial/
➔
Evaluate hadoop without installation : http://go.cloudera.com/cloudera-live.html
➔
Postgresql good parts : http://www.slideshare.net/Aveic/postgresql-34323147
➔
Postgresql as NOSQL column store : http://postgresguide.com/sexy/hstore.html
➔
Postgresql as Elastic search basic functionality : http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-enough/
➔
Good big data compatible OSS softwares : http://netflix.github.io/
➔
Practical Hbase usage : https://www.facebook.com/UsingHbase
➔
Using cassandra for write heavy applications : http://www.datastax.com/1-million-writes
➔
On-line analytics in STORM : http://hortonworks.com/hadoop/storm/
➔
E-commerce Domain specific use case : http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376
➔
Good use case of selecting data store based on proper understanding of CAP theorem :
http://tech-blog.flipkart.net/2013/01/nosql-for-a-user-engagement-platform/
➔
Recommendation engine in Big Data scenarios : http://www.slideshare.net/hava101/recommendations-play-flipkart-14115791
➔
High volume log proessing: http://www.splunk.com/view/product-tour/SP-CAAAAGV Open source alternatives : http://logstash.net/
and http://graylog2.org/

More Related Content

What's hot

Agile that works_and_the_tools_we_love
Agile that works_and_the_tools_we_loveAgile that works_and_the_tools_we_love
Agile that works_and_the_tools_we_loveReload! A/S
 
My favorite hot technologies
My favorite hot technologiesMy favorite hot technologies
My favorite hot technologiesnbullock35
 
Process Mapping, System Insight
Process Mapping, System InsightProcess Mapping, System Insight
Process Mapping, System InsightCole Whiteman
 
Cool Tools that make front-end development fun!
Cool Tools that make front-end development fun!Cool Tools that make front-end development fun!
Cool Tools that make front-end development fun!Jarne W. Beutnagel
 
LKCE16 - Kanban more than you think by Wolfgang Wiedenroth
LKCE16 - Kanban more than you think by Wolfgang WiedenrothLKCE16 - Kanban more than you think by Wolfgang Wiedenroth
LKCE16 - Kanban more than you think by Wolfgang WiedenrothLean Kanban Central Europe
 

What's hot (6)

Agile that works_and_the_tools_we_love
Agile that works_and_the_tools_we_loveAgile that works_and_the_tools_we_love
Agile that works_and_the_tools_we_love
 
My favorite hot technologies
My favorite hot technologiesMy favorite hot technologies
My favorite hot technologies
 
Winfred Peereboom - Architecture for the future
Winfred Peereboom - Architecture for the futureWinfred Peereboom - Architecture for the future
Winfred Peereboom - Architecture for the future
 
Process Mapping, System Insight
Process Mapping, System InsightProcess Mapping, System Insight
Process Mapping, System Insight
 
Cool Tools that make front-end development fun!
Cool Tools that make front-end development fun!Cool Tools that make front-end development fun!
Cool Tools that make front-end development fun!
 
LKCE16 - Kanban more than you think by Wolfgang Wiedenroth
LKCE16 - Kanban more than you think by Wolfgang WiedenrothLKCE16 - Kanban more than you think by Wolfgang Wiedenroth
LKCE16 - Kanban more than you think by Wolfgang Wiedenroth
 

Viewers also liked

Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeAhmedabadJavaMeetup
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programmingFilip Ekberg
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraAhmedabadJavaMeetup
 
Europeana for Creative Industry
Europeana for Creative IndustryEuropeana for Creative Industry
Europeana for Creative IndustryEuropeana
 
British extraterritorial jurisdiction in siam british extraterritorial jurisd...
British extraterritorial jurisdiction in siam british extraterritorial jurisd...British extraterritorial jurisdiction in siam british extraterritorial jurisd...
British extraterritorial jurisdiction in siam british extraterritorial jurisd...Kanaphon Chanhom
 
Staying safe on_the_internet_presentation (2)
Staying safe on_the_internet_presentation (2)Staying safe on_the_internet_presentation (2)
Staying safe on_the_internet_presentation (2)Celia Bandelier
 
Bluecloudwifi.com.au
Bluecloudwifi.com.auBluecloudwifi.com.au
Bluecloudwifi.com.auLoan direct
 
Http kenagain.freeservers.com prose
Http   kenagain.freeservers.com proseHttp   kenagain.freeservers.com prose
Http kenagain.freeservers.com prosedebeljackitatjana
 
Drawing ii final options - animal allegory2
Drawing ii   final options - animal allegory2Drawing ii   final options - animal allegory2
Drawing ii final options - animal allegory2nateabels
 

Viewers also liked (18)

Build next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google ChromeBuild next generation apps with eyes and ears using Google Chrome
Build next generation apps with eyes and ears using Google Chrome
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programming
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache Cassandra
 
The History of Humo
The History of HumoThe History of Humo
The History of Humo
 
Enhancing Maryland’s GIS Inventory
Enhancing Maryland’s GIS InventoryEnhancing Maryland’s GIS Inventory
Enhancing Maryland’s GIS Inventory
 
Europeana for Creative Industry
Europeana for Creative IndustryEuropeana for Creative Industry
Europeana for Creative Industry
 
British extraterritorial jurisdiction in siam british extraterritorial jurisd...
British extraterritorial jurisdiction in siam british extraterritorial jurisd...British extraterritorial jurisdiction in siam british extraterritorial jurisd...
British extraterritorial jurisdiction in siam british extraterritorial jurisd...
 
Welcome Table Training
Welcome Table TrainingWelcome Table Training
Welcome Table Training
 
8 citrus news
8 citrus news8 citrus news
8 citrus news
 
Staying safe on_the_internet_presentation (2)
Staying safe on_the_internet_presentation (2)Staying safe on_the_internet_presentation (2)
Staying safe on_the_internet_presentation (2)
 
Bluecloudwifi.com.au
Bluecloudwifi.com.auBluecloudwifi.com.au
Bluecloudwifi.com.au
 
Http kenagain.freeservers.com prose
Http   kenagain.freeservers.com proseHttp   kenagain.freeservers.com prose
Http kenagain.freeservers.com prose
 
Drawing ii final options - animal allegory2
Drawing ii   final options - animal allegory2Drawing ii   final options - animal allegory2
Drawing ii final options - animal allegory2
 
Family ch24
Family ch24Family ch24
Family ch24
 
최종발표
최종발표최종발표
최종발표
 
Madrid
MadridMadrid
Madrid
 
M@ahs
M@ahsM@ahs
M@ahs
 

Similar to Ahmedabad Java Meetup Big Data Workshop Introduction

Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsDhruv Gohil
 
Enterprise Frameworks: Java & .NET
Enterprise Frameworks: Java & .NETEnterprise Frameworks: Java & .NET
Enterprise Frameworks: Java & .NETAnant Corporation
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Rendy Bambang Junior
 
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn Career
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn CareerCodemash 2.0.1.4: Tech Trends and Pwning Your Pwn Career
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn CareerKevin Davis
 
Limits of Machine Learning
Limits of Machine LearningLimits of Machine Learning
Limits of Machine LearningAlexey Grigorev
 
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...Citrix
 
So many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSo many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSirris
 
Scaling Your Web Application
Scaling Your Web ApplicationScaling Your Web Application
Scaling Your Web ApplicationKetan Deshmukh
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascienceAdam Muise
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaSFrederik Denkens
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
 
Apache Con 2008 Top 10 Mistakes
Apache Con 2008 Top 10 MistakesApache Con 2008 Top 10 Mistakes
Apache Con 2008 Top 10 MistakesJohn Coggeshall
 
Citrix Labs Rapid Prototyping Workshop
Citrix Labs Rapid Prototyping WorkshopCitrix Labs Rapid Prototyping Workshop
Citrix Labs Rapid Prototyping WorkshopReuven Cohen
 
Advancing your data science career
Advancing your data science careerAdvancing your data science career
Advancing your data science careerAlexey Grigorev
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Info Session : University Institute of engineering and technology , Kurukshet...
Info Session : University Institute of engineering and technology , Kurukshet...Info Session : University Institute of engineering and technology , Kurukshet...
Info Session : University Institute of engineering and technology , Kurukshet...HRITIKKHURANA1
 
Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodySanjana Chowdhury
 

Similar to Ahmedabad Java Meetup Big Data Workshop Introduction (20)

Why, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For StartupsWhy, How, When and When Not of Big Data For Startups
Why, How, When and When Not of Big Data For Startups
 
Enterprise Frameworks: Java & .NET
Enterprise Frameworks: Java & .NETEnterprise Frameworks: Java & .NET
Enterprise Frameworks: Java & .NET
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
 
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn Career
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn CareerCodemash 2.0.1.4: Tech Trends and Pwning Your Pwn Career
Codemash 2.0.1.4: Tech Trends and Pwning Your Pwn Career
 
Limits of Machine Learning
Limits of Machine LearningLimits of Machine Learning
Limits of Machine Learning
 
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...
Synergy 2015 Session Slides: SYN408 XenDesktop 7.6 Architecture - Dealing Wit...
 
So many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSo many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS provider
 
Scaling Your Web Application
Scaling Your Web ApplicationScaling Your Web Application
Scaling Your Web Application
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS
 
How to do b tech be projects or any academic projects
How to do b tech be projects or any academic projectsHow to do b tech be projects or any academic projects
How to do b tech be projects or any academic projects
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire
 
Uotm workshop
Uotm workshopUotm workshop
Uotm workshop
 
Apache Con 2008 Top 10 Mistakes
Apache Con 2008 Top 10 MistakesApache Con 2008 Top 10 Mistakes
Apache Con 2008 Top 10 Mistakes
 
Citrix Labs Rapid Prototyping Workshop
Citrix Labs Rapid Prototyping WorkshopCitrix Labs Rapid Prototyping Workshop
Citrix Labs Rapid Prototyping Workshop
 
Advancing your data science career
Advancing your data science careerAdvancing your data science career
Advancing your data science career
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Info Session : University Institute of engineering and technology , Kurukshet...
Info Session : University Institute of engineering and technology , Kurukshet...Info Session : University Institute of engineering and technology , Kurukshet...
Info Session : University Institute of engineering and technology , Kurukshet...
 
Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for Everybody
 

Recently uploaded

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

Ahmedabad Java Meetup Big Data Workshop Introduction

  • 1. For Ahmedabad Java Meetup Group (300+ members strong now!) Big Data Workshop – An introduction and workshop launch session 10th May, 2014 Dhruv Gohil From Ishi systems
  • 2. Welcome! l Why a workshop and not a presentation l What you should do in workshop? l What is expected from you in this session l What you should expect from this session? l What are up-coming sessions going to be like?
  • 3. Seems too serious? Now, This is much better! So, let's change the font!
  • 4. OK... So what are we gonna do today? ➔Workshop setup and series introduction ➔Already done! (See it's easy!) ➔Big is not only ‘big’. ➔Why we need 'Big data'? ➔What 'Big data' is NOT? ➔fear of Big data? Kick it off!
  • 5. Let me tell you a story.. http://en.wikipedia.org/wiki/Information_Management_System
  • 6. If you still think about 'Entities' and 'Tables' Everything you have been taught in college about Database is ALL WRONG. http://slideshot.epfl.ch/play/suri_stonebraker
  • 8.
  • 9. Big Data is not only ‘big’ Volume, Velocity, Variety GB/TB vs PB/EB Centralized vs Distributed Structured vs Semi-Structured/Unstructured Data Model vs Schema Known relationships vs Flexible associations
  • 10. What 'Big data' is NOT? Big data हहैं इसललिए Hadoop हहहैँ , Hadoop हहहैँ इसलिए Big data नलहह!
  • 11. What 'Big data' is NOT? Applying for a job here? Hadoop ससे कम ततो गगालिली कसे बरगाबर हहैं !
  • 12. What 'Big data' is NOT? Why always Hadoop comes to mind with big data? What else we should know? Tools vs Methodologies Being too futuristic vs. being practical/economical
  • 13. Big Data in your organization http://www.fakingnews.firstpost.com/2014/04/transcript-of-rahul-gandhis-interview-for-job-of-a-c-programmer/ We brought RTSC. Right To Source Code. Now, deal with it.
  • 14. Big Data in your organization ➢ Cost of tools/software decreases, but cost of knowledge increases ➢ Being agile is the only way to deal competition ➢ Are you working with. ✔ Social networking and media ✔ Mobile devices ✔ Internet transactions ✔ Networked devices and sensors
  • 15. Big Data in your product/service ● Have to change thinking in perspective of access vs. storage ● Design based on when/where data is used vs. when/where data is produced. ● Use redundancy in contrast of storage cost ● Understand NoSQL = Not Only SQL ✔ Streams ✔ In memory analytics ✔ Massively parallel processing (Data crunching)
  • 16. Big Data in your project Random Research says.. ➔ 99% client of yours asked for Big Data project, ended up having total paid customers less then your own fingers. A Project hits Business scalability much much earlier then technical scalability.
  • 17. Big Data for your clients ➢ Business first - technology second ➢ Current reality for client projects: ✔ Use big data tools which works at small scale :-) ✔ Design with domain in mind not the database client suggests. ➢ Always design for read optimization in mind (the golden rule)
  • 18. Big Data project for small data customers If you can do it postgresql, then do it postgresql (the blue elephant rule)
  • 20. The CAP theorem- Basics of NoSQL Databases Read a lot about design of database before using any non traditional database. Or read good negative posts to know when NOT to use it. e.g. : http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
  • 21. Now... the good parts ! It's your time to speak now! Workshop session: About practical selection of technology and design for real word use cases.
  • 22. All references used in workshop reference ➔ Basic hadoop introductory material : http://www.coreservlets.com/hadoop-tutorial/ ➔ Evaluate hadoop without installation : http://go.cloudera.com/cloudera-live.html ➔ Postgresql good parts : http://www.slideshare.net/Aveic/postgresql-34323147 ➔ Postgresql as NOSQL column store : http://postgresguide.com/sexy/hstore.html ➔ Postgresql as Elastic search basic functionality : http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-enough/ ➔ Good big data compatible OSS softwares : http://netflix.github.io/ ➔ Practical Hbase usage : https://www.facebook.com/UsingHbase ➔ Using cassandra for write heavy applications : http://www.datastax.com/1-million-writes ➔ On-line analytics in STORM : http://hortonworks.com/hadoop/storm/ ➔ E-commerce Domain specific use case : http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376 ➔ Good use case of selecting data store based on proper understanding of CAP theorem : http://tech-blog.flipkart.net/2013/01/nosql-for-a-user-engagement-platform/ ➔ Recommendation engine in Big Data scenarios : http://www.slideshare.net/hava101/recommendations-play-flipkart-14115791 ➔ High volume log proessing: http://www.splunk.com/view/product-tour/SP-CAAAAGV Open source alternatives : http://logstash.net/ and http://graylog2.org/