SlideShare a Scribd company logo
1 of 31
How To Make Life Suck
       Less!
    (when building scalable systems)

       Bradford Stephens
    c: www.DrawnToScaleHQ.com
       b: www.roadtofailure.com
           t: @lusciouspear
About Me

• Founder, Drawn to Scale. Lead Engineer,
  Visible Technologies
• CS Degree, University of North FL
• Former careers in politics, music, finance,
  consulting
Drawn to Scale

• Building the “Big Data” platform: ingestion,
  processing, storage, search
• Products coming: Big Log, Big Search
  (faceted), Big Message...
Topics

• Overview
• Operations
• Engneering
• Process
Everything Changes
      with Big Data

• Bar is set higher: a previously niche field,
  few standard stacks (like LAMP)
• You need to have better engineering for
  minimum success
Scalability Matters

• “Web-Scale” data is unstructured and
  exponentially interconnected
• Social Media: Catalyst
• All data is important
• Data Size != Business Size
The Traditional DB
• Excel with highly structured, normalizable
  data
• Non-Linear Scale Cost
• More data = less features
• Optimized for single-node
• 90% of utility is 5% of capability
Ergo, Distributed

• Optimize for the problems, no Swiss-Army
  knife
• Shared-nothing, commodity boxes
• Linear scale cost
The State of Things

• Order changed from 20 years ago:
• Cust. Experience is paramount
• Engineers are precious
• Fast I/O is expensive
• Storage is cheap
Recovery-Oriented
      Computing

1. Seamlessly Partitioned
2. Synchronously Redundant
3. Heavily Monitored
Operations

Moving the Box: Sysadmin ratio from 2:1 to
            200:1 to 2000:1


   (yes devs, you’ll care about this too)
Ops vs. Eng

• Engineers build, Ops manages
• Fixing problems: devs code+automate, ops
  hire
• Want something fixed? Call devs at 2 AM.
Config is Important

• Configuration is not 2nd-class anymore
• Needs to be tackled by Engineers
• New frameworks = months of
  configuration and experimentation
• Chef is a good start, but...
Production = Test

• Surprise! You don’t have a Test environment
  any more.
• Test Cost => Prod Cost
• Anything that’s not your data center is an
  approximation. Switches, cable, power,
  boxes, etc...
You’re Always Testing

• Constantly simulate failures and brownouts
  of boxes, racks, switches...
• “Canary in the Coal Mine”: run a box and
  rack at 175% current load.
Deployment


• Deploy gradually: 1 box, 2 boxes, 1 rack...
• Code granularly, backwards-compatible
Built to Fail
• “It’s working” isn’t binary
• Acting weird? Shoot it.
• Multi-system failure is common: be
  topology aware
• Avoid false negative: something’s wrong and
  you don’t know it, lose customer data
• This is empowering!
Engineering


This is Systems Software, not Applications
                 Software
This is Hard :(
• Engineering at scale is very different than
  writing a 3-tier webapp
• Care about garbage collection, election
  algorithms, data structures, access patterns,
  etc...
• CS knowledge is required, not a luxury
• DBA/RDBMS skills pretty useless
• CAP is law
Not Everything’s a Table

• Structure your data according to how it
  needs to be used
• Unstructured massive files, graphs, KV-
  stores
• The more your problem narrows, the
  easier it is to scale
Big Data is BIG

• Imagine your test passes taking hours
• What works at 1.5 TB may fail at 10MB or
  2 TB
• Many tests, simple code
• Soft Delete Only
“No, I won’t give you a
        repro”

• Often impossible to repro a bug on
  demand in a cluster
• Either fix your logging or your bug
• Log everything (we have a product for this!)
Avoiding Impedance
       Mismatch

• High vs. Low Latency vs. Throughput
• A lot of data eventually, or a little now
• MapReduce vs. Sharding/Indexing
Simple Workflow
                       Semantic     Unstructured
Hadoop      Collect
                       Analysis       Analysis



                       Structured
                        Analysis
Hadoop +    Store in
 HBase      HBase
                                     Store in
                       Indexing
                                     Hadoop


Lucene+                 Load/
              Pull
 Solr+                 Replicate
            Indexes
 Katta                  Shards           Search
Biz + Process


The softer side of distributed computing
Hiring


• Plan for more engineers, less ops
• Be aware of “context switch cost” when
  training RDBMS-folks
It’s Not Just Coding
• Be aware of research cost
• Much more time spent experimenting, not
  coding
• Coding all this from scratch is horrific
• Nailing together 10+ OSS projects is a pain
• Open source anything not “Secret sauce”
Solve your Core
         Problem

• “Making your own electricity doesn’t create
  better tasting beer”
• Plan to use an end-to-end platform in the
  future (hint: ours!)
In Summary

• Plan for everything to fail
• Test constantly in production
• Systems Software requires Computer
  Science
• Don’t build it if you don’t have to
Thanks!

• Ya’ll
• Road to Failure Readers
• James Hamilton, Amazon/MS
• Bradford Cross, Flightcaster
• Ryan Rawson, HBase/Stumbleupon
Useful Resources

• www.roadtofailure.com
• www.highscalability.com
• perspectives.mvdirona.com

More Related Content

What's hot

MLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf
 
Importance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionImportance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionNicolas De Boose
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We WorkRandy Shoup
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesPeter Varhol
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterRandy Shoup
 
Dev productivity hacks - Cognitive Biases and Friction for Software Developers
Dev productivity hacks - Cognitive Biases and Friction for Software DevelopersDev productivity hacks - Cognitive Biases and Friction for Software Developers
Dev productivity hacks - Cognitive Biases and Friction for Software DevelopersYash Ranadive
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Randy Shoup
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service designTheo Schlossnagle
 
Experiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriExperiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriThe Hive
 
Josh Wills, MLconf 2013
Josh Wills, MLconf 2013Josh Wills, MLconf 2013
Josh Wills, MLconf 2013MLconf
 
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Lviv Startup Club
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughRandy Shoup
 
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheThe top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheOmri Allouche
 
Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermRandy Shoup
 

What's hot (20)

MLconf NYC Josh Wills
MLconf NYC Josh WillsMLconf NYC Josh Wills
MLconf NYC Josh Wills
 
Importance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introductionImportance of test automation, excuses and TDD introduction
Importance of test automation, excuses and TDD introduction
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We Work
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us Better
 
Dev productivity hacks - Cognitive Biases and Friction for Software Developers
Dev productivity hacks - Cognitive Biases and Friction for Software DevelopersDev productivity hacks - Cognitive Biases and Friction for Software Developers
Dev productivity hacks - Cognitive Biases and Friction for Software Developers
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020
 
Analysis paralysis
Analysis paralysisAnalysis paralysis
Analysis paralysis
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
Time tracking
Time trackingTime tracking
Time tracking
 
Experiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan SeshadriExperiences with big data by Srinivasan Seshadri
Experiences with big data by Srinivasan Seshadri
 
GTD(R) Workshop
GTD(R) WorkshopGTD(R) Workshop
GTD(R) Workshop
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Josh Wills, MLconf 2013
Josh Wills, MLconf 2013Josh Wills, MLconf 2013
Josh Wills, MLconf 2013
 
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
Gerrit Coetzee “Thou Shalt Write Things Down. And Other Rules for Managing Pr...
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good Enough
 
The top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri AlloucheThe top mistakes you're making in your Data Science interview - Omri Allouche
The top mistakes you're making in your Data Science interview - Omri Allouche
 
Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long Term
 

Similar to Make Life Suck Less (Building Scalable Systems)

Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)guest0f8e278
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudChris Dagdigian
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of dataArnon Shimoni
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesRob Winters
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLTriNimbus
 
From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...Thibaud Desodt
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBMongoDB
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...Mark Rittman
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 

Similar to Make Life Suck Less (Building Scalable Systems) (20)

Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the Cloud
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of data
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
 
From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...From ddd to DDD : My journey from data-driven development to Domain-Driven De...
From ddd to DDD : My journey from data-driven development to Domain-Driven De...
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDB
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Make Life Suck Less (Building Scalable Systems)

  • 1. How To Make Life Suck Less! (when building scalable systems) Bradford Stephens c: www.DrawnToScaleHQ.com b: www.roadtofailure.com t: @lusciouspear
  • 2. About Me • Founder, Drawn to Scale. Lead Engineer, Visible Technologies • CS Degree, University of North FL • Former careers in politics, music, finance, consulting
  • 3. Drawn to Scale • Building the “Big Data” platform: ingestion, processing, storage, search • Products coming: Big Log, Big Search (faceted), Big Message...
  • 4. Topics • Overview • Operations • Engneering • Process
  • 5. Everything Changes with Big Data • Bar is set higher: a previously niche field, few standard stacks (like LAMP) • You need to have better engineering for minimum success
  • 6. Scalability Matters • “Web-Scale” data is unstructured and exponentially interconnected • Social Media: Catalyst • All data is important • Data Size != Business Size
  • 7. The Traditional DB • Excel with highly structured, normalizable data • Non-Linear Scale Cost • More data = less features • Optimized for single-node • 90% of utility is 5% of capability
  • 8. Ergo, Distributed • Optimize for the problems, no Swiss-Army knife • Shared-nothing, commodity boxes • Linear scale cost
  • 9. The State of Things • Order changed from 20 years ago: • Cust. Experience is paramount • Engineers are precious • Fast I/O is expensive • Storage is cheap
  • 10. Recovery-Oriented Computing 1. Seamlessly Partitioned 2. Synchronously Redundant 3. Heavily Monitored
  • 11. Operations Moving the Box: Sysadmin ratio from 2:1 to 200:1 to 2000:1 (yes devs, you’ll care about this too)
  • 12. Ops vs. Eng • Engineers build, Ops manages • Fixing problems: devs code+automate, ops hire • Want something fixed? Call devs at 2 AM.
  • 13. Config is Important • Configuration is not 2nd-class anymore • Needs to be tackled by Engineers • New frameworks = months of configuration and experimentation • Chef is a good start, but...
  • 14. Production = Test • Surprise! You don’t have a Test environment any more. • Test Cost => Prod Cost • Anything that’s not your data center is an approximation. Switches, cable, power, boxes, etc...
  • 15. You’re Always Testing • Constantly simulate failures and brownouts of boxes, racks, switches... • “Canary in the Coal Mine”: run a box and rack at 175% current load.
  • 16. Deployment • Deploy gradually: 1 box, 2 boxes, 1 rack... • Code granularly, backwards-compatible
  • 17. Built to Fail • “It’s working” isn’t binary • Acting weird? Shoot it. • Multi-system failure is common: be topology aware • Avoid false negative: something’s wrong and you don’t know it, lose customer data • This is empowering!
  • 18. Engineering This is Systems Software, not Applications Software
  • 19. This is Hard :( • Engineering at scale is very different than writing a 3-tier webapp • Care about garbage collection, election algorithms, data structures, access patterns, etc... • CS knowledge is required, not a luxury • DBA/RDBMS skills pretty useless • CAP is law
  • 20. Not Everything’s a Table • Structure your data according to how it needs to be used • Unstructured massive files, graphs, KV- stores • The more your problem narrows, the easier it is to scale
  • 21. Big Data is BIG • Imagine your test passes taking hours • What works at 1.5 TB may fail at 10MB or 2 TB • Many tests, simple code • Soft Delete Only
  • 22. “No, I won’t give you a repro” • Often impossible to repro a bug on demand in a cluster • Either fix your logging or your bug • Log everything (we have a product for this!)
  • 23. Avoiding Impedance Mismatch • High vs. Low Latency vs. Throughput • A lot of data eventually, or a little now • MapReduce vs. Sharding/Indexing
  • 24. Simple Workflow Semantic Unstructured Hadoop Collect Analysis Analysis Structured Analysis Hadoop + Store in HBase HBase Store in Indexing Hadoop Lucene+ Load/ Pull Solr+ Replicate Indexes Katta Shards Search
  • 25. Biz + Process The softer side of distributed computing
  • 26. Hiring • Plan for more engineers, less ops • Be aware of “context switch cost” when training RDBMS-folks
  • 27. It’s Not Just Coding • Be aware of research cost • Much more time spent experimenting, not coding • Coding all this from scratch is horrific • Nailing together 10+ OSS projects is a pain • Open source anything not “Secret sauce”
  • 28. Solve your Core Problem • “Making your own electricity doesn’t create better tasting beer” • Plan to use an end-to-end platform in the future (hint: ours!)
  • 29. In Summary • Plan for everything to fail • Test constantly in production • Systems Software requires Computer Science • Don’t build it if you don’t have to
  • 30. Thanks! • Ya’ll • Road to Failure Readers • James Hamilton, Amazon/MS • Bradford Cross, Flightcaster • Ryan Rawson, HBase/Stumbleupon
  • 31. Useful Resources • www.roadtofailure.com • www.highscalability.com • perspectives.mvdirona.com