How to Improve Design Skills
- How Jeffery can improve his design skills
- JefferyYuan
Disclaimer
Just try to summarize what I have learned.
Specific knowledge is not important, but how to
approach and learn these knowledge.
Agenda
- How to Design
- System Design Principles
- Learning from Open Source
- Learning from Existing Products
- System Design Practices
How to Design (need add more content)
Take time to think about your design
- Minimize upfront design or YAGNI
- It doesn't mean you don't take time to design the component
Components related
Impact to other components
What are alternatives?
Welcome different approaches and discussion
How to Design
Estimation
- back-of-the-envelope calculation
Estimated data size, QPS
Take time to design data schema
- As it’s difficult to change them after deploy to prod
Reflection – Lesson Learned
What mistakes we made
- Where to store data: dynamodb or not?
- The key for Solr schema
Why they happened:
Not consider near-future requirements
Make decisions carelessly
Reflection – Lesson Learned
Better client library
- Only contains library and code that client need
Package shared configuration in the library
System Design Principles
Idempotent
Policy to expire/archive data - Less data
Optimize data for read
- Denormalization
Read Heavy vs Write Heavy
Design to Be Disabled - feature toggle
Isolate Faults - Circular breaker
Throttling - Rate limit
System Design Principles
Stateless
Asynchronous
- Back pressure with exponential backoff
- Message queues
Cache
Visibility – monitoring
Separation of concerns
- Separate read and write
System Design Principles
CAP
Graceful Degradation
- Be Robust - Hide error as much as possible
- Be conservative in what you send, be liberal in what you accept
- Make your apps do something reasonable even if not all is right
Learning from Open Source
What makes them popular
When to use them, when not
Cassandra - Learning from Open Source
LSM(Log Structured Merge Trees)
- append-only
SSTable
MemTable - SSTable in memory
How C* handles delete: Tombstone(grace period)
Merkle trees
Bloom Filter
Index
CommitLog
Cassandra - Learning from Open Source
Serialize cache data (row-cache, key cache) to avoid cold restart
Session Coordinator
Gossip protocol
Seed nodes
Consistent Hashing
Eventual Consistency
- W+R > N
Local Index (vs Global Index)
Kafka - Learning from Open Source
Why it is fast
Sequentially read/write vs random read/write
Memory Mapped File
Zero copy
Batch data(compressed)
Partition: ordered, immutable, replicated
Consumer group
Kafka -Learning from Open Source
Kafka evolves fast
Stream, KTable
Stateful stream, exactly once delivery
- How kafka implement them
Message Queue
Asynchronous
Separation of concerns
Scale separately
Evening out traffic spikes
Learning from Existing Products
Twitter/FB timeline
Pull/Push/Mixed Model
FB Haystack/Photo storage
System Design Practices
URL shortener
- read heavy
- able to disable write functions
Design key-value store
Crawler
Re-crawling
cur+2t or cur+t/2 based on changed or not
Design search engine
In-memory version: Data structure
Distributed: Solr Cloud internal design
DB - Learning from Open Source
Sharding
Replication
Master/Slave, Multi-master
System Design Practices
Design score/rank system for social game
Search nearby places: GeoHash
Design Chat app
Design logging collection and analysis system
Design shopping cart
-guest cart
Design Hit Counter
Design rate limiter
Design Miao Sha
Resource
Designing Data-Intensive Applications
Scalability Rules: Principles for Scaling Web Sites
The Art of Scalability
Shameless plug
http://lifelongprogrammer.blogspot.com/search/label/System%20Design

How to improve design skills

  • 1.
    How to ImproveDesign Skills - How Jeffery can improve his design skills - JefferyYuan
  • 2.
    Disclaimer Just try tosummarize what I have learned. Specific knowledge is not important, but how to approach and learn these knowledge.
  • 3.
    Agenda - How toDesign - System Design Principles - Learning from Open Source - Learning from Existing Products - System Design Practices
  • 4.
    How to Design(need add more content) Take time to think about your design - Minimize upfront design or YAGNI - It doesn't mean you don't take time to design the component Components related Impact to other components What are alternatives? Welcome different approaches and discussion
  • 5.
    How to Design Estimation -back-of-the-envelope calculation Estimated data size, QPS Take time to design data schema - As it’s difficult to change them after deploy to prod
  • 6.
    Reflection – LessonLearned What mistakes we made - Where to store data: dynamodb or not? - The key for Solr schema Why they happened: Not consider near-future requirements Make decisions carelessly
  • 7.
    Reflection – LessonLearned Better client library - Only contains library and code that client need Package shared configuration in the library
  • 8.
    System Design Principles Idempotent Policyto expire/archive data - Less data Optimize data for read - Denormalization Read Heavy vs Write Heavy Design to Be Disabled - feature toggle Isolate Faults - Circular breaker Throttling - Rate limit
  • 9.
    System Design Principles Stateless Asynchronous -Back pressure with exponential backoff - Message queues Cache Visibility – monitoring Separation of concerns - Separate read and write
  • 10.
    System Design Principles CAP GracefulDegradation - Be Robust - Hide error as much as possible - Be conservative in what you send, be liberal in what you accept - Make your apps do something reasonable even if not all is right
  • 11.
    Learning from OpenSource What makes them popular When to use them, when not
  • 12.
    Cassandra - Learningfrom Open Source LSM(Log Structured Merge Trees) - append-only SSTable MemTable - SSTable in memory How C* handles delete: Tombstone(grace period) Merkle trees Bloom Filter Index CommitLog
  • 13.
    Cassandra - Learningfrom Open Source Serialize cache data (row-cache, key cache) to avoid cold restart Session Coordinator Gossip protocol Seed nodes Consistent Hashing Eventual Consistency - W+R > N Local Index (vs Global Index)
  • 14.
    Kafka - Learningfrom Open Source Why it is fast Sequentially read/write vs random read/write Memory Mapped File Zero copy Batch data(compressed) Partition: ordered, immutable, replicated Consumer group
  • 15.
    Kafka -Learning fromOpen Source Kafka evolves fast Stream, KTable Stateful stream, exactly once delivery - How kafka implement them Message Queue Asynchronous Separation of concerns Scale separately Evening out traffic spikes
  • 16.
    Learning from ExistingProducts Twitter/FB timeline Pull/Push/Mixed Model FB Haystack/Photo storage
  • 17.
    System Design Practices URLshortener - read heavy - able to disable write functions Design key-value store Crawler Re-crawling cur+2t or cur+t/2 based on changed or not Design search engine In-memory version: Data structure Distributed: Solr Cloud internal design
  • 18.
    DB - Learningfrom Open Source Sharding Replication Master/Slave, Multi-master
  • 19.
    System Design Practices Designscore/rank system for social game Search nearby places: GeoHash Design Chat app Design logging collection and analysis system Design shopping cart -guest cart Design Hit Counter Design rate limiter Design Miao Sha
  • 20.
    Resource Designing Data-Intensive Applications ScalabilityRules: Principles for Scaling Web Sites The Art of Scalability Shameless plug http://lifelongprogrammer.blogspot.com/search/label/System%20Design