Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Massively Scalable Applications - TechFerry


Published on

Massively Scalable Applications - from TechFerry. Covers distributed computing, concurrent programming, parallel programming, and symmetric multiprocessing.

Published in: Technology
  • Be the first to comment

Massively Scalable Applications - TechFerry

  1. 1. Massively Scalable Applications Deepansh Malik
  2. 2. Introductions /TechFerry /@techferry Deepansh Malik CEO at TechFerry @DeepanshMalik TechFerry: Analytics, IT Innovation, R&D Company Specialization in o Growth Analytics o HealthCare Analytics o Massively Scalable Applications and Rich UI
  3. 3. Massively Scalable Applications Benchmark: 1 Million TRX per second 1 Million Requests per second 1 Million Messages per second 1 Million DB Transactions per second 1 Million/sec = 1 Billion TRX in 17 minutes = 86.4 Billion TRX a day
  4. 4. Scale out or Scale up? Scale out -> Add more hardware. 1 CPU Core = 1000 requests/sec To massively scale (1 Million request/second), we need 1000 cores. 50 machines 20 cores each. Good idea or stupid idea? Costs??
  5. 5. Scale up? Can one machine scale to a million transactions per second? The Answer is YES. Our commodity hardware is very powerful. What is the bottleneck then? What do we need to save tons of money being wasted in scaling out?
  6. 6. Let us begin Architecting Massively Scalable Apps
  7. 7. Computing Spectrum Symmetric Multi Processing A single problem or a single task (eg. a DB query), it takes 2 milliseconds on a core. Can I use two cores and complete this single task in 1 ms? Distributed Computing Distribute load on multiple machines. Make sure there are no bottlenecks or single point of failures. Can we achieve End to End Distribution, from messaging to processing to databases? Concurrent Programming One CPU core currently handles 1000 trx/sec. Can one core handle 1000 trx in a millisecond instead? That is 1M trx/sec. Can we remove context switching overheads and synchronous, I/O idling? Parallel Programming ● Throw more CPU cores for different tasks.
  8. 8. Distributed Computing Distribute workload between two or more computing devices or machines connected by some type of network. ● For example, clustered architecture with multiple machines However, in real life web applications, we need to distribute workload on ● application servers, ● database servers, ● perform real-time computations or analytics.
  9. 9. Distributed Computing Distributed Storage Distributed Messaging Distributed Analytics (Real Time and Batch)
  10. 10. Traditional vs New Spot the Bottleneck node / single point of failure. Traditional: Load Balancer (L), Master DB (M) | New: ?? Traditional New Load balancing App Servers Master Slave DB Architecture
  11. 11. Distributed Computing - Tools ➔ Distributed Messaging ◆ Apache Kafka, RabbitMQ, Apache ActiveMQ ◆ A detailed comparison from LinkedIn is available at ➔ Distributed Analytics ◆ Apache Storm (Real Time), Apache Spark (Batch) ➔ Distributed Storage ◆ Cassandra
  12. 12. Use Cases: Highly Suitable for Real Time analytics of High Velocity Big Data Machine to Machine (M2M) or Internet of Things (IoT) M2M, IoT and real time analytics
  13. 13. Concurrent Programming is a form of computing in which several computations are executing during overlapping time periods –concurrently – instead of sequentially software code that facilitates the performance of multiple computing tasks at the same time
  14. 14. Architectural Concepts Events, Threads or Actors? Asynchronous Programming Functional Programming Concurrent Programming
  15. 15. Events vs Threads, Actors NodeJS vs J2EE Performance comparison of Multithreaded synchronous technology using Spring/Hibernate, VS Event based, single process, asynchronous technology using NodeJS. Independent Research Report from TechFerry Innovation Lab
  16. 16. Asynchronous Programming End to end asynchronous programming Non blocking call-backs not just at Application layer but also at UI or Database layers. Pick asynchronous programming at application, database or UI layer based on your use-case.
  17. 17. Functional Programming A programming paradigm, a style of building the structure and elements of computer programs, that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data. Routines can easily be moved to a different CPU core. Scala/Akka Actors
  18. 18. Innovation Labs @ TechFerry
  19. 19. Symmetric Multi Processing Symmetric Multi Processing (SMP) is the processing of programs by multiple processors that share a common operating system and memory. The processors share memory and the I/O bus or data path. A single copy of the operating system is in charge of all the processors.
  20. 20. Asymmetric vs Symmetric Asymmetric Multiprocessing The different CPU take on different job Symmetric Multi Processing (SMP) All CPU run in parallel, doing the same job CPUs share the same memory
  21. 21. +1 408-337-6607 Contact Information Thank You /techferry /@techferry