This document discusses various techniques for scaling web applications, including horizontal scaling by adding more servers behind a load balancer, using a session store like Redis for shared sessions, centralized logging, and continuous integration to deploy updates. It also covers load balancing with HAProxy, monitoring with Zabbix, caching with Varnish, database scaling with master-slave replication or sharding in MongoDB, and using queues like RabbitMQ. The key is to think of the application as independent workers that can run on multiple servers rather than a single instance.
Massively Scaled High Performance Web Services with PHPDemin Yin
Over the years, people have questioned if PHP is a good choice for building web services. In this talk, I will share how we use PHP on the backend for Glu Mobile’s flagship mobile game Design Home, enabling it to regularly rank amongst the top free mobile games in the Apple App Store and the Google Play Store. We will deep dive into the thought processes, development, testing, and deployment strategy, showcasing what we have achieved with PHP.
Slides from my talk "Node.js Patterns for Discerning Developers" given at Pittsburgh TechFest 2013. This talk detailed common design pattern for Node.js, as well as common anti-patterns to avoid.
Node.js and How JavaScript is Changing Server Programming Tom Croucher
Node.js is a highly concurrent JavaScript server written on top of the V8 JavaScript runtime. This is awesome for a number of reasons. Firstly Node.js has re-architected some of the core module of V8 to create a server implementation that is non-blocking (similar to other event driven frameworks like Ruby’s Event Machine or Python’s Twisted). Event driven architectures are a natural fit for JavaScript developers because it’s already how the browser works. By using an event driven framework Node is not only intuitive to use but also highly scalable. Tests have shown Node instances running tens of thousands of simultaneous users.
This session will explore the architectural basics of Node.js and how it’s different from blocking server implementations such as PHP, Rail or Java Servlets. We’ll explore some basic examples of creating a simple server, dealing with HTTP requests, etc.
The bigger question is once we have this awesome programming environment, what do we do with it? Node already has a really vibrant collection of modules which provide a range of functionality. Demystifying what’s available is pretty important to actually getting stuff done with Node. Since Node itself is very low level, lot’s of things people expect in web servers aren’t automatically there (for example, request routing). In order to help ease people into using Node this session will look at a range of the best modules for Node.js.
All you need to know about the JavaScript event loopSaša Tatar
Learn the difference between JavaScript Engine, JavaScript Runtime, what is JavaScript event loop and why we should care.
At the end the presentation goes through a couple of examples and implementations of throttle and debounce utility functions.
This is a presentation I prepared for a local meetup. The audience is a mix of web designers and developers who have a wide range of development experience.
Massively Scaled High Performance Web Services with PHPDemin Yin
Over the years, people have questioned if PHP is a good choice for building web services. In this talk, I will share how we use PHP on the backend for Glu Mobile’s flagship mobile game Design Home, enabling it to regularly rank amongst the top free mobile games in the Apple App Store and the Google Play Store. We will deep dive into the thought processes, development, testing, and deployment strategy, showcasing what we have achieved with PHP.
Slides from my talk "Node.js Patterns for Discerning Developers" given at Pittsburgh TechFest 2013. This talk detailed common design pattern for Node.js, as well as common anti-patterns to avoid.
Node.js and How JavaScript is Changing Server Programming Tom Croucher
Node.js is a highly concurrent JavaScript server written on top of the V8 JavaScript runtime. This is awesome for a number of reasons. Firstly Node.js has re-architected some of the core module of V8 to create a server implementation that is non-blocking (similar to other event driven frameworks like Ruby’s Event Machine or Python’s Twisted). Event driven architectures are a natural fit for JavaScript developers because it’s already how the browser works. By using an event driven framework Node is not only intuitive to use but also highly scalable. Tests have shown Node instances running tens of thousands of simultaneous users.
This session will explore the architectural basics of Node.js and how it’s different from blocking server implementations such as PHP, Rail or Java Servlets. We’ll explore some basic examples of creating a simple server, dealing with HTTP requests, etc.
The bigger question is once we have this awesome programming environment, what do we do with it? Node already has a really vibrant collection of modules which provide a range of functionality. Demystifying what’s available is pretty important to actually getting stuff done with Node. Since Node itself is very low level, lot’s of things people expect in web servers aren’t automatically there (for example, request routing). In order to help ease people into using Node this session will look at a range of the best modules for Node.js.
All you need to know about the JavaScript event loopSaša Tatar
Learn the difference between JavaScript Engine, JavaScript Runtime, what is JavaScript event loop and why we should care.
At the end the presentation goes through a couple of examples and implementations of throttle and debounce utility functions.
This is a presentation I prepared for a local meetup. The audience is a mix of web designers and developers who have a wide range of development experience.
Xin Wang(Apache Storm Committer/PMC member)'s topic covered the relations between streaming and messaging platform, and the challenges and tips in Storm usage.
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
Este es el último seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web le guiaremos por el despliegue en producción.
Right-Sizing your SQL Server Virtual Machineheraflux
Virtualizing your top-tier production SQL Servers is not as easy as P2V’ing it. Sometimes allocating more resources to the VM is the wrong approach, and getting it wrong will silently hurt performance. What is the most effective method for determining the ‘right’ amount of resources to allocate? What happens if the workload changes a month from now?
The methods for understanding the performance of your mission-critical SQL Servers gathered over the past ten years of SQL Server virtualization will be addressed, and valuable processes for performance statistic collection and analysis will be displayed. Come learn how to properly ‘right-size’ the resources allocated to a VM, improve the performance of your SQL Servers, and keep it maximized well into the future.
HPC control systems are evolving into the future. This presentation looks at where this evolution may lead, and describes how the control system of the future might be constructed.
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...Spark Summit
So you know you want to write a streaming app but any non-trivial streaming app developer would have to think about these questions:
How do I manage offsets?
How do I manage state?
How do I make my spark streaming job resilient to failures? Can I avoid some failures?
How do I gracefully shutdown my streaming job?
How do I monitor and manage (e.g. re-try logic) streaming job?
How can I better manage the DAG in my streaming job?
When to use checkpointing and for what? When not to use checkpointing?
Do I need a WAL when using streaming data source? Why? When don’t I need one?
In this talk, we’ll share practices that no one talks about when you start writing your streaming app, but you’ll inevitably need to learn along the way.
An introduction to message queues with PHP. We'll focus on RabbitMQ and how to leverage queuing scenarios in your applications. The talk will cover the main concepts of RabbitMQ server and AMQP protocol and show how to use it in PHP. The RabbitMqBundle for Symfony2 will be presented and we'll see how easy you can start to use message queuing in minutes.
Presented at Symfony User Group Belgium: http://www.meetup.com/Symfony-User-Group-Belgium/events/169953362/
Large-scale projects development (scaling LAMP)Alexey Rybak
This 8-hours tutorial was given at various conferences including Percona conference (London), DevConf (Moscow), Highload++ (Moscow).
ABSTRACT
During this tutorial we will cover various topics related to high scalability for the LAMP stack. This workshop is divided into three sections.
The first section covers basic principles of shared nothing architectures and horizontal scaling for the app//cache/database tiers.
Section two of this tutorial is devoted to MySQL sharding techniques, queues and a few performance-related tips and tricks.
In section three we will cover the practical approach for measuring site performance and quality, porviding a "lean" support philosophy, connecting buesiness and technology metrics.
In addition we will cover a very useful Pinba real-time statistical server, it's features and various use cases. All of the sections will be based on real-world examples built in Badoo, one of the biggest dating sites on the Internet.
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
This is the story of how I helped make arm.com run on a small collection of state-of-the art ARM-based microservers, and the lessons I learned along the way, from how not to crash your entire multi-node MySQL database several times a day, to how not to have nginx consume all your disk space within a few minutes, all the way through to how to build a rock solid PHP application that is *truly* scalable to almost any size.
How to get the maximum performance from your AEP server. This will discuss ways to improve execution time of short running jobs and how to properly configure the server depending on the expected number of users as well as the average size and duration of individual jobs. Included will be examples of making use of job pooling, Database connection sharing, and parallel subprotocol tuning. Determining when to make use of cluster, grid, or load balanced configurations along with memory and CPU sizing guidelines will also be discussed.
Two popular tools for doing Machine Learning on top of JVM ecosystem is H2O and SparkML. This presentation compares these two tools as Machine Learning libraries (Didn't consider Spark's Data Munjing perspective). This work was done during June of 2018.
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
Presenter: Devendra Tagare - DataTorrent Engineer, Contributor to Apex, Data Architect experienced in building high scalability big data platforms.
Apache Apex is a next generation native Hadoop big data platform. This talk will cover details about how it can be used as a powerful and versatile platform for big data.
Apache Apex is a native Hadoop data-in-motion platform. We will discuss architectural differences between Apache Apex features with Spark Streaming. We will discuss how these differences effect use cases like ingestion, fast real-time analytics, data movement, ETL, fast batch, very low latency SLA, high throughput and large scale ingestion.
We will cover fault tolerance, low latency, connectors to sources/destinations, smart partitioning, processing guarantees, computation and scheduling model, state management and dynamic changes. We will also discuss how these features affect time to market and total cost of ownership.
Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.
Pragmatic Monolith-First, easy to decompose, clean architecturePiotr Pelczar
Designing systems architecture corresponding to business needs in long future is like a reading tea leaves. There is no common way to design systems. Making decision to start project with microservices may make refactoring much harder and introduce too much complexity in the infrastructure layer and finally slow down development. However maintaining a monolith is a tough nut to crack.
Let’s see how to build a system starting from well organized monolith with well marked technical and business scopes that enables to make a decision in with way it should be decomposed and how to deliver it. Strategic and tactical techniques from Domain-Driven Design and Hexagonal Architecture will be used. I will show you how to monitor accidential complexity using different tools during CI.
I invite you if you are interested in building systems with complex business domains.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
3. Think about your app as a worker
not single instance
OS
Load balancer
App
Server #1
App #1
App #2
Server #2
App #3
App #4
Server #3
App #5
4. Think about your app as a worker
not single instance
Load balancer
Server #1
App #1
Server #3
Load balancer
App #2
Server #2
App #3
App #4
App #5
Server #n
7. Sessions - Redis
•
•
•
•
•
Key-value in memory database (hash-tabled)
Scalable up to 1k nodes
Partitioning with Query routing
Non blocking M-S replication on nodes
Clustered (currently not production ready)
http://athlan.pl/symfony2-redis-session-handler/
8. Redis - Partitioning with Query routing
Query
random
node
Miss
Node #1
Hit, abort
Node #2
Node #3
Also supported:
• Client-side partitioning (app calls appropriate
node)
• Proxy assisted partitioning (proxy selects
appropriate node)
9. Centralized Logging
• Logs should be centrailzed to avoid taking
notice to each node separately
• Approaches:
– File replication (rsync + cron)
– syslog (easy to integrate with log4j)
• syslogd over UDP p:514
• rsyslog over TCP, stores data in db
10. Common storage, no local changes!
• Keep storage avaliable to all nodes
– Symfony2 Gaufrette Bundle
•
•
•
•
•
FTP
Amazon S3
OpenCloud
AzureBlobStorage
Rackspace
12. Continuous Integration
• To keep all nodes up-to-date, you need CI
• Automatize disabling nodes, building,
deploying
– Jenkins CI
13. Contineous Integration
1. Disable service on node
2. Deploy/build app
1. Copy files
2. Update db schema (liquibase, ORM schema
update)
3. Execute scripts
3. Re-run service
14. Balance the payload - HAProxy
Yeah guys, this is logo :)
But no schema is needed
just imagine how it works.
• Very, very fast proxy!
• Software TCP/HTTP load balancer
• Different node selecting algorithms:
– roudrobin (limit 4128)
– static-rr
– leastconn (lowest number of connections)
15. Balance the payload - HAProxy
• You can check node’s status by pinging
• Dead node is excluded from balancing strategy
vi /etc/haproxy/haproxy.cfg
option httpchk HEAD /check.txt HTTP/1.0
server webA 192.168.0.102:80 check
server webB 192.168.0.103:80 check
16. Balance the payload - HAProxy
• Monitor node’s status by read stats from
socket via socat.
echo "show stat" | socat
/tmp/haproxy.sock stdio
17. Balance the payload - HAProxy
• Monitor node’s status by native stats webapp
console
28. Varnish and ESI
<!DOCTYPE html>
<html>
<body>
<!-- ... some content -->
<!-- Embed the content of another page here -->
<esi:include src="http://..." />
<!-- ... more content -->
</body>
</html>
29. Scaling databases - Master slave
Write
Master
Slave
Read
• All data redundancy
Slave
Slave
30. MongoDB scaling
• Common models to spread data over nodes:
– range keys
– hash keys
• Many nodes on cheap machines
• No all data redundancy in each node
31. MongoDB – range-based keys
http://docs.mongodb.org
• Awesome for range queries (grab data from min nodes –
Query isolation)
• Not good enough to distribute data over nodes in case of
monotinic incemental
32. MongoDB – hash-based keys
http://docs.mongodb.org
• Take notice: not good for range queries while
merge-sorting, no Query isolation in this case
• Write scaling – Write to many nodes simultaneously (take
notice to readers-writer lock, where write is exclusive)
34. CQRS
• Command Query Responsibility Segregation
– separate application service layers for writing and
readng from DB (possibility to use different data
sources like RAM or DB)
35. CQRS
• Examples
– post-insert population cache
• all SELECTs are from cache (even invalid)
• consider LFU instead of LRU to invaidate cache
– pre-insert into memory
• dump results periodicaly
In both approaches there is convenient to use
Queues or data bus !
36. Queues, RabbitMQ
• RabbitMQ is based on AMQP (Advanced
Message Queuing Protocol)
– point-to-point
– publish-and-subscribe
– queueing, routing
• AMQP is not JMS (Java Message Service is an
API, not protocol)
• Happy Rabit is empty Rabbit
– do not try to store any data (messages) in queue
system in persistent mode to keep HA
38. Box vs spread architecture.
• Box architecture
– no scaling
– easy to maintenance
Server
Webapp
Redis
RabbitMQ
Varnish
DB
39. Box vs spread architecture.
• Spread architecture
– High availability
– more integrations, more administrative
Server #1
RabbitMQ
Redis
HAProxy
Server #2
Server #3
Webapp
Webapp
DB shard
Varnish
DB shard
Varnish