The document discusses Parse's process for benchmarking MongoDB upgrades by replaying recorded production workloads on test servers. They found a 33-75% drop in throughput when upgrading from 2.4.10 to 2.6.3 due to query planner bugs. Working with MongoDB, they identified and helped fix several bugs, improving performance in 2.6.5 but still below 2.4.10 levels initially. Further optimization work increased throughput above 2.4.10 levels when testing with more workers and operations.
How Parse Built a Mobile Backend as a Service on AWS (MBL307) | AWS re:Invent...Amazon Web Services
Parse is a BaaS for mobile developers that is built entirely on AWS. With over 150,000 mobile apps hosted on Parse, the stability of the platform is our primary concern, but it coexists with rapid growth and a demanding release schedule. This session is a technical discussion of the current architecture and the design decisions that went in to scaling the platform rapidly and robustly over the past year and a half. We talk about some of the lessons learned managing and scaling MongoDB, Cassandra, Redis, and MySQL in the cloud. We also discuss how Parse went from launching individual instances using chef to managing clusters of hosts with Auto Scaling groups, with instance discovery and registry handled by ZooKeeper, thus enabling us to manage vastly larger sets of services with fewer human resources. This session is useful to anyone who is trying to scale up from startup to established platform without sacrificing agility.
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsDatabricks
This talk will cover some practical aspects of Apache Spark monitoring, focusing on measuring Apache Spark running on cloud environments, and aiming to empower Apache Spark users with data-driven performance troubleshooting. Apache Spark metrics allow extracting important information on Apache Spark’s internal execution. In addition, Apache Spark 3 has introduced an improved plugin interface extending the metrics collection to third-party APIs. This is particularly useful when running Apache Spark on cloud environments as it allows measuring OS and container metrics like CPU usage, I/O, memory usage, network throughput, and also measuring metrics related to cloud filesystems access. Participants will learn how to make use of this type of instrumentation to build and run an Apache Spark performance dashboard, which complements the existing Spark WebUI for advanced monitoring and performance troubleshooting.
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Flink Forward
http://flink-forward.org/kb_sessions/multi-tenant-flink-as-a-service-on-yarn/
Since June 2016, Flink-as-a-service has been available to researchers and companies in Sweden from the Swedish ICT SICS Data Center at www.hops.site using the HopsWorks platform. Flink applications can be either deployed as jobs (batch or streaming) or written and run directly from Apache Zeppelin on YARN. Flink applications are run within a project on a YARN cluster with the novel property that Flink applications are metered and charged to projects. Projects are also securely isolated from each other and include support for project-specific Kafka topics that are protected from access by users that are not members of the project. Hopsworks is entirely UI-driven, is open-source, and Flink applications that include Kafka topics can be created in a few mouse clicks. In this talk we will discuss the challenges in building a metered version of Flink-as-a-Service for YARN, experiences with Flink-on-YARN, and some of the possibilities that Hopsworks opens up for building secure, multi-ten
How Parse Built a Mobile Backend as a Service on AWS (MBL307) | AWS re:Invent...Amazon Web Services
Parse is a BaaS for mobile developers that is built entirely on AWS. With over 150,000 mobile apps hosted on Parse, the stability of the platform is our primary concern, but it coexists with rapid growth and a demanding release schedule. This session is a technical discussion of the current architecture and the design decisions that went in to scaling the platform rapidly and robustly over the past year and a half. We talk about some of the lessons learned managing and scaling MongoDB, Cassandra, Redis, and MySQL in the cloud. We also discuss how Parse went from launching individual instances using chef to managing clusters of hosts with Auto Scaling groups, with instance discovery and registry handled by ZooKeeper, thus enabling us to manage vastly larger sets of services with fewer human resources. This session is useful to anyone who is trying to scale up from startup to established platform without sacrificing agility.
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsDatabricks
This talk will cover some practical aspects of Apache Spark monitoring, focusing on measuring Apache Spark running on cloud environments, and aiming to empower Apache Spark users with data-driven performance troubleshooting. Apache Spark metrics allow extracting important information on Apache Spark’s internal execution. In addition, Apache Spark 3 has introduced an improved plugin interface extending the metrics collection to third-party APIs. This is particularly useful when running Apache Spark on cloud environments as it allows measuring OS and container metrics like CPU usage, I/O, memory usage, network throughput, and also measuring metrics related to cloud filesystems access. Participants will learn how to make use of this type of instrumentation to build and run an Apache Spark performance dashboard, which complements the existing Spark WebUI for advanced monitoring and performance troubleshooting.
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Flink Forward
http://flink-forward.org/kb_sessions/multi-tenant-flink-as-a-service-on-yarn/
Since June 2016, Flink-as-a-service has been available to researchers and companies in Sweden from the Swedish ICT SICS Data Center at www.hops.site using the HopsWorks platform. Flink applications can be either deployed as jobs (batch or streaming) or written and run directly from Apache Zeppelin on YARN. Flink applications are run within a project on a YARN cluster with the novel property that Flink applications are metered and charged to projects. Projects are also securely isolated from each other and include support for project-specific Kafka topics that are protected from access by users that are not members of the project. Hopsworks is entirely UI-driven, is open-source, and Flink applications that include Kafka topics can be created in a few mouse clicks. In this talk we will discuss the challenges in building a metered version of Flink-as-a-Service for YARN, experiences with Flink-on-YARN, and some of the possibilities that Hopsworks opens up for building secure, multi-ten
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Till Rohrmann
Modern stream processing engines not only have to process millions of events per second at sub-second latency but also have to cope with constantly changing workloads. Due to the dynamic nature of stream applications where the number of incoming events can strongly vary with time, systems cannot reliably predetermine the amount of required resources. In order to meet guaranteed SLAs as well as utilizing system resources as efficiently as possible, frameworks like Apache Flink have to adapt their resource consumption dynamically. In this talk, we will take a look under the hood and explain how Flink scales stateful application in and out. Starting with the concept of key groups and partionable state, we will cover ways to detect bottlenecks in streaming jobs and discuss efficient strategies how to scale out operators with minimal down-time.
Webinar: Queues with RabbitMQ - Lorna MitchellCodemotion
Queues are a great addition to any application that has some tasks that need processing asynchronously. This could be sending a confirmation email, resizing an avatar, or recalculating a running total of some kind; in all those cases it would be cool to send the response back to the user and then sort out that task later. This session looks at how to use a RabbitMQ job queue in your application. It also looks at how to design elegant and robust long-running workers that will consume the jobs from the queue and process them. This session is ideal for technical leads, developers and architects alike.
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data AnalyticsSingleStore
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data Analytics: Novus, DigitalOcean, Akamai.
Building Predictive Applications with Real-Time Data Pipelines and Streamliner. Eric Frenkiel, CEO and Co-Founder, MemSQL
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
talk given at erlang factory 2011 about using erlang to build social games backends
Watch the video of this presentation http://vimeo.com/22144057#at=0
Degrading Performance? You Might be Suffering From the Small Files SyndromeDatabricks
No matter if your data pipelines are handling real-time event-driven streams, near-real-time streams, or batch processing jobs. When you work with a massive amount of data made out of small files, specifically parquet, your system performance will degrade.
A small file is one that is significantly smaller than the storage block size. Yes, even with object stores such as Amazon S3, Azure Blob, etc., there is minimum block size. Having a significantly smaller object file can result in wasted space on the disk since the storage is optimized to support fast read and write for minimal block size.
To understand why this happens, you need first to understand how cloud storage works with the Apache Spark engine. In this session, you will learn about Parquet, the Storage API calls, how they work together, why small files are a problem, and how you can leverage DeltaLake for a more straightforward, cleaner solution.
Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...Nagios
Nicolas Brousse's presentation on using Nagios to monitor a dynamic cloud environment.
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Slides of erlang factory 2011 London talk "Designing for performance with erlang"
Video of this presentation available at http://vimeo.com/26715793#at=0
The reality of software systems is that the business needs they intend to service will change over time. So applications must be created that are able to evolve and follow these changing needs.
Welcome to the world where we are building high-volume distributed streaming applications using systems like Apache Flink, Spark and Kafka. Applications that are assumed to run 'forever' never go down.
So what happens if a business need changes? How can you make a streaming application that can evolve without breaking all the downstream applications that depend on it? Roll out the new producers first? Roll out the new consumers first? How do I avoid going down? But wait! Systems like Kafka persist records for weeks; so how do you handle the fact that there can be several different schemas in the Kafka topic at a certain point in time? Can you deploy a new application that reads both formats?
In this presentation Niels Basjes (Avro PMC) will go into the ways bol.com has chosen to handle these effects in a practical way. He will describe how the "Message" format and the schema evolution features of Apache Avro are used in conjuction with Apache Flink to make applications really 'evolvable'.
How do we make sure all applications are able to find the schema specifications, what can we do to ensure schemas stay 'evolvable,' and what were the pitfalls we ran into? Join us and find out.
The feature we always hear about whenever Java 9 is in the news is Jigsaw, modularity. But this doesn't scratch the
same developer itch that Java 8's lambdas and streams did, and we're left with a vague sensation that the next version might not be that interesting.
Java 9 actually has a lot of great additions and changes to make development a bit nicer. These features can't be lumped under an umbrella term like Java 8's lambdas and streams, the changes are scattered throughout the APIs and language features that we regularly use.
In this presentation Trisha will show, via live coding:
- How we can use the new Flow API to utilise Reactive Programming
- How the improvements to the Streams API make it easier to control real-time streaming data
- How to the Collections convenience methods simplify code
Along the way we'll bump into other Java 9 features, including some of the additions to interfaces and changes to deprecation. We’ll see that once you start using Java 9, you can't go back to Before.
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...Amit Gupta
The Cloud Foundry Diego team at Pivotal has been hard at work for the past few months exploring and improving Diego's performance at scale and under stress. This talk covers the goals, tools, and results of the experiments to date, as well as a glimpse of what's next.
And finally, a brief teaser about the current state of .NET support in Diego
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2015.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
Silicon Valley Code Camp 2014 - Advanced MongoDBDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2014.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Till Rohrmann
Modern stream processing engines not only have to process millions of events per second at sub-second latency but also have to cope with constantly changing workloads. Due to the dynamic nature of stream applications where the number of incoming events can strongly vary with time, systems cannot reliably predetermine the amount of required resources. In order to meet guaranteed SLAs as well as utilizing system resources as efficiently as possible, frameworks like Apache Flink have to adapt their resource consumption dynamically. In this talk, we will take a look under the hood and explain how Flink scales stateful application in and out. Starting with the concept of key groups and partionable state, we will cover ways to detect bottlenecks in streaming jobs and discuss efficient strategies how to scale out operators with minimal down-time.
Webinar: Queues with RabbitMQ - Lorna MitchellCodemotion
Queues are a great addition to any application that has some tasks that need processing asynchronously. This could be sending a confirmation email, resizing an avatar, or recalculating a running total of some kind; in all those cases it would be cool to send the response back to the user and then sort out that task later. This session looks at how to use a RabbitMQ job queue in your application. It also looks at how to design elegant and robust long-running workers that will consume the jobs from the queue and process them. This session is ideal for technical leads, developers and architects alike.
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data AnalyticsSingleStore
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data Analytics: Novus, DigitalOcean, Akamai.
Building Predictive Applications with Real-Time Data Pipelines and Streamliner. Eric Frenkiel, CEO and Co-Founder, MemSQL
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
talk given at erlang factory 2011 about using erlang to build social games backends
Watch the video of this presentation http://vimeo.com/22144057#at=0
Degrading Performance? You Might be Suffering From the Small Files SyndromeDatabricks
No matter if your data pipelines are handling real-time event-driven streams, near-real-time streams, or batch processing jobs. When you work with a massive amount of data made out of small files, specifically parquet, your system performance will degrade.
A small file is one that is significantly smaller than the storage block size. Yes, even with object stores such as Amazon S3, Azure Blob, etc., there is minimum block size. Having a significantly smaller object file can result in wasted space on the disk since the storage is optimized to support fast read and write for minimal block size.
To understand why this happens, you need first to understand how cloud storage works with the Apache Spark engine. In this session, you will learn about Parquet, the Storage API calls, how they work together, why small files are a problem, and how you can leverage DeltaLake for a more straightforward, cleaner solution.
Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...Nagios
Nicolas Brousse's presentation on using Nagios to monitor a dynamic cloud environment.
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Slides of erlang factory 2011 London talk "Designing for performance with erlang"
Video of this presentation available at http://vimeo.com/26715793#at=0
The reality of software systems is that the business needs they intend to service will change over time. So applications must be created that are able to evolve and follow these changing needs.
Welcome to the world where we are building high-volume distributed streaming applications using systems like Apache Flink, Spark and Kafka. Applications that are assumed to run 'forever' never go down.
So what happens if a business need changes? How can you make a streaming application that can evolve without breaking all the downstream applications that depend on it? Roll out the new producers first? Roll out the new consumers first? How do I avoid going down? But wait! Systems like Kafka persist records for weeks; so how do you handle the fact that there can be several different schemas in the Kafka topic at a certain point in time? Can you deploy a new application that reads both formats?
In this presentation Niels Basjes (Avro PMC) will go into the ways bol.com has chosen to handle these effects in a practical way. He will describe how the "Message" format and the schema evolution features of Apache Avro are used in conjuction with Apache Flink to make applications really 'evolvable'.
How do we make sure all applications are able to find the schema specifications, what can we do to ensure schemas stay 'evolvable,' and what were the pitfalls we ran into? Join us and find out.
The feature we always hear about whenever Java 9 is in the news is Jigsaw, modularity. But this doesn't scratch the
same developer itch that Java 8's lambdas and streams did, and we're left with a vague sensation that the next version might not be that interesting.
Java 9 actually has a lot of great additions and changes to make development a bit nicer. These features can't be lumped under an umbrella term like Java 8's lambdas and streams, the changes are scattered throughout the APIs and language features that we regularly use.
In this presentation Trisha will show, via live coding:
- How we can use the new Flow API to utilise Reactive Programming
- How the improvements to the Streams API make it easier to control real-time streaming data
- How to the Collections convenience methods simplify code
Along the way we'll bump into other Java 9 features, including some of the additions to interfaces and changes to deprecation. We’ll see that once you start using Java 9, you can't go back to Before.
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...Amit Gupta
The Cloud Foundry Diego team at Pivotal has been hard at work for the past few months exploring and improving Diego's performance at scale and under stress. This talk covers the goals, tools, and results of the experiments to date, as well as a glimpse of what's next.
And finally, a brief teaser about the current state of .NET support in Diego
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2015.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
Silicon Valley Code Camp 2014 - Advanced MongoDBDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2014.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
I'm talking about how Ansible helps Backbase establish testing pipeline to ensure the quality of Customer Experience Platform - the leading horizontal portal software. This is done by utilizing the concept of immutable infrastructure to provision on-demand infrastructure use it and the dispose.
This session introduces tools that can help you analyze and troubleshoot performance with SharePoint 2013. This sessions presents tools like perfmon, Fiddler, Visual Round Trip Analyzer, IIS LogParser, Developer Dashboard and of course we create Web and Load Tests in Visual Studio 2013.
At the end we also take a look at some of the tips and best practices to improve performance on SharePoint 2013.
Migrare la tua applicazione verso il cloud è estremamente semplice, sulla carta. La dura verità è che l'unico modo per sapere con certezza come si comporterà è testare con attenzione. Estrarre un benchmark on premises è già abbastanza difficile, ma il benchmarking nel cloud può diventare davvero complicato a causa delle restrizioni negli ambienti PaaS e per la mancanza di strumenti.
Raggiungimi in questa sessione e scopri come catturare un carico di lavoro da produzione, riprodurlo nel tuo database cloud e confrontare le prestazioni. Ti illustrerò la metodologia e gli strumenti per portare il tuo database nel cloud senza battere ciglio.
By Gianluca Sartori
Presented at STPCon 2016. With the extensive amount of testing performed nightly on large software projects, test and verification teams often experience lengthy wait times for the availability of test results of the latest build. As we strive to identify and resolve issues as fast as possible, alternative methods of test execution have to be found. Learn how to use Jenkins to launch tests in parallel across a number of Virtual Machines, monitor execution health, and process results. Learn about various Jenkins plugins and how they contributed to the solution. Learn how to trigger downstream jobs, even if they are on separate Jenkins instances.
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
Morningstar’s Risk Model project is created by stitching together statistical and machine learning models to produce risk and performance metrics for millions of financial securities. Previously, we were running a single version of this application, but needed to expand it to allow for customizations based on client demand. With the goal of running hundreds of custom Risk Model runs at once at an output size of around 1TB of data each, we had a challenging technical problem on our hands! In this presentation, we’ll talk about the challenges we faced replatforming this application to Spark, how we solved them, and the benefits we saw.
Some things we’ll touch on include how we created customized models, the architecture of our machine learning application, how we maintain an audit trail of data transformations (for rigorous third party audits), and how we validate the input data our model takes in and output data our model produces. We want the attendees to walk away with some key ideas of what worked for us when productizing a large scale machine learning platform.
Hangfire
An easy way to perform background processing in .NET and .NET Core applications. No Windows Service or separate process required.
Why Background Processing?
Lengthy operations like updating lot of records in DB
Checking every 2 hours for new data or files
Invoice generation at the end of every billing period
Monthly Reporting
Rebuild data, indexes or search-optimized index after data change
Automatic subscription renewal
Regular Mailings
Send an email due to an action
Background service provisioning
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
Presentation covering 25 years worth of lessons learned while performance benchmarking applications and databases. Presented at Percona Live London in November 2014.
Benchmarking, Load Testing, and Preventing Terrible DisastersMongoDB
"Have you ever crossed your fingers before performing an upgrade or switching storage engines, because you weren't quite sure what would happen? Have you ever been bitten by a slight change in behavior that turned out to be unexpectedly significant for your workload? At Parse we have developed a workflow that lets us repeatedly capture and replay real production workloads offline. This has allowed us to confidently perform upgrades across a large fleet with a minimum amount of canarying, and has helped us load test a variety of storage engines with real workloads so we can compare and understand the performance tradeoffs.
In this talk we will cover best practices for upgrades and migrations, and we will walk through how to use our open-sourced tooling to demonstrate how you can do the same. We will also share some fun war stories about various disasters found and averted *before* putting them into production thanks to offline benchmarking."
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
At Booking.com, we have a constant flow of events coming from various applications and internal subsystems. This critical data needs to be stored for real-time, medium and long term analysis. Events are schema-less, making it difficult to use standard analysis tools.This presentation will explain how we built a storage and analysis solution based on Riak. The talk will cover: data aggregation and serialization, Riak configuration, solutions for lowering the network usage, and finally, how Riak's advanced features are used to perform real-time data crunching on the cluster nodes.
Laying the Foundation for Ionic Platform Insights on SparkIonic Security
The Ionic Analytics team shares insights about the system they built using Spark and Databricks to enable low cost, flexible reporting and lay a foundation for advanced analytics.
These slides were originally presented at the Databricks Data+ML Workshop entitled "Unify Data Pipelines with Machine Learning" on Tuesday September 11 2018 in Atlanta, GA.
MongoDB 3.2 introduces a host of new features and benefits, including encryption at rest, document validation, MongoDB Compass, numerous improvements to queries and the aggregation framework, and more. To take advantage of these features, your team needs an upgrade plan.
In this session, we’ll walk you through how to build an upgrade plan. We’ll show you how to validate your existing deployment, build a test environment with a representative workload, and detail how to carry out the upgrade. By the end, you should be prepared to start developing an upgrade plan for your deployment.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
2. Parse?
• Parse is a backend service for mobile apps
• Data Storage
• Server-side code
• Push Notifications
• Analytics
• … all by dropping an SDK into your app
3. Parse Stats
• Parse has 400,000 apps
• Rapidly growing MongoDB deployment with:
• 500 databases
• 2.5M collections
• 8M indexes
• 50T storage (excluding replication)
• We have all kinds of workloads!
4. Variety is Fun
• We support just about any kind workload you can
imagine
• Games, social networking, events, travel, music, etc
• Apps that are read heavy or write heavy
• Heavy push users (time sensitive notifications)
• Apps that store large objects
• Apps that use us for backups
• Inefficient queries
5. 2.6 - Why Upgrade?
• General desire to stay current, precursor
for 2.8 and pluggable storage engines
• Specific features in 2.6
• Background indexing on secondaries
• Index intersection
• query plan summary logging
6. Upgrading is Scary
• In the early days, we just upgraded
• Put a new version on a secondary
• ???
• Upgrade primaries
• ???
• Fix bugs as we find them - LIVE!
7. Upgrading
• We’re too big now to cowboy it up
• Upgrading blindly is a potential catastrophe
• In particular, we want to avoid:
• Significant performance regressions
• Unexpected bugs that break customer
apps
8. Benchmarking
• We know that:
• Benchmarking can detect performance
regressions between versions
• Tools and sample workloads (sysbench, YCSB,
…) already exist
• MongoDB runs its own benchmarks
• Our workload is complex - we want more
confidence
9. A Customized Approach
• Why not test with production
workloads?
• Flashback: https://github.com/
ParsePlatform/flashback
• Record - python tool to record ops
• Replay - go tool to play back ops
10. Record
• Record leverages mongo’s profiling and oplog
• Profiling is enabled on all DBs
• Inserts are collected from the oplog
• All other ops taken from profile db
• Ops are recorded for specified time period
(24H) and then merged
• Produces a JSON file of ops to feed the replay
tool
12. Base Snapshot
• Need to replay prod ops on prod data
• It’s best to play back ops on a consistent copy of the data,
otherwise:
• inserts are duplicate key errors
• deletes are no-ops
• queries don’t return the right data
• Using EBS snapshots, we grab a copy of the db during the
recording
• Discard ops before the snapshot
14. Base Snapshot
• Snapshot is restored to our benchmark server(s)
• EBS volume has to be “warmed” because snapshot
blocks are not instantiated
• Multi TB volumes can take a few hours to warm
• After warming we create an LVM snapshot
• We can “rewind” (merge) after each playback,
iterating faster
15. Playback
1. Freeze the LVM volume
2. Start the version of mongo being tested
3. Adjust replay parameters
• # workers
• # num ops
• timestamp to start at (when base snapshot was taken)
4. Go!
5. Client-side results are logged to file, server-side collected
from monitoring tools
17. Our Workload
• 24h of ops collected
• 10M ops at a time, as fast as possible
• 10 workers
• No warming of RS
• LVM snapshot reset, mongod restarted for
each version
• Rinse and repeat for multiple replica sets
20. Results
• 33% loss in throughput.
• A second workload showed a 75% drop
in throughput
• 3669.73 ops/sec vs 975.64 ops/sec
• Ouch! What do we do next?
23. Bug Hunt!
• Old fashioned troubleshooting begins
• Began isolating query patterns and collections
with high max times
• Reproduced issue, confirmed slowness in 2.6
• Lots of documentation and log gathering,
including extremely verbose QLOG
• Started investigation with the Mongo team that ran
several weeks
24. What we found
• Basically, new query planner in 2.6 meets Parse
auto-indexer
• We create lots of indexes automatically
• More indexes to score and potentially race
• Increased likelihood of running into query
planner bugs
25. Example 1
Remove op on “Installation”
{ "installationId": {"$ne": ? }, "appIdentifier": "?",
"deviceToken": “?”}
• 9M documents
• installationId is UUID, unique value
• "installationId": {"$ne": ? } matches most documents
• deviceToken is a unique token identifying the device
26. { "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”}
• Three candidate indexes:
{installationId: 1, deviceToken: 1}
{deviceToken: 1, installationId: 1}
{deviceToken: 1}
• The second and third indexes are clearly better candidates
for this query, since the device token is a simple point lookup.
• Mongo bug where the work required to skip keys was not
factored in to the plan ranking, causing the inefficient plan to
sometimes tie
• Since it’s a remove op, held the write lock for the DB
• Fixed in: https://jira.mongodb.org/browse/SERVER-14311
27. Example 2
Query on “Activity”:
{ $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } }
• 25M documents
• _p_project and _p_newProject are pointers to unique IDs of other objects
• acl matches most documents
• Four candidate indexes for this query
{ _p_newProject: 1 }
{ _p_project: 1 }
{ _p_project: 1, _created_at: 1 }
{ acl: 1 }
28. { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } }
• Query Planner would race multiple plans using indexes
• Due to a bug, one of the raced indexes would do a full
index scan (acl)
• Index scan was non-yielding, tying up the lock until it had
completed
• Parse query killer job kills non-yielding queries after 45s
• Query planner would fail to cache plan, and would re-run
on next query with the same pattern
• Fixed: https://jira.mongodb.org/browse/SERVER-15152
29. Example 3
Query on “Activity”: { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl:
{ $in: [ "a", “b”, “c" ] } } } (same as previous example)
• Usually fast, but occasionally saw high nscanned and query time > 60s
• Since there were indexes on all fields in AND condition, this was a
candidate for index intersection
• planSummary: IXSCAN { _p_project: 1 }, IXSCAN
{ _p_newProject: 1 }, IXSCAN { acl: 1.0 }
• acl was not selective, but _p_project and _p_newProject would
sometimes match 0 documents during race
• intersection-based query plan would get cached, subsequent queries
slow
• Fixed in https://jira.mongodb.org/browse/SERVER-14961
31. Comparison
2.4.10
P99
2.4.10
MAX
2.6.4
P99
2.6.4
MAX
2.6.5
P99
2.6.5
MAX
query
18
ms
20,953
ms
19
ms
60,001
ms
10
ms
4,352
ms
insert
23
ms
6,290
ms
50
ms
48,837
ms
24
ms
2,225
ms
update
22
ms
3,835
ms
21
ms
48,776
ms
23
ms
4,535
ms
FAM
22
ms
6,159
ms
24
ms
49,254
ms
23
ms
4,353
ms
33. What now?
• 2.6 has a green light on performance
• Working through functionality testing
• Unit/integration testing catching
majority of issues
• Bonus: Flashback error log helping us to
identify problems not caught by tests
34. Wrap Up
• Benchmarking with something representative of your
production workload is worth the time
• Saved us from discovering slowness in production and
inevitable and painful rollbacks
• Using actual production data is even better
• Helped us avoid new bugs
• Learned a lot about our own service (indexing
algorithms need some work)
• Initial work can be reused to efficiently test future versions