random list of Apache Cassndra Anti Patterns. There is a lot of info on what to use Cassandra for and how, but not a lot of information on what not to do. This presentation works towards filling that gap.
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...DataStax Academy
Ooyala has been using Apache Cassandra since version 0.4. Our data ingest volume has exploded since 0.4 and Cassandra has scaled along with us. Al will cover many topics from an operational perspective on how to manage, tune, and scale Cassandra in a production environment.
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...DataStax Academy
Ooyala has been using Apache Cassandra since version 0.4. Our data ingest volume has exploded since 0.4 and Cassandra has scaled along with us. Al will cover many topics from an operational perspective on how to manage, tune, and scale Cassandra in a production environment.
Seastore: Next Generation Backing Store for CephScyllaDB
Ceph is an open source distributed file system addressing file, block, and object storage use cases. Next generation storage devices require a change in strategy, so the community has been developing crimson-osd, an eventual replacement for ceph-osd intended to minimize cpu overhead and improve throughput and latency. Seastore is a new backing store for crimson-osd targeted at emerging storage technologies including persistent memory and ZNS devices.
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
This talk is prepared as a bunch of slides, where each slide describes a really bad way people can screw up their PostgreSQL database and provides a weight - how frequently I saw that kind of problem. Right before the talk I will reshuffle the deck to draw twenty random slides and explain you why such practices are bad and how to avoid running into them.
Josh Berkus
Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.
P99CONF — What We Need to Unlearn About Persistent StorageScyllaDB
System software engineers have long been taught that disks are slow and sequential I/O is key to performance. With SSD drives I/O really got much faster but not simpler. In this brave new world of rocket-speed throughputs an engineer has to distinguish sustained workload from bursts, (still) take care about I/O buffer sizes, account for disks’ internal parallelism and study mixed I/O characteristics in advance. In this talk we will share some key performance measurements of the modern hardware we’re taking at ScyllaDB and our opinion about the implications for the database and system software design.
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
One way to speed up you application is to bring more of your data into memory. But how to do you handle hundreds of GB of data in a JVM and what tools can help you.
Mentions: Speedment, Azul, Terracotta, Hazelcast and Chronicle.
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
In Alibaba, almost all of the busiest business are builded on MySQL. It aquires MySQL can support high load. The challenge comes from sucn as complex transactions, huge number of parallel connections and press on network, cpu, memory and disk.
To deal with the high press, we have some solutions on database level, such as some patches on MySQL source and the relevant change in application code.
This topics will discuss about the solutions, focus on what problems we met, how solutions were designed and the result of them.
This talk was given during DockerCon EU 2018.
It ain't just a whim - to be able to continue innovating, we’ve moved our good old static production to containers. We needed to be elastic, fast, reliable and production ready at any time - that's why we chose Docker. But like in most enterprises, lots of our apps run on the JVM and most JVMs’ ergonomics assume they “own” the server they are running on. So how do you containerize JVM apps? Should you really increase JVM heap if you have spare memory? What about OS caches? What are the differences between JDK 8, 9 and 10 when it comes to container awareness? Outages because of out of memory errors? Slowness because of long garbage collection and poor environment visibility? Long story short, in this session, we’ll look at the gotchas of running JVM apps in containers and teach you how to avoid costly mistakes.
Top 3 things attendees will learn:
1. Key differences between various JVM versions relevant for containerized Java apps.
2. Best practices for running JVM in containers.
3. Avoiding common pitfalls when running containerized JVM applications.
There are two key choices when scaling a NoSQL data store:
choosing between a hash or a range based sharding and choosing the right sharding key. Any choice is a trade-off between scalability of read, append, and update workloads.
In this talk I will present the standard scaling techniques,
some non-universal sharding tricks, less obvious reasons for
hotspots, as well as techniques to avoid them.
Unikraft: Fast, Specialized Unikernels the Easy WayScyllaDB
P99 CONF
Unikernels are famous for providing excellent performance in terms of boot times, throughput and memory consumption, to name a few metrics. However, they are infamous for making it hard and extremely time consuming to extract such performance, and for needing significant engineering effort in order to port applications to them. We introduce Unikraft, a novel micro-library OS that (1) fully modularizes OS primitives so that it is easy to customize the unikernel and include only relevant components and (2) exposes a set of composable, performance-oriented APIs in order to make it easy for developers to obtain high performance.
Our evaluation using off-the-shelf applications such as nginx, SQLite, and Redis shows that running them on Unikraft results in a 1.7x-2.7x performance improvement compared to Linux guests. In addition, Unikraft images for these apps are around 1MB, require less than 10MB of RAM to run, and boot in around 1ms on top of the VMM time (total boot time 3ms-40ms). Unikraft is a Linux Foundation open source project and can be found at www.unikraft.org.
Unless you have a problem which scales to many independent tasks easily e.g. web services, you may find that the best way to improve throughput is by reducing latency. This talk starts with Little's Law and it's consequences for high performance computing.
Cassandra concepts, patterns and anti-patternsDave Gardner
An introduction to the fundamental concepts behind Apache Cassandra. This talk explains the engineering principles that make Cassandra such an attractive choice for building highly resilient and available systems and then goes on to explain how to use it - covering basic data modelling patterns and anti-patterns.
Seastore: Next Generation Backing Store for CephScyllaDB
Ceph is an open source distributed file system addressing file, block, and object storage use cases. Next generation storage devices require a change in strategy, so the community has been developing crimson-osd, an eventual replacement for ceph-osd intended to minimize cpu overhead and improve throughput and latency. Seastore is a new backing store for crimson-osd targeted at emerging storage technologies including persistent memory and ZNS devices.
PostgreSQL worst practices, version PGConf.US 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
This talk is prepared as a bunch of slides, where each slide describes a really bad way people can screw up their PostgreSQL database and provides a weight - how frequently I saw that kind of problem. Right before the talk I will reshuffle the deck to draw twenty random slides and explain you why such practices are bad and how to avoid running into them.
Josh Berkus
Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.
P99CONF — What We Need to Unlearn About Persistent StorageScyllaDB
System software engineers have long been taught that disks are slow and sequential I/O is key to performance. With SSD drives I/O really got much faster but not simpler. In this brave new world of rocket-speed throughputs an engineer has to distinguish sustained workload from bursts, (still) take care about I/O buffer sizes, account for disks’ internal parallelism and study mixed I/O characteristics in advance. In this talk we will share some key performance measurements of the modern hardware we’re taking at ScyllaDB and our opinion about the implications for the database and system software design.
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
One way to speed up you application is to bring more of your data into memory. But how to do you handle hundreds of GB of data in a JVM and what tools can help you.
Mentions: Speedment, Azul, Terracotta, Hazelcast and Chronicle.
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
In Alibaba, almost all of the busiest business are builded on MySQL. It aquires MySQL can support high load. The challenge comes from sucn as complex transactions, huge number of parallel connections and press on network, cpu, memory and disk.
To deal with the high press, we have some solutions on database level, such as some patches on MySQL source and the relevant change in application code.
This topics will discuss about the solutions, focus on what problems we met, how solutions were designed and the result of them.
This talk was given during DockerCon EU 2018.
It ain't just a whim - to be able to continue innovating, we’ve moved our good old static production to containers. We needed to be elastic, fast, reliable and production ready at any time - that's why we chose Docker. But like in most enterprises, lots of our apps run on the JVM and most JVMs’ ergonomics assume they “own” the server they are running on. So how do you containerize JVM apps? Should you really increase JVM heap if you have spare memory? What about OS caches? What are the differences between JDK 8, 9 and 10 when it comes to container awareness? Outages because of out of memory errors? Slowness because of long garbage collection and poor environment visibility? Long story short, in this session, we’ll look at the gotchas of running JVM apps in containers and teach you how to avoid costly mistakes.
Top 3 things attendees will learn:
1. Key differences between various JVM versions relevant for containerized Java apps.
2. Best practices for running JVM in containers.
3. Avoiding common pitfalls when running containerized JVM applications.
There are two key choices when scaling a NoSQL data store:
choosing between a hash or a range based sharding and choosing the right sharding key. Any choice is a trade-off between scalability of read, append, and update workloads.
In this talk I will present the standard scaling techniques,
some non-universal sharding tricks, less obvious reasons for
hotspots, as well as techniques to avoid them.
Unikraft: Fast, Specialized Unikernels the Easy WayScyllaDB
P99 CONF
Unikernels are famous for providing excellent performance in terms of boot times, throughput and memory consumption, to name a few metrics. However, they are infamous for making it hard and extremely time consuming to extract such performance, and for needing significant engineering effort in order to port applications to them. We introduce Unikraft, a novel micro-library OS that (1) fully modularizes OS primitives so that it is easy to customize the unikernel and include only relevant components and (2) exposes a set of composable, performance-oriented APIs in order to make it easy for developers to obtain high performance.
Our evaluation using off-the-shelf applications such as nginx, SQLite, and Redis shows that running them on Unikraft results in a 1.7x-2.7x performance improvement compared to Linux guests. In addition, Unikraft images for these apps are around 1MB, require less than 10MB of RAM to run, and boot in around 1ms on top of the VMM time (total boot time 3ms-40ms). Unikraft is a Linux Foundation open source project and can be found at www.unikraft.org.
Unless you have a problem which scales to many independent tasks easily e.g. web services, you may find that the best way to improve throughput is by reducing latency. This talk starts with Little's Law and it's consequences for high performance computing.
Cassandra concepts, patterns and anti-patternsDave Gardner
An introduction to the fundamental concepts behind Apache Cassandra. This talk explains the engineering principles that make Cassandra such an attractive choice for building highly resilient and available systems and then goes on to explain how to use it - covering basic data modelling patterns and anti-patterns.
Talk from CassandraSF 2012 showing the importance of real durability. Examples of use for row level isolation in Cassandra and the implementation of a transaction log pattern. The example used is a banking system on top of Cassandra with support crediting/debiting an account, viewing an account balance and transferring money between accounts.
Cassandra, Modeling and Availability at AMUGMatthew Dennis
brief high level comparison of modeling between relational databases and Cassandra followed by a brief description of how Cassandra achieves global availability
A high level overview of common Cassandra use cases, adoption reasons, BigData trends, DataStax Enterprise and the future of BigData given at the 7th Advanced Computing Conference in Seoul, South Korea
Further discussion on Data Modeling with Apache Cassandra. Overview of formal data modeling techniques as well as practical. Real-world use cases and associated data models.
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
Cassandra community has consistently requested that we cover C* schema design concepts. This presentation goes in depth on the following topics:
- Schema design
- Best Practices
- Capacity Planning
- Real World Examples
Dicas rápidas de programação, truques e técnicas que você pode usar agora! Você está convidado a juntar-se aos especialistas em desenvolvimento de software da Embarcadero a cada 15 dias para tutoriais ao vivo de 30 minutos sobre desenvolvimento de software para Windows, Mac, Android e iOS.
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
Talk for the Cassandra Seattle Meetup April 2013: http://www.meetup.com/cassandra-seattle/events/114988872/
Cassandra's got some properties which make it an ideal fit for building real-time analytics applications -- but getting from atomic increments to live dashboards and streaming queries is quite a stretch. In this talk, Tim Moreton, CTO at Acunu, talks about how and why they built Acunu Analytics, which adds rich SQL-like queries and a RESTful API on top of Cassandra, and looks at how it keeps Cassandra's spirit of denormalization under the hood.
This presentation demonstrates how to efficiently manage GPU buffers using today's APIs. It describes why buffer management is so important, and how inefficient buffer management can cut frame rates in half. Finally, it demonstrates a couple of new techniques; the first being discard-free circular buffers and the second transient buffers.
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmAnne Nicolas
Building a full kernel takes time but is often necessary during development or when backporting patches. The nature of the kernel makes it easy to distribute its build on multiple cheap machines. This presentation will explain how to set up a build farm based on cost, size, and performance.
Willy Tarreau, HaProxy
Presentation by Dr. Cliff Click, Jr. Mention Java performance to a C hacker, or vice versa, and a flame war will surely ensue. The Web is full of broken benchmarks and crazy claims about Java and C performance. This session will aim to give a fair(er) comparison between the languages, striving to give a balanced view of each language's various strengths and weaknesses. It will also point out what's broken about many of the Java-versus-C Websites, so when you come across one, you can see the flaws and know that the Website isn't telling you what it (generally) claims to be telling you. (It's surely telling you "something," but almost just as surely is "not realistically" telling you why X is better than Y).
Retaining Goodput with Query Rate LimitingScyllaDB
Distributed systems are usually optimized with particular workloads in mind. At the same time, the system should still behave in a sane way when the assumptions about workload do not hold - notably, one user shouldn't be able to ruin the whole system's performance. Buggy parts of the system can be a source of the overload as well, so it is worth considering overload protection on a per-component basis. For example, ScyllaDB's shared-nothing architecture gives it great scalability, but at the same time makes it prone to a "hot partition" problem: a single partition accessed with disproportionate frequency can ruin performance for other requests handled by the same shards. This talk will describe how we implemented rate limiting on a per-partition basis which reduces the performance impact in such a case, and how we reduced the CPU cost of handling failed requests such as timeouts (spoiler: it's about C++ exceptions).
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Performance optimization techniques for Java codeAttila Balazs
The presentation covers the the basics of performance optimizations for real-world Java code. It starts with a theoretical overview of the concepts followed by several live demos
showing how performance bottlenecks can be diagnosed and eliminated. The demos include some non-trivial multi-threaded examples
inspired by real-world applications.
This talk was given during Lucene Revolution 2017 and has two goals: first, to discuss the tradeoffs for running Solr on Docker. For example, you get dynamic allocation of operating system caches, but you also get some CPU overhead. We'll keep in mind that Solr nodes tend to be different than your average container: Solr is usually long running, takes quite some RSS and a lot of virtual memory. This will imply, for example, that it makes more sense to use Docker on big physical boxes than on configurable-size VMs (like Amazon EC2).
The second goal is to discuss issues with deploying Solr on Docker and how to work around them. For example, many older (and some of the newer) combinations of Docker, Linux Kernel and JVM have memory leaks. We'll go over Docker operations best practices, such as using container limits to cap memory usage and prevent the host OOM killer from terminating a memory-consuming process - usually a Solr node. Or running Docker in Swarm mode over multiple smaller boxes to limit the spread of a single issue.
Modern computationally intensive tasks are rarely bottlenecked by the absolute performance of your processor cores. The real bottleneck in 2013 is getting data out of memory. CPU caches are designed to alleviate the difference in performance between CPU core clock speed and main memory clock speed, but developers rarely understand how this interaction works or how to measure or tune their application accordingly. This session aims to address this by:
• Describing how CPU caches work in the latest Intel hardware
• Showing what and how to measure in order to understand the caching behavior of software
• Giving examples of how this affects Java program performance and what can be done to address poor cache utilization
Data deduplication is a hot topic in storage and saves significant disk space for many environments, with some trade offs. We’ll discuss what deduplication is and where the Open Source solutions are versus commercial offerings. Presentation will lean towards the practical – where attendees can use it in their real world projects (what works, what doesn’t, should you use in production, etcetera).
We have been running a C* cluster in production with more than 20 nodes in EC2 for almost 1 year. We use v-nodes on EBS and we have learned quite a bit about what to do and what to avoid in order to reduce the ongoing operational support of our cluster. The cluster holds currently 5TB of data with average 4600 writes/sec (max of 18000 writes/sec). The read load is usually around 1100 reads/second (max of 3100 reads/sec).
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
Similar to strangeloop 2012 apache cassandra anti patterns (20)
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
2. C* on a SAN
● fact: C* was designed, from the start, for
commodity hardware
● more than just not requiring a SAN, C*
actually performs better without one
● SPOF
● unnecessary (large) cost
● “(un)coordinated” IO from nodes
● SANs were designed to solve problems C*
doesn’t have
3. Commit Log + Data Directory
(on the same volume)
● conflicting IO patterns
● commit log is 100% sequential append only
● data directory is (usually) random on reads
● commit log is essentially serialized
● massive difference in write
throughput under load
● NB: does not apply to SSDs or EC2
4. Oversize JVM Heaps
● 4 – 8 GB is good
(assuming sufficient ram on your boxen)
● 10 – 12 GB is not bad
(and often “correct”)
● 16GB == max
● > 16GB => badness
● heap >= boxen RAM => badness
6. not using -pr on scheduled repairs
● -pr is kind of new
● only applies to scheduled repairs
● reduces work to 1/RF (e.g. 1/3)
7. low file handle limit
● C* requires lots of file handles
(sorry, deal with it)
● Sockets and SSTables mostly
● 1024 (common default) is not sufficient
● fails in horrible miserably unpredictable ways
(though clear from the logs after the fact)
● 32K - 128K is common
● unlimited is also common, but personally I
prefer some sort of limit ...
8. Load Balacners
(in front of C*)
● clients will load balance
(C* has no master so this can work reliably)
● SPOF
● performance bottle neck
● unneeded complexity
● unneeded cost
9. restricting clients to a single node
● why?
● no really, I don’t understand how
this was thought to be a good idea
● thedailywtf.com territory
10. Unbalanced Ring
● used to be the number one
problem encountered
● OPSC automates the resolution of
this to two clicks (do it + confirm)
even across multiple data centers
● related: don’t let C* auto pick your
tokens, always specify initial_token
11. Row Cache + Slice Queries
● the row cache is a row cache, not a query cache or
slice cache or magic cache or WTF-ever-you-thought-
it-was cache
● for the obvious impaired: that’s why we called it a
row cache – because it caches rows
● laughable performance difference in some extreme
cases (e.g. 100X increase in throughput, 10X drop in
latency, maxed cpu to under 10% average)
12. Row Cache + Large Rows
● 2GB row? yeah, lets cache that !!!
● related: wtf are you doing trying to
read a 2GB row all at once anyway?
13. OPP/BOP
● if you think you need BOP, check
again
● no seriously, you’re doing it wrong
● if you use BOP anyway:
● IRC will mock you
● your OPS team will plan your disappearance
● I will setup a auto reply for your entire domain
that responds solely with “stop using BOP”
14. Unbounded Batches
● batches are sent as a single message
● they must fit entirely in memory
(both server side and client side)
● best size is very much an empirical
exercise depending on your HW, load,
data model, moon phase, etc
(start with 10 – 100 and tune)
● NB: streaming transport will address
this in future releases
15. Bad Rotational Math
● rotational disks require seek time
● 5ms is a fast seek time for a rotational disk
● you cannot get thousands of random seeks
per second from rotational disks
● caches/memory alleviate this, SSDs solve it
● maths are teh hard? buy SSDs
● everything fits in memory? I don’t care what
disks you buy
16. 32 Bit JVMs
● C* deals (usually) with BigData
● 32 bits cannot address BigData
● mmap, file offsets, heaps, caches
● always wrong? no, I guess not ...
17. EBS volumes
● nice in theory, but ...
● not predictable
● freezes common
● outages common
● stripe ephemeral drives instead
● provisioned IOPS EBS?
future hazy, ask again later
18. Non-Sun (err, Oracle) JVM
● at least u22, but in general the
latest release
(unless you have specific reasons otherwise)
● this is changing
● some people (successfully) use
OpenJDK anyway
19. Super Columns
● 10-15 percent overhead on reads and writes
● entire super column is always held in memory
at all stages
● most C* devs hate working on them
● C* and DataStax is committed to maintaining
the API going forward, but they should be
avoided for new projects
● composite columns are an alternative
20. Not Running OPSC
● extremely useful postmortem
● trivial (usually) to setup
● DataStax offers a free version
(you have no excuse now)