The introduction of DateTieredCompactionStrategy in late 2014 was a significant step forward in providing a viable compaction strategy for time series data, especially time series data that will be TTL'd out. DateTieredCompactionStrategy's introduction was met with genuine excitement, and its rapid adoption is testament to developers' and operators' desire to have data compacted in a way that better matches their write patterns.
However, DateTieredCompactionStrategy's features come with significant limitations. This talk will review our real world benchmarking and use cases for DTCS as a vehicle to discuss the implications of DateTieredCompactionStrategy on operational tasks such as repair, read-repair, bootstrapping, and especially DR recovery scenarios, and it will also discuss how those various limitations lead us to proposing an operations-friendly alternative to DateTieredCompactionStrategy.
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads.
Cassandra Summit 2015: Real World DTCS For OperatorsJeff Jirsa
Real World DTCS For Operators
The introduction of DateTieredCompactionStrategy in late 2014 was a significant step forward in providing a viable compaction strategy for time series data, especially time series data that will be TTL'd out. DateTieredCompactionStrategy's introduction was met with genuine excitement, and its rapid adoption is testament to developers' and operators' desire to have data compacted in a way that better matches their write patterns.
However, DateTieredCompactionStrategy's features come with significant limitations. This talk will review our real world benchmarking and use cases for DTCS as a vehicle to discuss the implications of DateTieredCompactionStrategy on operational tasks such as repair, read-repair, bootstrapping, and especially DR recovery scenarios, and it will also discuss how those various limitations lead us to proposing an operations-friendly alternative to DateTieredCompactionStrategy.
From Divided to United - Aligning Technical and Business TeamsDominica DeGrandis
This is a true story of one SaaS company's journey to gain alignment across business and technical teams by changing how four important factors were viewed: customer demand, work prioritization, team metrics, and communication etiquette.
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
This session covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Viewers will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster.
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
In this presentation, we will look into JIRAs, JavaDocs and system log entries to gain a deeper understanding on how LCS works under the hood. We will explain what scenarios don't work well for LCS and (more importantly) why. We will leverage legacy TRACE/DEBUG level log for compaction related objects as well as some newer compaction logging information introduced in C* 3.6 (CASSANDRA-10805) to gain better insights.
About the Speakers
Wei Deng Solutions Architect, DataStax
Solutions Architect for DataStax. I have a strong interest in big data, cloud application and distributed computing practices.
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
Time series data has long been a natural use case for Cassandra with plenty of write ups showing you how to store mock data ""at scale"". Unfortunately warnings of wide rows and examples of storing numeric-only data aren't sufficient to guide your organization through the realities of running these workloads. Instead you find yourself implementing anti patterns like rotating clusters, only to be taken in by the siren song of DTCS, your hopes dashed across the rocks of ever expanding disk utilization.
We will be taking the lessons learned at Threat Stack - a continuous security monitoring platform - about how to scale a large volume of bulky transactions totaling terabytes and petabytes on AWS, but while holding yourself to a sane budget and DBA-free operational life.
Specifics to include ""break the glass"" operational maneuvers, making DTCS function properly, data modeling, and living in a polyglot data platform.
About the Speaker
Sam Bisbee CTO, Threat Stack
As the CTO at Threat Stack, Sam is responsible for leading the Company's strategic technology roadmap for its continuous security monitoring service, purpose-built for cloud environments. Sam brings highly-relevant experience in distributed systems in public, private, and hybrid cloud environments, as well as proven success scaling SaaS startups. Sam was most recently the CXO at Cloudant (acquired by IBM in Feb. 2014), a leader in the Database-as-a-Service space.
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...DataStax
Go90 is a mobile entertainment platform offering access to live and on demand videos. We built the web services platform and social features like activity feed for go90 by making heavy use of Cassandra and Scala, and would like to share what we learned during development and while operating go90. In this presentation, we cover our data model evolution from the initial prototypes to the current production version and the significant performance gain by using a better data model. We will explain how we apply time series data modeling and the benefits of using expiring columns with DateTieredCompactionStrategy. We will also talk about interesting experiences related to table modifications, tombstones and table pagination. On the operations side, we will discuss our findings on java driver usage, performance, monitoring, cluster maintenance, version upgrade, 2-way ssl and many more. We hope you can learn from our mistakes instead of making them yourself!
About the Speakers
Christopher Webster Software Engineer, AOL
Christopher Webster works on the web services platform for the go90 AOL project. Previously he was a Computer Scientist for the Mission Control Technologies project at NASA Ames Center. Chris worked as a senior staff engineer at Sun Microsystems for Project zembly, the cloud development and deployment environment as well as technical lead in many NetBeans projects. Chris is an author of the NetBeans Field Guide and Assemble the Social Web With Zembly.
Thomas Ng Software Engineer, AOL
Thomas Ng is a software engineer at AOL, building web services for the go90 mobile entertainment platform using Cassandra, Scala and Kafka.
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads.
Cassandra Summit 2015: Real World DTCS For OperatorsJeff Jirsa
Real World DTCS For Operators
The introduction of DateTieredCompactionStrategy in late 2014 was a significant step forward in providing a viable compaction strategy for time series data, especially time series data that will be TTL'd out. DateTieredCompactionStrategy's introduction was met with genuine excitement, and its rapid adoption is testament to developers' and operators' desire to have data compacted in a way that better matches their write patterns.
However, DateTieredCompactionStrategy's features come with significant limitations. This talk will review our real world benchmarking and use cases for DTCS as a vehicle to discuss the implications of DateTieredCompactionStrategy on operational tasks such as repair, read-repair, bootstrapping, and especially DR recovery scenarios, and it will also discuss how those various limitations lead us to proposing an operations-friendly alternative to DateTieredCompactionStrategy.
From Divided to United - Aligning Technical and Business TeamsDominica DeGrandis
This is a true story of one SaaS company's journey to gain alignment across business and technical teams by changing how four important factors were viewed: customer demand, work prioritization, team metrics, and communication etiquette.
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
This session covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Viewers will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster.
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
In this presentation, we will look into JIRAs, JavaDocs and system log entries to gain a deeper understanding on how LCS works under the hood. We will explain what scenarios don't work well for LCS and (more importantly) why. We will leverage legacy TRACE/DEBUG level log for compaction related objects as well as some newer compaction logging information introduced in C* 3.6 (CASSANDRA-10805) to gain better insights.
About the Speakers
Wei Deng Solutions Architect, DataStax
Solutions Architect for DataStax. I have a strong interest in big data, cloud application and distributed computing practices.
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
Time series data has long been a natural use case for Cassandra with plenty of write ups showing you how to store mock data ""at scale"". Unfortunately warnings of wide rows and examples of storing numeric-only data aren't sufficient to guide your organization through the realities of running these workloads. Instead you find yourself implementing anti patterns like rotating clusters, only to be taken in by the siren song of DTCS, your hopes dashed across the rocks of ever expanding disk utilization.
We will be taking the lessons learned at Threat Stack - a continuous security monitoring platform - about how to scale a large volume of bulky transactions totaling terabytes and petabytes on AWS, but while holding yourself to a sane budget and DBA-free operational life.
Specifics to include ""break the glass"" operational maneuvers, making DTCS function properly, data modeling, and living in a polyglot data platform.
About the Speaker
Sam Bisbee CTO, Threat Stack
As the CTO at Threat Stack, Sam is responsible for leading the Company's strategic technology roadmap for its continuous security monitoring service, purpose-built for cloud environments. Sam brings highly-relevant experience in distributed systems in public, private, and hybrid cloud environments, as well as proven success scaling SaaS startups. Sam was most recently the CXO at Cloudant (acquired by IBM in Feb. 2014), a leader in the Database-as-a-Service space.
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...DataStax
Go90 is a mobile entertainment platform offering access to live and on demand videos. We built the web services platform and social features like activity feed for go90 by making heavy use of Cassandra and Scala, and would like to share what we learned during development and while operating go90. In this presentation, we cover our data model evolution from the initial prototypes to the current production version and the significant performance gain by using a better data model. We will explain how we apply time series data modeling and the benefits of using expiring columns with DateTieredCompactionStrategy. We will also talk about interesting experiences related to table modifications, tombstones and table pagination. On the operations side, we will discuss our findings on java driver usage, performance, monitoring, cluster maintenance, version upgrade, 2-way ssl and many more. We hope you can learn from our mistakes instead of making them yourself!
About the Speakers
Christopher Webster Software Engineer, AOL
Christopher Webster works on the web services platform for the go90 AOL project. Previously he was a Computer Scientist for the Mission Control Technologies project at NASA Ames Center. Chris worked as a senior staff engineer at Sun Microsystems for Project zembly, the cloud development and deployment environment as well as technical lead in many NetBeans projects. Chris is an author of the NetBeans Field Guide and Assemble the Social Web With Zembly.
Thomas Ng Software Engineer, AOL
Thomas Ng is a software engineer at AOL, building web services for the go90 mobile entertainment platform using Cassandra, Scala and Kafka.
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...DataStax
After reading the topic you are probably asking yourself: “Why I’ve never heard about Cassandra Streams?”. The reason is because Cassandra didn’t have any streams support. Until now.
The project started as a preparation to multi regional deployment, so we needed to test Cassandra and answer several simple questions:
· what will be the replication lag, or how long will it take for each mutation to propagate?
· will we be losing any mutation?
· will we see any additional load and/or other problems?
· and zillions of other questions
To answer them, we decided to integrate Cassandra with Amazon Kinesis, so we can track any individual mutation and analyze replication stats. That is how our Cassandra Streams integration was born. Currently it supports several stream platforms, like Kinesis and Kafka.
During this talk you will learn how we did it, how we used it to test Cassandra in Multi-regional setup and what are other possible applications of this concept.
About the Speaker
Dustin Pham, Sony
Dustin is part of the small team who built the core infrastructure which delivered the PlayStation 3 store and then subsequently core services for the PlayStation 4. He has been with Sony for over 4 years and continues to focus on providing entertainment experiences to Sony customers. Dustin is an avid gamer and finds enjoyment in solving large scale problems.
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.
Webinar: Getting Started with Apache CassandraDataStax
Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...”
You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.
Cassandra Summit 2014: Diagnosing Problems in ProductionDataStax Academy
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster.
How can you successfully migrate to hosted private cloud 2020OVHcloud
OVHcloud teams are pleased to offer you this webinar dedicated to migration to HPC2020 :
• What is HPC2020 & its features ?
• What are the migration paths and steps ?
• What resources are made available to you ?
• Q&A
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
The internal battle has been fought, and Cassandra is your group's NoSQL platform of choice! Hooray! But now what? This talk will introduce you to all the basic operations concepts you need to know to start your foray into the wonderful world of Cassandra off right. Or even if you have already started but are looking for a solid holistic overview... this is the talk for you!
Unless you have a problem which scales to many independent tasks easily e.g. web services, you may find that the best way to improve throughput is by reducing latency. This talk starts with Little's Law and it's consequences for high performance computing.
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
One way to speed up you application is to bring more of your data into memory. But how to do you handle hundreds of GB of data in a JVM and what tools can help you.
Mentions: Speedment, Azul, Terracotta, Hazelcast and Chronicle.
Co-Founder and CTO of Instaclustr, Ben Bromhead's presentation at the Cassandra Summit 2016, in San Jose.
This presentation will show how create truly elastic Cassandra deployments on AWS allowing you to scale and shrink your large Cassandra deployments multiple times a day. Leveraging a combination of EBS backed disks, JBOD, token pinning and our previous work on bootstrapping from backups you will be able to dramatically reduce costs per cluster by scaling to match your daily workloads.
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
Presenter: Adam Zeglin, CTO of Instaclustr
In this presentation we discuss a method of provisioning and running an Apache Cassandra deployment spilt between multiple heterogeneous data centers which, rather than allocating per-node public IPv4 addresses or configuring mesh VPNs, uses Port Address Translation (PAT) for node↔internet connectivity and is self- configuring and discoverable via DNS Service Discovery (DNS-SD or wide-area Bonjour). While Cassandra has built-in support for AWS EC2 multi-region/data centre topologies (via Ec2MultiRegionSnitch, etc), the existing solution requires the wasteful allocation of public IPv4 addresses per-node. Additionally there is little support for topologies that are either a mix of or deploy completely on alternative infrastructure providers. Our solution uses a single public IP address per data center, is provider-agnostic, doesn’t introduce the configuration and management overheads of a mesh VPN between data centres, and allows nodes to automatically discover each-other.
PlayStation and Cassandra Streams (Alexander Filipchik & Dustin Pham, Sony) |...DataStax
After reading the topic you are probably asking yourself: “Why I’ve never heard about Cassandra Streams?”. The reason is because Cassandra didn’t have any streams support. Until now.
The project started as a preparation to multi regional deployment, so we needed to test Cassandra and answer several simple questions:
· what will be the replication lag, or how long will it take for each mutation to propagate?
· will we be losing any mutation?
· will we see any additional load and/or other problems?
· and zillions of other questions
To answer them, we decided to integrate Cassandra with Amazon Kinesis, so we can track any individual mutation and analyze replication stats. That is how our Cassandra Streams integration was born. Currently it supports several stream platforms, like Kinesis and Kafka.
During this talk you will learn how we did it, how we used it to test Cassandra in Multi-regional setup and what are other possible applications of this concept.
About the Speaker
Dustin Pham, Sony
Dustin is part of the small team who built the core infrastructure which delivered the PlayStation 3 store and then subsequently core services for the PlayStation 4. He has been with Sony for over 4 years and continues to focus on providing entertainment experiences to Sony customers. Dustin is an avid gamer and finds enjoyment in solving large scale problems.
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.
Webinar: Getting Started with Apache CassandraDataStax
Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...”
You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.
Cassandra Summit 2014: Diagnosing Problems in ProductionDataStax Academy
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster.
How can you successfully migrate to hosted private cloud 2020OVHcloud
OVHcloud teams are pleased to offer you this webinar dedicated to migration to HPC2020 :
• What is HPC2020 & its features ?
• What are the migration paths and steps ?
• What resources are made available to you ?
• Q&A
This sessions covers diagnosing and solving common problems encountered in production, using performance profiling tools. We’ll also give a crash course to basic JVM garbage collection tuning. Attendees will leave with a better understanding of what they should look for when they encounter problems with their in-production Cassandra cluster. This talk is intended for people with a general understanding of Cassandra, but it not required to have experience running it in production.
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
The internal battle has been fought, and Cassandra is your group's NoSQL platform of choice! Hooray! But now what? This talk will introduce you to all the basic operations concepts you need to know to start your foray into the wonderful world of Cassandra off right. Or even if you have already started but are looking for a solid holistic overview... this is the talk for you!
Unless you have a problem which scales to many independent tasks easily e.g. web services, you may find that the best way to improve throughput is by reducing latency. This talk starts with Little's Law and it's consequences for high performance computing.
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
One way to speed up you application is to bring more of your data into memory. But how to do you handle hundreds of GB of data in a JVM and what tools can help you.
Mentions: Speedment, Azul, Terracotta, Hazelcast and Chronicle.
Co-Founder and CTO of Instaclustr, Ben Bromhead's presentation at the Cassandra Summit 2016, in San Jose.
This presentation will show how create truly elastic Cassandra deployments on AWS allowing you to scale and shrink your large Cassandra deployments multiple times a day. Leveraging a combination of EBS backed disks, JBOD, token pinning and our previous work on bootstrapping from backups you will be able to dramatically reduce costs per cluster by scaling to match your daily workloads.
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
Presenter: Adam Zeglin, CTO of Instaclustr
In this presentation we discuss a method of provisioning and running an Apache Cassandra deployment spilt between multiple heterogeneous data centers which, rather than allocating per-node public IPv4 addresses or configuring mesh VPNs, uses Port Address Translation (PAT) for node↔internet connectivity and is self- configuring and discoverable via DNS Service Discovery (DNS-SD or wide-area Bonjour). While Cassandra has built-in support for AWS EC2 multi-region/data centre topologies (via Ec2MultiRegionSnitch, etc), the existing solution requires the wasteful allocation of public IPv4 addresses per-node. Additionally there is little support for topologies that are either a mix of or deploy completely on alternative infrastructure providers. Our solution uses a single public IP address per data center, is provider-agnostic, doesn’t introduce the configuration and management overheads of a mesh VPN between data centres, and allows nodes to automatically discover each-other.
Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster.
In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring.
Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster.
About the Speaker
Julien Anguenot VP Software Engineering, iland Internet Solutions, Corp
Julien currently serves as iland's Vice President of Software Engineering. Prior to joining iland, Mr. Anguenot held tech leadership positions at several open source content management vendors and tech startups in Europe and in the U.S. Julien is a long time Open Source software advocate, contributor and speaker: Zope, ZODB, Nuxeo contributor, Zope and OpenStack foundations member, his talks includes Apache Con, Cassandra summit, OpenStack summit, The WWW Conference or still EuroPython.
Apache Cassandra operations have the reputation to be quite simple against single datacenter clusters and / or low volume clusters but they become way more complex against high latency multi-datacenter clusters: basic operations such as repair, compaction or hints delivery can have dramatic consequences even on a healthy cluster.
In this presentation, Julien will go through Cassandra operations in details: bootstrapping new nodes and / or datacenter, repair strategies, compaction strategies, GC tuning, OS tuning, large batch of data removal and Apache Cassandra upgrade strategy.
Julien will give you tips and techniques on how to anticipate issues inherent to multi-datacenter cluster: how and what to monitor, hardware and network considerations as well as data model and application level bad design / anti-patterns that can affect your multi-datacenter cluster performances.
GumGum relies heavily on Cassandra for storing different kinds of metadata. Currently GumGum reaches 1 billion unique visitors per month using 3 Cassandra datacenters in Amazon Web Services spread across the globe.
This presentation will detail how we scaled out from one local Cassandra datacenter to a multi-datacenter Cassandra cluster and all the problems we encountered and choices we made while implementing it.
How did we architect multi-region Cassandra in AWS? What were our experiences in implementing multi-datacenter Cassandra? How did we achieve low latency with multi-region Cassandra and the Datastax Driver? What are the different Cassandra use cases at GumGum? How did we integrate our Cassandra with Spark?
Strategies for multi-data center deployment. Diving into the details of deploying of MongoDB across multiple data centers.
Covers the advantages of a multi data center deployment for read/write locality, the various deployment strategies, and disaster preparedness and recovery.
In addition, we’ll look at the MongoDB roadmap and planned enhancements around data center awareness.
This presentation was given at MongoNYC 2012. The animations didn’t survive the transformation to the web, so not all the meaning carries over perfectly.
Ficstar Software: Cassandra Installation to OptimizationDataStax Academy
A general rule of thumb talk aimed at late bloomers, managers, directors and architects who have yet to adopt Cassandra.
Covers:
- what not to do.
- operational setup
- data modeling
- performance tuning
- capacity planning
- advanced use cases
At Target, we serve millions of transactions through our APIs each month. These are backed by Cassandra. During peak season, we see a 10x traffic increase, which presents some interesting scaling issues. This is our performance tuning journey for cassandra, both in our own datacenters and in the cloud.
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
Al has been using Cassandra since version 0.6 and has spent the last few months doing little else but tune Cassandra clusters. In this talk, Al will show how to tune Cassandra for efficient operation using multiple views into system metrics, including OS stats, GC logs, JMX, and cassandra-stress.
Azure + DataStax Enterprise Powers Office 365 Per User StoreDataStax Academy
We will present our O365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DataStax Enterprise on azure.
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
This presentation will show how create truly elastic Cassandra deployments on AWS allowing you to scale and shrink your large Cassandra deployments multiple times a day.
Leveraging a combination of EBS backed disks, JBOD, token pinning and our previous work on bootstrapping from backups you will be able to dramatically reduce costs per cluster by scaling to match your daily workloads.
Warning: This presentation will probably contain some references to late 2000's pop group LMFAO
About the Speaker
Ben Bromhead CTO, Instaclustr
Ben Bromhead is the CTO of Instaclustr where he is responsible for working closely with his engineering team and customers to build highly available, scalable applications on top of Cassandra. Instaclustr is the only multi-cloud, self service Cassandra as a Service provider in the world and is dedicated to provider world class support.
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
We will present our Office 365 use case scenarios, why we chose Cassandra + Spark, and walk through the architecture we chose for running DSE on Azure.
The presentation will feature demos on how you too can build similar applications.
Choosing the right parallel compute architecture corehard_by
Multi-core architecture is the present and future way in which the market is addressing Moore’s law limitations. Multi-core workstations, high performance computers, GPUs and the focus on hybrid/ public cloud technologies for offloading and scaling applications is the direction development is heading. Leveraging multiple cores in order to increase application performance and responsiveness is expected especially from classic high-throughput executions such as rendering, simulations, and heavy calculations. Choosing the correct multi-core strategy for your software requirements is essential, making the wrong decision can have serious implications on software performance, scalability, memory usage and other factors. In this overview, we will inspect various considerations for choosing the correct multi-core strategy for your application’s requirement and investigate the pros and cons of multi-threaded development vs multi-process development. For example, Boost’s GIL (Generic Image Library) provides you with the ability to efficiently code image processing algorithms. However, deciding whether your algorithms should be executed as multi-threaded or multi-process has a high impact on your design, coding, future maintenance, scalability, performance, and other factors.
A partial list of considerations to take into account before taking this architectural decision includes:
- How big are the images I need to process
- What risks can I have in terms of race-conditions, timing issues, sharing violations – does it justify multi-threading programming?
- Do I have any special communication and synchronization requirements?
- How much time would it take my customers to execute a large scenario?
- Would I like to scale processing performance by using the cloud or cluster?
We will then examine these issues in real-world environments. In order to learn how this issue is being addressed in a real-world scenario, we will examine common development and testing environments we are using in our daily work and compare the multi-core strategies they have implemented in order to promote higher development productivity.
Performance Optimization of Cloud Based Applications by Peter Smith, ACLTriNimbus
Peter Smith, PhD, Principal Software Engineer at ACL talks about Performance Optimization of Cloud Based Applications at TriNimbus' 2017 Canadian Executive Cloud & DevOps summit in Vancouver
TidalScale has created a software defined computer.
At TidalScale, we have created a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
We configure hosted hardware into one or more TidalPods. Each TidalPod is a virtual supercomputer comprising a set of commodity servers configured with the TidalScale HyperKernel. What the user sees is standard Linux, FreeBSD or Windows running with the sum of all memory, processors, networks, and I/O. The secret sauce is the HyperKernel that fools the guest OS into thinking it’s running directly on a huge, expensive machine when in fact it’s running on a set of smaller, less expensive servers.
We offer an incredibly simple user experience.
• Define the computer size you want (Number of CPU, Amount of Memory), boot the virtual machine, then login to the computer…
Thus, we enable a simple cost-effective way for a data scientist, an analyst, an engineer, a scientist, a database administrator, or a software developer to access a group of servers in a Datacenter through a single operating system instance as if it were a single supercomputer. This dramatically simplifies development, while reducing software scaling complexity not to mention a dramatic cost saving in hardware and software.
How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax
In this webinar, Charles Rich, VP of Product Management at jKool will share their journey with DataStax; how jKool knew from the start that traditional relational databases wouldn’t work for the scalability and availability demands of time-series data, and why they turned to DataStax Enterprise for blazing performance and powerful enterprise search and analytics capabilities.
How jKool Analyzes Streaming Data in Real Time with DataStaxjKool
jKool provides an application analytics SaaS for DevOps. These slides illustrate some of the choices we had to make and the architectural decisions to build a system for both real-time and historical application analytics.
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
Companies today are innovating with real-time data to deliver truly amazing customer experiences in the moment. Real-time data management for real-time customer experience is core to staying ahead of competition and driving revenue growth. Join Trays to learn how Comcast is differentiating itself from it's own historical reputation with Customer Experience strategies.
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
DataStax Enterprise (DSE) Graph is a built to manage, analyze, and search highly connected data. DSE Graph, built on NoSQL Apache Cassandra delivers continuous uptime along with predictable performance and scales for modern systems dealing with complex and constantly changing data.
Download DataStax Enterprise: Academy.DataStax.com/Download
Start free training for DataStax Enterprise Graph: Academy.DataStax.com/courses/ds332-datastax-enterprise-graph
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
DataStax Enterprise Advanced Replication supports one-way distributed data replication from remote database clusters that might experience periods of network or internet downtime. Benefiting use cases that require a 'hub and spoke' architecture.
Learn more at http://www.datastax.com/2016/07/stay-100-connected-with-dse-advanced-replication
Advanced Replication docs – https://docs.datastax.com/en/latest-dse/datastax_enterprise/advRep/advRepTOC.html
Data Modeling is the one of the first things to sink your teeth into when trying out a new database. That's why we are going to cover this foundational topic in enough detail for you to get dangerous. Data Modeling for relational databases is more than a touch different than the way it's approached with Cassandra. We will address the quintessential query-driven methodology through a couple of different use cases, including working with time series data for IoT. We will also demo a new tool to get you bootstrapped quickly with MovieLens sample data. This talk should give you the basics you need to get serious with Apache Cassandra.
Hear about how Coursera uses Cassandra as the core of its scalable online education platform. I'll discuss the strengths of Cassandra that we leverage, as well as some limitations that you might run into as well in practice.
In the second part of this talk, we'll dive into how best to effectively use the Datastax Java drivers. We'll dig into how the driver is architected, and use this understanding to develop best practices to follow. I'll also share a couple of interesting bug we've run into at Coursera.
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
This is a two part talk in which we'll go over the architecture that enables Apache Cassandra’s linear scalability as well as how DataStax Drivers are able to take full advantage of it to provide developers with nicely designed and speedy clients extendable to the core.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
2. An Introduction to
CrowdStrike
We Are CyberSecurity Technology Company
We Detect, Prevent And Respond To All Attack Types
In Real Time, Protecting OrganizationsFrom
Catastrophic Breaches
We Provide Next Generation Endpoint Protection,Threat
Intelligence & Pre &Post IR Services
NEXT-GEN
ENDPOINT
INCIDENT
RESPONSE
THREAT
INTEL