The document discusses performance improvements in Apache Cassandra 3.0's storage engine. Key improvements include delta encoding, variable integer encoding, clustering columns written only once, aggregated cell metadata, and cell presence bitmaps. This reduces storage size and improves read performance. The write path involves committing to the commit log and merging into the memtable. The read path can use clustering index filters to short-circuit searching based on deletion times, column names, or clustering ranges to avoid reading unnecessary SSTables.
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL aaronmorton
Slides from my talk at Cassandra Summit 2015
http://cassandrasummit-datastax.com/agenda/steady-state-data-size-with-compaction-tombstones-and-ttl/
thelastpickle.com
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
We built an application based on the principles of CQRS and Event Sourcing using Cassandra and Spark. During the project we encountered a number of challenges and problems with Cassandra and the Spark Connector.
In this talk we want to outline a few of those problems and our actions to solve them. While some problems are specific to CQRS and Event Sourcing applications most of them are use case independent.
About the Speakers
Matthias Niehoff IT-Consultant, codecentric AG
works as an IT-Consultant at codecentric AG in Germany. His focus is on big data & streaming applications with Apache Cassandra & Apache Spark. Yet he does not lose track of other tools in the area of big data. Matthias shares his experiences on conferences, meetups and usergroups.
Stephan Kepser Senior IT Consultant and Data Architect, codecentric AG
Dr. Stephan Kepser is an expert on cloud computing and big data. He wrote a couple of journal articles and blog posts on subjects of both fields. His interests reach from legal questions to questions of architecture and design of cloud computing and big data systems to technical details of NoSQL databases.
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL aaronmorton
Slides from my talk at Cassandra Summit 2015
http://cassandrasummit-datastax.com/agenda/steady-state-data-size-with-compaction-tombstones-and-ttl/
thelastpickle.com
Cassandra Community Webinar: Back to Basics with CQL3DataStax
Cassandra is a distributed, massively scalable, fault tolerant, columnar data store, and if you need the ability to make fast writes, the only thing faster than Cassandra is /dev/null! In this fast-paced presentation, we'll briefly describe big data, and the area of big data that Cassandra is designed to fill. We will cover Cassandra's unique, every-node-the-same architecture. We will reveal Cassandra's internal data structure and explain just why Cassandra is so darned fast. Finally, we'll wrap up with a discussion of data modeling using the new standard protocol: CQL (Cassandra Query Language).
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
We built an application based on the principles of CQRS and Event Sourcing using Cassandra and Spark. During the project we encountered a number of challenges and problems with Cassandra and the Spark Connector.
In this talk we want to outline a few of those problems and our actions to solve them. While some problems are specific to CQRS and Event Sourcing applications most of them are use case independent.
About the Speakers
Matthias Niehoff IT-Consultant, codecentric AG
works as an IT-Consultant at codecentric AG in Germany. His focus is on big data & streaming applications with Apache Cassandra & Apache Spark. Yet he does not lose track of other tools in the area of big data. Matthias shares his experiences on conferences, meetups and usergroups.
Stephan Kepser Senior IT Consultant and Data Architect, codecentric AG
Dr. Stephan Kepser is an expert on cloud computing and big data. He wrote a couple of journal articles and blog posts on subjects of both fields. His interests reach from legal questions to questions of architecture and design of cloud computing and big data systems to technical details of NoSQL databases.
This talk presents a number of conceptual and technical challenges that we discovered while building JRebel. At first, the JVM wasn't designed for live updates, so we will talk about the engine that mitigates the problem. Secondly, the diversity of Java ecosystem, created by the variety of application servers, the frameworks and tools, makes it challenging in creating a generic solution that would fit the majority of developers. We will see, how Java platform itself allows us to develop a solution by applying bytecode instrumentation mechanism.
JRebel does live code reloading to ensure that the developer can keep instantly alternating between the developing environment and the web browser, to save wasted time and increase the productivity flow.
Is your profiler speaking the same language as you? -- Docklands JUGSimon Maple
Profilers are absolute beasts. And profilers might prove useful to pinpoint the performance issues in your Java applications.
By using profilers, developers are fortunate to find the root cause of an issue at hand. However, it requires effort to actually comprehend the data collected by the profiler. Due to the inherent complexity of the data, one has to understand how this data is collected. And thus understand how the profiler actually works.
During this talk we will go through the classic profiler features. What is a hotspot? What is the difference between sampling and instrumentation from the profiler point of view? What are the problems with either of those methods? What is the time budget of the application? And more!
I will also showcase a new kid on the block among the profiling tools: XRebel. This tool provides insight into application behaviour and permits the developers to discover application level issues.
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
ScyllaDB adopted Raft as a consensus protocol in order to dramatically improve our operational aspects as well as provide strong consistency to the end-user. This talk will explain how Raft behaves in Scylla Open Source 5.0 and introduce the first end-user visible major improvement: schema changes. Learn how cluster configuration resides in Raft, providing consistent cluster assembly and configuration management. This makes bootstrapping safer and provides reliable disaster recovery when you lose the majority of the cluster.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
Successfully running Apache Cassandra in production often means knowing what configuration settings to change and which ones to leave as default. Over the years the cassandra.yaml file has grown to provide a number of settings that can improve stability and performance. While the file contains plenty of helpful comments, there is more to be said about the settings and when to change them.
In this talk Edward Capriolo, Consultant at The Last Pickle, will break down the parameters in the configuration files. Looking at those that are essential to getting started, those that impact performance, those that improve availability, the exotic ones, and the ones that should not be played with. This talk is ideal for someone someone setting up Cassandra for the first time up to people with deployments in productions and wondering what the more exotic configuration options do.
About the Speaker
Edward Capriolo Consultant, The Last Pickle
Long time Apache Cassandra user, big data enthusiast.
A brief, but action-packed introduction to DataStax Enterprise Search. In this deck, we'll get an overview of DSE Search's value proposition, see some example CQL search queries, and dive into the details of the indexing and query paths.
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Continuent
Join us for this training session where we discuss tools for backing up MySQL databases within Tungsten Clustering, and how Tungsten Clustering makes it easy to reprovision databases and recover a cluster. This training is for all engineers using Tungsten Clustering and will explain the tools needed to create a sound backup and recovery strategy, and is also valuable for those who wish to review their current MySQL backup and MySQL recovery procedures.
AGENDA
- Methods and tools for taking a MySQL backup from within a Tungsten cluster
- mysqldump
- xtrackbackup
- Snapshots (lvm or other)
- File copy
- tungsten_provision_slave
- Restoring a MySQL backup from within a Tungsten cluster
- Verifying the backup contains the last binary log position, and the importance of having this
- Restoring backups into a cluster
- Provisioning slaves from an existing datasource
A lot of data is best represented as time series: Operational data, financial data, and even in data warehouses the dominant dimension is often time. We present Chronix, a time series database based on Apache Solr and Spark which is able to handle trillions of time series data points and perform interactive queries. Chronix Spark is open source software and battle-proven at a German car manufacturer and an international telco.
We demonstrate several use cases of Chronix from real-life. Afterwards we lift the curtain and deep-dive into the Chronix architecture esp. how we're using Solr to store time series data and how we've hooked up Solr with Spark. We provide some benchmarks showing how Chronix has outperformed other time series databases in both performance and storage-efficiency.
Chronix is open source under the Apache License (http://chronix.io).
Effective testing for spark programs Strata NY 2015Holden Karau
This session explores best practices of creating both unit and integration tests for Spark programs as well as acceptance tests for the data produced by our Spark jobs. We will explore the difficulties with testing streaming programs, options for setting up integration testing with Spark, and also examine best practices for acceptance tests.
Unit testing of Spark programs is deceptively simple. The talk will look at how unit testing of Spark itself is accomplished, as well as factor out a number of best practices into traits we can use. This includes dealing with local mode cluster creation and teardown during test suites, factoring our functions to increase testability, mock data for RDDs, and mock data for Spark SQL.
Testing Spark Streaming programs has a number of interesting problems. These include handling of starting and stopping the Streaming context, and providing mock data and collecting results. As with the unit testing of Spark programs, we will factor out the common components of the tests that are useful into a trait that people can use.
While acceptance tests are not always part of testing, they share a number of similarities. We will look at which counters Spark programs generate that we can use for creating acceptance tests, best practices for storing historic values, and some common counters we can easily use to track the success of our job.
Relevant Spark Packages & Code:
https://github.com/holdenk/spark-testing-base / http://spark-packages.org/package/holdenk/spark-testing-base
https://github.com/holdenk/spark-validator
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
The 3.0 storage engine re-write is the biggest and most exciting change to ever happen in Apache Cassandra. The new storage engine can efficiently store and read data from disk using the same concepts present in the CQL 3 language. This has delivered large space savings, and creates new performance characteristics.
In this talk Aaron Morton, Co Founder at The Last Pickle and Apache Cassandra Committer, will discuss the 3.0 storage engine, it's layout and performance characteristics.
About the Speaker
Aaron Morton CEO, The Last Pickle
Aaron Morton is the Co Founder & CEO at The Last Pickle (thelastpickle.com). A professional services company that works with clients to deliver and improve Apache Cassandra based solutions. He's based in New Zealand, is an Apache Cassandra Committer and a DataStax MVP for Apache Cassandra.
Apache Cassandra, part 2 – data model example, machineryAndrey Lomakin
Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.
This talk presents a number of conceptual and technical challenges that we discovered while building JRebel. At first, the JVM wasn't designed for live updates, so we will talk about the engine that mitigates the problem. Secondly, the diversity of Java ecosystem, created by the variety of application servers, the frameworks and tools, makes it challenging in creating a generic solution that would fit the majority of developers. We will see, how Java platform itself allows us to develop a solution by applying bytecode instrumentation mechanism.
JRebel does live code reloading to ensure that the developer can keep instantly alternating between the developing environment and the web browser, to save wasted time and increase the productivity flow.
Is your profiler speaking the same language as you? -- Docklands JUGSimon Maple
Profilers are absolute beasts. And profilers might prove useful to pinpoint the performance issues in your Java applications.
By using profilers, developers are fortunate to find the root cause of an issue at hand. However, it requires effort to actually comprehend the data collected by the profiler. Due to the inherent complexity of the data, one has to understand how this data is collected. And thus understand how the profiler actually works.
During this talk we will go through the classic profiler features. What is a hotspot? What is the difference between sampling and instrumentation from the profiler point of view? What are the problems with either of those methods? What is the time budget of the application? And more!
I will also showcase a new kid on the block among the profiling tools: XRebel. This tool provides insight into application behaviour and permits the developers to discover application level issues.
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
ScyllaDB adopted Raft as a consensus protocol in order to dramatically improve our operational aspects as well as provide strong consistency to the end-user. This talk will explain how Raft behaves in Scylla Open Source 5.0 and introduce the first end-user visible major improvement: schema changes. Learn how cluster configuration resides in Raft, providing consistent cluster assembly and configuration management. This makes bootstrapping safer and provides reliable disaster recovery when you lose the majority of the cluster.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
Successfully running Apache Cassandra in production often means knowing what configuration settings to change and which ones to leave as default. Over the years the cassandra.yaml file has grown to provide a number of settings that can improve stability and performance. While the file contains plenty of helpful comments, there is more to be said about the settings and when to change them.
In this talk Edward Capriolo, Consultant at The Last Pickle, will break down the parameters in the configuration files. Looking at those that are essential to getting started, those that impact performance, those that improve availability, the exotic ones, and the ones that should not be played with. This talk is ideal for someone someone setting up Cassandra for the first time up to people with deployments in productions and wondering what the more exotic configuration options do.
About the Speaker
Edward Capriolo Consultant, The Last Pickle
Long time Apache Cassandra user, big data enthusiast.
A brief, but action-packed introduction to DataStax Enterprise Search. In this deck, we'll get an overview of DSE Search's value proposition, see some example CQL search queries, and dive into the details of the indexing and query paths.
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Continuent
Join us for this training session where we discuss tools for backing up MySQL databases within Tungsten Clustering, and how Tungsten Clustering makes it easy to reprovision databases and recover a cluster. This training is for all engineers using Tungsten Clustering and will explain the tools needed to create a sound backup and recovery strategy, and is also valuable for those who wish to review their current MySQL backup and MySQL recovery procedures.
AGENDA
- Methods and tools for taking a MySQL backup from within a Tungsten cluster
- mysqldump
- xtrackbackup
- Snapshots (lvm or other)
- File copy
- tungsten_provision_slave
- Restoring a MySQL backup from within a Tungsten cluster
- Verifying the backup contains the last binary log position, and the importance of having this
- Restoring backups into a cluster
- Provisioning slaves from an existing datasource
A lot of data is best represented as time series: Operational data, financial data, and even in data warehouses the dominant dimension is often time. We present Chronix, a time series database based on Apache Solr and Spark which is able to handle trillions of time series data points and perform interactive queries. Chronix Spark is open source software and battle-proven at a German car manufacturer and an international telco.
We demonstrate several use cases of Chronix from real-life. Afterwards we lift the curtain and deep-dive into the Chronix architecture esp. how we're using Solr to store time series data and how we've hooked up Solr with Spark. We provide some benchmarks showing how Chronix has outperformed other time series databases in both performance and storage-efficiency.
Chronix is open source under the Apache License (http://chronix.io).
Effective testing for spark programs Strata NY 2015Holden Karau
This session explores best practices of creating both unit and integration tests for Spark programs as well as acceptance tests for the data produced by our Spark jobs. We will explore the difficulties with testing streaming programs, options for setting up integration testing with Spark, and also examine best practices for acceptance tests.
Unit testing of Spark programs is deceptively simple. The talk will look at how unit testing of Spark itself is accomplished, as well as factor out a number of best practices into traits we can use. This includes dealing with local mode cluster creation and teardown during test suites, factoring our functions to increase testability, mock data for RDDs, and mock data for Spark SQL.
Testing Spark Streaming programs has a number of interesting problems. These include handling of starting and stopping the Streaming context, and providing mock data and collecting results. As with the unit testing of Spark programs, we will factor out the common components of the tests that are useful into a trait that people can use.
While acceptance tests are not always part of testing, they share a number of similarities. We will look at which counters Spark programs generate that we can use for creating acceptance tests, best practices for storing historic values, and some common counters we can easily use to track the success of our job.
Relevant Spark Packages & Code:
https://github.com/holdenk/spark-testing-base / http://spark-packages.org/package/holdenk/spark-testing-base
https://github.com/holdenk/spark-validator
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
The 3.0 storage engine re-write is the biggest and most exciting change to ever happen in Apache Cassandra. The new storage engine can efficiently store and read data from disk using the same concepts present in the CQL 3 language. This has delivered large space savings, and creates new performance characteristics.
In this talk Aaron Morton, Co Founder at The Last Pickle and Apache Cassandra Committer, will discuss the 3.0 storage engine, it's layout and performance characteristics.
About the Speaker
Aaron Morton CEO, The Last Pickle
Aaron Morton is the Co Founder & CEO at The Last Pickle (thelastpickle.com). A professional services company that works with clients to deliver and improve Apache Cassandra based solutions. He's based in New Zealand, is an Apache Cassandra Committer and a DataStax MVP for Apache Cassandra.
Apache Cassandra, part 2 – data model example, machineryAndrey Lomakin
Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.
Introduction to apache_cassandra_for_developezznate
A presentation for Data Day Austin on January 29th, 2011
Introduces how to effectively use Apache Cassandra for Java developers using the Hector Java client: http://github.com/rantav/hector
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2DataStax
Title: Introduction to Apache Cassandra 1.2
Details: Join Aaron Morton, DataStax MVP for Apache Cassandra and learn the basics of the massively scalable NoSQL database. This webinar is will examine C*’s architecture and its strengths for powering mission-critical applications. Aaron will introduce you to core concepts such as Cassandra’s data model, multi-datacenter replication, and tunable consistency. He’ll also cover new features in Cassandra version 1.2 including virtual nodes, CQL 3 language and query tracing.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010, he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
NET Systems Programming Learned the Hard Way.pptxpetabridge
What is a thread quantum and why is it different on Windows Desktop and Windows Server? What's the difference between a blocking call and a blocking flow? Why did our remoting benchmarks suddenly drop when we moved to .NET 6? When should I try to write lock-free code? What does the `volatile` keyword mean?
Welcome to the types of questions my team and I are asked, or ask ourselves, on a regular basis - we're the makers of Akka.NET, a high performance distributed actor system library and these are the sorts of low-level questions we need to answer in order to build great experiences for our own users.
In this talk we're going to learn about .NET systems programming, the low level components we hope we can take for granted, but sometimes can't. In particular:
- The `ThreadPool` and how work queues operate in practice;
- Synchronization mechanisms - including `lock`-less ones;
- Memory management, `Span<T>`, and garbage collection;
- `await`, `Task`, and the synchronization contexts; and
- Crossing user-code and system boundaries in areas such as sockets.
This talk will help .NET developers understand why their code works the way it does and what to do in scenarios that demand high performance.
SignalFx engineer Rajiv Kurian's presentation on why we wrote our own Kafka consumer, the performance goals, and the performance gains achieved.
Download the slides to see animations showing hardware details. These slides were converged from Keynote to Powerpoint, so there may be some oddness with slide transitions!
Groovy's AST Transformation is a compile time meta-programming technique and allows developer to hook into compilation process and add new fields or methods, or modify existing methods.
Stata cheat sheet: programming. Co-authored with Tim Essam (linkedin.com/in/timessam). See all cheat sheets at http://bit.ly/statacheatsheets. Updated 2016/06/04
Writing a TSDB from scratch_ performance optimizations.pdfRomanKhavronenko
I'm one of maintainers of the open source TimeSeries database VictoriaMetrics, written in Go. It is used for APM or Kubernetes monitoring. The average VictoriaMetrics installation is processing 2-4 million samples/s on the ingestion path, 20-40 million samples/s on the read path. The biggest installations have more than 100 million samples/s on the ingestion path for a single cluster. This requires being clever with data processing to keep it efficient and scalable. In the talk, I'll cover the following optimizations for keeping the database fast:
1. String interning for lowering GC pressure. We use string interning for storing time series metadata (aka labels). However, this approach has downside of increased memory usage. When it is worth it to use string interning?
2. Metadata processing may require many regular expression matching and strings modification operations. Caching results of such operations helps to save CPU. But the downside could be increased memory usage. Which operations should be cached and which are not?
3. Limiting the number of concurrently running goroutines with CPU-bound load by the number of available CPU cores. This helps to control the memory usage on load spikes (which is a frequent event in monitoring). The limit also improves the processing speed of each goroutine, since it reduces the number of context switches. The downside of the approach is its complexity - it is easy to make a mistake and end up with a deadlock or inefficient resource utilization.
4. The better understanding of `sync.Pool`. For us, `sync.Pool` shows itself the best when used in CPU-bound code, while in IO-bound code it leads to excessive memory usage. The CPU-bound code has short ownership over the objects retrieved from the pool. In combination with p.3 (limited number of goroutines processing CPU-bound code) it gives the most efficient processing speed and memory usage since the chance to get a "hot" object from the pool is much higher.
Similar to Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X (20)
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
Slides from my talk at Cassandra Summit 2015
http://cassandrasummit-datastax.com/agenda/repeatable-scalable-reliable-observable-cassandra/
thelastpickle.com
My talk from http://wdcnz.com 2012.
I took a brief look at Cassandra and then stepped through building a twitter clone. Very rough code is at https://github.com/amorton/wdcnz-2012-site
Building a distributed Key-Value store with Cassandraaaronmorton
Slides from my talk at Kiwi Pycon in 2010.
Covers why we chose Cassandra, overview of it's feature and data model, and how we implemented our application.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
1. SF CASSANDRA USERS MARCH 2016
CQL PERFORMANCE WITH APACHE
CASSANDRA 3.0
Aaron Morton
@aaronmorton
CEO
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
2. AboutThe Last Pickle.
Work with clients to deliver and improve Apache Cassandra
based solutions.
Apache Cassandra Committer and DataStax MVPs.
Based in New Zealand,Australia, France & USA.
3. How We Got Here
Storage Engine 3.0
Write Path
Read Path
57. ClusteringIndexNamesFilter
1. Get Partition From Memtables.
2. Filter named columns into a temporary
result.
3. Select SSTables that may contain Partition
Key.
4. Order in descending timestamp order.
5. Read from SSTables in order.
58. Names Filter Short Circuits
If result has a Partition Deletion
newer than next SSTable max
timestamp.
Stop Search.
59. Names Filter Short Circuits
If read all Columns and max
timestamp of next SSTable less than
selected Columns min timestamp.
Stop Search.
60. Names Filter Short Circuits
Note: list of Columns
remaining to select is pruned
after every SSTable is read
based on max timestamp.
61. Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
62. Names Filter Short Circuits
If SSTable Cell not in search set.
Skip reading value.
65. ClusteringIndexSliceFilter
1. Get Partition From Memtables.
2. Create Iterators for Partitions.
3. Select SSTables that may contain Partition
Key.
4. Order in reverse max timestamp order.
5. Create Iterators for SSTables in order.
66. Slice Filter Short Circuits
If SSTable max timestamp is before
max seen Partition Deletion
timestamp.
Stop Search.
67. Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.