Many Relation Databases are adding NoSQL features to their products. So what happens when you can get direct access to the data as a key/value pair, or you can store an entire document in a column of a relational table, and more
What Your Database Query is Really DoingDave Stokes
Do you ever wonder what your database servers is REALLY doing with that query you just wrote. This is a high level overview of the process of running a query
Five Database Mistakes and how to fix them -- Confoo VancouverDave Stokes
Very few developers are learning Structured Query Language (about 2%) but then wonder why their database queries stink. This presentation covers five common database problems and how to fix them
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
Cassandra is a distributed database with features included but not limited to Secundary Indexes, UDF, Materialized Views, etc. and not so strict hardware requirements.
It is important to use those features and select hardware correctly to make sure the use of Cassandra in your business can be as painless as possible.
I will address how these features are used in the wrong way, how hardware should be selected, and how to make Cassandra work in the best possible way.
Learning Objective #1:
Learn that Cassandra hardware requirements exist (and why) and the shortcomings in some of features(Secundary Indexes, Compaction Strategies, etc).
Learning Objective #2:
The most misused features and common hardware errors. How they might seem harmeless at first (either small cluster or even single node).
Learning Objective #3:
How to correctly use Cassandra and it's features and go for perfect operation.
About the Speaker
Carlos Rolo Cassandra Consultant, Pythian
Carlos Rolo is a Cassandra MVP, and has deep expertise with distributed architecture technologies. Carlos is driven by challenge, and enjoys the opportunities to discover new things.. He has become known and trusted by customers and colleagues for his ability to understand complex problems, and to work well under pressure. When Carlos isn't working he can be found playing water polo or enjoying the his local community.
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
Large partitions shall no longer be a nightmare. That is the goal of CASSANDRA-11206.
100MB and 100,000 cells per partition is the recommended limit for a single partition in Cassandra up to 3.5. Exceeding these limits can cause a lot of trouble. Repairs and compactions could fail and reads cause out-of-memory failures.
This talk provides a deep-dive of the reasons for the previous limitations, why exceeding these limitations caused trouble, how the improvements in Cassandra 3.6 helps with big partitions and why you should not blindly let your partitions get huge.
About the Speaker
Robert Stupp Solution Architect, DataStax
Robert is working as a Solutions Architect at DataStax and is also a Committer to Apache Cassandra. Before joining DataStax he worked with his customers to architect and build distributed systems using Cassandra and has a long experience in building distributed backend systems mostly using Java as the preferred language of choice.
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...DataStax
Migration from relational world to no-SQL solutions not only impacts architectures, it requires change in mindset to understand/adjust to no-SQL design. We're going to talk about challenges, both technical and personal (such as resistance to change, endorse collisions, be faster, data model re-design for no-SQL, big data analytics etc.), we ran into during our quest to take major Symantec and Norton products away from the land of transactions. This is how we learned to stop worrying and love the eventual consistency...
About the Speakers
Ilya Sokolov Technical Director / Architect, Symantec
I was working as a software developer and architect in many companies across the world. Being always fascinated with information security I enjoy challenges in identity, digital protection and data storage space helping millions of Symantec customers feel safe online. If I'm not working on a new Norton component or looking for software vulnerabilities at Symantec CyberWar games, you'll likely find me evangelizing new information technology architectures and designs.
Shu Zhang Director, Database Engineering, Symantec
I started my career as an Oracle PL/SQL programmer and soon became an Oracle DBA. I was an Oracle DBA for over twelve years, including the seven years where I worked in Oracle USA. About three years ago, I became fascinated with the NoSQL world. I have learned MongoDB, Elastic Search, Couchbase and Cassandra. Now I lead a team of fourteen database engineers, responsible for choosing the most appropriate database technology, implementing and supporting all Norton database backend.
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
We'll be covering some aspects of our architecture, highlighting differences between MongoDB and Cassandra. We'll go in depth to explain why Cassandra is a better choice for our general purpose Application Platform (SHIFT) as well as our Media Buying Analytics tool (the SHIFT Media Manager). We'll be going over common design patterns people might be familiar with coming from a background with MongoDB and highlight how Cassandra would be used as a better alternative. We'll also touch more on cqlengine which is nearing feature completeness as the Cassandra object mapper for Python.
What Your Database Query is Really DoingDave Stokes
Do you ever wonder what your database servers is REALLY doing with that query you just wrote. This is a high level overview of the process of running a query
Five Database Mistakes and how to fix them -- Confoo VancouverDave Stokes
Very few developers are learning Structured Query Language (about 2%) but then wonder why their database queries stink. This presentation covers five common database problems and how to fix them
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
Cassandra is a distributed database with features included but not limited to Secundary Indexes, UDF, Materialized Views, etc. and not so strict hardware requirements.
It is important to use those features and select hardware correctly to make sure the use of Cassandra in your business can be as painless as possible.
I will address how these features are used in the wrong way, how hardware should be selected, and how to make Cassandra work in the best possible way.
Learning Objective #1:
Learn that Cassandra hardware requirements exist (and why) and the shortcomings in some of features(Secundary Indexes, Compaction Strategies, etc).
Learning Objective #2:
The most misused features and common hardware errors. How they might seem harmeless at first (either small cluster or even single node).
Learning Objective #3:
How to correctly use Cassandra and it's features and go for perfect operation.
About the Speaker
Carlos Rolo Cassandra Consultant, Pythian
Carlos Rolo is a Cassandra MVP, and has deep expertise with distributed architecture technologies. Carlos is driven by challenge, and enjoys the opportunities to discover new things.. He has become known and trusted by customers and colleagues for his ability to understand complex problems, and to work well under pressure. When Carlos isn't working he can be found playing water polo or enjoying the his local community.
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
Large partitions shall no longer be a nightmare. That is the goal of CASSANDRA-11206.
100MB and 100,000 cells per partition is the recommended limit for a single partition in Cassandra up to 3.5. Exceeding these limits can cause a lot of trouble. Repairs and compactions could fail and reads cause out-of-memory failures.
This talk provides a deep-dive of the reasons for the previous limitations, why exceeding these limitations caused trouble, how the improvements in Cassandra 3.6 helps with big partitions and why you should not blindly let your partitions get huge.
About the Speaker
Robert Stupp Solution Architect, DataStax
Robert is working as a Solutions Architect at DataStax and is also a Committer to Apache Cassandra. Before joining DataStax he worked with his customers to architect and build distributed systems using Cassandra and has a long experience in building distributed backend systems mostly using Java as the preferred language of choice.
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...DataStax
Migration from relational world to no-SQL solutions not only impacts architectures, it requires change in mindset to understand/adjust to no-SQL design. We're going to talk about challenges, both technical and personal (such as resistance to change, endorse collisions, be faster, data model re-design for no-SQL, big data analytics etc.), we ran into during our quest to take major Symantec and Norton products away from the land of transactions. This is how we learned to stop worrying and love the eventual consistency...
About the Speakers
Ilya Sokolov Technical Director / Architect, Symantec
I was working as a software developer and architect in many companies across the world. Being always fascinated with information security I enjoy challenges in identity, digital protection and data storage space helping millions of Symantec customers feel safe online. If I'm not working on a new Norton component or looking for software vulnerabilities at Symantec CyberWar games, you'll likely find me evangelizing new information technology architectures and designs.
Shu Zhang Director, Database Engineering, Symantec
I started my career as an Oracle PL/SQL programmer and soon became an Oracle DBA. I was an Oracle DBA for over twelve years, including the seven years where I worked in Oracle USA. About three years ago, I became fascinated with the NoSQL world. I have learned MongoDB, Elastic Search, Couchbase and Cassandra. Now I lead a team of fourteen database engineers, responsible for choosing the most appropriate database technology, implementing and supporting all Norton database backend.
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
We'll be covering some aspects of our architecture, highlighting differences between MongoDB and Cassandra. We'll go in depth to explain why Cassandra is a better choice for our general purpose Application Platform (SHIFT) as well as our Media Buying Analytics tool (the SHIFT Media Manager). We'll be going over common design patterns people might be familiar with coming from a background with MongoDB and highlight how Cassandra would be used as a better alternative. We'll also touch more on cqlengine which is nearing feature completeness as the Cassandra object mapper for Python.
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...DataStax
Why do data breaches occur even though we protect data at rest and in flight? What if the Cassandra admin's credentials get compromised? Why is it so hard to make encryption work for real world applications? If I encrypt my customer's data, do I have to turn it over when the authorities come calling? The answer may be in keeping data encrypted. . .always, let the customer own the keys and make data breaches irrelevant.
In this talk, Ameesh Divatia, Co-founder at Baffle.io, will talk about a way to encrypt individual fields in a Cassandra database while continuing to let them be available for CQL access. From deterministic to random algorithms, key management and integration into DataStax drivers, this talk will introduce attendees to the steps to follow in order to protect an existing Cassandra database with field-level granularity ensuring protection against data breaches.
About the Speaker
Ameesh Divatia President & CEO, Baffle, Inc.
Ameesh is a serial entrepreneur with over 25 years of operating experience in storage, security and networking infrastructure. He specializes in conceiving and implementing startup business plans that create new product categories by leveraging innovation in existing markets to its adjacencies. He co-founded Baffle in May 2015 to address the challenge of preventing data breaches in cloud infrastructure.
Run Cloud Native MySQL NDB Cluster in KubernetesBernd Ocklin
The more your database aligns with Cloud Native principles such as resilience, scaling, auto-healing and data consistency across all nodes, the better it also runs as DBaaS in Kubernetes. I walk through running databases in Kubernetes and demos manual deployment and deployment with an NDB operator.
This talk was given at the MySQL Dev Room FOSDEM 2021.
Demystifying the Distributed Database LandscapeScyllaDB
What is the state of the art of high performance, distributed databases as we head into 2022, and which options are best suited for your own development projects?
The data-intensive applications leading this next tech cycle are typically powered by multiple types of databases and data stores — each satisfying specific needs and often interacting with a broader data ecosystem. Even the very notion of “a database” is evolving as new hardware architectures and methodologies allow for ever-greater capabilities and expectations for horizontal and vertical scalability, performance, and reliability.
In this webinar, ScyllaDB Director of Technology Advocacy Peter Corless will survey the current landscape of distributed database systems and highlight new directions in the industry.
This talk will cover different database and database-adjacent technologies as well as describe their appropriate use cases, patterns and antipatterns with a focus on:
- Distributed SQL, NewSQL and NoSQL
- In-memory datastores and caches
- Streaming technologies with persistent data storage
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...Insight Technology, Inc.
Migrating Oracle based applications to MariaDB has become easier and economically advantageous with the feature set of MariaDB 10.2 and the upcoming 10.3 release. We’ll present details of the features that led DBS Bank to migrate mission critical applications to MariaDB.
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
NoSQL schemas are designed with very different goals in mind than SQL schemas. Where SQL normalizes data, NoSQL denormalizes. Where SQL joins ad-hoc, NoSQL pre-joins. And where SQL tries to push performance to the runtime, NoSQL bakes performance into the schema. Join us for an exploration of the core concepts of NoSQL schema design, using Scylla as an example to demonstrate the tradeoffs and rationale.
A Step by Step Introduction to the MySQL Document StoreDave Stokes
Looking for a fast, flexible NoSQL document store? And one that runs with the power and reliability of MySQL. This is an intro on how to use the MySQL Document Store
Third normal form? That’s so 20th century. Learn the newest techniques to make your Cassandra database sing from the rafters in performance and scalability. AND it uses concepts that you already know and apply every day. You can do this. This is the must-see half hour of your professional life! These developers found a new way to work with databases. First you will be shocked, then you will be inspired!
Primary and Clustering Keys should be one of the very first things you learn about when modeling Cassandra data. Most people coming from a relational background automatically think, ""Yeah, I know what a Primary Key is"", and gloss right over it. Because of this, there always seems to be a lot of confusion around the topic of Primary Keys in Cassandra. This presentation will demystify that confusion. I will cover what the different types of Keys are, how they can be used, what their purpose is, and how they affect your queries.
For this presentation, I will be using CrossFit gym locations as my subject matter. I will explain the differences between Primary Keys, Compound Keys, Clustering Keys, & Composite Keys. I will also show how the data behind each type differs as stored on disk. Lastly, I will show what queries each type of key will support.
About the Speaker
Adam Hutson Data Architect, DataScale
Adam is Data Architect for DataScale, Inc. He is a seasoned data professional with experience designing & developing large-scale, high-volume database systems. Adam previously spent four years as Senior Data Engineer for Expedia building a distributed Hotel Search using Cassandra 1.1 in AWS. Having worked with Cassandra since version 0.8, he was early to recognize the value Cassandra adds to Enterprise data storage. Adam is also a DataStax Certified Cassandra Developer.
Join us as we talk about the current state as well as the future of DSE Search. Nick Panahi will discuss high level architecture while Ariel will dive deep into some of the integration. We'll talk about future features, improvements and enhancements as well as some of the challenges of our custom integration and what that means for scale and availability.
About the Speakers
Nick Panahi Sr. Product Manager, DSE Search, DataStax
I am the product manager for DSE search, prior to product management, I was a solution architect for DataStax.
Ariel Weisberg Software Engineer, DataStax
Ariel is currently a Cassandra contributor and Datastax employee and former lead architect for VoltDB. Ariel aspires to be or considers himself a shared-nothing database expert depending on the time of day and whether Benedict is in the room, and has a passion for things measured in nanoseconds. Ariel has presented at events like Strangeloop, PAX Dev, OpenSQL camp Boston, NYC MySQL Meetup, and Boston New Technology Group meetup.
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
This webinar compares NoSQL and NewSQL databases. We will look at the significant architectural differences between the two, tradeoffs between availability, scalable performance and consistency, data models, and share benchmark results to display the performance implications of NoSQL versus NewSQL.
Many NoSQL DBaaS vendors limit what cloud platform you can run on, the size of the data you can run and require you to over-provision cloud infrastructure resources while failing to deliver performance and low latency at scale.
In this session, we will compare the performance and Total Cost of Ownership (TCO) of competing NoSQL DBaaS offerings. We will also review how to migrate to Scylla Cloud, our fully managed database service.
You will learn:
- The true cost of ownership for selected NoSQL DBaaS offerings
- The 8 essentials for selecting a NoSQL DBaaS
- Migration options from Apache Cassandra, DynamoDB and other databases
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...DataStax
Cassandra is moving away from Thrift to CQL protocol. Along with this change, Thrift based client drivers are not actively supported any more, nor are they exposed to new Cassandra features. This brings many existing Cassandra users into a situation that they need to migrate their Thrift based application to CQL based. A complete solution to Thrift-to-CQL migration requires changes in 3 areas: application code, data model, and existing data. In this session we will focus on how we can migrate existing data effectively in order to reflect data model changes and give you the necessary tools to build from what you have learned to apply it in your environments.
Learning Objectives:
1) Gain an insight of what Cassandra storage engine looks like for version 2.2 and downward
2) Get better idea of how Thrift and CQL difference affects table design
3) Explore an approach to effectively migrate dynamically generated (Thrift) data into a static defined (CQL) table
About the Speaker
Yabin Meng Apache Cassandra / DataStax Enterprise Consultant, Pythian
Yabin is a DataStax certified Architect, Administrator, and Developer. He has been in IT industry for more than 15 years and much of his career is around database related technologies. He has been working with Cassandra for about 2 years and is currently a Cassandra/DSE consultant at Pythian.
Development teams are facing an increased demand for building faster, smarter, more complex systems in record time. The DataStax development tools have been architected from the ground up to enable teams to deliver intelligent,
large-scale distributed, high-performance applications while remaining familiar and easy to use.
This session will provide an overview of the DataStax Developer Tools, where and how they fit into the development cycle.
About the Speaker
Alex Popescu Senior Product Manager, DataStax
I'm a developer turned product manager building developer tools for Apache Cassandra and DSE. With an eye for simplicity, I focus on creating friendly developer solutions that enable building high-performance, scalable, and fault tolerant applications. I'm passionate about open source and over years I made numerous contributions to major projects like TestNG and Groovy.
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
Tutorial at Oracle Open World 2015:
Performance of SQL queries plays a big role in application performance. If some queries execute slowly, these queries or the database schema may need tuning. This tutorial covers query processing, optimization methods, and how the MySQL optimizer chooses a specific plan to execute SQL. See demonstrations on how to use tools such as EXPLAIN (including the JSON-based variant), optimizer trace, and performance schema to analyze query plans. See how the Visual Explain functionality in MySQL Workbench helps you to visualize these plans. Based on the analysis, the tutorial covers how to take the next steps for performance tuning. It might mean forcing a particular index, changing the schema, or modifying configuration parameters.
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...DataStax
Why do data breaches occur even though we protect data at rest and in flight? What if the Cassandra admin's credentials get compromised? Why is it so hard to make encryption work for real world applications? If I encrypt my customer's data, do I have to turn it over when the authorities come calling? The answer may be in keeping data encrypted. . .always, let the customer own the keys and make data breaches irrelevant.
In this talk, Ameesh Divatia, Co-founder at Baffle.io, will talk about a way to encrypt individual fields in a Cassandra database while continuing to let them be available for CQL access. From deterministic to random algorithms, key management and integration into DataStax drivers, this talk will introduce attendees to the steps to follow in order to protect an existing Cassandra database with field-level granularity ensuring protection against data breaches.
About the Speaker
Ameesh Divatia President & CEO, Baffle, Inc.
Ameesh is a serial entrepreneur with over 25 years of operating experience in storage, security and networking infrastructure. He specializes in conceiving and implementing startup business plans that create new product categories by leveraging innovation in existing markets to its adjacencies. He co-founded Baffle in May 2015 to address the challenge of preventing data breaches in cloud infrastructure.
Run Cloud Native MySQL NDB Cluster in KubernetesBernd Ocklin
The more your database aligns with Cloud Native principles such as resilience, scaling, auto-healing and data consistency across all nodes, the better it also runs as DBaaS in Kubernetes. I walk through running databases in Kubernetes and demos manual deployment and deployment with an NDB operator.
This talk was given at the MySQL Dev Room FOSDEM 2021.
Demystifying the Distributed Database LandscapeScyllaDB
What is the state of the art of high performance, distributed databases as we head into 2022, and which options are best suited for your own development projects?
The data-intensive applications leading this next tech cycle are typically powered by multiple types of databases and data stores — each satisfying specific needs and often interacting with a broader data ecosystem. Even the very notion of “a database” is evolving as new hardware architectures and methodologies allow for ever-greater capabilities and expectations for horizontal and vertical scalability, performance, and reliability.
In this webinar, ScyllaDB Director of Technology Advocacy Peter Corless will survey the current landscape of distributed database systems and highlight new directions in the industry.
This talk will cover different database and database-adjacent technologies as well as describe their appropriate use cases, patterns and antipatterns with a focus on:
- Distributed SQL, NewSQL and NoSQL
- In-memory datastores and caches
- Streaming technologies with persistent data storage
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...Insight Technology, Inc.
Migrating Oracle based applications to MariaDB has become easier and economically advantageous with the feature set of MariaDB 10.2 and the upcoming 10.3 release. We’ll present details of the features that led DBS Bank to migrate mission critical applications to MariaDB.
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
NoSQL schemas are designed with very different goals in mind than SQL schemas. Where SQL normalizes data, NoSQL denormalizes. Where SQL joins ad-hoc, NoSQL pre-joins. And where SQL tries to push performance to the runtime, NoSQL bakes performance into the schema. Join us for an exploration of the core concepts of NoSQL schema design, using Scylla as an example to demonstrate the tradeoffs and rationale.
A Step by Step Introduction to the MySQL Document StoreDave Stokes
Looking for a fast, flexible NoSQL document store? And one that runs with the power and reliability of MySQL. This is an intro on how to use the MySQL Document Store
Third normal form? That’s so 20th century. Learn the newest techniques to make your Cassandra database sing from the rafters in performance and scalability. AND it uses concepts that you already know and apply every day. You can do this. This is the must-see half hour of your professional life! These developers found a new way to work with databases. First you will be shocked, then you will be inspired!
Primary and Clustering Keys should be one of the very first things you learn about when modeling Cassandra data. Most people coming from a relational background automatically think, ""Yeah, I know what a Primary Key is"", and gloss right over it. Because of this, there always seems to be a lot of confusion around the topic of Primary Keys in Cassandra. This presentation will demystify that confusion. I will cover what the different types of Keys are, how they can be used, what their purpose is, and how they affect your queries.
For this presentation, I will be using CrossFit gym locations as my subject matter. I will explain the differences between Primary Keys, Compound Keys, Clustering Keys, & Composite Keys. I will also show how the data behind each type differs as stored on disk. Lastly, I will show what queries each type of key will support.
About the Speaker
Adam Hutson Data Architect, DataScale
Adam is Data Architect for DataScale, Inc. He is a seasoned data professional with experience designing & developing large-scale, high-volume database systems. Adam previously spent four years as Senior Data Engineer for Expedia building a distributed Hotel Search using Cassandra 1.1 in AWS. Having worked with Cassandra since version 0.8, he was early to recognize the value Cassandra adds to Enterprise data storage. Adam is also a DataStax Certified Cassandra Developer.
Join us as we talk about the current state as well as the future of DSE Search. Nick Panahi will discuss high level architecture while Ariel will dive deep into some of the integration. We'll talk about future features, improvements and enhancements as well as some of the challenges of our custom integration and what that means for scale and availability.
About the Speakers
Nick Panahi Sr. Product Manager, DSE Search, DataStax
I am the product manager for DSE search, prior to product management, I was a solution architect for DataStax.
Ariel Weisberg Software Engineer, DataStax
Ariel is currently a Cassandra contributor and Datastax employee and former lead architect for VoltDB. Ariel aspires to be or considers himself a shared-nothing database expert depending on the time of day and whether Benedict is in the room, and has a passion for things measured in nanoseconds. Ariel has presented at events like Strangeloop, PAX Dev, OpenSQL camp Boston, NYC MySQL Meetup, and Boston New Technology Group meetup.
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
This webinar compares NoSQL and NewSQL databases. We will look at the significant architectural differences between the two, tradeoffs between availability, scalable performance and consistency, data models, and share benchmark results to display the performance implications of NoSQL versus NewSQL.
Many NoSQL DBaaS vendors limit what cloud platform you can run on, the size of the data you can run and require you to over-provision cloud infrastructure resources while failing to deliver performance and low latency at scale.
In this session, we will compare the performance and Total Cost of Ownership (TCO) of competing NoSQL DBaaS offerings. We will also review how to migrate to Scylla Cloud, our fully managed database service.
You will learn:
- The true cost of ownership for selected NoSQL DBaaS offerings
- The 8 essentials for selecting a NoSQL DBaaS
- Migration options from Apache Cassandra, DynamoDB and other databases
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...DataStax
Cassandra is moving away from Thrift to CQL protocol. Along with this change, Thrift based client drivers are not actively supported any more, nor are they exposed to new Cassandra features. This brings many existing Cassandra users into a situation that they need to migrate their Thrift based application to CQL based. A complete solution to Thrift-to-CQL migration requires changes in 3 areas: application code, data model, and existing data. In this session we will focus on how we can migrate existing data effectively in order to reflect data model changes and give you the necessary tools to build from what you have learned to apply it in your environments.
Learning Objectives:
1) Gain an insight of what Cassandra storage engine looks like for version 2.2 and downward
2) Get better idea of how Thrift and CQL difference affects table design
3) Explore an approach to effectively migrate dynamically generated (Thrift) data into a static defined (CQL) table
About the Speaker
Yabin Meng Apache Cassandra / DataStax Enterprise Consultant, Pythian
Yabin is a DataStax certified Architect, Administrator, and Developer. He has been in IT industry for more than 15 years and much of his career is around database related technologies. He has been working with Cassandra for about 2 years and is currently a Cassandra/DSE consultant at Pythian.
Development teams are facing an increased demand for building faster, smarter, more complex systems in record time. The DataStax development tools have been architected from the ground up to enable teams to deliver intelligent,
large-scale distributed, high-performance applications while remaining familiar and easy to use.
This session will provide an overview of the DataStax Developer Tools, where and how they fit into the development cycle.
About the Speaker
Alex Popescu Senior Product Manager, DataStax
I'm a developer turned product manager building developer tools for Apache Cassandra and DSE. With an eye for simplicity, I focus on creating friendly developer solutions that enable building high-performance, scalable, and fault tolerant applications. I'm passionate about open source and over years I made numerous contributions to major projects like TestNG and Groovy.
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
Tutorial at Oracle Open World 2015:
Performance of SQL queries plays a big role in application performance. If some queries execute slowly, these queries or the database schema may need tuning. This tutorial covers query processing, optimization methods, and how the MySQL optimizer chooses a specific plan to execute SQL. See demonstrations on how to use tools such as EXPLAIN (including the JSON-based variant), optimizer trace, and performance schema to analyze query plans. See how the Visual Explain functionality in MySQL Workbench helps you to visualize these plans. Based on the analysis, the tutorial covers how to take the next steps for performance tuning. It might mean forcing a particular index, changing the schema, or modifying configuration parameters.
Tips on how to prepare MySQL 5.7 GIS databases for the upgrade to MySQL 8.0 and the introduction of geography support.
Presentation given at the Pre-FOSDEM MySQL Day in Brussels, February 3, 2017.
The technology has almost written off MySQL as a database for new fancy NoSQL databases like MongoDB and Cassandra or even Hadoop for aggregation. But MySQL has a lot to offer in terms of 'ACID'ity, performance and simplicity. For many use-cases MySQL works well. In this week's ShareThis workshop we discuss different tips & techniques to improve performance and extend the lifetime of your MySQL deployment.
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
Thinking about your sales team's goals for 2017? Drift's VP of Sales shares 3 things you can do to improve conversion rates and drive more revenue.
Read the full story on the Drift blog here: http://blog.drift.com/sales-team-tips
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
SwanseaCon 2017 presentation on Making MySQL Agile-ish. Relational Databases are not usually considered part of the Agile Programming movement but there are many new features in MySQL to make it easier to include it. This presentation covers how MySQL is moving to help support agile development while maintaining the traditional 'non agile' stability expected from a database.
PHP, The X DevAPI, and the MySQL Document Store Presented January 23rd, 20...Dave Stokes
This presentation from the January 2019 Benelux PHP conference and covers use of the MySQL Document Store via the X DevAPI so that MySQL can used as a NoSQL JSON Document store as wells as a relational database, providing the best of both works
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
You can now use MySQL without needing to know Structured Query Language (SQL) with the MySQL Document Store. Access JSON documents and/or relational tables using the new X DevAPI
MySQL Document Store - A Document Store with all the benefts of a Transactona...Olivier DASINI
MySQL Document Store allows developers to work with SQL relational tables and schema-less JSON collections. To make that possible MySQL has created the X Dev API which puts a strong focus on CRUD by providing a fluent API allowing you to work with JSON documents in a natural way. The X Protocol is a highly extensible and is optimized for CRUD as well as SQL API operations.
Scaling the Content Repository with ElasticsearchNuxeo
This talk will explain how to leverage Elasticsearch capabilities to make your content repository scale to the sky while still relying on standard SQL based technologies and ensuring data security and integrity. The design choices behind this hybrid Elasticsearch / PgSQL architecture will be discussed and the technical integration with Elasticsearch will be demonstrated.
Watch the recorded webinar: http://www.nuxeo.com/resources/scaling-the-document-repository-with-elasticsearch/
Fewer developers each year are getting training in Structured Query Language (SQL) but their code is more dependent on Relation Data. The MySQL Document Store allows programmers to use the full power of MySQL without needing SQL! Built on the X DevAPI, the MySQL Document Store is a power tool for working with JSON Document Store or relational tables. Best of both NoSQL and SQL worlds!
MySQL is an ubiquitous open source database but do you know how make it secure? This talk is from the 2022 Texas Cyber Summit on how to do just that. Make sure you data and database are secure.
MySQL Indexes and Histograms - RMOUG Training Days 2022Dave Stokes
Nobody complains when the database is too fast. But they do gripe when it slows down. The two most popular ways to increase query speed are indexes and histograms. But there a dozens of options for indexes and a lot of lots of bad information on how to use them. Histograms are great but not for all types of data. This session covers the hows and whys of both approaches
Develop PHP Applications with MySQL X DevAPIDave Stokes
The X DevAPI provides a way to use MySQL as a NoSQL JSON Document Store and this presentation covers how to use it with the X DevAPI PHP PECL extension. And it also works with traditional relational tables. Presented at Oracle CodeOne 24 October 2018
The Proper Care and Feeding of MySQL DatabasesDave Stokes
Many Linux System Administrators are 'also' accidental database administrators. This is a guide for them to keep their MySQL database instances happy, health, and glowing
MySQL can now be used as a NoSQL JSON Document store so you get the best of NoSQL and SQL world. This talk covers the features of the X Devapi, the MySQL Document Store, and how to use relational tables with the new SQL features
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018Dave Stokes
The MySQL Document Store allows developers to use MySQL as a JSON Document Store -- no normalizing of data, setting up relational tables, and you do not have to use SQL to query data. And you get the both the SQL and NoSQL worlds on one server
Presentation Skills for Open Source FolksDave Stokes
Do you want to present at a Linuxfest or other open source conference but do not know where or how to start. Follow these recommendations and you will be on your way to being a speaking all star. Discover how write your presentation. what tools you need, and other items of note
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
MySQL 8 has many new features and this presentation covers the new data dictionary, improved JSON functions, roles, histograms, and much more. Updated after SunshinePHP 2018 after feedback
ConFoo MySQL Replication Evolution : From Simple to Group ReplicationDave Stokes
MySQL Replication has been around for many years but how wee do you under stand it? Do you know about read/write splitting, RBR vs SBR style replication, and InnoDB cluster?
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
The very basics of programming in PHP to store/retrieve data on a relational database management system (RDMS). For those looking for intermediate to advanced material, please see 'What Your Database Query is Really Doing'.
MySQL Replication Evolution -- Confoo Montreal 2017Dave Stokes
MySQL Replication has evolved since the early days with simple async master/slave replication with better security, high availability, and now InnoDB Cluster
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
2. "THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL
PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION
PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY
CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY
MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE
RELIED UPON IN MAKING PURCHASING DECISIONS. THE
DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR
FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS
REMAINS AT THE SOLE DISCRETION OF ORACLE."
Safe Harbor
3. 21 Years Old
MySQL has been part of
Oracle’s family of databases
for six years.
MySQL 8
MySQl 5.7 is the current release
but the next version will be
MySQL 8. Big feature is real
time data dictionary
Group Replication
Active master-master
replication.
JSON
A new native JSON datatype to
store documents in a column of
a table
Document Store
Programmers not know SQL
but need a database? X Devapi
allows them to use RDMS from
language of choice
Encryption
Use Oracle Key Vault to
encrypt your data at rest.
7. The relational database model has been established
for decades. Originally designed for efficient storage
of data of normalized data when disk drive space
was expensive, they now can consume un-
normalized data or bypass their own optimizers and
syntax checking restraints for previously unimagined
speed.
8. Why!What would force well behaving software of outstanding utility
To change …
To morph …
To take on new SuperPowers?????
10. The relational model works well but not all
data fits easily into schemas.
Sometimes the data is too big (or messy) ,
ephemeral, the team lacks DBA skills, or
The programmer is #%*@ lazy and wants
To pile stuff like a teenager!!!
11. NoSQL -- Broad topic made
up of many technologies
looped together like
indolent cattle and branded
NoSQL
12. × Graph Databases --relations, 6°of Kevin Bacon
× Map/Reduce -- Filter/Sort/Count
× Key/Value -- <attribute name/ value>
× Document store -- key/value with document data
× Most designed to scale horizontally
× Give up consistency
13. Adding NoSQL
The relational database vendors
have been busy adding NoSQL
features to SQL. ACID on NoSQL still
very hard.
15. The InnoDB memcached plugin provides an integrated memcached
daemon that automatically stores and retrieves data from InnoDB
tables, turning the MySQL server into a fast “key-value store”. Instead
of formulating queries in SQL, you can use simple get, set, and incr
operations that avoid the performance overhead associated with
SQL parsing and constructing a query optimization plan. You can also
access the same InnoDB tables through SQL for convenience,
complex queries, bulk operations, and other strengths of traditional
database software.
16.
17. Plugin
One line command
to installed shared
object.
Script
Run a script to setup
example tables. Use
as a template to fit
your data, e.g.
column separator
character defaults
to ‘|’ (pipe).
Cache
You can now use
MySQL/Memcached
as a consistent
cache or customize
to your needs.
18. Java Script Object Notation or JSON is a
popular way of storing information in a
relatively easy for humans to consume fashion.
19.
20. dON’T WANT TO NORMALIZE YOUR DATA?
tHEN STORE A
ENTIRE
DOCUMENT
IN A SINGLE
COLUMN!
22. You are breaking the first
rule of data normalization!
So you need new functions
to be able to take care of
the document in the column!
23. See JSONDATA TYPE http://slideshare.net/davidmstokes for details & examples
24. You can use GENERATED columns to
extract values from a JSON column
and those columns can be indexed
for fast SQL searches.
25. SQL skills == no?What if your programmers do not know
Structured Query Language???
26. Relational databases such as MySQL usually required a
document schema to be defined before documents can be
stored. You can use MySQL as a document store, which is a
schema-less, and therefore schema-flexible, storage system for
documents. When using MySQL as a document store, to create
documents describing products you do not need to know and
define all possible attributes of any products before storing
them and operating with them. This differs from working with a
relational database and storing products in a table, when all
columns of the table must be known and defined before adding
any products to the database.
27. New Shell & New
protocol
Developers can do CRUD
(create/read/update/delete)
From language of choice
(SQL hidden as to not frighten
them)
29. The X DevAPI wraps powerful concepts in a
simple API.
● A new high-level session concept
enables you to write code that can
transparently scale from single
MySQL Server to a multiple server
environment.
● Read operations are simple and
easy to understand.
● Non-blocking, asynchronous calls
follow common host language
patterns.
The X DevAPI introduces a new, modern
and easy-to-learn way to work with your data.
● Documents are stored in
Collections and have their
dedicated CRUD operation set.
● Work with your existing domain
objects or generate code based on
structure definitions for strictly
typed languages.
● Focus is put on working with data
● Modern practices and syntax
styles are used to get away from
traditional SQL-String-Building
30. mysqlsh -u root --sql
Enter password: ****
mysql-py> db.createCollection("flags")
<Collection:flags>
mysql-py> db.getCollections()
[
<Collection:CountryInfo>,
<Collection:flags>
]
mysql-py> db.CountryInfo.find("GNP > 500000")
...[output removed]
10 documents in set (0.00 sec)
37. More/faster memory for in memory DBs
Spill to disk, pin to memory
Durable to disk (ACID Compliant)
Gets cheaper every year (~40%)
Cache is the new RAM, RAM the new disk, disk the new tape, etc
No random writes leverage SSDs
NVRAM on the way, many be cheaper