Distributed Vector Databases - What, Why, and How - Steve Pousty, VMware
In the last two years, AI machine learning has exploded in prominence. One of the key concepts used in the modeling and storage of AI is vectors. Feeling like you should learn more and how you would use them in your data work? Wondering how you would run this distributed on Kubernetes? Then have I got a talk for you! We will start by explaining the concept of (embedding) vectors and how they are used in the AI life cycle. From there we will go into putting them into a database. We will cover the use cases where this technology makes sense. As opposed to an RDBMS, vector databases are more tightly focused and optimized for particular use cases. To ground this discussion in something more concrete, there will be hands-on demos throughout the talk. You will see the advantages to running distributed vector databases on Kubernetes infrastructure. Bring your favorite Kube infrastructure and leave with hands-on experience running AI infrastructure on Kubernetes.
Part 2 of a 2 part presentation that I did in 2009, this presentation covers more about unstructured data, and operational data vault components. YES, even then I was commenting on how this market will evolve. IF you want to use these slides, please let me know, and add: "(C) Dan Linstedt, all rights reserved, http://LearnDataVault.com" in a VISIBLE fashion on your slides.
Efficient and reliable hybrid cloud architecture for big databaseijccsa
The objective of our paper is to propose a Cloud computing framework which is feasible and necessary for
handling huge data. In our prototype system we considered national ID database structure of Bangladesh
which is prepared by election commission of Bangladesh. Using this database we propose an interactive
graphical user interface for Bangladeshi People Search (BDPS) that use a hybrid structure of cloud
computing handled by apache Hadoop where database is implemented by HiveQL. The infrastructure
divides into two parts: locally hosted cloud which is based on “Eucalyptus” and the remote cloud which is
implemented on well-known Amazon Web Service (AWS). Some common problems of Bangladesh aspect
which includes data traffic congestion, server time out and server down issue is also discussed.
Part 2 of a 2 part presentation that I did in 2009, this presentation covers more about unstructured data, and operational data vault components. YES, even then I was commenting on how this market will evolve. IF you want to use these slides, please let me know, and add: "(C) Dan Linstedt, all rights reserved, http://LearnDataVault.com" in a VISIBLE fashion on your slides.
Efficient and reliable hybrid cloud architecture for big databaseijccsa
The objective of our paper is to propose a Cloud computing framework which is feasible and necessary for
handling huge data. In our prototype system we considered national ID database structure of Bangladesh
which is prepared by election commission of Bangladesh. Using this database we propose an interactive
graphical user interface for Bangladeshi People Search (BDPS) that use a hybrid structure of cloud
computing handled by apache Hadoop where database is implemented by HiveQL. The infrastructure
divides into two parts: locally hosted cloud which is based on “Eucalyptus” and the remote cloud which is
implemented on well-known Amazon Web Service (AWS). Some common problems of Bangladesh aspect
which includes data traffic congestion, server time out and server down issue is also discussed.
Cloud computing has been the most adoptable technology in the recent times, and the database has also
moved to cloud computing now, so we will look into the details of database as a service and its functioning.
This paper includes all the basic information about the database as a service. The working of database as a
service and the challenges it is facing are discussed with an appropriate. The structure of database in
cloud computing and its working in collaboration with nodes is observed under database as a service. This
paper also will highlight the important things to note down before adopting a database as a service
provides that is best amongst the other. The advantages and disadvantages of database as a service will let
you to decide either to use database as a service or not. Database as a service has already been adopted by
many e-commerce companies and those companies are getting benefits from this service.
Efficient and scalable multitenant placement approach for in memory database ...CSITiaesprime
Of late Multitenant model with In-Memory database has become prominent area for research. The paper has used advantages of multitenancy to reduce the cost for hardware, labor and make availability of storage by sharing database memory and file execution. The purpose of this paper is to give overview of proposed Supple architecture for implementing in-memory database backend and multitenancy, applicable in public and private cloud settings. Backend in memory database uses column-oriented approach with dictionary based compression technique. We used dedicated sample benchmark for the workload processing and also adopt the SLA penalty model. In particular, we present two approximation algorithms, multi-tenant placement (MTP) and best-fit greedy to show the quality of tenant placement. The experimental results show that MTP algorithm is scalable and efficient in comparison with best-fit greedy algorithm over proposed architecture.
Software Design PatternsConsider a company migrating to a third-p.pdfarorastores
Software Design Patterns:
Consider a company migrating to a third-party cloud-based solution from an internally
maintained ecosystem of applications utilizing one current-generation database system, as well
as a legacy system for older data. They plan to migrate all data to the cloud based solution in
time. But, for now, they are going to transition to the new cloud-based applications and the
cloud-based database for new data, but will rely upon the existing and legacy database for older
data. The databases have approximately the same functionality, but different interfaces and
languages.
What design pattern highlights the most significant challenge associated with integrating the
different databases (as well as one way of addressing it)?
What is that challenge?
Briefly, and in English, describe how the pattern teaches that we should approach this problem?
In other words, what is the pattern that should follow for the solution?
Solution
Design patterns like Factory pattern,Singleton pattern etc basically provide solutions to general
problems which are faced by software developers during the development phase. These patterns
do not play any role in Data migration.
There are four stages in Data Migration. They are:
1.Semantic Data models which comprises of the Dimensional models,Semantic models,
Mapping to Semantic building blocks.
2. Data Mapping Specifications which is used to translate Source data to target data.
3. KPIs and Data lineage which is useful in establishing the data lineage for the org and other
rightful requirements.
4. End-to-End scope of Data models is used to standardise data that is loaded in the Data
Warehouse.
Please follow the list of steps provided below while migrating data to the cloud:
1. Assessing the requirements and then plan.
2.Disintegrate the dependencies after the initial assessments.
3. Redesign, re-program and reintegrate.
4. Testing of new migrated components.
5. Fine tuning and training.
However, there would be technical issues while data migration. Many firms which migrate the
data to the cloud, proceed in a hybrid model, keeping key elementss of their infrastructure
inhouse and under their comtrol while they outsource less sensitive or core components.
Cloud vendors would always expect the customers to provide or develop a virtual image jointly
that specifies their basic server configuration, which is offered as a service after being built
inside the cloud. It is required that the IT team also have the skillset tp create a VM template
which includes infrastructure, application and security that is required by the enterprise..
Challenges Management and Opportunities of Cloud DBAinventy
Research Inventy provides an outlet for research findings and reviews in areas of Engineering, Computer Science found to be relevant for national and international development, Research Inventy is an open access, peer reviewed international journal with a primary objective to provide research and applications related to Engineering. In its publications, to stimulate new research ideas and foster practical application from the research findings. The journal publishes original research of such high quality as to attract contributions from the relevant local and international communities.
A presentation on best practices for J2EE scalability from requirements gathering through to implementation, including design and architecture along the way.
Challenges for running Hadoop on AWS - AdvancedAWS MeetupAndrei Savu
Nowadays we've got all the tools we need to spin-up and tear-down clusters with hundreds of nodes in minutes and this puts more pressure on the tools we use to configure and monitor our applications. This challenge is even more interesting when we have to deal with long running distributed data storage and processing systems like Hadoop. In this talk we will look into some of the challenges we need to deal with when creating and managing Hadoop clusters in AWS, we will discuss improvement opportunities in monitoring (e.g. detecting and dealing with instance failure, resource contention & noisy neighbors) and a bit about the future and how we should go about disconnecting workload dispatch from cluster lifecycle.
Machine Learning in the Cloud has influences operations company-wide. Learn from Data Scientist, Ahmed Sherif, how to leverage Cloud offerings including AWS, IBM Cloud, and Microsoft Azure.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsDoKC
Is It Safe? Security Hardening for Databases Using Kubernetes Operators - Robert Hodges, Altinity
Thanks to the Operator Pattern, Kubernetes is now an outstanding platform to run databases. But to quote Marathon Man, "is it safe?" This talk is a top-level review of the database security problem in Kubernetes, standard ways that operators can mitigate threats, and a wallet-sized checklist of security features you should look for in any operator you use. Our talk is practical and focused on needs of Kubernetes developers. Join us!
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryDoKC
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery - Shivani Gupta, Elotl & Sergey Pronin, Percona
Disaster Recovery(DR) is critical for business continuity in the face of widespread outages taking down entire data centers or cloud provider regions. DR relies on deployment to multiple locations, data replication, monitoring for failure and failover. The process is typically manual involving several moving parts, and, even in the best case, involves some downtime for end-users. A multi-cluster K8s control plane presents the opportunity to automate the DR setup as well as the failure detection and failover. Such automation can dramatically reduce RTO and improve availability for end-users. This talk (and demo) describes one such setup using the open source Percona Operator for PostgreSQL and a multi-cluster K8s orchestrator. The orchestrator will use policy driven placement to replicate the entire workload on multiple clusters (in different regions), detect failure using pluggable logic, and do failover processing by promoting the standby as well as redirecting application traffic
More Related Content
Similar to Distributed Vector Databases - What, Why, and How
Cloud computing has been the most adoptable technology in the recent times, and the database has also
moved to cloud computing now, so we will look into the details of database as a service and its functioning.
This paper includes all the basic information about the database as a service. The working of database as a
service and the challenges it is facing are discussed with an appropriate. The structure of database in
cloud computing and its working in collaboration with nodes is observed under database as a service. This
paper also will highlight the important things to note down before adopting a database as a service
provides that is best amongst the other. The advantages and disadvantages of database as a service will let
you to decide either to use database as a service or not. Database as a service has already been adopted by
many e-commerce companies and those companies are getting benefits from this service.
Efficient and scalable multitenant placement approach for in memory database ...CSITiaesprime
Of late Multitenant model with In-Memory database has become prominent area for research. The paper has used advantages of multitenancy to reduce the cost for hardware, labor and make availability of storage by sharing database memory and file execution. The purpose of this paper is to give overview of proposed Supple architecture for implementing in-memory database backend and multitenancy, applicable in public and private cloud settings. Backend in memory database uses column-oriented approach with dictionary based compression technique. We used dedicated sample benchmark for the workload processing and also adopt the SLA penalty model. In particular, we present two approximation algorithms, multi-tenant placement (MTP) and best-fit greedy to show the quality of tenant placement. The experimental results show that MTP algorithm is scalable and efficient in comparison with best-fit greedy algorithm over proposed architecture.
Software Design PatternsConsider a company migrating to a third-p.pdfarorastores
Software Design Patterns:
Consider a company migrating to a third-party cloud-based solution from an internally
maintained ecosystem of applications utilizing one current-generation database system, as well
as a legacy system for older data. They plan to migrate all data to the cloud based solution in
time. But, for now, they are going to transition to the new cloud-based applications and the
cloud-based database for new data, but will rely upon the existing and legacy database for older
data. The databases have approximately the same functionality, but different interfaces and
languages.
What design pattern highlights the most significant challenge associated with integrating the
different databases (as well as one way of addressing it)?
What is that challenge?
Briefly, and in English, describe how the pattern teaches that we should approach this problem?
In other words, what is the pattern that should follow for the solution?
Solution
Design patterns like Factory pattern,Singleton pattern etc basically provide solutions to general
problems which are faced by software developers during the development phase. These patterns
do not play any role in Data migration.
There are four stages in Data Migration. They are:
1.Semantic Data models which comprises of the Dimensional models,Semantic models,
Mapping to Semantic building blocks.
2. Data Mapping Specifications which is used to translate Source data to target data.
3. KPIs and Data lineage which is useful in establishing the data lineage for the org and other
rightful requirements.
4. End-to-End scope of Data models is used to standardise data that is loaded in the Data
Warehouse.
Please follow the list of steps provided below while migrating data to the cloud:
1. Assessing the requirements and then plan.
2.Disintegrate the dependencies after the initial assessments.
3. Redesign, re-program and reintegrate.
4. Testing of new migrated components.
5. Fine tuning and training.
However, there would be technical issues while data migration. Many firms which migrate the
data to the cloud, proceed in a hybrid model, keeping key elementss of their infrastructure
inhouse and under their comtrol while they outsource less sensitive or core components.
Cloud vendors would always expect the customers to provide or develop a virtual image jointly
that specifies their basic server configuration, which is offered as a service after being built
inside the cloud. It is required that the IT team also have the skillset tp create a VM template
which includes infrastructure, application and security that is required by the enterprise..
Challenges Management and Opportunities of Cloud DBAinventy
Research Inventy provides an outlet for research findings and reviews in areas of Engineering, Computer Science found to be relevant for national and international development, Research Inventy is an open access, peer reviewed international journal with a primary objective to provide research and applications related to Engineering. In its publications, to stimulate new research ideas and foster practical application from the research findings. The journal publishes original research of such high quality as to attract contributions from the relevant local and international communities.
A presentation on best practices for J2EE scalability from requirements gathering through to implementation, including design and architecture along the way.
Challenges for running Hadoop on AWS - AdvancedAWS MeetupAndrei Savu
Nowadays we've got all the tools we need to spin-up and tear-down clusters with hundreds of nodes in minutes and this puts more pressure on the tools we use to configure and monitor our applications. This challenge is even more interesting when we have to deal with long running distributed data storage and processing systems like Hadoop. In this talk we will look into some of the challenges we need to deal with when creating and managing Hadoop clusters in AWS, we will discuss improvement opportunities in monitoring (e.g. detecting and dealing with instance failure, resource contention & noisy neighbors) and a bit about the future and how we should go about disconnecting workload dispatch from cluster lifecycle.
Machine Learning in the Cloud has influences operations company-wide. Learn from Data Scientist, Ahmed Sherif, how to leverage Cloud offerings including AWS, IBM Cloud, and Microsoft Azure.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsDoKC
Is It Safe? Security Hardening for Databases Using Kubernetes Operators - Robert Hodges, Altinity
Thanks to the Operator Pattern, Kubernetes is now an outstanding platform to run databases. But to quote Marathon Man, "is it safe?" This talk is a top-level review of the database security problem in Kubernetes, standard ways that operators can mitigate threats, and a wallet-sized checklist of security features you should look for in any operator you use. Our talk is practical and focused on needs of Kubernetes developers. Join us!
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryDoKC
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery - Shivani Gupta, Elotl & Sergey Pronin, Percona
Disaster Recovery(DR) is critical for business continuity in the face of widespread outages taking down entire data centers or cloud provider regions. DR relies on deployment to multiple locations, data replication, monitoring for failure and failover. The process is typically manual involving several moving parts, and, even in the best case, involves some downtime for end-users. A multi-cluster K8s control plane presents the opportunity to automate the DR setup as well as the failure detection and failover. Such automation can dramatically reduce RTO and improve availability for end-users. This talk (and demo) describes one such setup using the open source Percona Operator for PostgreSQL and a multi-cluster K8s orchestrator. The orchestrator will use policy driven placement to replicate the entire workload on multiple clusters (in different regions), detect failure using pluggable logic, and do failover processing by promoting the standby as well as redirecting application traffic
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...DoKC
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Data Mesh - Rakesh Subramanian Suresh & Jainik Vora, Intuit
This presentation explores how Intuit uses Kubernetes with Domain-Driven Design and Data Mesh principles to transform its data processing landscape, crucial for its AI-driven expert platform. We will discuss the importance of clean data in developing robust generative artificial intelligence and how Intuit is addressing this through the creation of paved paths for data platforms running on Kubernetes. We'll examine the challenges and solutions in managing 100,000 data pipelines and 1000+ engineers interacting with data, highlighting the need for scalable solutions. We'll also discuss how Intuit uses Kubernetes to build its batch and stream processing platform, overcoming hurdles in data pipeline deployment, scheduling, orchestration, and dependency management. We'll conclude by emphasizing how this transformation, based on treating data as a product, has improved decision-making speed and accuracy across the organization and fostered a more efficient, collaborative data culture.
The State of Stateful on Kubernetes - Stateful Workloads in Kubernetes: A Deep Dive - Kaslin Fields & Michelle Au, Google
As a platform for distributed computing, Kubernetes enables users to run their workloads across machines. However data has gravity, and when workloads in Kubernetes have to share data with other applications, managing the application’s requirements can get more tricky. In this talk, we will explore what "Stateful" means from Kubernetes' perspective. We will discuss the different types of stateful workloads, and the challenges of deploying them on Kubernetes. We will also look at the features that exist in Kubernetes to support stateful workloads, as well as the features that are in the works. Key Takeaways: What is a stateful workload from Kubernetes’ perspective? What are the challenges of deploying stateful workloads on Kubernetes? What features exist in Kubernetes to support stateful workloads? What features are in the works to support stateful workloads better in the future?
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...DoKC
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource Utilization - He Cao, ByteDance
Recently, more and more data workloads are running on top of Kubernetes, such as ETL processes, Spark and Flink jobs, and more. These workloads typically exhibit high resource utilization and remain relatively stable over time. In contrast, web services often exhibit tidal patterns, characterized by significant fluctuations in resource utilization. The resource model of vanilla Kubernetes is static, which can lead to low resource utilization accumulated over 24 hours. In this talk, He will introduce how ByteDance uses Katalyst to colocate data workloads and online services on Kubernetes to improve resource utilization. In addition, He will explain how Katalyst ensures the QoS of these workloads through QoS-aware scheduling, service profiling, multi-dimensional resource isolation, real-time container resource adjustment, and more. In ByteDance, Katalyst has been deployed on 500,000+ nodes with tens of millions of cores, and has improved daily resource utilization from 20% to 60%.
Make Your Kafka Cluster Production-Ready - Jakub Scholz, Red Hat
Kubernetes became the de-facto standard for running cloud-native applications. And more and more users turn to it also to run stateful applications such as Apache Kafka. While there are different tools such as Helm charts or operators which can get you quickly up and running, there is often still a long way to make sure the Kafka cluster is production-ready. This talk will take you through the main aspects you should consider for your Kafka cluster and will cover things such as resource management, storage, scheduling, rolling updates, or reliability. It will show you how to do it using the Strimzi operator, but the lessons learned will apply also to any other Kafka cluster. If you are interested in production-ready Apache Kafka on Kubernetes, this is a talk for you.
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...DoKC
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo Workflows and Argo Events - Ovidiu Valeanu, AWS & Vara Bonthu, Amazon
Are you eager to build and manage large-scale Spark clusters on Kubernetes for powerful data processing? Whether you are starting from scratch or considering migrating Spark workloads from existing Hadoop clusters to Kubernetes, the challenges of configuring storage, compute, networking, and optimizing job scheduling can be daunting. Join us as we unveil the best practices to construct a scalable Spark clusters on Kubernetes, with a special emphasis on leveraging Argo Workflows and Argo Events. In this talk, we will guide you through the journey of building highly scalable Spark clusters on Kubernetes, using the most popular open-source tools. We will showcase how to harness the potential of Argo Workflows and Argo Events for event-driven job scheduling, enabling efficient resource utilization and seamless scalability. By integrating these powerful tools, you will gain better control and flexibility for executing Spark jobs on Kubernetes.
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudDoKC
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud - Sagy Volkov, Lightbits
PostgreSQL as a SQL engine can accommodate a very high-transaction rate, but as your data grows and the number of connections and queries increases, there is a challenge for the storage to keep up with the SQL engine.
To the rescue comes NVMe over TCP (or NVMe/TCP). Developed by Lightbits Labs in 2016 and donated to the Linux community, it is the next evaluation of using NVMe based storage over TCP Fabric. NVMe/TCP simplifies how you interact with remote NVMe devices (targets) and allows your PostgreSQL storage to consume fast storage very easily.
In this session I will explain the core concept of the NVMe/TCP protocol, current storage providers that can use it, how you can consume it in Kubernetes (super easy), and discuss the possibilities of using NVMe/TCP in the cloud.
The session will also include a performance comparison of a few storage that are available in AWS and even a live demo of how PostgreSQL can run super fast - warp speed fast - in AWS.
Link: https://www.youtube.com/watch?v=D8kJCvsHD9Q&list=PLHgdNuGxrJt04Fwaip9aDYvXrbRSmc5HZ&index=12
https://go.dok.community/slack
https://dok.community/
From DoK Day NA 2022 (https://www.youtube.com/watch?v=YWTa-DiVljY&list=PLHgdNuGxrJt04Fwaip9aDYvXrbRSmc5HZ)
In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we’ve started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them.
In this talk, we’ll examine two different databases, TiDB and Apache Cassandra, in order to identify what it means for a database to be Kubernetes native and why it matters. We’ll look at points including:
- The differences between cloud native, Kubernetes native, and serverless
- How databases become Kubernetes native
- Benefits of Kubernetes native databases
- How Kubernetes can better support databases
-----
Jeff has worked as a software engineer and architect in multiple industries and as a developer advocate helping engineers get up to speed on Apache Cassandra. He's involved in multiple open source projects in the Cassandra and Kubernetes ecosystems including Stargate and K8ssandra. Jeff is the author of the O’Reilly books “Cassandra: The Definitive Guide" and “Managing Cloud Native Data on Kubernetes".
ING Data Services hosted on ICHP DoK Amsterdam 2023DoKC
An explanation of how ING deals with local persistence at scale in secure and compliant manner for Elastic and Prometheus workloads today and other Data Services in the future.
In more detail we will elaborate on the following topics
How we solve local persistence
Type of workloads now and in the future
Typical requirements for a banking environment
Automation
Scale
Resilience
Security / Compliance
Service offering / demarcation
About Tor and Luuk:
Tor and Luuk are experienced engineers working at ING for over 10 years and working in the Kubernetes area for the last 5 years. They are specialized in and responsible for the Data Services OpenShift clusters in ING and have a strong focus on resilience, automation and security.
Implementing data and databases on K8s within the Dutch governmentDoKC
A small walkthrough of projects within the dutch government running Data(bases) on OpenShift. This talk shares success stories, provides a proven recipe to `get it done` and debunks some of the FUD.
About Sebastiaan:
I have always been a weird DBA, trying to combine Databases with out-of-the-box thinking and a DevOps mindset. Around 2016 I fell in love with both Postgres and Kubernetes, and I then committed my life to enabling Dutch organisations with running their Database workloads CloudNative.
Over the last few years I worked as a private contractor for 2 large government agencies doing exactly that, and I want to share my and others (success stories) hoping to enable and inspire Data on Kubernetes adoption.
https://go.dok.community/slack
https://dok.community/
Link: https://youtu.be/n_thXwyJNSU
ABSTRACT OF THE TALK
Deploying Stateless applications is easy but this is not the case for Stateful applications. StatefulSets are the K8s API object that helps to manage stateful application. Learn about what Stateful sets are, how to create, How it differs from Deployments.
KEY TAKE-AWAYS FROM THE TALK
This talk is focused on basics of StatefulSet, how StatefulSet differs from Deployments, How to manage Stateful app using StatefulSet
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...DoKC
Link: https://youtu.be/cegd3Exg05w
https://go.dok.community/slack
https://dok.community/
Gabriele Bartolini - Vice President/CTO of Cloud Native and Kubernetes, EDB
ABSTRACT OF THE TALK
Imagine this: you have a virtual infrastructure based on Kubernetes, made up of virtual data centers, possibly spread across multiple Kubernetes clusters and regions. Your infrastructure could even be hosted on premises or on different cloud service providers. Infrastructure as Code is a requirement. You’ve been tasked to run Postgres databases, alongside your applications.
The good news is that you can leverage a fully open source stack with Kubernetes, PostgreSQL and the CloudNativePG operator, and deploy your Postgres database in the same way you deploy applications.
Join me in this webinar to discover the key role that you have to make this succeed, starting from day 0 through day 2 operations.
I’ll share some examples and best practices for running Postgres databases in Kubernetes, before peeking at the new features we are developing for the months to come.
Analytics with Apache Superset and ClickHouse - DoK Talks #151DoKC
Link: https://youtu.be/Y-1uFVKDfgY
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
This talk concerns performing analytical tasks with Apache Superset with ClickHouse as the data backend. ClickHouse is a super fast database for analytical tasks, and Apache Superset is an Apache Software foundation project meant for data visualization and exploration. Performing analytical tasks using this combo is super fast since both the software are designed to be scalable and capable of handling data of petabyte scale.
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...DoKC
Link: https://youtu.be/EFaRyl4HmmE
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
If you are running or planning a multi-cloud or even a multi-cluster environment, there are several considerations in implementing a data protection solution – especially if you plan on an organic home-grown, do-it-yourself option. This talk will highlight challenges and best practices around centralized management of configuration, credentials, compliance across multiple accounts, regions, providers etc. We will also highlight the deviations in CSI driver implementations of various storage vendors and cloud providers. Finally, we will cover the various recovery options available in the market today.
Kubernetes cloud services are popular since they mitigate, but do not eliminate, the difficulties of operating a Kubernetes environment. This is especially true for protecting the stateful configuration and data of your Kubernetes applications, where the inherent high-availability and infrastructure as code are not a substitute for have cloud-native backup and disaster recovery capabilities. Further, many companies now have multi-cloud strategies for their cloud-native applications. These challenges can be addressed with backup applications that are both Kubernetes managed service and multi-cloud aware in order to snapshot, copy, restore, and migrate Kubernetes workloads (resources and data) running on AKS, EKS and GKE. Capturing information from cloud accounts and how the cluster and storage resources are configured allows 1) centralized visibility into all cloud accounts and the clusters and resources in the accounts including for compliance; 2) cross-account, cross-cluster, and cross-region data restores; 3) automation of the cluster and data restores including for Dev, Test, and Production recovery use cases.
BIO
Sebastian Glab is a Cloud Architect for CloudCasa and he resides in Poland. He is responsible for integrating the different cloud providers with the CloudCasa service, and making sure that all clusters in the cloud service get discovered and protected. In his free time, he plays volleyball and develops his own projects.
Martin Phan is the Field CTO in North America for CloudCasa by Catalogic Software. With over 20+ years of experience in the software-industry, he takes pride in supporting, developing, implementing, and selling enterprise software and data protection solutions to help customer solve their backup and recovery challenges.
KEY TAKE-AWAYS FROM THE TALK
1) Challenges and best practices around centralized management of configuration, credentials, compliance across multiple accounts, regions, providers etc.
2) Advantages of cloud awareness and Kubernetes managed service awareness for application and data recovery and security
3) Examples of overcoming Container Storage Interface (CSI) deviations
4) Various recovery options available in the market today.
Evaluating Cloud Native Storage Vendors - DoK Talks #147DoKC
Link: https://youtu.be/YVXEpcSclwY
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
In a continuation of a talk given at DoK day at KubeCon EU 2022, join Dinesh Majrekar, Civo's CTO as they walk through their evaluation process of the CNCF Storage market.
Civo offers managed Kubernetes clusters powered by K3s to customers around the world. We manage thousands of Virtual Machines and stateful customer data within multiple data centres across several continents.
In late 2021, Civo had the opportunity to evaluate the CNCF storage landscape to move to a new technology stack. During the migration project, Civo evaluated Mayastor, Ondat, Ceph and Longhorn against the following metrics:
Scalability
Performance
Ease of Support
Attendants will see practical examples on how they could carry out their own similar evaluation and see some of the results of the Civo research project.
BIO
Dinesh is CTO at Civo. Having worked in the hosting industry for many years, Dinesh has a passion for creating solutions that operate at scale. This not only applies to the technology stack, but for nurturing engineers through their career.
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...DoKC
Link: https://youtu.be/qUW8LkxYayc
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
How do you make sure your Stateful Workloads remain available when your Kubernetes infrastructure updates? This talk will discuss different strategies of upgrading a Kubernetes cluster, and how you can manage risk for your workload. The talk will showcase demos of each upgrade strategy.
BIO
Peter is a Senior Software Engineer on GKE at Google. He works on improving Kubernetes for Stateful workloads. His main focus is on enhancing the Kubernetes ecosystem for high availability applications.
KEY TAKE-AWAYS FROM THE TALK
The mechanics of different upgrade strategies, when to apply a particular upgrade strategy depending on your Stateful workload and how to mitigate risk to your application’s availability.
We will Dok You! - The journey to adopt stateful workloads on k8sDoKC
Link: https://youtu.be/AjvwG53yLMY
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
Stateful workloads are the heart of any application, yet they remain confusing and complicated even to daily K8s practitioners. That’s why many organizations shy away from migrating their data - their prized possession - to the unfamiliar stateful realm of Kubernetes.
After meeting with many organizations in the adoption phase, I discovered what works best, what to avoid, and how critical it is to gain confidence and the right knowledge in order to successfully adopt stateful workloads.
In this talk I will demonstrate how to optimally adopt Kubernetes and stateful workloads in a few steps, based on what I’ve learned from observing dozens of different adoption journeys. If you are taking your first steps in data on K8s or contemplating where to start - this talk is for you!
BIO
- A Developer turned Solution Architect.
- Working at Komodor, a startup building the first K8s-native troubleshooting platform.
- Love everything in infrastructure: storage, networks & security - from 70’s era mainframes to cloud-native.
- All about “plan well, sleep well”.
KEY TAKE-AWAYS FROM THE TALK
- Understand how critical stateful workloads are for any system, and that the key challenges to migrating it to Kubernetes are knowledge and confidence.
- How to build the foundational knowledge required to overcome adoption challenges by creating a learning path for individuals and teams.
- How to gain confidence to run stateful workloads on Kubernetes with support from the community (and yourself!)
Mastering MongoDB on Kubernetes, the power of operators DoKC
Link: https://youtu.be/Pi5ueyl_1jU
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
During my first talk for DoK community I want to walk you through the world of NoSQL database MongoDB and Kubernetes Operators - Community Edition, Enterprise Edition (MongoDB and Ops Manager on K8s), and Atlas operator, highlight the most important capabilities, talk about use cases and challenges, the theory will be mixed with a live demos!
BIO
I'm a SRE / NoSQL / DevOps professional. I hold CKA, CKAD, CKS, also I’m MongoDB Certified DBA and MongoDB Champion. I have experience with multiple cloud providers, Kubernetes, different types of K8s operators (Strimzi, RabbitMQ Cluster Operator), but especially MongoDB K8s Operator. I also work with KEDA. Since 2017, I have been a speaker at MongoDB conferences all around the world (USA, China, Europe).
KEY TAKE-AWAYS FROM THE TALK
I would like to share the best practices of running NoSQL database - MongoDB on Kubernetes also I want to show how to manage Atlas (MongoDB cloud) via K8s operator
https://www.mongodb.com/developer/community-champions/arkadiusz-borucki/
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...DoKC
Link: https://youtu.be/KUipuM3UJF4
https://go.dok.community/slack
https://dok.community/
DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE)
Kubernetes comes with a lot of useful features like Volumes and StatefulSets, which make running stateful workloads simple. Interestingly, when combined with the right tools, these features can make Kubernetes very valuable for developers wanting to run massive production databases in development! This is exactly what was seen at "Extendi".
The developers at Extendi deal with a large amount of data in their production Kubernetes clusters. But when developing locally, they didn't have an easy way of replicating this data. This replication was needed because it allowed developers to test new features instantaneously without worrying if they would work as expected when pushed to production. But replicating a 100Gb+ production database for development wasn't turning out to be an easy task!
This is where leveraging Kubernetes + remote development environments came to the rescue. Running data on Kubernetes turned out to be way faster than any of the traditional approaches because of Kubernetes' ability to handle stateful workloads exceptionally well. And since Extendi already used Kubernetes in production - the setup process was fairly simple.
This talk will cover practical steps on how leveraging Kubernetes based development environments allowed dev teams at Extendi to run production data on Kubernetes during development using features like Volume Snapshots, having a huge positive impact on developer productivity.
-----
Arsh is a Developer Experience Engineer at Okteto. He is an active contributor to the upstream Kubernetes project and was awarded the Kubernetes Contributor Award for his contributions in 2021. Arsh has written blogs and spoken about different topics in the cloud-native ecosystem at various conferences before, including KubeCon + CloudNativeCon + Open Source Summit China 2021. He has also been on the Kubernetes Release Team since the 1.23 release. He also serves as the New Contributor Ambassador for the Documentation Special Interest Group of the Kubernetes project and continuously mentors new folks in the community. Previously, he worked at VMware and was an active contributor to other CNCF projects, including cert-manager and Kyverno.
-----
Lapo is a Software Engineer currently leading the development team of a Social Listening and Audience Intelligence platform. He started coding at the early age of 14 and since he turned his passion into a real job, he has always been looking for boosting his knowledge by constantly researching for newer and newer technologies.
Active on Ruby Open Source projects
-----
Ramiro Berrelleza is one of the founders of Okteto. He has spent most of his career (and his free time) building cloud services and developer tools. Before starting Okteto, Ramiro was an Architect at Atlassian and a Software Engineer at Microsoft Azure.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.