With AWS, you can choose the right storage service for the right use case. This session shows the range of AWS choices - object storage to block storage - that are available to you. We include specifics about real-world deployments from customers who are using Amazon S3, Amazon EBS, Amazon Glacier, and AWS Storage Gateway.
Speakers:
Matt McClean, AWS Solutions Architect
Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the cloud. It provides cost-efficient, re-sizable capacity for an industry-standard relational database and manages common database administration tasks
This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014.
http://www.meetup.com/nylug-meetings/events/180533472/
This presentation is for those of you who are interested in moving your on-prem SQL Server databases and servers to Azure virtual machines (VM’s) in the cloud so you can take advantage of all the benefits of being in the cloud. This is commonly referred to as a “lift and shift” as part of an Infrastructure-as-a-service (IaaS) solution. I will discuss the various Azure VM sizes and options, migration strategies, storage options, high availability (HA) and disaster recovery (DR) solutions, and best practices.
One of the major trends in data warehousing/data engineering is the transition from click-based ETL tools to using code for defining data pipelines. Nowadays, the vast majority of projects either start with a set of simple shell/ bash scripts or with platforms such as Luigi or Apache Airflow, with the latter clearly becoming the dominant player. In the past 6 years, Project A also followed this approach when building data warehouses for more than 20 of its portfolio companies and we are now open sourcing the underlying infrastructure (https://github.com/mara). Basically, it is a lightweight, opinionated Airflow, with a focus on transparency and complexity reduction. In this talk, I will guide you through some of the design decisions behind the platform and some general learnings for setting up successful data engineering teams.
컨테이너를 활용하여 마이크로서비스를 구성할 때는 효과적으로 컨테이너 및 서비스를 관리할 수 있는 방법이 필요합니다. 본 세션에서는 유연하게 컨테이너 환경을 관리/모니터링 할 수 있는 Amazon EC2 Container Service 및 EC2 Container Registry를 소개합니다. 아울러 Amazon ECS/ECR 환경에서 효과적인 자원 및 로그 관리, 마이크로서비스 관리에 대해서 자세히 살펴봅니다.
With AWS, you can choose the right storage service for the right use case. This session shows the range of AWS choices - object storage to block storage - that are available to you. We include specifics about real-world deployments from customers who are using Amazon S3, Amazon EBS, Amazon Glacier, and AWS Storage Gateway.
Speakers:
Matt McClean, AWS Solutions Architect
Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the cloud. It provides cost-efficient, re-sizable capacity for an industry-standard relational database and manages common database administration tasks
This is a introduction to PostgreSQL that provides a brief overview of PostgreSQL's architecture, features and ecosystem. It was delivered at NYLUG on Nov 24, 2014.
http://www.meetup.com/nylug-meetings/events/180533472/
This presentation is for those of you who are interested in moving your on-prem SQL Server databases and servers to Azure virtual machines (VM’s) in the cloud so you can take advantage of all the benefits of being in the cloud. This is commonly referred to as a “lift and shift” as part of an Infrastructure-as-a-service (IaaS) solution. I will discuss the various Azure VM sizes and options, migration strategies, storage options, high availability (HA) and disaster recovery (DR) solutions, and best practices.
One of the major trends in data warehousing/data engineering is the transition from click-based ETL tools to using code for defining data pipelines. Nowadays, the vast majority of projects either start with a set of simple shell/ bash scripts or with platforms such as Luigi or Apache Airflow, with the latter clearly becoming the dominant player. In the past 6 years, Project A also followed this approach when building data warehouses for more than 20 of its portfolio companies and we are now open sourcing the underlying infrastructure (https://github.com/mara). Basically, it is a lightweight, opinionated Airflow, with a focus on transparency and complexity reduction. In this talk, I will guide you through some of the design decisions behind the platform and some general learnings for setting up successful data engineering teams.
컨테이너를 활용하여 마이크로서비스를 구성할 때는 효과적으로 컨테이너 및 서비스를 관리할 수 있는 방법이 필요합니다. 본 세션에서는 유연하게 컨테이너 환경을 관리/모니터링 할 수 있는 Amazon EC2 Container Service 및 EC2 Container Registry를 소개합니다. 아울러 Amazon ECS/ECR 환경에서 효과적인 자원 및 로그 관리, 마이크로서비스 관리에 대해서 자세히 살펴봅니다.
비즈니스 경쟁은 혁신 기술로 치열하게 격돌하는 승부처 이고 AI/ML은 가장 파급력이 높은 혁신 기술입니다. 여기서는 비즈니스 혁신을 만들 수 있는 AWS의 AI/ML 서비스를 소개하고, 사례를 기반으로 초개인화 서비스, 고객 경험 혁신, 문서 처리, 수요 예측 등을 살펴보겠습니다. 마지막으로 책임있는 AI를 제공하기 위한 고려 요소와 준비를 이야기하겠습니다.
State, Local and Education customers are using the AWS cloud to enable faster disaster recovery of their mission critical IT systems without incurring the infrastructure expense of a second physical site. Join us for an informative webinar on how AWS cloud supports many popular disaster recovery (DR) architectures from “pilot light” environments that are ready to scale up at a moment’s notice to “hot standby” environments that enable rapid failover. With infrastructure centers in 10 regions around the world, AWS provides a set of cloud-based DR services that enable rapid recovery of your IT infrastructure and data.
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum and optimizing your overall capital expense can be challenging. This session presents AWS features and services along with disaster recovery architectures that you can leverage when building highly available and disaster-resilient strategies.
by Mahesh Pakal, AWS
PostgreSQL is a powerful, enterprise class open source object-relational database system with an emphasis on extensibility and standards-compliance. PostgreSQL boasts many sophisticated features and runs stored procedures in more than a dozen programming languages. We’ll explore the advantages and limitations of PostgreSQL, examples of where it is best suited for use, and examples of who is using PostgreSQL to power their applications.
Azure SQL Database Managed Instance is a new flavor of Azure SQL Database that is a game changer. It offers near-complete SQL Server compatibility and network isolation to easily lift and shift databases to Azure (you can literally backup an on-premise database and restore it into a Azure SQL Database Managed Instance). Think of it as an enhancement to Azure SQL Database that is built on the same PaaS infrastructure and maintains all it's features (i.e. active geo-replication, high availability, automatic backups, database advisor, threat detection, intelligent insights, vulnerability assessment, etc) but adds support for databases up to 35TB, VNET, SQL Agent, cross-database querying, replication, etc. So, you can migrate your databases from on-prem to Azure with very little migration effort which is a big improvement from the current Singleton or Elastic Pool flavors which can require substantial changes.
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
온프레미스 분석 플랫폼에는 자원 증설 비용, 자원 관리 비용, 신규 자원 도입 및 환경 설정의 리드타임 등 다양한 측면에서의 한계가 존재합니다. 이에 KB국민카드에서는 기존 분석 플랫폼의 한계를 극복함과 동시에 시너지를 낼 수 있는 클라우드 기반 분석 플랫폼을 설계 및 도입하였습니다. 본 사례 소개는 KB국민카드의 데이터 혁신 여정과 노하우를 소개합니다.
시계열 예측 자동화를 위한 Amazon Forecast 기반 MLOps 파이프라인 구축하기 - 김주영, 이동민 AWS 솔루션즈 아키텍트 :...Amazon Web Services Korea
인공지능 및 기계 학습 프로젝트들은 데이터 유입부터 학습, 모델 검증 및 제공까지 전체 프로세스의 반복을 통해 최적의 값을 지속적으로 제공하는 것을 목표로 합니다. 본 실습에서는 Amazon Forecast가 지속적으로 데이터를 학습하고 시계열 예측 모델을 제공할 수 있도록 자동화 된 MLOps 파이프라인을 구축하는 방법에 대해 학습합니다.
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
Congratulations: you've been selected to build an application that will manage whether or not the rooms for PGConf.EU are being occupied by a session!
On the surface, this sounds simple, but we will be managing the rooms of PGConf.EU, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the PGConf.EU website checking to see what availability each of the PGConf.EU rooms has.
To do this, we will explore the following PGConf.EU features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges
Indexes such as:
* GiST
* SP-Gist
* Common Table Expressions and Recursion
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all PGConf.EU attendees made possible by the innovation of PGConf.EU!
Amazon EKS Architecture in detail including CNI/Networking, IAM, Provisioning, Shared Responsibility Model, Project Calico, Load Balancing, Logging/Metrics, CI/CD using AWS CodePipeline, CodeCommit, CodeBuild, Lambda, Amazon ECR and Parameter Store and finally the use of Spot Instances which could yield a savings of 70-90% versus conventional on-demand EC2 instances.
AWS offers you the ability to add additional layers of security to your data at rest in the cloud, providing access control as well scalable and efficient encryption features. Flexible key management options allow you to choose whether to have AWS manage the encryption keys or to keep complete control over the keys yourself. In this session, you will learn how to secure data when using AWS services. We will discuss Key Management Service, S3, access controls, and database platform security features.
Breaking down the economics and tco of migrating to aws - TorontoAmazon Web Services
This session is for anyone interested in understanding the financial costs associated with migrating workloads to AWS. By presenting real cases from AWS Professional Services and directly from a customer, we explore how to measure value, improve the economics of a migration project, and manage migration costs and expectations through large-scale IT transformations. We’ll also look at automation tooling that can further assist and accelerate the migration process.
The success of application deployment on cloud depends a lot on the architecture style which in turn depends on your business needs. This presentation talks about the commonly used Architecture and business use cases.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Using PostgreSQL With Docker & Kubernetes - July 2018Jonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL along with pgadmin4 and monitoring
- Running PostgreSQL on Kubernetes with a Demo
- Trends in the container world and how it will affect PostgreSQL
비즈니스 경쟁은 혁신 기술로 치열하게 격돌하는 승부처 이고 AI/ML은 가장 파급력이 높은 혁신 기술입니다. 여기서는 비즈니스 혁신을 만들 수 있는 AWS의 AI/ML 서비스를 소개하고, 사례를 기반으로 초개인화 서비스, 고객 경험 혁신, 문서 처리, 수요 예측 등을 살펴보겠습니다. 마지막으로 책임있는 AI를 제공하기 위한 고려 요소와 준비를 이야기하겠습니다.
State, Local and Education customers are using the AWS cloud to enable faster disaster recovery of their mission critical IT systems without incurring the infrastructure expense of a second physical site. Join us for an informative webinar on how AWS cloud supports many popular disaster recovery (DR) architectures from “pilot light” environments that are ready to scale up at a moment’s notice to “hot standby” environments that enable rapid failover. With infrastructure centers in 10 regions around the world, AWS provides a set of cloud-based DR services that enable rapid recovery of your IT infrastructure and data.
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum and optimizing your overall capital expense can be challenging. This session presents AWS features and services along with disaster recovery architectures that you can leverage when building highly available and disaster-resilient strategies.
by Mahesh Pakal, AWS
PostgreSQL is a powerful, enterprise class open source object-relational database system with an emphasis on extensibility and standards-compliance. PostgreSQL boasts many sophisticated features and runs stored procedures in more than a dozen programming languages. We’ll explore the advantages and limitations of PostgreSQL, examples of where it is best suited for use, and examples of who is using PostgreSQL to power their applications.
Azure SQL Database Managed Instance is a new flavor of Azure SQL Database that is a game changer. It offers near-complete SQL Server compatibility and network isolation to easily lift and shift databases to Azure (you can literally backup an on-premise database and restore it into a Azure SQL Database Managed Instance). Think of it as an enhancement to Azure SQL Database that is built on the same PaaS infrastructure and maintains all it's features (i.e. active geo-replication, high availability, automatic backups, database advisor, threat detection, intelligent insights, vulnerability assessment, etc) but adds support for databases up to 35TB, VNET, SQL Agent, cross-database querying, replication, etc. So, you can migrate your databases from on-prem to Azure with very little migration effort which is a big improvement from the current Singleton or Elastic Pool flavors which can require substantial changes.
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
온프레미스 분석 플랫폼에는 자원 증설 비용, 자원 관리 비용, 신규 자원 도입 및 환경 설정의 리드타임 등 다양한 측면에서의 한계가 존재합니다. 이에 KB국민카드에서는 기존 분석 플랫폼의 한계를 극복함과 동시에 시너지를 낼 수 있는 클라우드 기반 분석 플랫폼을 설계 및 도입하였습니다. 본 사례 소개는 KB국민카드의 데이터 혁신 여정과 노하우를 소개합니다.
시계열 예측 자동화를 위한 Amazon Forecast 기반 MLOps 파이프라인 구축하기 - 김주영, 이동민 AWS 솔루션즈 아키텍트 :...Amazon Web Services Korea
인공지능 및 기계 학습 프로젝트들은 데이터 유입부터 학습, 모델 검증 및 제공까지 전체 프로세스의 반복을 통해 최적의 값을 지속적으로 제공하는 것을 목표로 합니다. 본 실습에서는 Amazon Forecast가 지속적으로 데이터를 학습하고 시계열 예측 모델을 제공할 수 있도록 자동화 된 MLOps 파이프라인을 구축하는 방법에 대해 학습합니다.
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
Congratulations: you've been selected to build an application that will manage whether or not the rooms for PGConf.EU are being occupied by a session!
On the surface, this sounds simple, but we will be managing the rooms of PGConf.EU, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the PGConf.EU website checking to see what availability each of the PGConf.EU rooms has.
To do this, we will explore the following PGConf.EU features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges
Indexes such as:
* GiST
* SP-Gist
* Common Table Expressions and Recursion
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all PGConf.EU attendees made possible by the innovation of PGConf.EU!
Amazon EKS Architecture in detail including CNI/Networking, IAM, Provisioning, Shared Responsibility Model, Project Calico, Load Balancing, Logging/Metrics, CI/CD using AWS CodePipeline, CodeCommit, CodeBuild, Lambda, Amazon ECR and Parameter Store and finally the use of Spot Instances which could yield a savings of 70-90% versus conventional on-demand EC2 instances.
AWS offers you the ability to add additional layers of security to your data at rest in the cloud, providing access control as well scalable and efficient encryption features. Flexible key management options allow you to choose whether to have AWS manage the encryption keys or to keep complete control over the keys yourself. In this session, you will learn how to secure data when using AWS services. We will discuss Key Management Service, S3, access controls, and database platform security features.
Breaking down the economics and tco of migrating to aws - TorontoAmazon Web Services
This session is for anyone interested in understanding the financial costs associated with migrating workloads to AWS. By presenting real cases from AWS Professional Services and directly from a customer, we explore how to measure value, improve the economics of a migration project, and manage migration costs and expectations through large-scale IT transformations. We’ll also look at automation tooling that can further assist and accelerate the migration process.
The success of application deployment on cloud depends a lot on the architecture style which in turn depends on your business needs. This presentation talks about the commonly used Architecture and business use cases.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Using PostgreSQL With Docker & Kubernetes - July 2018Jonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL along with pgadmin4 and monitoring
- Running PostgreSQL on Kubernetes with a Demo
- Trends in the container world and how it will affect PostgreSQL
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
Delivery of a new Bio-informatics infrastructure at the Wellcome Trust Sanger Center. We include how to programatically create, manage and provide providence for images used both at Sanger and elsewhere using open source tools and continuous integration.
Cask Webinar
Date: 08/10/2016
Link to video recording: https://www.youtube.com/watch?v=XUkANr9iag0
In this webinar, Nitin Motgi, CTO of Cask, walks through the new capabilities of CDAP 3.5 and explains how your organization can benefit.
Some of the highlights include:
- Enterprise-grade security - Authentication, authorization, secure keystore for storing configurations. Plus integration with Apache Sentry and Apache Ranger.
- Preview mode - Ability to preview and debug data pipelines before deploying them.
- Joins in Cask Hydrator - Capabilities to join multiple data sources in data pipelines
- Real-time pipelines with Spark Streaming - Drag & drop real-time pipelines using Spark Streaming.
- Data usage analytics - Ability to report application usage of data sets.
- And much more!
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
During this webinar, we will review best practices and lessons learned from working with large and mid-size companies on their deployment of PostgreSQL. We will explore the practices that helped industry leaders move through these stages quickly, and get as much value out of PostgreSQL as possible without incurring undue risk.
During this webinar, we will review best practices and lessons learned from working with large and mid-size companies on their deployment of PostgreSQL. We will explore the practices that helped industry leaders move through these stages quickly, and get as much value out of PostgreSQL as possible without incurring undue risk.
We have identified a set of levers that companies can use to accelerate their success with PostgreSQL:
- Application Tiering
- Collaboration between DBAs and Development Teams
- Evangelizing
- Standardization and Automation
- Balance of Migration and New Development
Get a glimpse of the main features supported in Nuxeo Platform LTS 2015.
With this LTS version of the Nuxeo Platform, we’re changing how we assign product version names and numbers. The name for each LTS version is now based on the release year. Nuxeo Platform LTS 2015 is the result of the four Fast Track releases throughout the past year.
Highlights of Nuxeo Platform LTS 2015 include:
- Nuxeo Live Connect: Native Integration with Google Drive & Dropbox
- Content Analytics & Data Visualisation
- Elasticsearch: API Passthrough, Hints for NXQL, Security
- Massive Scalability with MongoDB Integration
- New Document Viewer
- Automation Scripting
- Nuxeo Drive 2
- Automated Media Conversions
In this open marketing meeting, we discuss the major features for the Grizzly release, coming April 4, as well as a preview of the Summit and upcoming upcoming events.
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
This is a slide deck from QuerySurge's Big Data Testing webinar.
Learn why Testing is pivotal to the success of your Big Data Strategy .
Learn more at www.querysurge.com
The growing variety of new data sources is pushing organizations to look for streamlined ways to manage complexities and get the most out of their data-related investments. The companies that do this correctly are realizing the power of big data for business expansion and growth.
Learn why testing your enterprise's data is pivotal for success with big data, Hadoop and NoSQL. Learn how to increase your testing speed, boost your testing coverage (up to 100%), and improve the level of quality within your data warehouse - all with one ETL testing tool.
This information is geared towards:
- Big Data & Data Warehouse Architects,
- ETL Developers
- ETL Testers, Big Data Testers
- Data Analysts
- Operations teams
- Business Intelligence (BI) Architects
- Data Management Officers & Directors
You will learn how to:
- Improve your Data Quality
- Accelerate your data testing cycles
- Reduce your costs & risks
- Provide a huge ROI (as high as 1,300%)
Geek Sync | Deployment and Management of Complex Azure EnvironmentsIDERA Software
You can watch the replay of this Geek Sync webinar in the IDERA Resource Center: http://ow.ly/pg7N50A4svf.
Today's data management professional is finding their landscape changing. They have multiple database platforms to manage, multi-OS environments and everyone wants it now.
Join IDERA and Kellyn Pot’Vin-Gorman as she discusses the power of auto deployment in Azure when faced with complex environments and tips to increase the knowledge you need at the speed of light. Kellyn will cover scripting basics, advanced Portal features, opportunities to lessen the learning curve and how multi-platform and tier doesn't have to mean multi-cloud.
Attendees can expect to learn how to build automation scripts efficiently, even if you have little scripting experience, and how to work with Azure automation deployments. This session will allow you to begin building a repository of multi-platform development scripts to use as needed.
About Kellyn: Kellyn Pot’Vin-Gorman is a member of the Oak Table Network and an IDERA ACE and Oracle ACE Director alumnus. She is the newest Technical Solution Professional in Power BI with AI in the EdTech group at Microsoft. Kellyn is known for her extensive work with multi-database platforms, DevOps, cloud migrations, virtualization, visualizations, scripting, environment optimization tuning, automation, and architecture design. She has spoken at numerous technical conferences for Oracle, Big Data, DevOps, Testing and SQL Server. Her blog, http://dbakevlar.com and social media activity under her handle, DBAKevlar is well respected for her insight and content.
New enhancements for security and usability in EDB 13EDB
EDB 13 is here and it enhances our flagship database server and tools. This webinar will explore its security, usability, and portability updates. Join us to learn how EDB 13 can help you improve your PostgreSQL productivity and data protection.
Webinar highlights include:
- New security features such as SCRAM and the encryption of database passwords and traffic between Failover Manager agents
- Usability updates that automate partitioning, verify backup integrity, and streamline the management of failover and backups
- Portability improvements that simplify running PostgreSQL across on-premise and cloud environments
Robert Bates, SVP Sales Engineering of Crunchy Data explains how you can tackle Data Gravity, Kubernetes, and strategies/best practices to run, scale, and leverage stateful containers in production.
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Gabriele Bartolini
Migrating an Oracle database to Postgres is never an automated operation. And it rarely (never?) involve just the database. Experience brought us to develop an agile methodology for the migration process, involving schema migration, data import, migration of procedures and queries up to the generation of unit tests for QA.
Pitfalls, technologies and main migration opportunities will be outlined, focusing on the reduction of total costs of ownership and management of a database solution in the middle-long term (without reducing quality and business continuity requirements).
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio, Inc.
Alluxio Bay Area Meetup March 14th
Join the Alluxio Meetup group: https://www.meetup.com/Alluxio
Alluxio Community slack: https://www.alluxio.org/slack
Similar to High Availability PostgreSQL on OpenShift...and more! (20)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Jonathan Katz
Vectors are a centuries old, well-studied mathematical concept, yet they pose many challenges around efficient storage and retrieval in database systems. The heightened ease-of-use of AI/ML has lead to a surge of interested of storing vector data alongside application data, leading to some unique challenges. PostgreSQL has seen this story before with JSON, when JSON became the lingua franca of the web. So how can you use PostgreSQL to manage your vector data, and what challenges should you be aware of?
In this session, we'll review what vectors are, how they are used in applications, and what users are looking for in vector storage and search systems. We'll then see how you can search for vector data in PostgreSQL, including looking at best practices for using pgvector, an extension that adds additional vector search capabilities to PostgreSQL. Finally, we'll review ongoing development in both PostgreSQL and pgvector that will make it easier and more performant to search vector data in PostgreSQL.
There are parallels between storing JSON data in PostgreSQL and storing vectors that are produced from AI/ML systems. This lightning talk briefly covers the similarities in use-cases in storing JSON and vectors in PostgreSQL, shows some of the use-cases developers have for querying vectors in Postgres, and some roadmap items for improving PostgreSQL as a vector database.
This talk explores PostgreSQL 15 enhancements (along with some history) and looks at how they improve developer experience (MERGE and SQL/JSON), optimize support for backups and compression, logical replication improvements, enhanced security and performance, and more.
Build a Complex, Realtime Data Management App with Postgres 14!Jonathan Katz
Congratulations: you've been selected to build an application that will manage reservations for rooms!
On the surface, this sounds simple, but you are building a system for managing a high traffic reservation web page, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the website checking to see what availability each room has.
Fortunately, PostgreSQL is prepared for this! And even better, we will be using Postgres 14 to make the problem even easier!
We will explore the following PostgreSQL features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges / Multirnages
Indexes such as:
* GiST
* Common Table Expressions and Recursion (though multiranges will make things easier!)
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all users made possible by the innovation of PostgreSQL!
Get Your Insecure PostgreSQL Passwords to SCRAMJonathan Katz
Passwords: they just seem to work. You connect to your PostgreSQL database and you are prompted for your password. You type in the correct character combination, and presto! you're in, safe and sound.
But what if I told you that all was not as it seemed. What if I told you there was a better, safer way to use passwords with PostgreSQL? What if I told you it was imperative that you upgraded, too?
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
In this talk, we will look at:
* A history of the evolution of password storage and authentication in PostgreSQL
* How SCRAM works with a step-by-step deep dive into the algorithm (and convince you why you need to upgrade!)
* SCRAM channel binding, which helps prevent MITM attacks during authentication
* How to safely set and modify your passwords, as well as how to upgrade to SCRAM-SHA-256 (which we will do live!)
all of which will be explained by some adorable elephants and hippos!
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMJonathan Katz
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
An Introduction to Using PostgreSQL with Docker & KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL container
- Extending your setup with a pgadmin4 container
- Container orchestration: What this means, and how to use Kubernetes to leverage database-as-a-service with PostgreSQL
- Trends in the container world and how it will affect PostgreSQL
Developing and Deploying Apps with the Postgres FDWJonathan Katz
I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.
I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.
We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.
Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:
* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
All data is relational and can be represented through relational algebra, right? Perhaps, but there are other ways to represent data, and the PostgreSQL team continues to work on making it easier and more efficient to do so!
With the upcoming 9.4 release, PostgreSQL is introducing the "JSONB" data type which allows for fast, compressed, storage of JSON formatted data, and for quick retrieval. And JSONB comes with all the benefits of PostgreSQL, like its data durability, MVCC, and of course, access to all the other data types and features in PostgreSQL.
How fast is JSONB? How do we access data stored with this type? What can it do with the rest of PostgreSQL? What can't it do? How can we leverage this new data type and make PostgreSQL scale horizontally? Follow along with our presentation as we try to answer these questions.
PostgreSQL comes built-in with a variety of indexes, some of which are further extensible to build powerful new indexing schemes. But what are all these index types? What are some of the special features of these indexes? What are the size & performance tradeoffs? How do I know which ones are appropriate for my application?
Fortunately, this talk aims to answer all of these questions as we explore the whole family of PostgreSQL indexes: B-tree, expression, GiST (of all flavors), GIN and how they are used in theory and practice.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
2. Market Leading
Data Security
Crunchy Data is the leader in
PostgreSQL security. Common
Criteria certification and essential
security enhancements make Crunchy
Certified PostgreSQL the trusted open
source PostgreSQL distribution for
the enterprise.
Cloud Ready
Data Management
Whether deploying to public or
private clouds, Crunchy Data
provides market leading, open
source, Kubernetes-based
technology solutions, giving your
team the choice and flexibility for how
you deploy your data.
Leader in Open Source
Enterprise PostgreSQL
Crunchy Data gives organizations the
technology, support, and confidence to
enjoy the power and efficiency of open
source PostgreSQL.
3. 3
• Vice President of Platform Engineering, Crunchy Data
• Previously: Engineering leadership in startups
• PostgreSQL Major Contributor
• Advocacy & various committees for PostgreSQL
Global Development Group
• @postgresql + .org content
• Director, PgUS
• Conference organization + speaking
• @jkatz05
About Me
5. • The PostgreSQL relational database is over 30 years old and has been open
source since 1996
• Open source like Linux: no single vendor; flexible license; no one can own
PostgreSQL
• Recent PostgreSQL releases have brought it to feature parity with popular
proprietary databases
• PostgreSQL's extensibility allows it to handle NoSQL workloads,
accommodate developer language preferences, and provide full transaction
safety and data integrity
• PostgreSQL’s stability, consistency, and robustness have made it the trusted
open source database in the enterprise.
PostgreSQL: An Oldie Stays a Goodie
6. • Stateful workloads have only one job: maintain state
• If you lose or corrupt all your data, you're done
• Specific knowledge is required about a stateful application (e.g.
PostgreSQL) to perform state modification operations such as provisioning,
failover, or recovery. A Kubernetes Operator can do this.
• Steps to do state modification operations range from simple to tedious
• Adding a user to a database is simple
• Adding a user to 1,000 databases is tedious
• Proper Operator design allows for autonomous managed workloads:
distributed consensus HA, systematic backups, etc.
The Need for Kubernetes Operators
7. • Open Source! GA March 2017, Apache 2.0 Licensed
• https://github.com/CrunchyData/postgres-operator
• Current version: 4.4.1; 4.5.0 beta 1
• OperatorHub.io: Level 5 “Auto Pilot” Capabilities
• Supports essential database-as-a-service functionality:
• Provisioning (create, delete clusters, clone existing clusters)
• High-availability (distributed consensus based, leverages pod anti-affinity,
supports synchronous replication)
• Elasticity (add/remove replicas)
• Disaster recovery (backup, restore geared towards terabyte scale DBs,
automatically schedule backups)
• Administration (PostgreSQL software updates, user management)
• Connection pooling via pgBouncer
• Monitoring via pgMonitor
Crunchy PostgreSQL Operator
8. • Automation: Complex, multi-step tasks reduced to one-line commands
• Standardization: Many customizations, same workflow
• Ease-of-Use: Simple API. Can add on with CLI, UI
• Scale
• Provision & manage quickly amongst thousands of instances
• High-Availability, Load balancing, disaster recovery, security policies,
deployment specifications
• Security: Sandboxed environments, RBAC, mass grant/revoke policies
• Flexibility: Run your stateful workload in any environment you have a node
Why Use an Operator?
16. Crunchy PostgreSQL Operator:
Production Deployments from a Single Command
pgo create cluster hacluster --metrics --pgbadger –pgbouncer --pgbackrest-storage-type="local,s3"