What do you give up – and gain – when moving to a fully-managed cloud database?
Now that database-as-a-Service (DBaaS) offerings have been “battle tested” in production, how is the reality matching up to the expectation? What can teams thinking of adopting a fully-managed DBaaS can learn from teams who have years of experience working with this deployment model?
Join this webinar to dive into the reality of working with various high-performance DBaaS offerings. We’ll cover the following topics, all supported with real-world examples:
- Developer flexibility
- Cost variability
- Security & privacy
- Performance impact
- Transparency & troubleshooting
This is the presentation I delivered on Hadoop User Group Ireland meetup in Dublin on Nov 28 2015. It covers at glance the architecture of GPDB and most important its features. Sorry for the colors - Slideshare is crappy with PDFs
Data Mesh is a new socio-technical approach to data architecture, first described by Zhamak Dehghani and popularised through a guest blog post on Martin Fowler's site.
Since then, community interest has grown, due to Data Mesh's ability to explain and address the frustrations that many organisations are experiencing as they try to get value from their data. The 2022 publication of Zhamak's book on Data Mesh further provoked conversation, as have the growing number of experience reports from companies that have put Data Mesh into practice.
So what's all the fuss about?
On one hand, Data Mesh is a new approach in the field of big data. On the other hand, Data Mesh is application of the lessons we have learned from domain-driven design and microservices to a data context.
In this talk, Chris and Pablo will explain how Data Mesh relates to current thinking in software architecture and the historical development of data architecture philosophies. They will outline what benefits Data Mesh brings, what trade-offs it comes with and when organisations should and should not consider adopting it.
DAS Slides: Building a Future-State Data Architecture Plan - Where to Begin?DATAVERSITY
With technology changing at an ever more rapid pace and business requirements ever-evolving to meet the needs of the market, building a future-state Data Architecture plan can be a challenge. Join this webinar to learn practical ways to balance technology and business needs as you develop your future-state architecture for the coming years.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
What are Data Analytics Platforms? What decision points are necessary in creating a modern, unified analytics data platform? What benefits are there to building your analytics data platform on Google Cloud Platform? Susan Pierce walks us through it all.
Introduction to Data Governance
Seminar hosted by Embarcadero technologies, where Christopher Bradley presented a session on Data Governance.
Drivers for Data Governance & Benefits
Data Governance Framework
Organization & Structures
Roles & responsibilities
Policies & Processes
Programme & Implementation
Reporting & Assurance
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Amazon Web Services
Get the most out of Amazon Redshift by learning about cutting-edge data warehousing implementations. Desk.com, a Salesforce.com company, discusses how they maintain a large concurrent user base on their customer-facing business intelligence portal powered by Amazon Redshift. HasOffers shares how they load 60 million events per day into Amazon Redshift with a 3-minute end-to-end load latency to support ad performance tracking for thousands of affiliate networks. Finally, Aggregate Knowledge discusses how they perform complex queries at scale with Amazon Redshift to support their media intelligence platform.
This is the presentation I delivered on Hadoop User Group Ireland meetup in Dublin on Nov 28 2015. It covers at glance the architecture of GPDB and most important its features. Sorry for the colors - Slideshare is crappy with PDFs
Data Mesh is a new socio-technical approach to data architecture, first described by Zhamak Dehghani and popularised through a guest blog post on Martin Fowler's site.
Since then, community interest has grown, due to Data Mesh's ability to explain and address the frustrations that many organisations are experiencing as they try to get value from their data. The 2022 publication of Zhamak's book on Data Mesh further provoked conversation, as have the growing number of experience reports from companies that have put Data Mesh into practice.
So what's all the fuss about?
On one hand, Data Mesh is a new approach in the field of big data. On the other hand, Data Mesh is application of the lessons we have learned from domain-driven design and microservices to a data context.
In this talk, Chris and Pablo will explain how Data Mesh relates to current thinking in software architecture and the historical development of data architecture philosophies. They will outline what benefits Data Mesh brings, what trade-offs it comes with and when organisations should and should not consider adopting it.
DAS Slides: Building a Future-State Data Architecture Plan - Where to Begin?DATAVERSITY
With technology changing at an ever more rapid pace and business requirements ever-evolving to meet the needs of the market, building a future-state Data Architecture plan can be a challenge. Join this webinar to learn practical ways to balance technology and business needs as you develop your future-state architecture for the coming years.
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
What are Data Analytics Platforms? What decision points are necessary in creating a modern, unified analytics data platform? What benefits are there to building your analytics data platform on Google Cloud Platform? Susan Pierce walks us through it all.
Introduction to Data Governance
Seminar hosted by Embarcadero technologies, where Christopher Bradley presented a session on Data Governance.
Drivers for Data Governance & Benefits
Data Governance Framework
Organization & Structures
Roles & responsibilities
Policies & Processes
Programme & Implementation
Reporting & Assurance
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Amazon Web Services
Get the most out of Amazon Redshift by learning about cutting-edge data warehousing implementations. Desk.com, a Salesforce.com company, discusses how they maintain a large concurrent user base on their customer-facing business intelligence portal powered by Amazon Redshift. HasOffers shares how they load 60 million events per day into Amazon Redshift with a 3-minute end-to-end load latency to support ad performance tracking for thousands of affiliate networks. Finally, Aggregate Knowledge discusses how they perform complex queries at scale with Amazon Redshift to support their media intelligence platform.
The three main qualities that any firm seeks in their applications are flexibility, scalability, and ease of customization. A SaaS architecture is the only option that can satisfy these needs. Learn about its architecture types, models, and advantages for a successful market expansion.
Review existing data management maturity models to identify core set of characteristics of an effective data maturity model:
DMBOK (Data Management Book of Knowledge) from DAMA (Data Management Association)
MIKE2.0 (Method for an Integrated Knowledge Environment) Information Maturity Model (IMM)
IBM Data Governance Council Maturity Model
Enterprise Data Management Council Data Management Maturity Model
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
With the acceleration of customer and business demands, site reliability engineers and IT Ops analysts now require operational visibility into their entire architecture, something that traditional APM tools, dev logging tools, and SRE tools aren’t equipped to provide. Observability enables you to inspect and understand your IT stack on premises and in the cloud(s); It’s no longer about whether your system works (monitoring), but being able to task why it is not working? (Observability). This presentation will outline key steps to take to move from monitoring to observability.
Prov International - Our Service-Now ITOM Delivery CapabilitiesSonny Nnamchi (Ph.D)
ProV International , Inc (www.provintl.com) is a global IT solution provider, and a Service-now Business Partner with very strong ITOM services delivery capabilities that can assist your organization meet or exceed your ITOM tools deployment and custom integration needs using our Service-now implementation best practices. Our dedicated IT Operations Management (ITOM) team has the required knowledge (Certifications / Accreditations) and hands-on experience needed to ensure your ITOM projects is delivered successfully. This Presentation attempts to capture some of our capabilities and best practices in this regard.To learn more about how we can help you best deliver and support a new or existing ITOM tools investment, you can contact us at info@provintl.com.
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
Doing performance tuning on a massively distributed database is never an easy task. This is especially true for TiDB, an open-source, cloud-native NewSQL database for elastic scale and real-time analytics, because it consists of multiple components and each component has plenty of metrics.
Like many distributed systems, TiDB uses Prometheus to store the monitoring and performance metrics and Grafana to visualize these metrics. Thanks to these two open source projects, it is easy for TiDB developers to add monitoring and performance metrics. However, as the metrics increase, the learning curve becomes steeper for TiDB users to gain performance insights. In this talk, we will share how we measure latency in a distributed system using a top-down (holistic) approach, and why we introduced "tuning by database time" and "tuning by color" into TiDB. The new methodologies and Grafana dashboard help reduce the time and the requirement of expertise in performance tuning by orders of magnitude.
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Denodo
Watch Alberto's session from Fast Data Strategy on-demand here: https://buff.ly/2wByS41
Gartner’s recently published report “Data Catalogs Are the New Black in Data Management Analytics” emphasizes the importance of data catalogs.
Watch this session to learn more about:
• The vision behind the Denodo Data Catalog
• How to maximize information value with the Denodo Data Catalog
• Why it is essential to combine data delivery with a data catalog
SQL is a popular database language for modern applications, given its flexibility in modelling workloads and how widely it is understood by developers. However, most modern applications running in the clouds require fault tolerance, the ability to scale out and geographic data distribution of data. These are hard to achieve with traditional SQL databases, which is paving the way for distributed SQL databases.
Google Spanner is arguably the world's first truly distributed SQL database. Given its fully decentralized architecture, it delivers higher performance and availability for geo-distributed SQL workloads than other specialized transactional databases such as Amazon Aurora. Now, there are a number of open source derivatives of Google Spanner such as YugaByte DB, CockroachDB and TiDB. This talk will focus on the common architectural paradigms that these databases are built on (using YugaByte DB as an example). Learn about the concepts these databases leverage, how to evaluate if these will meet your needs and the questions to ask to differentiate among these databases.
Data Vault 2.0 DeMystified with Dan Linstedt and WhereScapeWhereScape
Join Dan Linstedt and WhereScape to learn the benefits that Data Vault 2.0 offers to data warehousing teams, what it is and isn't, and how data vault automation can help teams implement Data Vault 2.0 more quickly and successfully.
How to apply machine learning into your CI/CD pipelineAlon Weiss
A quick introduction to AIOps, the business reasons why the CI/CD pipeline needs to constantly improve, and how this can be accomplished with data that's already available with existing Machine Learning and other algorithms.
Graph databases provide the ability to quickly discover and integrate key relationships between enterprise data sets. Business use cases such as recommendation engines, social networks, enterprise knowledge graphs, and more provide valuable ways to leverage graph databases in your organization. This webinar will provide an overview of graph database technologies, and how they can be used for practical applications to drive business value.
DataOps is a methodology and culture shift that brings the successful combination of development and operations (DevOps) to data processing environments. It breaks down silos between developers, data scientists, and operators, resulting in lean data feature development processes with quick feedback. In this presentation, we will explain the methodology, and focus on practical aspects of DataOps.
In this lecture we discuss data quality and data quality in Linked Data. This 50 minute lecture was given to masters student at Trinity College Dublin (Ireland), and had the following contents:
1) Defining Quality
2) Defining Data Quality - What, Why, Costs
3) Identifying problems early - using a simple semantic publishing process as an example
4) Assessing Linked (big) Data quality
5) Quality of LOD cloud datasets
References can be found at the end of the slides
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 (CC-BY-SA-40) International License.
RWDG Webinar: Data Steward Definition and Other Data Governance RolesDATAVERSITY
The role of the Data Steward is critical to the success of a Data Governance program. There are several approaches to Stewardship including assigning people to be Data Stewards, identify existing Data Stewards and recognizing Data Stewards according to their relationship to the data they define, produce and use. However Stewards are only one of several Data Governance roles that must be considered.
In this month’s RWDG webinar, Bob Seiner will discuss several approaches to defining the role of the Data Steward as well as the other roles necessary for Data Governance program success. Data Governance roles must include operational, tactical, strategic and supporting levels of responsibilities. Spend an hour with Bob where he will share a customize-able Operating Model of Data Governance roles and responsibilities.
In this webinar, Bob will discuss:
• Several approaches to defining Data Stewards and Stewardship
• How to select the Stewardship approach that is right for you
• Different levels of Stewards required for a successful program
• An Operating Model of DG Roles that can be molded to fit in any culture
• Why the approach to defining DG roles can make or break the program
ML Products have become a prolific and integral part of taking the insights of Data Science from theory to reality. Oddly though, the path from conception to implementation is often unclear with seemingly few similar examples to work from. The result is often a sea of agony between sliding deadlines, heroic efforts of people working though unforeseen challenges and haphazard innovation. Each time a beautiful model makes its impact on the business bottom line, something worked. In this talk we present the ML Playbook. It pulls together the best aspects from a variety of successful ML Product launches into a cohesive strategy to Plan, Build, Test, Learn, and Release ML Products. We'll demonstrate the ML Playbook in action with the story of launching an alert monitoring product for the world's most powerful jet engines, the GE90-115B.
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Amazon Web Services
In this session, we provide an overview of the PostgreSQL options available on AWS, and do a deep dive on Amazon Relational Database Service (Amazon RDS) for PostgreSQL, a fully managed PostgreSQL service, and Amazon Aurora, a PostgreSQL-compatible database with up to 3x the performance of standard PostgreSQL. Learn about the features, functionality, and many innovations in Amazon RDS and Aurora, which give you the background to choose the right service to solve different technical challenges, and the knowledge to easily move between services as your requirements change over time.
ITIL Practical Guide - Continual Service Improvement (CSI)Axios Systems
To view this complimentary webcast in full, visit: http://forms.axiossystems.com/LP=272
This video provides a run through of the lifecycle stage, which manages the day-to-day operation of IT services for the identification and reporting of interruptions in the delivery of services and handling of service requests at agreed levels.
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
Today’s data-driven companies have a choice to make – where do we store our data? As the move to the cloud continues to be a driving factor, the choice becomes either the data warehouse (Snowflake et al) or the data lake (AWS S3 et al). There are pro’s and con’s for each approach. While the data warehouse will give you strong data management with analytics, they don’t do well with semi-structured and unstructured data with tightly coupled storage and compute, not to mention expensive vendor lock-in. On the other hand, data lakes allow you to store all kinds of data and are extremely affordable, but they’re only meant for storage and by themselves provide no direct value to an organization.
Enter the Open Data Lakehouse, the next evolution of the data stack that gives you the openness and flexibility of the data lake with the key aspects of the data warehouse like management and transaction support.
In this webinar, you’ll hear from Ali LeClerc who will discuss the data landscape and why many companies are moving to an open data lakehouse. Ali will share more perspective on how you should think about what fits best based on your use case and workloads, and how some real world customers are using Presto, a SQL query engine, to bring analytics to the data lakehouse.
History of cloud, types of cloud, popular cloud service providers, Business Model Around cloud IaaS, PaaS, SaaS, Distributed and parallel computing, Eucalyptus Cloud, nimbus ,OpenNebula and Cloudsim
Elevate Your IT Operations: How DevOps as a Service Can Transform Your Businessbasilmph
DevOps as a Service (DaaS) is a model where an external provider manages software development operations, including the necessary infrastructure, tools, and processes. Increasingly significant in the modern business landscape, DaaS fosters global accessibility, enabling remote teams to function seamlessly and collaborate in real time. DaaS automates routine tasks, freeing up resources for more productive activities.
The three main qualities that any firm seeks in their applications are flexibility, scalability, and ease of customization. A SaaS architecture is the only option that can satisfy these needs. Learn about its architecture types, models, and advantages for a successful market expansion.
Review existing data management maturity models to identify core set of characteristics of an effective data maturity model:
DMBOK (Data Management Book of Knowledge) from DAMA (Data Management Association)
MIKE2.0 (Method for an Integrated Knowledge Environment) Information Maturity Model (IMM)
IBM Data Governance Council Maturity Model
Enterprise Data Management Council Data Management Maturity Model
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
With the acceleration of customer and business demands, site reliability engineers and IT Ops analysts now require operational visibility into their entire architecture, something that traditional APM tools, dev logging tools, and SRE tools aren’t equipped to provide. Observability enables you to inspect and understand your IT stack on premises and in the cloud(s); It’s no longer about whether your system works (monitoring), but being able to task why it is not working? (Observability). This presentation will outline key steps to take to move from monitoring to observability.
Prov International - Our Service-Now ITOM Delivery CapabilitiesSonny Nnamchi (Ph.D)
ProV International , Inc (www.provintl.com) is a global IT solution provider, and a Service-now Business Partner with very strong ITOM services delivery capabilities that can assist your organization meet or exceed your ITOM tools deployment and custom integration needs using our Service-now implementation best practices. Our dedicated IT Operations Management (ITOM) team has the required knowledge (Certifications / Accreditations) and hands-on experience needed to ensure your ITOM projects is delivered successfully. This Presentation attempts to capture some of our capabilities and best practices in this regard.To learn more about how we can help you best deliver and support a new or existing ITOM tools investment, you can contact us at info@provintl.com.
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
Doing performance tuning on a massively distributed database is never an easy task. This is especially true for TiDB, an open-source, cloud-native NewSQL database for elastic scale and real-time analytics, because it consists of multiple components and each component has plenty of metrics.
Like many distributed systems, TiDB uses Prometheus to store the monitoring and performance metrics and Grafana to visualize these metrics. Thanks to these two open source projects, it is easy for TiDB developers to add monitoring and performance metrics. However, as the metrics increase, the learning curve becomes steeper for TiDB users to gain performance insights. In this talk, we will share how we measure latency in a distributed system using a top-down (holistic) approach, and why we introduced "tuning by database time" and "tuning by color" into TiDB. The new methodologies and Grafana dashboard help reduce the time and the requirement of expertise in performance tuning by orders of magnitude.
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Denodo
Watch Alberto's session from Fast Data Strategy on-demand here: https://buff.ly/2wByS41
Gartner’s recently published report “Data Catalogs Are the New Black in Data Management Analytics” emphasizes the importance of data catalogs.
Watch this session to learn more about:
• The vision behind the Denodo Data Catalog
• How to maximize information value with the Denodo Data Catalog
• Why it is essential to combine data delivery with a data catalog
SQL is a popular database language for modern applications, given its flexibility in modelling workloads and how widely it is understood by developers. However, most modern applications running in the clouds require fault tolerance, the ability to scale out and geographic data distribution of data. These are hard to achieve with traditional SQL databases, which is paving the way for distributed SQL databases.
Google Spanner is arguably the world's first truly distributed SQL database. Given its fully decentralized architecture, it delivers higher performance and availability for geo-distributed SQL workloads than other specialized transactional databases such as Amazon Aurora. Now, there are a number of open source derivatives of Google Spanner such as YugaByte DB, CockroachDB and TiDB. This talk will focus on the common architectural paradigms that these databases are built on (using YugaByte DB as an example). Learn about the concepts these databases leverage, how to evaluate if these will meet your needs and the questions to ask to differentiate among these databases.
Data Vault 2.0 DeMystified with Dan Linstedt and WhereScapeWhereScape
Join Dan Linstedt and WhereScape to learn the benefits that Data Vault 2.0 offers to data warehousing teams, what it is and isn't, and how data vault automation can help teams implement Data Vault 2.0 more quickly and successfully.
How to apply machine learning into your CI/CD pipelineAlon Weiss
A quick introduction to AIOps, the business reasons why the CI/CD pipeline needs to constantly improve, and how this can be accomplished with data that's already available with existing Machine Learning and other algorithms.
Graph databases provide the ability to quickly discover and integrate key relationships between enterprise data sets. Business use cases such as recommendation engines, social networks, enterprise knowledge graphs, and more provide valuable ways to leverage graph databases in your organization. This webinar will provide an overview of graph database technologies, and how they can be used for practical applications to drive business value.
DataOps is a methodology and culture shift that brings the successful combination of development and operations (DevOps) to data processing environments. It breaks down silos between developers, data scientists, and operators, resulting in lean data feature development processes with quick feedback. In this presentation, we will explain the methodology, and focus on practical aspects of DataOps.
In this lecture we discuss data quality and data quality in Linked Data. This 50 minute lecture was given to masters student at Trinity College Dublin (Ireland), and had the following contents:
1) Defining Quality
2) Defining Data Quality - What, Why, Costs
3) Identifying problems early - using a simple semantic publishing process as an example
4) Assessing Linked (big) Data quality
5) Quality of LOD cloud datasets
References can be found at the end of the slides
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 (CC-BY-SA-40) International License.
RWDG Webinar: Data Steward Definition and Other Data Governance RolesDATAVERSITY
The role of the Data Steward is critical to the success of a Data Governance program. There are several approaches to Stewardship including assigning people to be Data Stewards, identify existing Data Stewards and recognizing Data Stewards according to their relationship to the data they define, produce and use. However Stewards are only one of several Data Governance roles that must be considered.
In this month’s RWDG webinar, Bob Seiner will discuss several approaches to defining the role of the Data Steward as well as the other roles necessary for Data Governance program success. Data Governance roles must include operational, tactical, strategic and supporting levels of responsibilities. Spend an hour with Bob where he will share a customize-able Operating Model of Data Governance roles and responsibilities.
In this webinar, Bob will discuss:
• Several approaches to defining Data Stewards and Stewardship
• How to select the Stewardship approach that is right for you
• Different levels of Stewards required for a successful program
• An Operating Model of DG Roles that can be molded to fit in any culture
• Why the approach to defining DG roles can make or break the program
ML Products have become a prolific and integral part of taking the insights of Data Science from theory to reality. Oddly though, the path from conception to implementation is often unclear with seemingly few similar examples to work from. The result is often a sea of agony between sliding deadlines, heroic efforts of people working though unforeseen challenges and haphazard innovation. Each time a beautiful model makes its impact on the business bottom line, something worked. In this talk we present the ML Playbook. It pulls together the best aspects from a variety of successful ML Product launches into a cohesive strategy to Plan, Build, Test, Learn, and Release ML Products. We'll demonstrate the ML Playbook in action with the story of launching an alert monitoring product for the world's most powerful jet engines, the GE90-115B.
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Amazon Web Services
In this session, we provide an overview of the PostgreSQL options available on AWS, and do a deep dive on Amazon Relational Database Service (Amazon RDS) for PostgreSQL, a fully managed PostgreSQL service, and Amazon Aurora, a PostgreSQL-compatible database with up to 3x the performance of standard PostgreSQL. Learn about the features, functionality, and many innovations in Amazon RDS and Aurora, which give you the background to choose the right service to solve different technical challenges, and the knowledge to easily move between services as your requirements change over time.
ITIL Practical Guide - Continual Service Improvement (CSI)Axios Systems
To view this complimentary webcast in full, visit: http://forms.axiossystems.com/LP=272
This video provides a run through of the lifecycle stage, which manages the day-to-day operation of IT services for the identification and reporting of interruptions in the delivery of services and handling of service requests at agreed levels.
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
Today’s data-driven companies have a choice to make – where do we store our data? As the move to the cloud continues to be a driving factor, the choice becomes either the data warehouse (Snowflake et al) or the data lake (AWS S3 et al). There are pro’s and con’s for each approach. While the data warehouse will give you strong data management with analytics, they don’t do well with semi-structured and unstructured data with tightly coupled storage and compute, not to mention expensive vendor lock-in. On the other hand, data lakes allow you to store all kinds of data and are extremely affordable, but they’re only meant for storage and by themselves provide no direct value to an organization.
Enter the Open Data Lakehouse, the next evolution of the data stack that gives you the openness and flexibility of the data lake with the key aspects of the data warehouse like management and transaction support.
In this webinar, you’ll hear from Ali LeClerc who will discuss the data landscape and why many companies are moving to an open data lakehouse. Ali will share more perspective on how you should think about what fits best based on your use case and workloads, and how some real world customers are using Presto, a SQL query engine, to bring analytics to the data lakehouse.
History of cloud, types of cloud, popular cloud service providers, Business Model Around cloud IaaS, PaaS, SaaS, Distributed and parallel computing, Eucalyptus Cloud, nimbus ,OpenNebula and Cloudsim
Elevate Your IT Operations: How DevOps as a Service Can Transform Your Businessbasilmph
DevOps as a Service (DaaS) is a model where an external provider manages software development operations, including the necessary infrastructure, tools, and processes. Increasingly significant in the modern business landscape, DaaS fosters global accessibility, enabling remote teams to function seamlessly and collaborate in real time. DaaS automates routine tasks, freeing up resources for more productive activities.
Basics of cloud computing including examples of SaaS, PaaS and Iaas. The advantages and disadvantages are reviewed as well as a plan to migrate to the cloud.
A perspective on cloud computing and enterprise saa s applicationsGeorge Milliken
A perspective on Cloud computing and SaaS for Enterprise applications by a SaaS industry veteran.
Please make sure you read the speakers notes, there's a significant amount of content there.
With extended support for SQL Server 2005 expiring on April 12, 2016, many organisations are beginning to look at their upgrade options.
This presentation is from a webinar broadcast in August 2015 hosted by SoftwareONE, Microsoft and Ridgian.
The presentation highlights key milestone dates, the potential impact of not upgrading prior to April 12, 2016 and upgrade options from SQL Server 2005 to 2014 and hybrid opportunities with Microsoft Azure.
Top Considerations When Deciding Between Cloud Apps, Cloud Infrastructure or ...Datavail
Which Cloud is right for you? OCI, AWS, Azure…This presentation looks at how the right cloud can move your company forward. This applies to Oracle: JDE, PeopleSoft, Hyperion, OBIEE, and EBS
JD Edwards in the Cloud - Flipbook: What are your peers doing? ManageForce
What’s Inside:
Get the facts in 15 minutes. Use the planning information to get started.
Benchmark
Learn what your peers are doing (OAUG survey)
57% are using cloud service, and the number is growing.
Triggers
Explore cloud adoption scenarios
Survey: The impetus is coming from IT, and 35% are seeing unexpected benefits.
Options
Navigate cloud adoption options
Everything "as-a-service" explained, along with private/hybrid/public options--independent of provider bias.
Plan
Orchestrate your move considering the whole stack
No two organizations are handling their infrastructure the same way, and complex variables are at play. Explore windows of opportunity for incremental progress and cross-organization drivers.
Resources
Define Point B and determine next steps
With so many different cloud providers and solutions, how can organizations know which one meets their needs? Check out this presentation and learn more about this essential decision.
Blue Crystal Solutions (BCS) Database Migration Services (DMS) helps you migrate or transform databases smoothly and securely. The service can be used in a variety of different use cases and can provide you with new business capabilities with minimal risk and upfront investment.
Comprehensive Information on Software as a ServiceHTS Hosting
Software as a service (SaaS) is a delivery model of cloud computing that is used by many business applications. It entails licensing software, which is centrally hosted, on a subscription basis.
Choosing DevOps as a Service for Outsourcing: A Decision-Maker's Guidebasilmph
Introducing DevOps as a Service, or DaaS, an innovative and agile approach to software development that programmatically blends development and operational tasks. DaaS has gained critical traction in the outsourcing world, primarily due to its ability to streamline the software development lifecycle (SDLC), hence increasing productivity and reducing time-to-market.
Organizations looking to the cloud now have more vendor offerings and architecture choices available to them than ever before. In order to correctly select and implement the most appropriate cloud based DBMS architecture for their shops, technology pros must create and execute a well-thought out, detailed analysis of the competing offerings.
In addition, they must consider the impact cloud based DBMS systems, like any new architecture, will have on their support environment. Changes to policies and procedures, security controls, staff roles and responsibilities, change management processes and support documentation must be evaluated.
The internet has enabled different delivery models for business software, such as Software as a Service (SaaS). The emergence of this model has led to an important decision for many companies: whether to “buy” on-premise software or “rent” their technologies through a SaaS vendor.
While many enterprises consider cloud computing the savior of their data strategy, there is a process they should be following when looking to leveraging database-as-a-service. This includes understanding their own data requirements, selecting the right cloud computing candidate, and then planning for the migration and operations. A huge number of issues and obstacles will inevitably arise, but fortunately best practices are emerging. This presentation will take you through the process of moving data to cloud computing providers.
Similar to DBaaS in the Real World: Risks, Rewards & Tradeoffs (20)
Optimizing NoSQL Performance Through ObservabilityScyllaDB
ScyllaDB has the potential to deliver impressive performance and scalability. The better you understand how it works, the more you can squeeze out of it. But before you squeeze, make sure you know what to monitor!
Watch our experienced Postgres developer work through monitoring and performance strategies that help him understand what mistakes he’s made moving to NoSQL. And learn with him as our database performance expert offers friendly guidance on how to use monitoring and performance tuning to get his sample Rust application on the right track.
This webinar focuses on using monitoring and performance tuning to discover and correct mistakes that commonly occur when developers move from SQL to NoSQL. For example:
- Common issues getting up and running with the monitoring stack
- Using the CQL optimizations dashboard
- Common issues causing high latency in a node
- Common issues causing replica imbalance
- What a healthy system looks like in terms of memory
- Key metrics to keep an eye on
This isn’t “Death-by-Powerpoint.” We’ll walk through problems encountered while migrating a real application from Postgres to ScyllaDB – and try to fix them live as well.
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
Discuss the core tradeoffs and considerations involved in order-free and ordered stream processing. Brian Taylor walks through the pros and cons of three different approaches: no data dependency, deferred inter-event data dependency, and streaming inter-event data dependency.
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
We start by setting up a common ground introducing why relational databases fall short, addressing common EDA characteristics such as the need for real-time response times and schemaless approaches to address recurring changes to adapt and on-board new use cases. Next, interact with a sample Rust-based application: a social network app demonstrating an integration of both ScyllaDB and Redpanda.
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
Discover how to avoid common pitfalls when shifting to an event-driven architecture (EDA) in order to boost system recovery and scalability. We cover Kafka Schema Registry, in-broker transformations, event sourcing, and more.
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
See where an RDBMS-pro’s intuition leads him astray – and learn practical tips for the data modeling transition
ScyllaDB has the potential to deliver impressive performance and scalability. The better you understand how it works, the more you can squeeze out of it. However, developers new to high-performance NoSQL intuitively shoot themselves in the foot with respect to things like table design, query design, indexing, and partitioning.
Watch where our experienced Postgres developer intuitively falls into traps that hurt performance and scalability. And learn with him as our database performance expert offers friendly guidance on navigating all the unexpected behaviors that tend to trip up RDBMS experts.
This webinar focuses on common data modeling and querying mistakes that occur when developers move from SQL to NoSQL. For example:
- Understanding query first design principles
- Planning for schema evolution
- Steering clear of common pitfalls and anti-patterns
- Assessing data access patterns
This isn’t “Death-by-Powerpoint.” We’ll walk through problems encountered while migrating a real application from Postgres to ScyllaDB – and try to fix them live as well.
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
See where an RDBMS-pro’s intuition leads him astray – and learn practical tips for the transition
ScyllaDB has the potential to deliver impressive performance and scalability. The better you understand how it works, the more you can squeeze out of it. However, developers new to high-performance NoSQL intuitively shoot themselves in the foot with respect to things like table design, query design, indexing, and partitioning.
Watch where our experienced Postgres developer intuitively falls into traps that hurt performance and scalability. And learn with him as our database performance expert offers friendly guidance on navigating all the unexpected behaviors that tend to trip up RDBMS experts.
Our first webinar of this series will cover common mistakes with practices such as:
- Translating the data model to NoSQL
- Optimizing table design
- Optimizing query performance
- Planning for partitioning
This isn’t “Death-by-Powerpoint.” We’ll walk through problems encountered while migrating a real application from Postgres to ScyllaDB – and try to fix them live as well.
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
Expert tips on how to maximize your database performance at scale
Untangle the complexity of achieving database performance at scale. Join this webinar to discover commonly overlooked ways to get predictable low latency, even at extreme scale. Our Solution Architects will walk you through the strategies and pitfalls learned by working on thousands of real-world distributed database projects, many reaching 1M OPS with single-digit MS latencies.
In addition to offering clear recommendations, we’ll also explain the process behind how we arrived at them – so you can benefit from the lessons learned by other teams.
We’ll cover how to:
- Design and deploy a large-scale distributed database cluster
- Optimize your clients’ interactions with it
- Expand the cluster horizontally and globally
- Ensure it survives whatever disasters the world throws at it
Tackling your own database performance challenges is serious business. For a change of pace, let’s have some fun learning from other teams’ performance predicaments.
Join us for an interactive session where we dissect four specific database performance challenges faced by teams considering or using ScyllaDB. For each dilemma, we'll:
- Examine the context and technical requirements
- Talk about potential solutions and cover the pros and cons of each
- Disclose what approach the team took, and how it worked out
About the speaker:
Felipe is an IT specialist with years of experience on distributed systems and open-source technologies. He is one of the co-authors of "Database Performance at Scale", an Open Access, freely available publication for individuals interested on improving database performance. At ScyllaDB, he works as a Solution Architect.
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
Linear scaling (sometimes near linear scaling) is often mentioned in several benchmarks, articles and product comparisons as proof that a given technology and algorithmic optimizations perform better than another. But is that really what performance is all about, and should you even care?
This webinar discusses performance beyond linear scalability, including what typically matters more when running high throughput and low latency workloads at scale. We'll cover how ScyllaDB offers unparalleled performance and share our insights on:
- The hidden aspects of linear scaling
- When linear scaling matters most and when it’s simply irrelevant
- Often overlooked considerations for optimizing and measuring distributed systems performance
Watch now to learn from our experience (and lessons learned) in building the fastest NoSQL database in the world.
Navigating Complex Database Performance Hurdles
Tackling your own database performance challenges is serious business. For a change of pace, let’s have some fun learning from other teams’ performance predicaments.
Join us for an interactive session where we dissect 4 specific database performance challenges faced by teams considering or using ScyllaDB. For each dilemma:
- The presenters will describe the context and technical requirements
- Together, we’ll talk about potential solutions and cover the pros and cons of each
- Finally, we’ll disclose what approach the team took, and how it worked out
Throughout the event, we’ll have opportunities to win ScyllaDB swag and prizes! Come prepared to engage in lively discussions and gain valuable insight into database performance strategies.
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
Felipe Cardeneti Mendes, Solutions Architect at ScyllaDB
Navigating workload-specific performance challenges and tradeoffs.
Felipe Mendes covers how to navigate the top performance challenges and tradeoffs that you’re likely to face with your project’s specific workload characteristics and technical/business requirements.
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
Pavel Emelyanov, Principal Engineer at ScyllaDB
Botond Denes, C++ Developer at ScyllaDB
What performance-minded engineers need to know.
Hear from Pavel Emelyanov and Botond Dénes on the impact of database internals – specifically, what to look for if you need latency and/or throughput improvements.
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
Piotr Sarna, Software Engineer at Turso
Understanding and tapping your driver’s performance potential.
Piotr Sarna discusses how to get the most out of a driver, particularly from the performance perspective, and select a driver that’s a good fit for your needs.
Technical risks of putting a cache in front of your database– and what to do instead
Teams experiencing subpar latency commonly turn to an external cache to meet the required SLAs. Placing a cache in front of your database might seem like a fast and easy fix, but it often ends up introducing unanticipated complexity, costs, and risks. External caches can be one of the more problematic components of distributed application architecture.
Join this webinar for a technical discussion of the risks associated with using an external cache and a look at how ScyllaDB’s cache implementation simplifies your architecture without compromising latency. We’ll cover:
- Different approaches to caching (pre-caching vs. caching, side cache vs. transparent cache)
- 7 specific reasons why external caching ia a bad choice
- Why Linux’s default caching doesn’t work well for databases
- The advantages & architecture of ScyllaDB's specialized row-based cache
- Real-world examples of why and how teams eliminated their external cache with ScyllaDB
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
Discover how your team can achieve low latency at the extreme scale that your data-intensive applications require. We’ll walk you through an example of how ScyllaDB scales linearly to achieve 1M and then 2M OPS – with <1ms P99 latency. We’ll cover how this works on a sample realtime app (an ML feature store), share best practices for performance, and talk about the most important tradeoffs you’ll need to negotiate.
Join us to learn:
- Why and how to ensure your database takes full advantage of your cloud infrastructure
- What architectural considerations matter most for high throughput and low latency
- Key factors to consider when selecting a high-performance database
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
Teams experiencing subpar latency commonly turn to an external cache to meet the required SLAs. Placing a cache in front of your database might seem like a fast and easy fix, but it often ends up introducing unanticipated complexity, costs, and risks. Caches can be one of the more problematic components of distributed application architecture.
Join this webinar for a technical discussion of the risks associated with using an external cache and a look at an alternative strategy that simplifies your architecture without compromising latency. We’ll cover:
- Different approaches to caching (pre-caching vs. caching, side cache vs. transparent cache)
- 7 specific reasons why external caching can be a bad choice
- Why Linux’s default caching doesn’t work well for databases
- The advantages & architecture of specialized row-based caches
- Real-world examples of why and how teams eliminated their external cache
Expert tips on how to maximize your database potential
If you’re considering or getting started with ScyllaDB, you’re probably intrigued by its potential to achieve high throughput and predictable low latency at a reasonable cost. So how do you ensure that you’re maximizing that potential for your team’s specific workloads and use case?
This webinar offers practical advice for navigating the various decision points you’ll face as you assess whether ScyllaDB is a good fit for your team and later roll it out into production. We’ll cover the most critical considerations, tradeoffs, and recommendations related to:
- Infrastructure selection
- ScyllaDB configuration
- Client-side setup
- Data modeling
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
In this talk, Felipe Mendes, Solutions Architect at ScyllaDB, shares how 4 companies managed their migration. He covers:
Disney+ – No migration needed!
Discord – Shadow cluster
OpenWeb – TTL expiration, cover Load and Stream
MyHeritage – Counters
ShareChat – Bonus: A bit of everything
In this talk, Lubos discusses tools and methods for a successful migration. He covers:
Methods
Data (re)modeling
APIs
Spark Migrator
DS bulk
Tuning
Testing/monitoring
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
In this talk, Jon discusses practical strategies and issues to consider. He covers:
Reasons for Migrations
DB Functionality
Cost/Licensing
Outdated Technology
Scaling Problems
Technology Evolution
SQL to NoSQL
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
DBaaS in the Real World: Risks, Rewards & Tradeoffs
1. DBaaS in the Real
World: Risks, Rewards
& Tradeoffs
Felipe Mendes, Solution Architect at ScyllaDB
Michael Hollander, Director of Product at ScyllaDB
2. + For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
2
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the power
a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of an
in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
3. 3
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
4. Introductions
Felipe Mendes, Solution Architect at ScyllaDB
+ Years of experience with Linux and other distributed systems
+ An open source enthusiast
+ Passion towards helping businesses to achieve their most challenging goals
Michael Hollander, Director of Product at ScyllaDB
+ Former full-stack developer at both startups & enterprises
+ Led product in various dev tools companies
7. Often promoted rewards
I spend too much time and money maintaining my own data center
“
“
I don't have the experts to implement and set up my database
I want to focus on application development and delivery
I don’t know how to scale the database when my business grows
“
“
“
“
“
“
Seamless Scale
Security
Hardening
24/7 Support
High
Availability
Low-latency
Network
Ease of Use
8. Shared responsibility
Cluster Management
Provisioning
Regular Upgrades
Backup/Restore
Security
Alert Monitoring
Schema Management
Application Development
Cluster Management
Provisioning
Regular Upgrades
Backup/Restore
Security
Alert Monitoring
Schema Management
Application Development
Managed
by
you
Managed
by
you
Managed
by
your
provider
Self-managed DBaaS
9. Focus
Goals Speed Costs
What a DBaaS helps me
achieve?
+ Define your workload needs
+ Understand the vendor
roadmap
+ Can I become locked-in?
How easy is it to get started?
+ Learning curve
+ Potential limitations
+ How much time does it saves
you?
How much is this going to cost?
+ Do the benefits outweigh costs?
+ Are there hidden costs?
+ Where do I want to be in the
future?
11. Rewards
Ecosystem Integration
DBaaS typically integrate easily with most stacks,
including CI/CD. Zero impact schema changes are a plus.
Self-Service
The developer back in control. Self-service capabilities bring
more speed and agility to businesses, and allow fast creation
and tear-down of environments.
12. Risks
Lock-In
Consider all the aspects involved to move in and out
from the vendor. How easy is it to have access to a
backup? Does using a specific API or feature restrict
you in anyway on par to other solutions?
Technology Refresh
DBaaS vendors manage tons of deployments and – unless
you hit a "stop the world" bug – it is unlikely that you will be
the first in line to receive quality service updates.
13. Tradeoffs
Learning Curve
Although easy and simple to get started, every DBaaS
solution requires a learning curve. Integrating and
switching fully to a new platform may take some time
and effort.
Lack of Customization
DBaaS try to be as general purpose as possible, allowing
for very little customization. This may be an impediment if
your organization relies on specific features
15. Rewards
Reduced Infrastructure & Staffing Costs
DBaaS eliminates the need for investing in physical
infrastructure and staff. Allowing organizations to
allocate resources more efficiently and focus on core
business operations and innovation.
Deployment Flexibility
DBaaS providers offer pay-as-you-go pricing model, so you
only pay for resources you use, avoiding overprovisioning &
wasted resources. BYOA model leverages your existing cloud
provider infrastructure to save on infra costs.
16. Risks
Unpredictable Cost Scaling
As your application's workload increases, the cost of
scaling your database may not be linear or
predictable. Unexpected usage spikes or sudden
scaling needs can lead to higher costs.
Hidden or Unanticipated Costs
While DBaaS providers provide transparent pricing, there can
be hidden or unanticipated costs that arise during usage.
These could be related to additional services or specific
features not covered by the base pricing.
17. Tradeoffs
Limited Cost Optimization Options
DBaaS provide cost savings but may limit ability to optimize costs to
the same extent as self-managed DBs. Optimizing hardware
configurations or fine-grained performance tuning may be restricted
due to constraints of the DBaaS environment.
Cost Comparisons and TCO
Evaluating cost tradeoffs between DBaaS and self-managed DBs
requires TCO analysis. Consider hardware, licenses, maintenance,
personnel, and other operational expenses. Compare these costs to
DBaaS subscription fees and additional expenses.
20. Security & privacy
+ Data stored in the cloud is more likely to be exposed to hacking, breaches and unauthorized access.
+ To mitigate these risks, look for a DBaaS vendor that has flexible security measures in place, such as
encryption, multi-factor authentication and regular security audits.
+ Companies with strict security requirements should also look for a vendor that meets regulatory
requirements by providing:
+ Single Sign-on (SSO)
+ Data encryption
+ SOC 2 and ISO 27001 Certification
+ As an enterprise org, If you have even stricter security requirements, you can use a BYOA solution,
providing increased degree of control over your data.
22. Rewards
Co-location
Place data close to users and applications, reducing latency.
Multi-regional deployments and "always-on" architectures are
made simple, allowing for smart replication to selected
locations.
Ease of Scale
The Black Friday problem. An ideal DBaaS solution should
seamlessly scale to adapt itself to workload demands, while it
must be resilient enough to handle sudden spikes.
23. Risks
Pay per operations
Sustaining up to a few thousands of operations per second is
generally fine, but scaling throughput beyond tens of thousands
may break the bank.
Hidden Limits / Quotas
Many DBaaS have limits such as Item sizes. Others may
even throttle down the workload, limiting the number of
operations allowed, thus directly impacting your traffic.
24. Tradeoffs
Limited Deployment Options
DBaaS (as a business) primary goal is to sell managed solutions
and contribute to your "growth". There isn't much flexibility in
deployment options and not always the offered infrastructure is
the most performance oriented solution.
Maintenance and Operations
As you defer the management of your database to a third-party, you
may lose visibility on internal database operations and maintenance,
making performance related problems more difficult to diagnose and
more prone to impact your workloads
26. Rewards
Expert Consultation
Sizing, data modeling, access patterns and other types of
consultation are generally offered as part of a Services
Agreement.
Observability
Observability is key for any organization operating at scale, and
your DBaaS should generally offer you mechanisms to extract
metrics, define alerts and integrate with your preferred solution.
27. Risks
SLAs
Think about this: If a disaster happens, what are the chances
YOUR problem will be the first one to be looked at and
prioritized from your DBaaS vendor perspective?
Asking for Help
Many DBaaS service tiers either do not include support or may
lack qualified personnel to support you when a critical
problem happens.
28. Tradeoffs
Build Trust
A long-term relationship is only successful when both sides
understand their responsibilities. Be sure to select a DBaaS
vendor committed to your success.
Problems Happen!
It is unfortunate, but problems WILL eventually happen. It
happened for giant tech industries, so what would shield you
away from it? When it happen, your vendor will be your only
hope.
29. Q&A
Free NoSQL Database Training
scylladb.com/events
ScyllaDB Cloud
Start free Trial
scylladb.com/cloud
October 18 + 19, 2023
p99conf.io
30. Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/
Editor's Notes
PRESENTER - Felipe
Welcome, everyone! My name is Felipe. I am a Solution Architect at ScyllaDB and I'll be your host for today's webinar: "DBaaS in the Real World: Risks, Rewards & Tradeoffs".
Media Streaming is part of our daily lives nowadays. The number of key players within the streaming industry has been growing at a very fast pace. With so many options, it is no surprise that one of the business challenges of media streaming is how to attract more consumers to an engaging experience. Therefore, let me pose a question: What are going to be some of the aspects that will differentiate one platform from another, and how can we help?
PRESENTER - Felipe
For those of you who are not familiar with ScyllaDB yet, it is the database behind gamechangers - organizations whose success depends upon delivering engaging experiences with impressive speed.
ScyllaDB was built with a close-to-the-metal design that squeezes every possible ounce of performance out of modern infrastructure.
This translates to predictable low latency even at high throughputs.
With such consistent innovation the adoption of our database technology has grown to over 400 key players worldwide
PRESENTER - Felipe
Many of you will recognize some of the companies among the selection pictured here, such as Starbucks who leverage ScyllaDB for inventory management, Zillow for real-time property listing and updates, and Comcast Xfinity who power all DVR scheduling with ScyllaDB.
As it can be seen, ScyllaDB is used across many different industries and for entirely different types of use cases. More than often, your company probably has a use case that is a perfect fit for ScyllaDB and it may be that you don’t know it yet!
PRESENTER - Michael
Cost is another big topic when considering a move from self-managed to managed database solutions.
Let’s start by taking a look at some of the rewards related to costs.
DBaaS eliminates the need for physical infrastructure and dedicated staff. With DBaaS, you don't have to invest in hardware or worry about its maintenance. Instead, the infrastructure is provided and managed by the DBaaS vendor, which results in significant cost savings.
By adopting DBaaS, you can allocate your resources more efficiently. You can focus on your core operations, innovation, and also improving the customer experience rather than spending on hardware procurement and management.
In addition, DBaaS reduces staffing costs. With a self-managed database, companies need a dedicated team of devops engineers and developers. DBaaS, on the other hand, shifts these responsibilities to the vendor itself. So there is no need for a specialized database team. And this allows you to optimize your workforce and allocate the relevant staff to more strategic initiatives.
Let’s move on to pricing and deployment.
DBaaS providers usually offer two types of price plans, pay-as-you-go and annual pricing.
The pay-as-you-go pricing model of DBaaS is particularly an advantage here. It eliminates the need for upfront capital and enables cost optimization by aligning expenses with actual database usage. This flexibility is beneficial for you if you’re a startup or an organization with limited resources.
Now, most DBaaS vendors offer a single deployment model, where the customer’s DB sits on the vendor’s cloud provider infrastructure.
There is another model also referred to as “bring your own account”, where the DB remains on your own organization’s cloud provider infrastructure.
This deployment is especially beneficial for enterprises because if you have good relations with your cloud provider, you’ll be able to save costs on your infrastructure by obtaining pre-negotiated discounts.
Also, since the database resources remain on your existing infrastructure, you won’t have to deal with additional security matters.
And lastly, you will be able to manage your cloud provider bills in the same way as other existing infrastructure you are already consuming today.
Let’s move on to the risks associated to moving to a DBaaS solution.
The first risk we need to address is the unpredictable nature of cost scaling when using DBaaS. While it offers scalability, the cost of scaling your database may not follow a linear or easily predictable pattern. As your application's workload increases, there may be unexpected usage spikes, or sudden scaling needs, which can potentially lead to higher costs.
When your application gets exposed to a sudden increase in traffic or data volume, the resource requirements for your database may increase significantly. This can result in unexpected expenses as you need to scale up your database to meet the growing demands. So it's essential for you to closely monitor and analyze the cost implications of scaling. And this is to avoid any surprises in your budget.
Now, while DBaaS providers generally provide transparent pricing, there can still be costs that are not immediately visible. These costs often arise from additional services or specific features that may not be covered by the base pricing.
For example, you may require specialized support or advanced features for your specific use case, and these services might come at an extra cost. It's crucial to carefully review the service-level agreements (SLAs) and pricing documentation provided by the DBaaS provider, so that you can identify any potential hidden costs.
A concrete example is from one of our customers who switched from another vendor to ScyllaDB Cloud, after they had run into massive additional and unexpected variable costs. These costs were mainly associated to network usage. The costs were so unexpected and the engineers could not explain this internally on why this happened, and so some engineers were actually fired as a result.
So, understanding and accounting for these hidden or unanticipated costs is important, both for accurate budgeting but also for cost management. It ensures that you have a comprehensive understanding of the total cost of ownership and that you can make a better informed decision regarding the most cost-effective approach for your organization.
Now, let's dive into the tradeoffs associated with costs when using a DBaaS solution
The first tradeoff we'll discuss is "Limited Cost Optimizations." While DBaaS solutions offer cost savings, they may limit the ability to optimize costs to the same extent as self-managed databases. In a DBaaS environment, optimizing hardware configurations or applying performance tuning may be restricted due to the constraints which were imposed by the service provider.
DBaaS solution also provide a standardized infrastructure that caters to a wide range of use cases. So while this simplifies operations, it may limit your ability to implement very customized cost-saving strategies. So it's essential to evaluate the extent to which you can optimize costs within the boundaries of your DBaaS environment.
For example, optimizing hardware configurations or adjusting resource allocation may be restricted in a DBaaS environment. Also, fine-tuning the parameters specific to your workload or implementing specialized caching strategies may also have some limitations. So bottomline, you should be carefully evaluating these considerations in order to determine the impact on your cost optimization efforts.
Moving on to the next tradeoff, "Cost Comparisons and TCO."
When comparing the costs between DBaaS and self-managed databases, performing an analysis of the Total Cost of Ownership (TCO) is essential. This analysis involves considering various factors such as hardware, licenses, maintenance, personnel, and other operational expenses associated with managing a database in-house.
It's crucial to compare these costs against the ongoing subscription fees and any additional expenses associated with the DBaaS solution. This evaluation allows you to make an informed decision by understanding the true cost implications of each option.
By conducting a TCO analysis, you can evaluate the long-term financial impact of using a DBaaS solution against managing your database infrastructure in-house. This analysis will provide you with a holistic view of the costs involved and can also help you in making a well-informed decision based on your organization's specific requirements and budget considerations.
So we went through the different cost implications when moving to DbaaS.
Now, I also wanted to highlight our ScyllaDB Cloud pricing calculator which we’ve built, which compares the price between different scenarios of ScyllaDB and other vendors, such as DynamoDB.
Ofcourse, you shouldn’t take our word for it, and you should just compare it for yourself using the calculator which is available on our website.
You can apply your custom measurements, and then just compare the results.
Moving on to Security and privacy
When migrating a database to the cloud, one of the biggest concerns is security. Data stored in the cloud is more exposed to hacking, breaches and unauthorized access. To mitigate these risks, your organization should choose and ensure that your DBaaS provider has robust security measures in place, such as encryption, access controls, multi-factor authentication. But it should also undergo regular security audits.
In addition, DBaaS solutions often offer backup and disaster recovery mechanisms. This can minimize the risk of data loss in the event of a system failure, natural disaster, or even a cyberattack. This reduces the burden on your organization to set up and manage your own data replication and backup strategies.
As a company, you usually have organizational policies and requirements for accessing external tools or platforms.So you should make sure that your DBaaS vendor offers capabilities such as single Sign-on (SSO), data Encryption, and relevant certifications such as SOC 2 compliance, which provides an independent assessment of how well the DB vendor manages data with respect to security and availability, confidentiality, and ofcourse privacy.
Also, some DBaaS vendors utilize shared infrastructure, where multiple clients' databases are hosted on the same physical infrastructure. This shared environment introduces a level of risk, as a security breach or vulnerability in one client's database could potentially impact others. Your organization should assess the vendor’s isolation and segregation mechanisms in order to mitigate these risks and ensure you get consistent data privacy.
Finally, if you are an enterprise organization with strict security requirements, you can use a BYOA deployment model approach which I referred to in an earlier slide, which will allow all of your DB data to be located inside your existing pre-approved infrastructure.
PRESENTER - Felipe
PRESENTER - FELIPE
Thank you all very much for attending today. In due time, you will find this presentation available on the ScyllaDB dot com home page in our webinars section for on-demand viewing.
If you would like to weigh in on what we present in the future, please Contact Us, either via the form on our website, or on Twitter. We’d love to hear your ideas.
For now, on behalf of Michael and myself, and all of us at ScyllaDB, enjoy the rest of your day.
PRESENTER - Michael
Network can be very expensive (data intensive with high throughput - will not be cost effective). Pay per operations service models can bring surprising bills.Dynamo - scylla comparison - we never talked about multi-region in previous talks/slides.
Prices for global-table (table in dynamodb which lives in more than one region). (equivalent in dynamodb) are very high.
Price for cross-region + price on outbound traffic.
Looks like customers like Disney pay lots of money for this, but this may be a hidden cost in DynamoDB. Think about doing a slide on this. (See Avi’s email)
The base service (hosting) costs were well known in advance. They knew them both for DynamoDB and ScyllaDB. I understood the furor was over networking costs. The networking costs has the same model in ScyllaDB and DynamoDB: you pay for every byte that transit the network.
BYOA: If maintaining special prices from your cloud provider is important to your org, find a DBaaS solution that can provide a model called BYOA (bring your own account). If your company is big enough, you probably have your CFO is looking at your cloud provider bill; You’d like to simplify and have all of your AWS spending tallied up in one place. Plus, within your own AWS Account you likely have pre-negotiated discounts.With ScyllaDB Cloud BYOA, we provide a fully managed NoSQL database-as-a-service (DBaaS) that runs in your AWS account. You pay only the subscription fees for ScyllaDB; all of your infrastructure expenses are paid directly to AWS, through your existing accounts.
PRESENTER - Michael
If maintaining special prices from your cloud provider is important to your org, find a DBaaS solution that can provide a model called BYOA (bring your own account). If your company is big enough, you probably have your CFO is looking at your cloud provider bill; You’d like to simplify and have all of your AWS spending tallied up in one place. Plus, within your own AWS Account you likely have pre-negotiated discounts.With ScyllaDB Cloud BYOA, we provide a fully managed NoSQL database-as-a-service (DBaaS) that runs in your AWS account. You pay only the subscription fees for ScyllaDB; all of your infrastructure expenses are paid directly to AWS, through your existing accounts.
In general, it is recommended to use both regions and availability zones to achieve the highest level of availability and reliability for your applications and services.A region is a physical location around the world where AWS hosts one of its multiple data centers.Each region is divided into multiple availability zones, which are isolated data centers within a region. An availability zone (AZ) is a separate data center that contains its own power, networking, and connectivity resources. Each availability zone is designed to be independent of the others, with its own power and network connectivity, to minimize the risk of a single point of failure.However, using multiple regions or availability zones (AZs) typically results in a higher overall cloud computing bill due to the cost of hosting redundant workloads and data transfer fees when moving data between regions.AWS does not charge for data transfer between resources within the same Availability Zone (AZ) or for data transfer within the same region. This means that if you have resources, such as EC2 instances, or S3 buckets, deployed in the same AZ, you will not be charged for data transfer between them.In general, it is recommended to use multiple availability zones (AZs) when deploying applications in AWS to improve availability, fault tolerance, and resilience. However, In some scenarios, you may want to move to a single AZ instead of using cross-AZ traffic to save costs:Cost Restrictions: Deploying resources across multiple AZs incurs additional costs for data transfer and storage replication. If you have budget constraints and your application does not require high availability, you may consider using a single AZ to reduce costs.Low-risk and low latency applications: If your application is not mission-critical and can tolerate some downtime, using a single AZ may be appropriate. For example, if you are deploying a development or test environment, using a single AZ may be sufficient. In some cases, deploying resources in a single AZ can result in lower latency and faster response times. For example, if you are running a high-performance computing application that requires low-latency communication between nodes, deploying in a single AZ may be more appropriate.
D
References:Troubleshoot and optimize AWS cross-AZ traffic (lightlytics.com)
PRESENTER - Michael
Can you have better control on costs?
I would argue you can, selecting an IaaS and DBaaS vendor should not be a lifelong bond!
You should be able to exercise your buying power. Using an IaaS and DBaaS deployment strategy that is Cloud vendor agnostic will leave the choice of deployment in your hand.
Use Kubernetes and cloud agnostic DBaaS deployment to keep the target deployment platform to your choice, and use a multi-cloud vendor approach. The multi cloud strategy will help with user experience as you will be able to deploy applications near your customers, with whichever cloud vendor that has resources available near the customers.
Another aspect of control on DBaaS TCO, is the ability to consolidate existing workloads into the new DBaaS offering.
Users of Scylla Cloud were able to consolidate dozens of Cassandra clusters into a single, fully managed cluster w/o interfering to the current application!
Scylla Cloud users are using both the DynamoDB and the CQL API to interact with the application, it means that there is no need for application changes.
This stability in the code base help keep the development costs down!
And again, leaving the control in your hand on the application TCO.
I’ll go through multiple factors you should consider when looking at cost variability, especially at cost savings but also potential revenue enhancements, and for each one I will talk about what makes ScyllaDB Cloud a cheaper alternative over other managed DB vendors available out there.
There are 3 main factors that come into play here:
Higher Throughput
Lower Consistent Latency
Deployment Type
Let’s go through the first factor, Higher Throughput.
Why does “Higher Throughput” translates to lower costs?
It’s pretty straightforward. Let’s imagine you have an Apache Cassandra deployment of 24 servers or nodes.
By moving to ScyllaDB and having each server work 10x faster - you can simply reduce the number of servers by a factor of 10 while still being able to handle the same amount of capacity or throughput.
So this immediately translates into great savings. It’s a quick way to reduce your costs.
Now, more than that, Scylla was designed with multi-core deployment from scratch. So if you want to, you can switch to fewer number of servers, but have larger servers with multi-core.
This, btw, will sometimes not immediately let you save on costs, because for example in AWS, usually the size and price are proportional. So you get a bigger server with more storage, but it can also be more expensive.
So to recap, higher throughput > less hardware > lower cost of operation
The second factor which may be a little bit more tricky to grasp, is consistent latency.
It can be translated in some cases to higher revenue. So ofcourse, this depends on your domain and application.
You might write an application for which latency is completely irrelevant ofcourse, so this might not apply to you.
But in many cases, and I brought here a few screen captures from various blogs and websites which cover this, lower latency actually can translate into higher revenue.
In some industries it’s more straightforward. For example, with Algotrading or with ad-tech where you have a bidding mechanism in place, latency is critical, but this is also the case for the IoT industry, for example where you have a lot of events going into the same machine, latency is also critical. And there are other additional domains where this is true, such as Gaming or social apps.
So I hope I established the fact that lower latency can in some cases, maybe in your case, translate into higher revenue.
—
latency sensitive domains:
Bids and Ad Tech
Algo trading
IoT (insert)
Gaming
Chats and social (Discord)
Moving on to the 3rd factor, Deployment Type.
Most DBaaS vendors offer a single deployment model, where the customer’s DB sits on the vendor’s cloud provider infrastructure.
There is another model also referred to as “bring your own account”, where the DB remains on the customer’s cloud provider infrastructure.
This deployment is especially beneficial for enterprises because if you have good relations with your cloud provider, you’ll be able to save costs on your infrastructure by obtaining pre-negotiated discounts.
Also, since the database resources remain on your existing infrastructure, you won’t have to deal with additional security matters.
And lastly, you will be able to manage your cloud provider bills in the same way as other existing infrastructure you are consuming today.
This is how with a workload that has a similar ratio between read and write operations, it translates into lower costs.
With DynamoDB and Google Bigtable for example, we don’t count the amount of servers or anything like that, but you can and should compare the price that you pay per workload.
As you can see, if you compare Scylla to DynamoDB for example, in this use case we get a 5th of the cost of dynamodb.
But we also compare ScyllaDB to other vendors, and for each of those, the higher throughput translates immediately into lower costs.