In this talk we describe the features of Cassandra that set it above the pack, and how to get the most out of them, depending on your application. In particular, we'll describe de-normalization, and detail how the algorithms behind Cassandra leverage awesome write speed to accelerate reads; and we'll explain how Cassandra achieves multi-datacenter support, tunable consistency and no single point of failure, to give a great solution for highly available systems.
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...DataStax
Cassandra's support for multiple data centers can bring massive benefits to an organization, however it can also bring painful operational lessons. While there is no recipe for trouble free mutli DC clusters, the best approach is to understand why you are using one, what Cassandra supports, and how it does it. With this knowledge in your toolkit you will have a better chance of fixing the sort of gremlins that can trouble a globally distributed database.
In this talk Alexander Dejanovski, Consultant at The Last Pickle, will outline the motivations people typically have for running a multi DC cluster. He will also look at how multiple DC's are supported through all areas of the Cassandra, how it impacts your application and operations, and how you can always blame the network.
About the Speaker
Alexander DEJANOVSKI Consultant, The Last Pickle
Alexander has been working as a software developer for the last 18 years, mainly for the french leader of express shipments. He's been leading there the effort to build a Cassandra based architecture and migrate services to it from traditional RDBMS. He is involved in the Cassandra community through the development of a JDBC wrapper for the DataStax Java Driver. Recently, he joined The Last Pickle as a Cassandra consultant and now helps customers to get the best out of it.
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion.
In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership.
A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
Do you ever wonder how data-driven organizations fuel analytics, improve customer experience, and accelerate business productivity? They are successful by governing and mastering data effectively so they can get trusted data to those who need it faster. Efficient data discovery, mastering and democratization is critical for swiftly linking accurate data with business consumers. When business teams can quickly and easily locate, interpret, trust, and apply data assets to support sound business judgment, it takes less time to see value.
Join data mastering and data governance experts from Informatica—plus a real-world organization empowering trusted data for analytics—for a lively panel discussion. You’ll hear more about how a single cloud-native approach can help global businesses in any economy create more value—faster, more reliably, and with more confidence—by making data management and governance easier to implement.
What is data literacy? Which organizations, and which workers in those organizations, need to be data-literate? There are seemingly hundreds of definitions of data literacy, along with almost as many opinions about how to achieve it.
In a broader perspective, companies must consider whether data literacy is an isolated goal or one component of a broader learning strategy to address skill deficits. How does data literacy compare to other types of skills or “literacy” such as business acumen?
This session will position data literacy in the context of other worker skills as a framework for understanding how and where it fits and how to advocate for its importance.
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task – but it’s worth the effort. Getting your Data Strategy right can provide significant value, as data drives many of the key initiatives in today’s marketplace – from digital transformation, to marketing, to customer centricity, to population health, and more. This webinar will help demystify Data Strategy and its relationship to Data Architecture and will provide concrete, practical ways to get started.
Uncover how your business can save money and find new revenue streams.
Driving profitability is a top priority for companies globally, especially in uncertain economic times. It's imperative that companies reimagine growth strategies and improve process efficiencies to help cut costs and drive revenue – but how?
By leveraging data-driven strategies layered with artificial intelligence, companies can achieve untapped potential and help their businesses save money and drive profitability.
In this webinar, you'll learn:
- How your company can leverage data and AI to reduce spending and costs
- Ways you can monetize data and AI and uncover new growth strategies
- How different companies have implemented these strategies to achieve cost optimization benefits
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
In this webinar, Bob will focus on:
-Selecting the appropriate metadata to govern
-The business and technical value of a data catalog
-Building the catalog into people’s routines
-Positioning the data catalog for success
-Questions the data catalog can answer
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...DataStax
Cassandra's support for multiple data centers can bring massive benefits to an organization, however it can also bring painful operational lessons. While there is no recipe for trouble free mutli DC clusters, the best approach is to understand why you are using one, what Cassandra supports, and how it does it. With this knowledge in your toolkit you will have a better chance of fixing the sort of gremlins that can trouble a globally distributed database.
In this talk Alexander Dejanovski, Consultant at The Last Pickle, will outline the motivations people typically have for running a multi DC cluster. He will also look at how multiple DC's are supported through all areas of the Cassandra, how it impacts your application and operations, and how you can always blame the network.
About the Speaker
Alexander DEJANOVSKI Consultant, The Last Pickle
Alexander has been working as a software developer for the last 18 years, mainly for the french leader of express shipments. He's been leading there the effort to build a Cassandra based architecture and migrate services to it from traditional RDBMS. He is involved in the Cassandra community through the development of a JDBC wrapper for the DataStax Java Driver. Recently, he joined The Last Pickle as a Cassandra consultant and now helps customers to get the best out of it.
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion.
In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership.
A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
Do you ever wonder how data-driven organizations fuel analytics, improve customer experience, and accelerate business productivity? They are successful by governing and mastering data effectively so they can get trusted data to those who need it faster. Efficient data discovery, mastering and democratization is critical for swiftly linking accurate data with business consumers. When business teams can quickly and easily locate, interpret, trust, and apply data assets to support sound business judgment, it takes less time to see value.
Join data mastering and data governance experts from Informatica—plus a real-world organization empowering trusted data for analytics—for a lively panel discussion. You’ll hear more about how a single cloud-native approach can help global businesses in any economy create more value—faster, more reliably, and with more confidence—by making data management and governance easier to implement.
What is data literacy? Which organizations, and which workers in those organizations, need to be data-literate? There are seemingly hundreds of definitions of data literacy, along with almost as many opinions about how to achieve it.
In a broader perspective, companies must consider whether data literacy is an isolated goal or one component of a broader learning strategy to address skill deficits. How does data literacy compare to other types of skills or “literacy” such as business acumen?
This session will position data literacy in the context of other worker skills as a framework for understanding how and where it fits and how to advocate for its importance.
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task – but it’s worth the effort. Getting your Data Strategy right can provide significant value, as data drives many of the key initiatives in today’s marketplace – from digital transformation, to marketing, to customer centricity, to population health, and more. This webinar will help demystify Data Strategy and its relationship to Data Architecture and will provide concrete, practical ways to get started.
Uncover how your business can save money and find new revenue streams.
Driving profitability is a top priority for companies globally, especially in uncertain economic times. It's imperative that companies reimagine growth strategies and improve process efficiencies to help cut costs and drive revenue – but how?
By leveraging data-driven strategies layered with artificial intelligence, companies can achieve untapped potential and help their businesses save money and drive profitability.
In this webinar, you'll learn:
- How your company can leverage data and AI to reduce spending and costs
- Ways you can monetize data and AI and uncover new growth strategies
- How different companies have implemented these strategies to achieve cost optimization benefits
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
In this webinar, Bob will focus on:
-Selecting the appropriate metadata to govern
-The business and technical value of a data catalog
-Building the catalog into people’s routines
-Positioning the data catalog for success
-Questions the data catalog can answer
Because every organization produces and propagates data as part of their day-to-day operations, data trends are becoming more and more important in the mainstream business world’s consciousness. For many organizations in various industries, though, comprehension of this development begins and ends with buzzwords: “Big Data,” “NoSQL,” “Data Scientist,” and so on. Few realize that all solutions to their business problems, regardless of platform or relevant technology, rely to a critical extent on the data model supporting them. As such, data modeling is not an optional task for an organization’s data effort, but rather a vital activity that facilitates the solutions driving your business. Since quality engineering/architecture work products do not happen accidentally, the more your organization depends on automation, the more important the data models driving the engineering and architecture activities of your organization. This webinar illustrates data modeling as a key activity upon which so much technology and business investment depends.
Specific learning objectives include:
- Understanding what types of challenges require data modeling to be part of the solution
- How automation requires standardization on derivable via data modeling techniques
- Why only a working partnership between data and the business can produce useful outcomes
Analytics play a critical role in supporting strategic business initiatives. Despite the obvious value to analytic professionals of providing the analytics for these initiatives, many executives question the economic return of analytics as well as data lakes, machine learning, master data management, and the like.
Technology professionals need to calculate and present business value in terms business executives can understand. Unfortunately, most IT professionals lack the knowledge required to develop comprehensive cost-benefit analyses and return on investment (ROI) measurements.
This session provides a framework to help technology professionals research, measure, and present the economic value of a proposed or existing analytics initiative, no matter the form that the business benefit arises. The session will provide practical advice about how to calculate ROI and the formulas, and how to collect the necessary information.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
Change is hard, especially in response to negative stimuli or what is perceived as negative stimuli. So organizations need to reframe how they think about data privacy, security and governance, treating them as value centers to 1) ensure enterprise data can flow where it needs to, 2) prevent – not just react – to internal and external threats, and 3) comply with data privacy and security regulations.
Working together, these roles can accelerate faster access to approved, relevant and higher quality data – and that means more successful use cases, faster speed to insights, and better business outcomes. However, both new information and tools are required to make the shift from defense to offense, reducing data drama while increasing its value.
Join us for this panel discussion with experts in these fields as they discuss:
- Recent research about where data privacy, security and governance stand
- The most valuable enterprise data use cases
- The common obstacles to data value creation
- New approaches to data privacy, security and governance
- Their advice on how to shift from a reactive to resilient mindset/culture/organization
You’ll be educated, entertained and inspired by this panel and their expertise in using the data trifecta to innovate more often, operate more efficiently, and differentiate more strategically.
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
With technological innovation and change occurring at an ever-increasing rate, it’s hard to keep track of what’s hype and what can provide practical value for your organization. Join this webinar to see the results of a recent DATAVERSITY survey on emerging trends in Data Architecture, along with practical commentary and advice from industry expert Donna Burbank.
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
As DATAVERSITY’s RWDG series hurdles into our 12th year, this webinar takes a quick look behind us, evaluates the present, and predicts the future of Data Governance. Based on webinar numbers, hot Data Governance topics have evolved over the years from policies and best practices, roles and tools, data catalogs and frameworks, to supporting data mesh and fabric, artificial intelligence, virtualization, literacy, and metadata governance.
Join Bob Seiner as he reflects on the past and what has and has not worked, while sharing examples of enterprise successes and struggles. In this webinar, Bob will challenge the audience to stay a step ahead by learning from the past and blazing a new trail into the future of Data Governance.
In this webinar, Bob will focus on:
- Data Governance’s past, present, and future
- How trials and tribulations evolve to success
- Leveraging lessons learned to improve productivity
- The great Data Governance tool explosion
- The future of Data Governance
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
Would you share your bank account information on social media? How about shouting your social security number on the New York City subway? We didn’t think so either – that’s why data governance is consistently top of mind.
In this webinar, we’ll discuss the common Cloud data governance best practices – and how to apply them today. Join us to uncover Google Cloud’s investment in data governance and learn practical and doable methods around key management and confidential computing. Hear real customer experiences and leave with insights that you can share with your team. Let’s get solving.
Topics that you will hear addressed in this webinar:
- Understanding the basics of Cloud Incident Response (IR) and anticipated data governance trends
- Best practices for key management and apply data governance to your day-to-day
- The next wave of Confidential Computing and how to get started, including a demo
It is a fascinating, explosive time for enterprise analytics.
It is from the position of analytics leadership that the enterprise mission will be executed and company leadership will emerge. The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data, and projects that will deliver analytics. After all, no matter what business you’re in, you’re in the business of analytics.
The coming years will be full of big changes in enterprise analytics and data architecture. William will kick off the fifth year of the Advanced Analytics series with a discussion of the trends winning organizations should build into their plans, expectations, vision, and awareness now.
Too often I hear the question “Can you help me with our data strategy?” Unfortunately, for most, this is the wrong request because it focuses on the least valuable component: the data strategy itself. A more useful request is: “Can you help me apply data strategically?” Yes, at early maturity phases the process of developing strategic thinking about data is more important than the actual product! Trying to write a good (must less perfect) data strategy on the first attempt is generally not productive –particularly given the widespread acceptance of Mike Tyson’s truism: “Everybody has a plan until they get punched in the face.” This program refocuses efforts on learning how to iteratively improve the way data is strategically applied. This will permit data-based strategy components to keep up with agile, evolving organizational strategies. It also contributes to three primary organizational data goals. Learn how to improve the following:
- Your organization’s data
- The way your people use data
- The way your people use data to achieve your organizational strategy
This will help in ways never imagined. Data are your sole non-depletable, non-degradable, durable strategic assets, and they are pervasively shared across every organizational area. Addressing existing challenges programmatically includes overcoming necessary but insufficient prerequisites and developing a disciplined, repeatable means of improving business objectives. This process (based on the theory of constraints) is where the strategic data work really occurs as organizations identify prioritized areas where better assets, literacy, and support (data strategy components) can help an organization better achieve specific strategic objectives. Then the process becomes lather, rinse, and repeat. Several complementary concepts are also covered, including:
- A cohesive argument for why data strategy is necessary for effective data governance
- An overview of prerequisites for effective strategic use of data strategy, as well as common pitfalls
- A repeatable process for identifying and removing data constraints
- The importance of balancing business operation and innovation
Who Should Own Data Governance – IT or Business?DATAVERSITY
The question is asked all the time: “What part of the organization should own your Data Governance program?” The typical answers are “the business” and “IT (information technology).” Another answer to that question is “Yes.” The program must be owned and reside somewhere in the organization. You may ask yourself if there is a correct answer to the question.
Join this new RWDG webinar with Bob Seiner where Bob will answer the question that is the title of this webinar. Determining ownership of Data Governance is a vital first step. Figuring out the appropriate part of the organization to manage the program is an important second step. This webinar will help you address these questions and more.
In this session Bob will share:
- What is meant by “the business” when it comes to owning Data Governance
- Why some people say that Data Governance in IT is destined to fail
- Examples of IT positioned Data Governance success
- Considerations for answering the question in your organization
- The final answer to the question of who should own Data Governance
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This program describes what must be done at the programmatic level to achieve better data use and a way to implement this as part of your data program. The approach combines DMBoK content and CMMI/DMM processes – permitting organizations with the opportunity to benefit from the best of both. It also permits organizations to understand:
- Their current Data Management practices
- Strengths that should be leveraged
- Remediation opportunities
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
With the explosive growth of DataOps to drive faster and more confident business decisions, proactively understanding the quality and health of your data is more important than ever. Data observability is an emerging discipline within data quality used to expose anomalies in data by continuously monitoring and testing data using artificial intelligence and machine learning to trigger alerts when issues are discovered.
Join Julie Skeen and Shalaish Koul from Precisely, to learn how data observability can be used as part of a DataOps strategy to improve data quality and reliability and to prevent data issues from wreaking havoc on your analytics and ensure that your organization can confidently rely on the data used for advanced analytics and business intelligence.
Topics you will hear addressed in this webinar:
Data observability – what is it and how it can complement your data quality strategy
Why now is the time to incorporate data observability into your DataOps strategy
How data observability helps prevent data issues from impacting downstream analytics
Examples of how data observability can be used to prevent real-world issues
Empowering the Data Driven Business with Modern Business IntelligenceDATAVERSITY
By consolidating data engineering, data warehouse, and data science capabilities under a single fully-managed platform, BigQuery can accelerate computation, reduce data analysis costs, and streamline data management.
Following in-depth interviews with a security services provider and a telecommunications company, Nucleus Research found that customers moving to Google Cloud BigQuery from on-premises data warehouse solutions accelerate data processing by over 75 percent while reducing data ongoing administrative expenses by over 25 percent.
As BigQuery continues to optimize its platform architecture for compute efficiency and multicloud support, Nucleus expects the vendor to see rapid adoption and further penetrate the data warehouse market.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how Data Architecture is a key component of an overall Enterprise Architecture for enhanced business value and success.
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
When starting or evaluating the present state of your Data Governance program, it is important to focus on best practices such that you don’t take a ready, fire, aim approach. Best practices need to be practical and doable to be selected for your organization, and the program must be at risk if the best practice is not achieved.
Join Bob Seiner for an important webinar focused on industry best practice around standing up formal Data Governance. Learn how to assess your organization against the practices and deliver an effective roadmap based on the results of conducting the assessment.
In this webinar, Bob will focus on:
- Criteria to select the appropriate best practices for your organization
- How to define the best practices for ultimate impact
- Assessing against selected best practices
- Focusing the recommendations on program success
- Delivering a roadmap for your Data Governance program
Including All Your Mission-Critical Data in Modern Apps and AnalyticsDATAVERSITY
To stay competitive, you need to swiftly deliver innovative web and mobile apps and analytics solutions that include all your critical data—including mainframe and IBM i. Join us to hear how forward-thinking companies are using modern cloud-based platforms to deliver solutions that drive better customer experiences and greater insight—all while extending the value of their core systems.
Assessing New Database Capabilities – Multi-ModelDATAVERSITY
Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos.
Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.
Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos.
Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.
While vendor offerings have exploded in recent years, in due time frameworks will integrate components into what amounts to, for practical purposes, a single offering for multiple workloads, perhaps even for the enterprise.
A multi-model database is a database that can store, manage, and query data in multiple models, such as relational, document-oriented, key-value, graph (triplestore), and column store.
An enterprise will find reduced overhead and other synergies from choosing a single vendor for these workloads.
This session will explore the multi-model option and some criteria that decision makers should evaluate when choosing a multi-model solution.
Likely lots of well-organized data. Periodically, it is useful to interact with your Data Governance group to reevaluate the relative value of the various collections in the warehouse. More and more organizations are using warehousing as a strategy and focusing less on the actual technology. This program will provide a refocus on data warehousing as a capability that supports BI activities, enables more effective business analyses and decision-making, and provides some contribution to innovation initiatives. What are the capabilities required and how does their operation compare to cloud-based options?
Learning objectives:
- Warehousing capabilities
- What to use these capabilities in support of
- Where they can be deployed
LA HUG - Video Testimonials with Chynna Morgan - June 2024Lital Barkan
Have you ever heard that user-generated content or video testimonials can take your brand to the next level? We will explore how you can effectively use video testimonials to leverage and boost your sales, content strategy, and increase your CRM data.🤯
We will dig deeper into:
1. How to capture video testimonials that convert from your audience 🎥
2. How to leverage your testimonials to boost your sales 💲
3. How you can capture more CRM data to understand your audience better through video testimonials. 📊
Because every organization produces and propagates data as part of their day-to-day operations, data trends are becoming more and more important in the mainstream business world’s consciousness. For many organizations in various industries, though, comprehension of this development begins and ends with buzzwords: “Big Data,” “NoSQL,” “Data Scientist,” and so on. Few realize that all solutions to their business problems, regardless of platform or relevant technology, rely to a critical extent on the data model supporting them. As such, data modeling is not an optional task for an organization’s data effort, but rather a vital activity that facilitates the solutions driving your business. Since quality engineering/architecture work products do not happen accidentally, the more your organization depends on automation, the more important the data models driving the engineering and architecture activities of your organization. This webinar illustrates data modeling as a key activity upon which so much technology and business investment depends.
Specific learning objectives include:
- Understanding what types of challenges require data modeling to be part of the solution
- How automation requires standardization on derivable via data modeling techniques
- Why only a working partnership between data and the business can produce useful outcomes
Analytics play a critical role in supporting strategic business initiatives. Despite the obvious value to analytic professionals of providing the analytics for these initiatives, many executives question the economic return of analytics as well as data lakes, machine learning, master data management, and the like.
Technology professionals need to calculate and present business value in terms business executives can understand. Unfortunately, most IT professionals lack the knowledge required to develop comprehensive cost-benefit analyses and return on investment (ROI) measurements.
This session provides a framework to help technology professionals research, measure, and present the economic value of a proposed or existing analytics initiative, no matter the form that the business benefit arises. The session will provide practical advice about how to calculate ROI and the formulas, and how to collect the necessary information.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
Change is hard, especially in response to negative stimuli or what is perceived as negative stimuli. So organizations need to reframe how they think about data privacy, security and governance, treating them as value centers to 1) ensure enterprise data can flow where it needs to, 2) prevent – not just react – to internal and external threats, and 3) comply with data privacy and security regulations.
Working together, these roles can accelerate faster access to approved, relevant and higher quality data – and that means more successful use cases, faster speed to insights, and better business outcomes. However, both new information and tools are required to make the shift from defense to offense, reducing data drama while increasing its value.
Join us for this panel discussion with experts in these fields as they discuss:
- Recent research about where data privacy, security and governance stand
- The most valuable enterprise data use cases
- The common obstacles to data value creation
- New approaches to data privacy, security and governance
- Their advice on how to shift from a reactive to resilient mindset/culture/organization
You’ll be educated, entertained and inspired by this panel and their expertise in using the data trifecta to innovate more often, operate more efficiently, and differentiate more strategically.
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
With technological innovation and change occurring at an ever-increasing rate, it’s hard to keep track of what’s hype and what can provide practical value for your organization. Join this webinar to see the results of a recent DATAVERSITY survey on emerging trends in Data Architecture, along with practical commentary and advice from industry expert Donna Burbank.
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
As DATAVERSITY’s RWDG series hurdles into our 12th year, this webinar takes a quick look behind us, evaluates the present, and predicts the future of Data Governance. Based on webinar numbers, hot Data Governance topics have evolved over the years from policies and best practices, roles and tools, data catalogs and frameworks, to supporting data mesh and fabric, artificial intelligence, virtualization, literacy, and metadata governance.
Join Bob Seiner as he reflects on the past and what has and has not worked, while sharing examples of enterprise successes and struggles. In this webinar, Bob will challenge the audience to stay a step ahead by learning from the past and blazing a new trail into the future of Data Governance.
In this webinar, Bob will focus on:
- Data Governance’s past, present, and future
- How trials and tribulations evolve to success
- Leveraging lessons learned to improve productivity
- The great Data Governance tool explosion
- The future of Data Governance
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
Would you share your bank account information on social media? How about shouting your social security number on the New York City subway? We didn’t think so either – that’s why data governance is consistently top of mind.
In this webinar, we’ll discuss the common Cloud data governance best practices – and how to apply them today. Join us to uncover Google Cloud’s investment in data governance and learn practical and doable methods around key management and confidential computing. Hear real customer experiences and leave with insights that you can share with your team. Let’s get solving.
Topics that you will hear addressed in this webinar:
- Understanding the basics of Cloud Incident Response (IR) and anticipated data governance trends
- Best practices for key management and apply data governance to your day-to-day
- The next wave of Confidential Computing and how to get started, including a demo
It is a fascinating, explosive time for enterprise analytics.
It is from the position of analytics leadership that the enterprise mission will be executed and company leadership will emerge. The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data, and projects that will deliver analytics. After all, no matter what business you’re in, you’re in the business of analytics.
The coming years will be full of big changes in enterprise analytics and data architecture. William will kick off the fifth year of the Advanced Analytics series with a discussion of the trends winning organizations should build into their plans, expectations, vision, and awareness now.
Too often I hear the question “Can you help me with our data strategy?” Unfortunately, for most, this is the wrong request because it focuses on the least valuable component: the data strategy itself. A more useful request is: “Can you help me apply data strategically?” Yes, at early maturity phases the process of developing strategic thinking about data is more important than the actual product! Trying to write a good (must less perfect) data strategy on the first attempt is generally not productive –particularly given the widespread acceptance of Mike Tyson’s truism: “Everybody has a plan until they get punched in the face.” This program refocuses efforts on learning how to iteratively improve the way data is strategically applied. This will permit data-based strategy components to keep up with agile, evolving organizational strategies. It also contributes to three primary organizational data goals. Learn how to improve the following:
- Your organization’s data
- The way your people use data
- The way your people use data to achieve your organizational strategy
This will help in ways never imagined. Data are your sole non-depletable, non-degradable, durable strategic assets, and they are pervasively shared across every organizational area. Addressing existing challenges programmatically includes overcoming necessary but insufficient prerequisites and developing a disciplined, repeatable means of improving business objectives. This process (based on the theory of constraints) is where the strategic data work really occurs as organizations identify prioritized areas where better assets, literacy, and support (data strategy components) can help an organization better achieve specific strategic objectives. Then the process becomes lather, rinse, and repeat. Several complementary concepts are also covered, including:
- A cohesive argument for why data strategy is necessary for effective data governance
- An overview of prerequisites for effective strategic use of data strategy, as well as common pitfalls
- A repeatable process for identifying and removing data constraints
- The importance of balancing business operation and innovation
Who Should Own Data Governance – IT or Business?DATAVERSITY
The question is asked all the time: “What part of the organization should own your Data Governance program?” The typical answers are “the business” and “IT (information technology).” Another answer to that question is “Yes.” The program must be owned and reside somewhere in the organization. You may ask yourself if there is a correct answer to the question.
Join this new RWDG webinar with Bob Seiner where Bob will answer the question that is the title of this webinar. Determining ownership of Data Governance is a vital first step. Figuring out the appropriate part of the organization to manage the program is an important second step. This webinar will help you address these questions and more.
In this session Bob will share:
- What is meant by “the business” when it comes to owning Data Governance
- Why some people say that Data Governance in IT is destined to fail
- Examples of IT positioned Data Governance success
- Considerations for answering the question in your organization
- The final answer to the question of who should own Data Governance
It is clear that Data Management best practices exist and so does a useful process for improving existing Data Management practices. The question arises: Since we understand the goal, how does one design a process for Data Management goal achievement? This program describes what must be done at the programmatic level to achieve better data use and a way to implement this as part of your data program. The approach combines DMBoK content and CMMI/DMM processes – permitting organizations with the opportunity to benefit from the best of both. It also permits organizations to understand:
- Their current Data Management practices
- Strengths that should be leveraged
- Remediation opportunities
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
MLOps is a practice for collaboration between Data Science and operations to manage the production machine learning (ML) lifecycles. As an amalgamation of “machine learning” and “operations,” MLOps applies DevOps principles to ML delivery, enabling the delivery of ML-based innovation at scale to result in:
Faster time to market of ML-based solutions
More rapid rate of experimentation, driving innovation
Assurance of quality, trustworthiness, and ethical AI
MLOps is essential for scaling ML. Without it, enterprises risk struggling with costly overhead and stalled progress. Several vendors have emerged with offerings to support MLOps: the major offerings are Microsoft Azure ML and Google Vertex AI. We looked at these offerings from the perspective of enterprise features and time-to-value.
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
With the explosive growth of DataOps to drive faster and more confident business decisions, proactively understanding the quality and health of your data is more important than ever. Data observability is an emerging discipline within data quality used to expose anomalies in data by continuously monitoring and testing data using artificial intelligence and machine learning to trigger alerts when issues are discovered.
Join Julie Skeen and Shalaish Koul from Precisely, to learn how data observability can be used as part of a DataOps strategy to improve data quality and reliability and to prevent data issues from wreaking havoc on your analytics and ensure that your organization can confidently rely on the data used for advanced analytics and business intelligence.
Topics you will hear addressed in this webinar:
Data observability – what is it and how it can complement your data quality strategy
Why now is the time to incorporate data observability into your DataOps strategy
How data observability helps prevent data issues from impacting downstream analytics
Examples of how data observability can be used to prevent real-world issues
Empowering the Data Driven Business with Modern Business IntelligenceDATAVERSITY
By consolidating data engineering, data warehouse, and data science capabilities under a single fully-managed platform, BigQuery can accelerate computation, reduce data analysis costs, and streamline data management.
Following in-depth interviews with a security services provider and a telecommunications company, Nucleus Research found that customers moving to Google Cloud BigQuery from on-premises data warehouse solutions accelerate data processing by over 75 percent while reducing data ongoing administrative expenses by over 25 percent.
As BigQuery continues to optimize its platform architecture for compute efficiency and multicloud support, Nucleus expects the vendor to see rapid adoption and further penetrate the data warehouse market.
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
Enterprise Architecture (EA) provides a visual blueprint of the organization, and shows key interrelationships between data, process, applications, and more. By abstracting these assets in a graphical view, it’s possible to see key interrelationships, particularly as they relate to data and its business impact across the organization. Join us for a discussion on how Data Architecture is a key component of an overall Enterprise Architecture for enhanced business value and success.
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
When starting or evaluating the present state of your Data Governance program, it is important to focus on best practices such that you don’t take a ready, fire, aim approach. Best practices need to be practical and doable to be selected for your organization, and the program must be at risk if the best practice is not achieved.
Join Bob Seiner for an important webinar focused on industry best practice around standing up formal Data Governance. Learn how to assess your organization against the practices and deliver an effective roadmap based on the results of conducting the assessment.
In this webinar, Bob will focus on:
- Criteria to select the appropriate best practices for your organization
- How to define the best practices for ultimate impact
- Assessing against selected best practices
- Focusing the recommendations on program success
- Delivering a roadmap for your Data Governance program
Including All Your Mission-Critical Data in Modern Apps and AnalyticsDATAVERSITY
To stay competitive, you need to swiftly deliver innovative web and mobile apps and analytics solutions that include all your critical data—including mainframe and IBM i. Join us to hear how forward-thinking companies are using modern cloud-based platforms to deliver solutions that drive better customer experiences and greater insight—all while extending the value of their core systems.
Assessing New Database Capabilities – Multi-ModelDATAVERSITY
Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos.
Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.
Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos.
Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.
While vendor offerings have exploded in recent years, in due time frameworks will integrate components into what amounts to, for practical purposes, a single offering for multiple workloads, perhaps even for the enterprise.
A multi-model database is a database that can store, manage, and query data in multiple models, such as relational, document-oriented, key-value, graph (triplestore), and column store.
An enterprise will find reduced overhead and other synergies from choosing a single vendor for these workloads.
This session will explore the multi-model option and some criteria that decision makers should evaluate when choosing a multi-model solution.
Likely lots of well-organized data. Periodically, it is useful to interact with your Data Governance group to reevaluate the relative value of the various collections in the warehouse. More and more organizations are using warehousing as a strategy and focusing less on the actual technology. This program will provide a refocus on data warehousing as a capability that supports BI activities, enables more effective business analyses and decision-making, and provides some contribution to innovation initiatives. What are the capabilities required and how does their operation compare to cloud-based options?
Learning objectives:
- Warehousing capabilities
- What to use these capabilities in support of
- Where they can be deployed
LA HUG - Video Testimonials with Chynna Morgan - June 2024Lital Barkan
Have you ever heard that user-generated content or video testimonials can take your brand to the next level? We will explore how you can effectively use video testimonials to leverage and boost your sales, content strategy, and increase your CRM data.🤯
We will dig deeper into:
1. How to capture video testimonials that convert from your audience 🎥
2. How to leverage your testimonials to boost your sales 💲
3. How you can capture more CRM data to understand your audience better through video testimonials. 📊
An introduction to the cryptocurrency investment platform Binance Savings.Any kyc Account
Learn how to use Binance Savings to expand your bitcoin holdings. Discover how to maximize your earnings on one of the most reliable cryptocurrency exchange platforms, as well as how to earn interest on your cryptocurrency holdings and the various savings choices available.
Premium MEAN Stack Development Solutions for Modern BusinessesSynapseIndia
Stay ahead of the curve with our premium MEAN Stack Development Solutions. Our expert developers utilize MongoDB, Express.js, AngularJS, and Node.js to create modern and responsive web applications. Trust us for cutting-edge solutions that drive your business growth and success.
Know more: https://www.synapseindia.com/technology/mean-stack-development-company.html
Buy Verified PayPal Account | Buy Google 5 Star Reviewsusawebmarket
Buy Verified PayPal Account
Looking to buy verified PayPal accounts? Discover 7 expert tips for safely purchasing a verified PayPal account in 2024. Ensure security and reliability for your transactions.
PayPal Services Features-
🟢 Email Access
🟢 Bank Added
🟢 Card Verified
🟢 Full SSN Provided
🟢 Phone Number Access
🟢 Driving License Copy
🟢 Fasted Delivery
Client Satisfaction is Our First priority. Our services is very appropriate to buy. We assume that the first-rate way to purchase our offerings is to order on the website. If you have any worry in our cooperation usually You can order us on Skype or Telegram.
24/7 Hours Reply/Please Contact
usawebmarketEmail: support@usawebmarket.com
Skype: usawebmarket
Telegram: @usawebmarket
WhatsApp: +1(218) 203-5951
USA WEB MARKET is the Best Verified PayPal, Payoneer, Cash App, Skrill, Neteller, Stripe Account and SEO, SMM Service provider.100%Satisfection granted.100% replacement Granted.
B2B payments are rapidly changing. Find out the 5 key questions you need to be asking yourself to be sure you are mastering B2B payments today. Learn more at www.BlueSnap.com.
VAT Registration Outlined In UAE: Benefits and Requirementsuae taxgpt
Vat Registration is a legal obligation for businesses meeting the threshold requirement, helping companies avoid fines and ramifications. Contact now!
https://viralsocialtrends.com/vat-registration-outlined-in-uae/
3.0 Project 2_ Developing My Brand Identity Kit.pptxtanyjahb
A personal brand exploration presentation summarizes an individual's unique qualities and goals, covering strengths, values, passions, and target audience. It helps individuals understand what makes them stand out, their desired image, and how they aim to achieve it.
Company Valuation webinar series - Tuesday, 4 June 2024FelixPerez547899
This session provided an update as to the latest valuation data in the UK and then delved into a discussion on the upcoming election and the impacts on valuation. We finished, as always with a Q&A
Implicitly or explicitly all competing businesses employ a strategy to select a mix
of marketing resources. Formulating such competitive strategies fundamentally
involves recognizing relationships between elements of the marketing mix (e.g.,
price and product quality), as well as assessing competitive and market conditions
(i.e., industry structure in the language of economics).
Business Valuation Principles for EntrepreneursBen Wann
This insightful presentation is designed to equip entrepreneurs with the essential knowledge and tools needed to accurately value their businesses. Understanding business valuation is crucial for making informed decisions, whether you're seeking investment, planning to sell, or simply want to gauge your company's worth.
4. History
• 2007: Started at Facebook for inbox search
• July 2008: Open sourced by Facebook
• March 2009: Apache Incubator
• February 2010: Apache top-level project
• May 2011:Version 0.8
Monday, 15 August 2011
5. What it’s good for
• Horizontal scalability
• No single-point of failure
• Multi-data centre support
• Very high write workloads
• Tuneable consistency
Monday, 15 August 2011
6. What it’s not so good for
• Transactions
• Read heavy workloads
• Low latency applications
• compared to in-memory dbs
Monday, 15 August 2011
8. Keyspaces and Column Families
SQL Cassandra
Database row/key col_1 col_2
Keyspace
row/key col_1 col_1
row/ col_1 col_1
Table Column Family
Keyspaces & CFs have different
sets of configuration settings
Monday, 15 August 2011
9. Column Family
key: {
column: value,
column: value,
...
}
Monday, 15 August 2011
10. Rows and columns
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
11. Reads
• get
• get_slice One row, some cols
• name predicate
• slice range
• multiget_slice Multiple rows
• get_range_slices
Monday, 15 August 2011
12. get
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
13. get_slice: name predicate
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
14. get_slice: slice range
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
15. multiget_slice: name
predicate
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
16. get_range_slices: slice range
col1 col2 col3 col4 col5 col6 col7
row1 x x x
row2 x x x x x
row3 x x x x x
row4 x x x x
row5 x x x x
row6 x
row7 x x x
Monday, 15 August 2011
24. Partitioning + Replication
• Partitioning data on to nodes
• load balancing
• row-based
• Replication
• to protect against failure
• better availability
Monday, 15 August 2011
25. Partitioning
• Random: take hash of row key
• good for load balancing
• bad for range queries
• Ordered: subdivide key space
• bad for load balancing
• good for range queries
• Or build your own...
Monday, 15 August 2011
26. Simple Replication
(k, v)
Nodes arranged on a ‘ring’
Monday, 15 August 2011
27. Simple Replication
Primary location
(k, v)
Nodes arranged on a ‘ring’
Monday, 15 August 2011
28. Simple Replication
Primary location
(k, v) Extra copies
are successors
on the ring
Nodes arranged on a ‘ring’
Monday, 15 August 2011
29. Topology-aware
Replication
• Snitch : node IP (DataCenter, rack)
• EC2Snitch
• Region DC; availability_zone rack
• PropertyFileSnitch
• Configured from a file
Monday, 15 August 2011
30. Topology-aware
Replication
DC 1 DC 2
(k, v)
r1 r2 r1 r2
Monday, 15 August 2011
31. Topology-aware
Replication
DC 1 DC 2
(k, v)
r1 r2 r1 r2
Monday, 15 August 2011
32. Topology-aware
Replication
DC 1 DC 2
extra copies
to different
data center
(k, v)
r1 r2 r1 r2
Monday, 15 August 2011
33. Topology-aware
Replication
DC 1 DC 2
extra copies
to different
data center
(k, v)
spread across
racks within a r1 r2 r1 r2
data center
Monday, 15 August 2011
35. Consistency Level
• How many replicas must respond in order to
declare success
• W/N must succeed for write to succeed
• write with client-generated timestamp
• R/N must succeed for read to succeed
• return most recent, by timestamp
Monday, 15 August 2011
36. Consistency Level
• 1, 2, 3 responses
• Quorum (more than half)
• Quorum in local data center
• Quorum in each data center
Monday, 15 August 2011
38. Read repair
• If the replicas disagree on read, send most
recent data back
n1
read k? n2
n3
Monday, 15 August 2011
39. Read repair
• If the replicas disagree on read, send most
recent data back
n1 v, t1
read k? n2 not found!
n3 v’, t2
Monday, 15 August 2011
40. Read repair
• If the replicas disagree on read, send most
recent data back
n1 v, t1
n2 not found!
n3 v’, t2
Monday, 15 August 2011
41. Read repair
• If the replicas disagree on read, send most
recent data back
n1
n2
n3 write (k, v’, t2)
Monday, 15 August 2011
42. Hinted handoff
• When a node is unavailable
• Writes can be written to any node as a hint
• Delivered when the node comes back
online
Monday, 15 August 2011
43. Anti-entropy
• Equivalent to ‘read repair all’
• Requires reading all data (woah)
• (Although only hashes are sent to calculate diffs)
• Manual process
Monday, 15 August 2011
45. De-normalisation
• Disk space is much cheaper than disk seeks
• Read at 100 MB/s, seek at 100 IO/s
• => copy data to avoid seeks
Monday, 15 August 2011
47. Data-centric model
m1: {
sender: user1
content: “Mary had a little lamb”
recipients: user2, user3
}
• but how to do ‘recipients’ for Inbox?
• one-to-many modelled by a join table
Monday, 15 August 2011
48. To join
m1: { user2: {
sender: user1 m1: true
subject: “A rhyme”
content: “Mary had a little lamb” }
} user3: {
m2: {
sender: user1 m1: true
subject: “colours” m2: true
content: “Its fleece was white as snow”
} }
m3: { user4: {
sender: user1
subject: “loyalty” m2: true
content: “And everywhere that Mary went” m3: true
}
}
Monday, 15 August 2011
49. .. or not to join
• Joins are expensive, so de-normalise to trade
off space for time
• We can have lots of columns, so think BIG:
• Make message id a time-typed super-column.
• This makes get_slice an efficient way of
searching for messages in a time window
Monday, 15 August 2011
51. De-normalisation +
Cassandra
• have to write a copy of the record for each
recipient ... but writes are very cheap
• get_slice fetches columns for a particular
row, so gets received messages for a user
• on-disk column order is optimal for this
query
Monday, 15 August 2011
53. What it’s good for
• Horizontal scalability
• No single-point of failure
• Multi-data centre support
• Very high write workloads
• Tuneable consistency
Monday, 15 August 2011