● Distributed Database Management Systems Advantages and Disadvantages.
● Characteristics of Distributed Database Management Systems.
● Levels of Data and Process Distribution.
● Distributed Database Transparency Features.
● Transaction Performance and Failure Transparency.
DDBMS, characteristics, Centralized vs. Distributed Database, Homogeneous DDBMS, Heterogeneous DDBMS, Advantages, Disadvantages, What is parallel database, Data fragmentation, Replication, Distribution Transaction
DDBMS, characteristics, Centralized vs. Distributed Database, Homogeneous DDBMS, Heterogeneous DDBMS, Advantages, Disadvantages, What is parallel database, Data fragmentation, Replication, Distribution Transaction
Database concepts and Archeticture Ch2 with in class ActivitiesZainab Almugbel
This is the slides of chapter 2 of the book Ramez Elmasri and Shamkant Navathe, "Fundamentals of Database Systems" 6th Edition, 2010
I did not include the activities in the slides. I printed them out in separate papers. Then, I asked students: who liked to participate in activity 1 (the interview) in the class. I selected 2 students for the first activity (one was the interviewer and another was the guest). I did the same for the other activities.
Database concepts and Archeticture Ch2 with in class ActivitiesZainab Almugbel
This is the slides of chapter 2 of the book Ramez Elmasri and Shamkant Navathe, "Fundamentals of Database Systems" 6th Edition, 2010
I did not include the activities in the slides. I printed them out in separate papers. Then, I asked students: who liked to participate in activity 1 (the interview) in the class. I selected 2 students for the first activity (one was the interviewer and another was the guest). I did the same for the other activities.
Fundamentals of database system - Database System Concepts and ArchitectureMustafa Kamel Mohammadi
In this chapter you will learn
DBMS evolution
Data model
Three schema architecture
DBMS language
DBMS interfaces
DBMS components
Classification of DBMS
SQL vs NoSQL, Structured Query Language (SQL)
More rigid and structured way of storing data
Consists of two or more tables with columns and rows
Relationship between tables and field types is called a schema
A well-designed schema minimizes data redundancy and prevents tables from becoming out-of-sync.
NoSQL: Not only SQL
Greater flexibility than their traditional counterparts
Unstructured data from the web
NoSQL databases are document-oriented
Ease of access
Advanced Database Systems CS352Unit 4 Individual Project.docxnettletondevon
Advanced Database Systems CS352
Unit 4 Individual Project
Randle Kuhn
03/14/16
Contents
The Database Models, Languages, and Architecture 3
Database System Development Life Cycle 6
Database Management Systems 9
Advanced SQL 17
Web and Data Warehousing and Mining in the Business World 22
References 23
The Database Models, Languages, and Architecture
It is exceedingly essential for every organization to evaluate its constituent database needs/requirements so as to determine whether it will be operationally compatible with the distinct architectural layouts available. Making the wrong choice of architectural design results to degraded database performance in terms of speed of accessing data as well as executing data definition and manipulation commands. These architectural database designs include the 3-level architecture which is implemented under the ANSI-SPARC (American National Standards Institute, Standards Planning and Requirements Committee) architectural framework of computational standards. It was inaugurated in the year 1975 as an abstract standard for utilization in DBMSs (Database Management System). The core objective of this 3-level architecture is to introduce efficient database operability by separating the users view from the other views (internal, conceptual and external). The user’s view is implemented and operates independently of the underlying database architecture. Therefore, multiple users are able to access similar data items synchronously while at the same time customizing their respective views with no regard to the other users’ views (www.computingstudents.com, 2009). Additionally, it ensures that the users are not presented with the sophisticated hardware/physical implementation details which are basically irrelevant to users. The access speed for this type of architecture is exceedingly high with fault tolerance capabilities.
Data independence refers to a very important concept utilized in centrally oriented database management systems and which incorporates data transparency. This sort of transparency exempts the users from being affected by any alterations conducted on the structural or organizational makeup of the underlying data. According to the guidelines followed by data independence policies, the user applications should not be involved in problems or issues emanating from the internal data definitions. Operations conducted by the user applications should not be influenced in any way by these internal data modifications (Zaiane, 2016). Data independence is subdivided into two categories namely first level and second level of data independence.
Data administrators are responsible of many essential roles which are different from those of a database administrator in several ways. For instance, a data administrator is in charge of coming up with the necessary definition of data items, creating names to refer to various data items as well as their respective relationships. He/she often consult datab.
Course Title: Database Programming with SQL
Course Code: DEE 431
TOPICS COVER:
Database Terminologies
Drawbacks of Traditional System
Data processing Modes
Application of DBMS
Types of Database
Histroy of Database
Characteristics of Database
Advantages and Disadvantages of Database
Types of database architecture: 1 Tier, 2 Tier, 3 Tier
SQL is a standard language for accessing and manipulating databases.
*What is SQL?
SQL stands for Structured Query Language
SQL lets you access and manipulate databases
SQL is an ANSI (American National Standards Institute) standard
*What Can SQL do?
SQL can execute queries against a database
SQL can retrieve data from a database
SQL can insert records in a database
SQL can update records in a database
SQL can delete records from a database
SQL can create new databases
SQL can create new tables in a database
UNIT : -(6)
CONNECTING DATABASE WITH ADO.NET
Content:
•ADO.NET Architecture
•Data provider and its core object
•DataSet class
•Data Binding
•SQL Data Source
● Data Modeling and Data Models.
● Business Rules (Translating Business Rules into Data Model Components).
● Emerging Data Models: Big Data and NoSQL.
● Degrees of Data Abstraction (External, Conceptual, Internal and Physical model).
● Why Databases?
● Why Database Design is Important?
● The Database System Environment and Functions.
● Managing the Database System: A Shift in Focus.
Data Science is a field where we apply 'science' to available 'data' in order to get the 'patterns' or 'insights' which can help a business to optimize operations or improvise decisions.
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]Usman Tariq
In this PPT, you will learn:
• About data modeling and why data models are important
• About the basic data-modeling building blocks
• What business rules are and how they influence database design
• How the major data models evolved
• About emerging alternative data models and the needs they fulfill
• How data models can be classified by their level of abstraction
Author: Carlos Coronel | Steven Morris
In this PPT, you will learn:
• The difference between data and information
• What a database is, the various types of databases, and why they are valuable assets for
decision making
• The importance of database design
• How modern databases evolved from file systems
• About flaws in file system data management
• The main components of the database system
• The main functions of a database management system (DBMS)
E-marketing is a process of planning and executing the conception, distribution, promotion, and pricing of products and services in a computerized, networked environment, such as the Internet and the World Wide Web, to facilitate exchanges and satisfy customer demands.
ERP modules and business software packageUsman Tariq
In organization, ERP helps to manage business processes of various departments & functions through centralized application. We can make all the major decisions by screening the information provided by ERP.
There are many vendors in market which are providing traditional ERP solutions or Cloud based ERP solutions. Though implementation platforms or technologies are different, there are common & basic modules of ERP which can be found in any ERP System. Depending on organizations need required components are integrated & customized ERP system is formed. All the below mentioned modules can be found in any ERP system:
• Human Resource, Inventory, Sales & Marketing, Purchase, Finance & Accounting, Customer Relationship Management (CRM), Engineering/ Production, Supply Chain Management (SCM)
Each component mentioned above is specialized to handle defined business processes of organization. Let us go through the introduction of the various modules.
ERP Implementation Challenges and Package SelectionUsman Tariq
ERP implementations have a nasty reputation for being challenging.
These challenges can lead to your ERP implementation project taking too much time and being over budget.
The result can be you being left with an underperforming solution. Or, you avoiding implementation of an ERP at all costs.
While the challenges are real, they shouldn’t stop you from implementing one.
Discuss overall trends in Internet access, usage, and purchasing around the world.
Define emerging economies and explain the vital role of information technology in economic development.
Outline how e-marketers apply market similarity and analyze online purchase and payment behaviors in planning market entry opportunities.
Deploying an enterprise resource planning (ERP) system is an expensive proposition, not just in terms of licensing and maintenance, but in terms of dedicated resources and time. The implementation of ERP systems has helped small and mid-sized companies, significantly improve their business metrics by process optimization, improving the entire supply chain process, better inventory control, better reporting to take decisions, integration across functionalities and increasing transparency across the company. Purchase department can see the sales department data, Sales department can see inventory data, and top management can see any data on a click of single button.
Customer Relationship Management (CRM) is a strategy for managing all your company’s relationships and interactions with customers and potential customers. It helps you stay connected to them, streamline processes and improve your profitability.
More commonly, when people talk about CRM they are usually referring to a CRM system, a tool which helps with contact management, sales management, productivity and more.
Customer Relationship Management enables you to focus on your organization’s relationships with individual people – whether those are customers, service users, colleagues or suppliers. CRM is not just for sales. Some of the biggest gains in productivity can come from moving beyond CRM as a sales and marketing tool and embedding it in your business – from HR to customer services and supply-chain management.
A marketing plan should be a formal written document, not recalled from memory or something scribbled on a napkin. To take your business to the next level requires preparing a written marketing action plan.
Strategic E-Marketing and Performance MetricsUsman Tariq
E-marketing means using digital technologies such as websites, mobile devices and social networking to help reach your customers, create awareness of your brand and sell your goods or services. The basics of marketing remain the same - creating a strategy to deliver the right messages to the right people.
ERP is an acronym for Enterprise Resource Planning, but even its full name doesn't shed much light on what ERP is or what it does. For that, you need to take a step back and think about all of the various processes that are essential to running a business, including inventory and order management, accounting, human resources, customer relationship management (CRM), and beyond. At its most basic level, ERP software integrates these various functions into one complete system to streamline processes and information across the entire organization.
The central feature of all ERP systems is a shared database that supports multiple functions used by different business units. In practice, this means that employees in different divisions—for example, accounting and sales—can rely on the same information for their specific needs.
E-Marketing (past, present, and future)Usman Tariq
Textbook Title: “E-Marketing, 8th E": international Edition
Author: Raymond D. Frost, Alexa Fox, Judy Strauss
Publisher: Taylor & Francis, ISBN-13: 9781138731370
Year/Edition: 8th ed. was published by Rutledge
Publication Date: 2018-10-18
ERP integrates business of an organization through a centralized database. The organizational data and transaction data are stored in the database. This data is a rich source of information. There are many software tools that would process the data and discover useful patterns. These techniques are referred to as data mining. The data from an ERP system may not be directly usable by data mining tools. The data may have to be pre-processed and made ready for data mining. A data warehouse is created from the ERP data that makes the data ready for data mining. An organization needs to interact with their suppliers for obtaining the raw material or semi-finished goods. They also need to interact with their retailers and dealers. These interactions may happen using EDI technology. Supply chain management (SCM) refers to managing suppliers and retailers. Customers are the reason why a business exists. The focus has changed from providing customer a product to providing a service built around the product. Customer relationship management (CRM) is the technology that helps an organization to manage its customers. CRM and SCM both integrate with ERP system and are collectively referred to as ERP-II.
In the computer industry, an enterprise is an organization that uses computers. A word was needed that would encompass corporations, small businesses, non-profit institutions, government bodies, and possibly other kinds of organizations. The term enterprise seemed to do the job. In practice, the term is applied much more often to larger organizations than smaller ones.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
2. Learning Objectives
In this chapter, the student will learn:
About distributed database management systems
(DDBMSs) and their components
How database implementation is affected by different
levels of data and process distribution
How transactions are managed in a distributed database
environment
2
3. Learning Objectives
In this chapter, the student will learn:
How distributed database design draws on data
partitioning and replication to balance performance,
scalability, and availability
About the trade-offs of implementing a distributed data
system
3
4. Distributed database
A set of databases in a distributed system that can
appear to applications as a single data source.
4
Hierarchical Arrangement of
Networked Databases
Homogeneous
Distributed Database
5. Important considerations
There are two principal approaches to store a relation
in a distributed database system:
Replication: Database replication is the frequent
electronic copying of data from a database in one computer
or server to a database in another so that all users share the
same level of information.
Fragmentation/Partitioning: Fragmentation is a database
server feature that allows you to control where data is stored
at the table level.
Fragmentation enables you to define groups of rows or index keys
within a table according to some algorithm or scheme. You can use
this table to access information about your fragmented tables and
indexes.
5
6. Distribution scheme for table
fragmentation (1/2)
The following example includes a FRAGMENT BY
EXPRESSION clause to create a fragmented table
with an expression-based distribution scheme:
6
7. Distribution scheme for table
fragmentation (2/2)
7
Here the first three fragments are stored in partitions of the dbs1 dbspace, and
the other fragments, including the remainder, are stored in named fragments of
the dbs2 dbspace. Explicit fragment names are required in this example,
because each dbspace has multiple partitions.
8. How to Check Index Fragmentation on
Indexes in a Database
The following is a simple query that will list every index on every table in
your database, ordered by the percentage of index fragmentation.
8
9. Global Name as a Loopback Database Link
You can use the global name of a database as a
loopback database link without explicitly creating a
database link. When the database link in a SQL
statement matches the global name of the current
database, the database link is effectively ignored.
For example, assume the global name of a database is
db1.example.com. You can run the following SQL
statement on this database:
9
10. SQL statements that create database links in a local database to
the remote sales.us.americas.example_auto.com database
CREATE DATABASE LINK
sales.us.americas.example_auto.com USING
'sales_us';
Connects To Database
sales using net service name sales_us
Connects As
Connected user
Link Type
Private connected user
10
11. SQL statements that create database links in a local database to
the remote database
CREATE DATABASE LINK foo CONNECT
TO CURRENT_USER USING 'am_sls';
Connects To Database
sales using service name am_sls
Connects As
Current global user
Link Type
Private current user
11
12. SQL statements that create database links in a local database to
the remote sales.us.americas.example_auto.com database
CREATE DATABASE LINK
sales.us.americas.example_auto.com
CONNECT TO SAAD IDENTIFIED BY
password USING 'sales_us';
Connects To Database
sales using net service name sales_us
Connects As
SAAD using password password
Link Type
Private current user
12
13. SQL statements that create database links in a local database to
the remote sales.us.americas.example_auto.com database
CREATE PUBLIC DATABASE LINK sales
CONNECT TO SULTAN IDENTIFIED BY
password USING 'rev';
Connects To Database
sales using net service name rev
Connects As
SULTAN using password password
Link Type
Public current user
13
14. SQL statements that create database links in a local database to
the remote sales.us.americas.example_auto.com database
CREATE SHARED PUBLIC DATABASE LINK
sales.us.americas.example_auto.com CONNECT
TO WALEED IDENTIFIED BY password
AUTHENTICATED BY USMAN IDENTIFIED BY
password1 USING 'sales';
Connects To Database
sales using net service name sales
Connects As
WALEED using password password, authenticated as
USMAN using password password1
Link Type
Shared public fixed user 14
15. Distributed processing
The operations that occurs when an application
distributes its tasks among different computers in a
network.
For example, a database application typically distributes
front-end presentation tasks to client computers and
allows a back-end database server to manage shared
access to a database. Consequently, a distributed database
application processing system is more commonly referred
to as a client/server database application system.
15
16. Evolution Database Management
Systems
Distributed database management system
(DDBMS): Governs storage and processing of
logically related data over interconnected
computer systems
Data and processing functions are
distributed among several sites
Centralized database management system
Required that corporate data be stored in a
single central site
Data access provided through dumb
terminals
16
19. Naming of Schema Objects Using
Database Links
Oracle Database uses the global database name to
name the schema objects globally.
Global database names are in the following form:
schema.schema_object@global_database_name
For example, using a database link to database
sales.division3.example.com, a user or application can
reference remote data as follows:
19
SELECT * FROM scott.emp@sales.division3.example.com;
# emp table in scott's schema
-----------------------
SELECT loc FROM
scott.dept@sales.division3.example.com;
EXPLANATORY SLIDE
20. For example, assume that you connect to the local
database as user SYSTEM:
CONNECT SYSTEM@sales1
You then issue the following statements using
database link hq.example.com to access objects in
the scott and jane schemas on remote database hq:
SELECT * FROM scott.emp@hq.example.com;
INSERT INTO jane.accounts@hq.example.com (acc_no,
acc_name, balance)VALUES (5001, 'BOWER', 2000);
UPDATE jane.accounts@hq.example.com
SET balance = balance + 500;
DELETE FROM jane.accounts@hq.example.com
WHERE acc_name = 'BOWER';
20
21. Figure 12.1 - Centralized Database
Management System
21
22. Factors Affecting the Centralized
Database Systems
Globalization of business operation
Advancement of web-based services
Rapid growth of social and network technologies
Digitization resulting in multiple types of data
Structured, unstructured, semi-structured data
Time-stamped data, etc.
Innovative business intelligence through analysis of data
22
23. An Oracle Distributed Database
System
A client can
connect directly or indi
rectly to a database
server.
A direct connection
occurs when a client
connects to a server
and accesses
information from a
database contained on
that server.
23
EXPLANATORY SLIDE
25. Rules for a DDBMS
To the user, a distributed system should look exactly like a non distributed
system.
1. Local Autonomy
2. No Reliance on a Central Site
3. Continuous Operation
4. Location Independence
5. Fragmentation Independence
6. Replication Independence
7. Distributed Query Processing
8. Distributed Transaction Processing
9. Hardware Independence
10. Operating System Independence
11. Network Independence
12. Database Independence
Last four rules are ideals. 25
EXPLANATORY SLIDE
Homogeneous Distributed Database
28. Remote SQL Statements
A remote update statement is an update that
modifies data in one or more tables, all of which are
located at the same remote node.
For example, the following query updates the dept table
in the scott schema of the remote sales database:
28
EXPLANATORY SLIDE
29. Distributed SQL Statements
A distributed SQL statement either queries or
modifies data on two or more nodes.
A distributed query statement retrieves information
from two or more nodes.
For example, the following query accesses data from the
local database as well as the remote sales database:
29
EXPLANATORY SLIDE
30. Distributed UPDATE statement
A distributed update statement modifies data on two or
more nodes. A distributed update is possible using a PL/SQL
sub-program unit such as a procedure or trigger that includes
two or more remote updates that access data on different
nodes.
For example, the following PL/SQL program unit updates
tables on the local database and the remote sales database:
30
EXPLANATORY SLIDE
31. Factors That Aided DDBMS to Cope
With Technological Advancement
Acceptance of Internet as a platform for business
Mobile wireless revolution
Usage of application as a service
Focus on mobile business intelligence
31
32. Desirability of Distributed DBMS
Over Centralized DBMS
Performance
degradation
High costs
Reliability
problems
Scalability
problems
Organizational
rigidity
32
33. Advantages and Disadvantages of
DDBMS
Advantages
• Data are located near
greatest demand site
• Faster data access and
processing
• Growth facilitation
• Improved communications
• Reduced operating costs
• User-friendly interface
• Less danger of a single-
point failure
• Processor independence
Disadvantages
• Complexity of management
and control
• Technological difficulty
• Security
• Lack of standards
• Increased storage and
infrastructure requirements
• Increased training cost
• Costs incurred due to the
requirement of duplicated
infrastructure
33
34. Characteristics of Distributed
Management Systems
Application
interface
Validation Transformation
Query
optimization
Mapping I/O interface Formatting Security
Backup and
recovery
DB
administration
Concurrency
control
Transaction
management
34
35. Functions of Distributed DBMS
Receives the request of an application
Validates, analyzes, and decomposes the request
Maps the request
Decomposes request into several I/O operations
Searches and validates data
Ensures consistency, security, and integrity
Validates data for specific conditions
Presents data in required format
35
36. Figure 12.4 - A Fully Distributed Database
Management System
36
37. DDBMS Components
Computer workstations or remote devices
Network hardware and software components
Communications media
• Transaction processor (TP): Software component of a
system that requests data
Known as transaction manager (TM) or application
processor (AP)
Data processor (DP) or data manager (DM)
Software component on a system that stores and
retrieves data from its location
37
38. Single-Site Processing, Single-Site
Data (SPSD)
Processing is done on a single host computer
Data stored on host computer’s local disk
Processing restricted on end user’s side
DBMS is accessed by dumb terminals
38
39. Multiple-Site Processing, Single-Site
Data (MPSD)
Multiple processes run on different computers
sharing a single data repository
Require network file server running
conventional applications
Accessed through LAN
Client/server architecture
Reduces network traffic
Processing is distributed
Supports data at multiple sites
39
40. Figure 12.7 - Multiple-Site Processing,
Single-Site Data
40
41. Multiple-Site Processing, Single-Site
Data (MPSD)
Fully distributed database management system
Support multiple data processors and transaction
processors at multiple sites
Classification of DDBMS depending on the level of
support for various types of databases
Homogeneous: Integrate multiple instances of same
DBMS over a network
Heterogeneous: Integrate different types of DBMSs (e.g.
Object-oriented Databases, Document Databases, Relational Databases, etc.)
Fully heterogeneous: Support different DBMSs, each
supporting different data model (e.g. Entity-Relationship Model,
network model, etc. ) 41
42. Restrictions of DDBMS
Remote access is provided on a read-only basis
Restrictions on the number of remote tables that may
be accessed in a single transaction
Restrictions on the number of distinct databases that
may be accessed
Restrictions on the database model that may be
accessed
42
43. Distributed Database Transparency
Features (cont.)
Distribution
transparency
Transaction
transparency
Failure
transparency
Performance
transparency
Heterogeneity
transparency
43
44. Distribution Transparency
Allows management of physically dispersed database
as if centralized
Levels
Fragmentation transparency
Location transparency
Local mapping transparency
44
45. Distribution Transparency
Unique fragment: Each row is unique, regardless of
the fragment in which it is located
Supported by distributed data dictionary (DDD) or
distributed data catalog (DDC)
DDC contains the description of the entire database as
seen by the database administrator
Distributed global schema: Common database
schema to translate user requests into subqueries
45
46. Transaction Transparency
Ensures database transactions will maintain
distributed database’s integrity and consistency
Ensures transaction completed only when all database
sites involved complete their part
Distributed database systems require complex
mechanisms to manage transactions
46
47. Distributed Requests and Distributed
Transactions
• Single SQL statement accesses data processed by a single remote
database processor
Remote request
• Accesses data at single remote site composed of several requests
Remote transaction
• Requests data from several different remote sites on network
Distributed transaction
• Single SQL statement references data at several DP sites
Distributed request
47
48. Distributed Concurrency Control
Concurrency control is important in distributed
databases environment
Due to multi-site multiple-process operations that
create inconsistencies and deadlocked transactions
48
50. Two-Phase Commit Protocol (2PC)
Guarantees if a portion of a transaction operation
cannot be committed, all changes made at the other
sites will be undone
To maintain a consistent database state
Requires that each DP’s transaction log entry be
written before database fragment is updated
DO-UNDO-REDO protocol: Roll transactions back
and forward with the help of the system’s transaction
log entries
50
51. Two-Phase Commit Protocol (2PC)
Write-ahead protocol: Forces the log entry to be
written to permanent storage before actual operation
takes place
Defines operations between coordinator and
subordinates
Phases of implementation
Preparation
The final COMMIT
51
52. Performance and Failure Transparency
Performance transparency: Allows a DDBMS to
perform as if it were a centralized database
Failure transparency: Ensures the system will
operate in case of network failure
Considerations for resolving requests in a distributed
data environment
Data distribution
Data replication
Replica transparency: DDBMS’s ability to hide multiple
copies of data from the user
52
53. Performance and Failure Transparency
Network and node availability
Network latency: delay imposed by the amount of time
required for a data packet to make a round trip
Network partitioning: delay imposed when nodes
become suddenly unavailable due to a network failure
53
54. Distributed Database Design
• How to partition database into fragments
Data fragmentation
• Which fragments to replicate
Data replication
• Where to locate those fragments and replicas
Data allocation
54
55. Data Fragmentation
Breaks single object into many segments
Information is stored in distributed data catalog (DDC)
Strategies
Horizontal fragmentation: Division of a relation into
subsets (fragments) of tuples (rows)
Vertical fragmentation: Division of a relation into
attribute (column) subsets
Mixed fragmentation: Combination of horizontal and
vertical strategies
55
56. Data Replication
Data copies stored at multiple sites served by a
computer network
Mutual consistency rule: Replicated data fragments
should be identical
Styles of replication
Push replication
Pull replication
Helps restore lost data
56
Supported Databases: IBM Db2, Microsoft SQL Server, Mango DB, Oracle,
PostGreSQL
57. Types of Data Replication [1/3]
Transactional Replication – In Transactional
replication users receive full initial copies of the
database and then receive updates as data changes.
Data is copied in real time from the publisher to the
receiving database(subscriber) in the same order as
they occur with the publisher therefore in this type of
replication, transactional consistency is
guaranteed.
Transactional replication is typically used in server-to-
server environments.
It does not simply copy the data changes, but rather
consistently and accurately replicates each change. 57
58. Types of Data Replication [2/3]
Snapshot Replication – Snapshot replication
distributes data exactly as it appears at a specific
moment in time does not monitor for updates to the
data. The entire snapshot is generated and sent to
Users. Snapshot replication is generally used when
data changes are infrequent.
It is bit slower than transactional because on each
attempt it moves multiple records from one end to the
other end.
Snapshot replication is a good way to perform initial
synchronization between the publisher and the subscriber.
58
59. Types of Data Replication [3/3]
Merge Replication – Data from two or more
databases is combined into a single database.
Merge replication is the most complex type of
replication because it allows both publisher and
subscriber to independently make changes to the
database.
Merge replication is typically used in server-to-client
environments. It allows changes to be sent from one
publisher to multiple subscribers.
59
63. Data Replication Scenarios
• Stores multiple copies of each database fragment at
multiple sites
Fully replicated database
• Stores multiple copies of some database fragments at
multiple sites
Partially replicated database
• Stores each database fragment at a single site
Unreplicated database
63
64. Data Allocation Strategies
• Entire database stored at one site
Centralized data allocation
• Database is divided into two or more disjoined
fragments and stored at two or more sites
Partitioned data allocation
• Copies of one or more database fragments are stored
at several sites
Replicated data allocation
64
65. The CAP Theorem
CAP stands for:
Consistency: Every read receives the most recent write or an
error
Availability: Every request receives a (non-error) response,
without the guarantee that it contains the most recent write
Partition tolerance: The system continues to operate despite
an arbitrary number of messages being dropped (or delayed)
by the network between nodes
Basically available, soft state, eventually consistent
(BASE)
Data changes are not immediate but propagate slowly
through the system until all replicas are consistent 65
67. Key Assumptions of Hadoop
Distributed File System
High volume
Write-once,
read-many
Streaming access
Move
computations to
the data
Fault tolerance
67
73. C. J. Date’s Twelve Commandments
for Distributed Databases
Local site independence
Central site independence
Failure independence
Location transparency
Fragmentation transparency
Replication transparency
73
74. C. J. Date’s Twelve Commandments
for Distributed Databases
Distributed query processing
Distributed transaction processing
Hardware independence
Operating system independence
Network independence
Database independence
74
Editor's Notes
A homogeneous distributed database has identical software and hardware running all databases instances, and may appear through a single interface as if it were a single database.
A heterogeneous distributed database may have different hardware, operating systems, database management systems, and even data models for different databases.
Consider fragmenting your tables if improving at least one of the following is your goal:
Single-user response time
Concurrency
Availability
Backup-and-restore characteristics
Loading of data
In an expression-based distribution scheme, each fragment expression in a rule specifies a storage space. Each fragment expression in the rule isolates data and aids the database server in searching for rows.
SELECT * FROM hr.employees@db1.example.com;
CREATE PUBLIC DATABASE LINK sales.division3.example.com USING 'sales1';
‘foo’ is a table alias/identifier for the derived query
Distributed database management system (DDBMS): for example, the data input/output (I/O), data selection, and data validation might be performed on one computer, and a report based on that data might be created on another computer.
A homogeneous distributed database has identical software and hardware running all databases instances, and may appear through a single interface as if it were a single database.
Autonomous − Each database is independent that functions on its own. They are integrated by a controlling application and use message passing to share data updates.
Non-autonomous − Data is distributed across the homogeneous nodes and a central or master DBMS co-ordinates data updates across the sites.
-----------
A heterogeneous distributed database may have different hardware, operating systems, database management systems, and even data models for different databases.Federated − The heterogeneous database systems are independent in nature and integrated together so that they function as a single database system.
Un-federated − The database systems employ a central coordinating module through which the databases are accessed.
schema is a collection of logical structures of data, or schema objects. A schema is owned by a database user and has the same name as that user. Each user owns a single schema.
schema_object is a logical data structure like a table, index, view, synonym, procedure, package, or a database link.
global_database_name is the name that uniquely identifies a remote database. This name must be the same as the concatenation of the remote database initialization parameters DB_NAME and DB_DOMAIN, unless the parameter GLOBAL_NAMES is set to FALSE, in which case any name is acceptable.
A procedure (often called a stored procedure) is a subroutine like a subprogram in a regular computing language, stored in database.
SQL Server triggers are special stored procedures that are executed automatically in response to the database object, database, and server events.
Data Replication is the process of storing data in more than one site or node. It is useful in improving the availability of data. It is simply copying data from a database from one server to another server so that all the users can share the same data without any inconsistency. The result is a distributed database in which users can access data relevant to their tasks without interfering with the work of others.
Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of its operations are executed or none. There must be no state in a database where a transaction is left partially completed. States should be defined either before the execution of the transaction or after the execution/abortion/failure of the transaction.
Consistency − The database must remain in a consistent state after any transaction. No transaction should have any adverse effect on the data residing in the database. If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well.
Durability − The database should be durable enough to hold all its latest updates even if the system fails or restarts. If a transaction updates a chunk of data in a database and commits, then the database will hold the modified data. If a transaction commits but the system fails before the data could be written on to the disk, then that data will be updated once the system springs back into action.
Isolation − In a database system where more than one transaction are being executed simultaneously and in parallel, the property of isolation states that all the transactions will be carried out and executed as if it is the only transaction in the system. No transaction will affect the existence of any other transaction.
At Uber, HDFS was designed as a scalable distributed file system to support thousands of nodes within a single cluster. With enough hardware, scaling to over 100 petabytes of raw storage capacity in one cluster can be easily—and quickly—achieved.