5. Data is statically raw and unprocessed
information. For example – name, class,
marks, etc. In computer language, a piece of
information that can be translated into a
form for efficient movement and processing
is called data. Data is interchangeable
information.
6. A database is a collection of data that is organized, which
is also called structured data. It can be accessed or stored
in a computer system. It can be managed through a
Database Management System (DBMS), a software used
to manage data. Database refers to related data in a
structured form.
Application: Company Information, Account information,
manufacturing, banking, finance transactions,
telecommunications.
7. In a database, data is organized into tables consisting
of rows and columns and it is indexed so data can be
updated, expanded, and deleted easily. Computer
databases typically contain file records data like
transactions money in one bank account to another
bank account, sales and customer details, fee details
of students, and product details. There are different
kinds of databases, ranging from the most prevalent
approach, the relational database, to a distributed
database, cloud database, and NoSQL databases.
8. MAINLY THE DATABASES CAN BE CLASSIFIED AS :-
Relational Database: A relational database is made up of a set of tables
with data that fits into a predefined category.
Distributed Database: A distributed database is a database in which
portions of the database are stored in multiple physical locations, and in
which processing is dispersed or replicated among different points in a
network.
Cloud Database: A cloud database is a database that typically runs on a
cloud computing platform. Database service provides access to the
database. Database services make the underlying software stack
transparent to the user.
In advancement of technology has led to new applications of database
systems. New media technology has made it possible to store images,
video clips. These essential features are making multimedia databases.
9. A transactional database is a type of database designed to support transactions,
which are sets of operations that must be executed together as a single unit. It
ensures data consistency, integrity, and isolation, following the ACID properties
(Atomicity, Consistency, Isolation, Durability).
A spatial database is a specialized database designed to store and query data with
spatial or geographic information, such as maps, coordinates, or spatial
relationships. It enables the efficient storage and retrieval of location-based data
for applications like mapping, navigation, and geographic analysis.
A heterogeneous database is a database system that supports the storage and
management of diverse data types and formats, allowing different data
structures and schemas to coexist within the same database environment.
10. A text database is a collection of data stored in a structured format primarily
consisting of text, organized for efficient retrieval and manipulation.
Engineering Design Data
Hypertext Data
Web Data and etc.
are continuous flows of data that are generated over time
from various sources such as sensors, applications, or devices. They are often
unbounded and can include real-time information, such as stock prices, social
media updates, or sensor readings. Data streams are typically processed and
analyzed in real-time or near real-time to extract insights or respond to events as
they occur.
11. Data Warehouse is a relational database management system (RDBMS)
construct to meet the requirement of transaction processing systems. It can
be loosely described as any centralized data repository which can be
queried for business benefits. It is a database that stores information
oriented to satisfy decision-making requests. It is a group of decision
support technologies, targets to enabling the knowledge worker (executive,
manager, and analyst) to make superior and higher decisions. So, Data
Warehousing support architectures and tool for business executives to
systematically organize, understand and use their information to make
strategic decisions.
Data Warehouse environment contains an extraction, transportation, and
loading (ETL) solution, an online analytical processing (OLAP) engine,
customer analysis tools, and other applications that handle the process of
gathering information and delivering it to business users.
12. The multi-Dimensional Data Model is a method which is used for ordering
data in the database along with good arrangement and assembling of the
contents in the database.
OLAP (online analytical processing) and data warehousing uses multi-
dimensional databases. It is used to show multiple dimensions of the data
to users.
It represents data in the form of data cubes. Data cubes allow to model and
view the data from many dimensions and perspectives. It is defined by
dimensions and facts and is represented by a fact table. Facts are
numerical measures and fact tables contain measures of the related
dimensional tables or names of the facts.
13.
14.
15. Subject-Oriented : A data warehouse target on the modelling and analysis of
data for decision-makers. Therefore, data warehouses typically provide a
concise and straightforward view around a particular subject, such as customer,
product, or sales, instead of the global organization's ongoing operations. This is
done by excluding data that are not useful concerning the subject and including
all data needed by the users to understand the subject.
16. Integrated : A data warehouse integrates various heterogeneous data sources
like RDBMS, flat files, and online transaction records. It requires performing
data cleaning and integration during data warehousing to ensure consistency in
naming conventions, attributes types, etc., among different data sources.
17. Time-Variant : Historical information is kept in a data warehouse. For example,
one can retrieve files from 3 months, 6 months, 12 months, or even previous
data from a data warehouse. These variations with a transactions system, where
often only the most current file is kept.
18. Non-Volatile : The data warehouse is a physically separate data storage, which
is transformed from the source operational RDBMS. The operational updates of
data do not occur in the data warehouse, i.e., update, insert, and delete
operations are not performed. It usually requires only two procedures in data
accessing: Initial loading of data and access to data. Therefore, the DW does
not require transaction processing, recovery, and concurrency capabilities,
which allows for substantial speedup of data retrieval. Non-Volatile defines
that once entered into the warehouse, and data should not change.
19. o To help reporting as well as analysis
o Maintain the organization's historical
information
o Be the foundation for decision making.
20.
21.
22. OLAP (On-line Analytical Processing) is represented by a relatively low
volume of transactions. Queries are very difficult and involve
aggregations. For OLAP operations, response time is an effectiveness
measure. OLAP applications are generally used by Data Mining
techniques. In OLAP database there is aggregated, historical information,
stored in multi-dimensional schemas (generally star schema).
Any type of Data Warehouse System is an OLAP system. The uses of the
OLAP System are described below.
o Spotify analyzed songs by users to come up with a personalized
homepage of their songs and playlist.
o Netflix movie recommendation system.
23. OLTP (On-Line Transaction Processing) is featured by a large number of short
on-line transactions (INSERT, UPDATE, and DELETE). The primary significance
of OLTP operations is put on very rapid query processing, maintaining record
integrity in multi-access environments, and effectiveness consistent by the
number of transactions per second. In the OLTP database, there is an accurate
and current record, and schema used to save transactional database is the
entity model (usually 3NF).
An example considered for OLTP System is ATM Center a person who
authenticates first will receive the amount first and the condition is that the
amount to be withdrawn must be present in the ATM. The uses of the OLTP
System are described below.
o ATM center is an OLTP application.
o OLTP handles the ACID properties during data transactions via the application.
o It’s also used for Online banking, Online airline ticket booking, sending a text
message, add a book to the shopping cart.
24.
25. Feature
Purpose
Data Type
Database Design
Queries
Response Time
Concurrency
Volume of Data
Data Modifications
Data Integrity
Examples
OLAP
Analytical processing for decision-making
Historical and aggregated data
Star or snowflake schema
Complex and involves aggregations
Longer response time due to complex
queries
Lower concurrency as it deals with read-
heavy operations
Handles large volumes of historical data
Rarely updated or modified
Emphasizes data accuracy and
consistency
Data warehouses, business intelligence
systems
OLTP
Transaction processing for day-to-day
operations
Current and detailed data
Normalized schema
Simple, primarily CRUD operations
Shorter response time for quick
transactions
Higher concurrency for frequent read and
write operations
Handles current, frequently updated data
Frequently updated and modified
Emphasizes data consistency and integrity
ERP systems, CRM systems
26.
27.
28. A star schema is the elementary form of a dimensional model, in which data are
organized into facts and dimensions. A fact is an event that is counted or measured,
such as a sale or log in. A dimension includes reference data about the fact, such as
date, item, or customer.
A star schema is a relational schema where a relational schema whose design
represents a multidimensional data model. The star schema is the explicit data
warehouse schema. It is known as star schema because the entity-relationship
diagram of this schemas simulates a star, with points, diverge from a central table.
The center of the schema consists of a large fact table, and the points of the star are
the dimension tables.
The simple structure of the star schema allows for fast query response times and
efficient use of database resources. Additionally, the star schema can be easily
extended by adding new dimension tables or measures to the fact table, making it a
scalable and flexible solution for data warehousing.
29.
30. A table in a star schema which
contains facts and connected
to dimensions. A fact table has
two types of columns: those
that include fact and those
that are foreign keys to the
dimension table. The primary
key of the fact tables is
generally a composite key that
is made up of all of its foreign
keys.
A dimension is an architecture
usually composed of one or
more hierarchies that
categorize data. If a dimension
has not got hierarchies and
levels, it is called a flat
dimension or list. The primary
keys of each of the dimensions
table are part of the composite
primary keys of the fact table.
Dimensional attributes help to
define the dimensional value.
They are generally descriptive,
textual values. Dimensional
tables are usually small in size
than fact table.
31. o Star Schemas are easy for end-users and application to understand and navigate. With a well-
designed schema, the customer can instantly analyze large, multidimensional data sets.
The main advantage of star schemas in a decision-support environment are: