SlideShare a Scribd company logo
1 of 94
IT6701 – Information
Management
IV Year / VII Semester
OBJECTIVES
• To expose students with the basics of managing the information
• To explore the various aspects of database design and modeling,
• To examine the basic issues in information governance and
information integration
• To understand the overview of information architecture.
UNIT I DATABASE MODELLING,
MANAGEMENT AND
DEVELOPMENT
Database design and modelling - Business Rules and
Relationship; Java database Connectivity (JDBC), Database
connection Manager, Stored Procedures. Trends in Big Data
systems including NoSQL – Hadoop, HDFS, MapReduce,
Hive, and enhancements.
Introduction
• Database
– Central repository of data
– To store the data in a structured manner.
• Database Design
• A process of defining the structure of a database.
Data Model
 Collection of conceptual tools for describing data, data
relationship and consistency constraint.
 Represents the nature of the data, business rules governing
the data, and how it will be organized in the database.
Data Model
 Levels:
 Conceptual – describes WHAT the system contains
 Logical – describes HOW the system will be implemented
regardless of the DBMS
 Physical - describes HOW the system will be implemented using a
specific DBMS.
Data Model - Element
Element Definition
Entity
a real world thing or an interaction between 2 or more real world
things.
Attribute Pieces of information the need to now about entities
Relationship
How entities depend on each other in terms of why the entities
depend on each other and what that relationship is
Data Model - Example
 Customer and Product are an entity
Customer: customer name, customer id
Product: Product name, Price
Sale- Relationship
Data Model - Types
 Entity Relationship Models
Unified Modeling Language
Entity Relationship Models
 a visual representation of data that describes how
data is related to each other.
Components of ER Diagram
Component Description Symbol
Entity Rectangle
Relationship Diamond
Attributes for
any Entity Ellipse
Key Attribute
for any Entity
the attribute
name inside
the Ellipse is
underlined.
Components of ER Diagram
Component Description Symbol
Derived
Attribute for
any Entity
dotted
ellipse is
created
inside the
main ellipse
Multivalued
Attribute for
any Entity
Double
Ellipse
ER Diagram - Entity
Component Example Symbol
Entity
Employee,
Manager,
Department
Weak Entity
depends on
another
entity
ER Diagram - Attribute
Component Description Symbol
Attribute
(Name, Age,
Address)
property or
characteristic
of an entity
Key Attribute
main
characterstic
of an Entity
Composite
Attribute
have their
own attributes
ER Diagram - Relationship
Component Description Symbol
One to One
Relationship
one student can enroll
only for one course and
a course will also have
only one Student
One to Many
Relationship
1 student can opt for
many courses
Many to One
Relationship
Student enrolls for only
one Course but a
Course can have many
Students
ER Diagram - Relationship
Component Description Symbol
Many to Many
Relationship
one student can enroll
for more than one
courses. And a course
can have more than 1
student enrolled in it
ER Diagram - Example
 College Database: Statements
 A college contains many departments
 Each department can offer any number of courses
 Many instructors can work in a department
 An instructor can work only in one department
 For each department there is a Head
 An instructor can be head of only one department
 Each instructor can take any number of courses
 A course can be taken by only one instructor
 A student can enroll for any number of courses
 Each course can have any number of students
ER Diagram - Steps
 Identify the Entities
Identify the relationship
Identify the key attributes
Identify other relevant attribute
ER Diagram - Example
 Step 1 : Identify the Entities
Department
Course
Instructor
Student
ER Diagram - Example
 Step 2 : Identify the relationships
Department and Course - One to Many (1:N)
Department and Instructor - One to Many (1:N)
Department and Head - One to One (1:1)
Course and student - Many to Many (M:N)
Course and instructor - Many to One (N :1)
ER Diagram - Example
 Step 3: Identify the key attributes
Department_Name is the key attribute for the Entity
"Department".
Course_ID is the key attribute for "Course" Entity.
Student_ID is the key attribute for "Student" Entity.
Instructor_ID is the key attribute for "Instructor" Entity.
ER Diagram - Example
 Step 4: Identify other relevant attributes
For the department entity, other attributes are location
For course entity, other attributes are course_name,duration
For instructor entity, other attributes are first_name, last_name,
phone
For student entity, first_name, last_name, phone
ER Diagram - Example
Step 5: Draw complete ER diagram
ER Diagram – College Database
NORMALIZATION
 a process of organizing the data in database to
avoid data redundancy
Forms:
First normal form(1NF)
Second normal form(2NF)
Third normal form(3NF)
Boyce & Codd normal form (BCNF)
NORMALIZATION
 First normal form(1NF) :
an attribute (column) of a table cannot hold multiple
values. It should hold only atomic values.
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur
8812121212
9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore
9990000123
8123450987
NORMALIZATION
 First normal form(1NF) :
each attribute of a table must have atomic (single)
values.
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212
102 Jon Kanpur 9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123
104 Lester Bangalore 8123450987
NORMALIZATION
 Second normal form(1NF) : A table is said to be
in 2NF if both the following conditions hold:
Table is in 1NF (First normal form)
All the non-key columns are dependent on the table’s
primary key.
NORMALIZATION – 2NF

Candidate Keys: {teacher_id, subject}
Non prime attribute: teacher_age
teacher_id subject teacher_age
111 Maths 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
NORMALIZATION – 2NF

The table is in first normal form and all the columns depend on the
table’s primary key.
teacher_id subject teacher_age
111 Maths 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
teacher_id teacher_age
111 38
222 38
333 40
teacher_id subject
111 Maths
111 Physics
222 Biology
333 Physics
333 Chemistry
NORMALIZATION
 A table design is said to be in 3NF if both the
following conditions hold:
Table must be in 2NF
Transitive functional dependency of non-prime attribute
on any super key should be removed.
NORMALIZATION – 3NF
Emp_id Emp_name Emp_zip Emp_state Emp_city Emp_district
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan
NORMALIZATION – 3NF
emp_id emp_name emp_zip
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999
emp_zip emp_state emp_city emp_district
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai Urrapakkam
292008 UK Pauri Bhagwan
222999 MP Gwalior Ratan
NORMALIZATION
 Boyce Codd normal form:
it is in 3NF and for every functional dependency
X->Y, X should be the super key of the table.
emp_id emp_nationality emp_dept dept_type dept_no_of_emp
1001 Austrian Production and planning D001 200
1001 Austrian stores D001 250
1002 American
design and technical
support
D134 100
1002 American Purchasing department D134 600
NORMALIZATION
 Functional dependencies in the table above:
emp_id->emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
emp_id emp_nationality emp_dept dept_type dept_no_of_emp
1001 Austrian Production and planning D001 200
1001 Austrian stores D001 250
1002 American
design and technical
support
D134 100
1002 American Purchasing department D134 600
NORMALIZATION - BCNF
 emp_nationality table:
emp_dept table:
Emp_dept_mapping :
emp_id
emp_national
ity
1001 Austrian
1002 American
emp_dept dept_type
dept_no
_of_emp
Production and
planning
D001 200
stores D001 250
design and
technical
support
D134 100
Purchasing
department
D134 600
emp_id emp_dept
1001 Production and planning
1001 stores
1002 design and technical support
1002 Purchasing department
Business Rule
 a brief, precise and unambiguous description of a Policy, Procedure, or
Principle within a specific organization..
Some of the Example of Business rules are,
 i) A customer may generate many invoices
 ii) An invoice is generated by only one customer
 iii) A training session cannot be scheduled for fewer than 10
employees or for more than 30 employees., etc
Business Rule - Purpose
• They help standardize the Company's view of data
• They can be a communication tool in between users and designers
• They allow the designer to understand the nature, role and scope of Data
• They allow the designer to understand business processes
• They allow the DB designer to understand to develop appropriate
relationship participation rules and constraints to create an accurate data
model
Java Database Connectivity (JDBC)
JDBC Driver is a software component that enables
java application to interact with the database.
There are 4 types of JDBC drivers:
JDBC-ODBC bridge driver
Native-API driver (partially java driver)
Network Protocol driver (fully java driver)
Thin driver (fully java driver)
JDBC Driver - JDBC-ODBC bridge
driver
uses ODBC driver to connect to the database.
converts JDBC method calls into the ODBC
function calls. Ex: JDK 1.2
JDBC Driver - Native-API driver
uses the client-side libraries of the database.
converts JDBC method calls into native calls of the
database API
JDBC Driver - Network Protocol
driver
uses middleware (application server) that converts
JDBC calls directly or indirectly into the vendor-
specific database protocol
JDBC Driver - Thin driver
The thin driver converts JDBC calls directly into
the vendor-specific database protocol. That is why
it is known as thin driver. It is fully written in Java
language
Steps to connect a Java Application to
Database
Register the Driver
Create a Connection
Create SQL Statement
Execute SQL Statement
Closing the connection
Register the Driver
forName() - to register the driver class
- to dynamically load the driver class
Syntax:
public static void forName(String className)
throws ClassNotFoundException
Example:
Class.forName("oracle.jdbc.driver.OracleDriver");
Create the connection object
getConnection() method of DriverManager class is used to
establish connection with the database
Syntax:
1) public static Connection getConnection(String url)
throws SQLException
2) public static Connection getConnection(String url,
String name,String password) throws SQLException
Example:
Connection con=DriverManager.getConnection(
"jdbc:oracle:thin:@localhost:1521:xe","system","password");
Create the Statement object
The createStatement() method - used to
create statement.
The object of statement - to execute queries
with the database.
Syntax:
public Statement createStatement() throws SQLException
Example:
Statement stmt=con.createStatement();
Execute the query
The executeQuery() method - used to execute queries to
the database.
returns the object of ResultSet that can be used to get all
the records of a table.
Syntax:
public ResultSet executeQuery(String sql)throws SQLException
Example:
ResultSet rs=stmt.executeQuery("select * from emp");
while(rs.next()){
System.out.println(rs.getInt(1)+" "+rs.getString(2)); }
Close the connection object
close() method of Connection interface
is used to close the connection.
Syntax:
public void close()throws SQLException
Example:
con.close();
Stored Procedures
A stored routine is a set of SQL statements that
can be stored in the server
Steps:
Picking a Delimiter
How to Work with a Stored Procedure
Parameters
Variables
Flow Control Structures
Step 1: Picking a Delimiter
 the character or string of characters that tells the
mySQL client finished typing in an SQL statement.
 use “//”
Step 2:Work with a stored procedure
Creating a Stored Procedure
DELIMITER //
CREATE PROCEDURE `p2` ()
LANGUAGE SQL
DETERMINISTIC
SQL SECURITY DEFINER
COMMENT 'A procedure'
BEGIN
SELECT 'Hello World !';
END//
Step 2:Work with a stored procedure
 Calling a Stored Procedure
to enter the word CALL, followed by the name of the
procedure, and then the parentheses, including all the
parameters between them (variables or values).
Parentheses are compulsory.
CALL stored_procedure_name (param1, param2, ....)
CALL procedure1(10 , 'string parameter' ,
@parameter_var);
Step 3: Parameter
• CREATE PROCEDURE proc1 () : Parameter list is
empty
• CREATE PROCEDURE proc1 (IN varname DATA-
TYPE) : One input parameter. The word IN is optional
because parameters are IN (input) by default.
• CREATE PROCEDURE proc1 (OUT varname DATA-
TYPE) : One output parameter.
• CREATE PROCEDURE proc1 (INOUT varname DATA-
TYPE) : One parameter which is both input and output.
Step 4: Variables
• DECLARE varname DATA-TYPE DEFAULT
defaultvalue;
• DECLARE a, b INT DEFAULT 5;
• DECLARE str VARCHAR(50);
• DECLARE today TIMESTAMP DEFAULT
CURRENT_DATE;
• DECLARE v1, v2, v3 TINYINT;
Step 5: Flow Control Structures
DELIMITER //
CREATE PROCEDURE `proc_IF` (IN param1 INT)
BEGIN
DECLARE variable1 INT;
SET variable1 = param1 + 1;
IF variable1 = 0 THEN
SELECT variable1;
END IF;
IF param1 = 0 THEN
SELECT 'Parameter value = 0';
ELSE
SELECT 'Parameter value <> 0';
END IF;
END //
BIG DATA
Overview
 “90% of the world’s data was generated in the last few years.”
• a collection of large datasets that
cannot be processed using traditional
computing techniques.
What Comes Under Big Data?
 Big data involves the data produced by different
devices and applications.
Black Box Data
Social Media Data
Stock Exchange Data
Transport Data
Search Engine Data
Characteristics
 Volume – the amount of data handled by the
application.
Velocity – the rate at which the data flows into the
system.
Variety – different types of data generated
Structured data : Relational data.
Semi Structured data : XML data
Unstructured data : Word, PDF, Text, Media Logs
Benefits
Using the information in the social media:
learning about the response for their campaigns,
promotions, and other advertising mediums.
consumers, product companies and retail organizations
are planning their production.
Using the data regarding the previous medical
history of patients, hospitals are providing better
and quick service
Big Data Technologies
providing more accurate analysis, which may lead
to more concrete decision-making resulting in
greater operational efficiencies, cost reductions,
and reduced risks for the business.
require an infrastructure that can manage and
process huge volumes of structured and
unstructured data in realtime and can protect data
privacy and security
Hadoop - Big Data Solutions
Traditional Approach
Hadoop - Big Data Solutions
Google’s Solution
MapReduce: divides the task into small parts and
assigns those parts to many computers connected over
the network, and collects the results to form the final
result dataset.
Hadoop - Big Data Solutions
Hadoop
 Doug Cutting, Mike Cafarella and team took the
solution provided by Google and started an Open
Source Project called HADOOP in 2005.
 Two parts:
Stoarge – HDFS
Processing - MapReduce
Hadoop - Components
Hadoop -Architecture
Hadoop -Architecture
Both Master Node and Slave Nodes contain two Hadoop
Components:
HDFS Component
MapReduce Component
 Master Node (HDFS) – Name Node – To store metadata
 Slave Node (HDFS) – Data Node – to store actual data
Hadoop -Ecosystem
Components of Ecosystem
HDFS – Hadoop Distributes File System - Storage
MapReduce – Data Processing
YARN – Resource Management
Hive - querying and analyzing large datasets
Pig - analyzing and querying huge dataset – Programmer
Hbase - to store structured data in tables
Sqoop - imports data from external sources
Zookeeper - coordinates a large cluster of machines
Oozie - a workflow scheduler
Hadoop Distributed File System
holds very large amount of data and provides easier access
 To store such huge data, the files are stored across multiple
machines
These files are stored in redundant fashion to rescue the
system from possible data losses in case of failure
Features of HDFS
It is suitable for the distributed storage and processing.
Hadoop provides a command interface to interact with
HDFS.
The built-in servers of namenode and datanode help users
to easily check the status of cluster.
Streaming access to file system data.
HDFS provides file permissions and authentication.
HDFS Architecture
HDFS Architecture Elements
Namenode:
Manages the file system namespace.
Regulates client’s access to files.
It also executes file system operations such as renaming,
closing, and opening files and directories
 Datanode:
Datanodes perform read-write operations on the file systems,
as per client request.
They also perform operations such as block creation, deletion,
and replication according to the instructions of the namenode
HDFS Architecture Elements
 Block:
the user data is stored in the files of HDFS.
The file in a file system will be divided into one or more
segments and/or stored in individual data nodes.
 Size: 64 MB to 128 MB
Writing a file in a Hadoop cluster

Reading a file in a Hadoop cluster

MapReduce
 to process huge amount of data in parallel, reliable and
efficient way in cluster environments.
uses Divide and Conquer technique to process large amount
of data.
It divides input task into smaller and manageable sub-tasks
to execute them in-parallel.
Steps:
Map function
Shuffle function
Reduce function
MapReduce – Map Function
 It takes input tasks and divides them into smaller sub-tasks
Sub steps:
Splitting - takes input DataSet from Source and divide
into smaller Sub-DataSets.
Mapping - takes those smaller Sub-DataSets and
perform required action or computation on each Sub-
DataSet
The output of this Map Function is a set of key and value
pairs as <Key, Value>
MapReduce – Shuffle Function
 Combine Function
Sub steps:
Merging - combines all key-value pairs which have
same keys.
Sorting - takes input from Merging step and sort all key-
value pairs by using Keys
 Shuffle Function returns a list of <Key, List<Value>>
sorted pairs to next step
MapReduce – Reduce Function
 takes list of <Key, List<Value>> sorted pairs from Shuffle
Function and perform reduce operation
MapReduce – Example
YARN
 Yet Another Resource Negotiator
 Resource Manager that enables Hadoop to improve its
distributed processing capabilities.
Resource Manager: communicates with the clients, tracks
resources on the cluster and the jobs by assigning tasks to
NodeManagers.
Entities:
Scheduler – scheduling resources to various tasks.
ApplicationMaster – exchange resources from the
scheduler and works with NodeManager
YARN
 NodeManager:
Launches and tracks tasks on DataNodes.
Container: portion of the NodeManager’s capacity and
it is used by the client for running a program.
YARN – Execution Flow
NoSQL
 Not only SQL or NoSQL
all databases and data stores that are not based on the
Relational Database Management Systems or RDBMS
principles.
relates to large data sets accessed and manipulated on a
Web scale
new classes of database products consist of column-based
data stores, key/value pair databases, and document
databases
NoSQL - Types
Key-Value database: a big hash table of keys and values
Document-based database: stores documents made up of
tagged elements.
Column-based database: each storage block contains data
from only one column.
Graph-based database: a network database that uses nodes
to represent and store data.
HIVE
 Hive is a data warehouse infrastructure tool to process
structured data in Hadoop. It resides on top of Hadoop to
summarize Big Data, and makes querying and analyzing
easy.
developed by Facebook, later the Apache Software
Foundation took it up and developed it further as an open
source under the name Apache Hive.
HIVE - Architecture
HIVE – Data Flow
 Executing Query from the UI
The driver is interacting with Compiler for getting the plan
The compiler creates the plan for a job to be executed.
Compiler communicating with Meta store for getting
metadata request.
Meta store sends metadata information back to compiler
Compiler communicating with Driver with the proposed
plan to execute the query
Driver Sending execution plans to Execution engine
HIVE – Data Flow
 Execution Engine (EE) acts as a bridge between Hive and
Hadoop to process the query. For DFS operations
 first contacts Name Node and then to Data nodes to get
the values stored in tables.
It collects actual data from data nodes related to
mentioned query
communicates bi-directionally with Meta store present
in Hive to perform DDL (Data Definition Language)
operations
HIVE – Data Flow
 Fetching results from driver
Sending results to Execution engine. Once the results
fetched from data nodes to the EE, it will send results back
to driver and to UI ( front end)
Hive Vs Relational Databases:-
 Relational databases are of "Schema on READ and
Schema on Write“.
Hive is "Schema on READ only".
supports "READ Many WRITE Once" pattern
Summary !!!

More Related Content

What's hot

Documenting software architecture
Documenting software architectureDocumenting software architecture
Documenting software architectureHimanshu
 
4+1 View Model of Software Architecture
4+1 View Model of Software Architecture4+1 View Model of Software Architecture
4+1 View Model of Software Architecturebashcode
 
EC8791 Requirement-Specifications-Quality assurance techniques
EC8791 Requirement-Specifications-Quality assurance techniquesEC8791 Requirement-Specifications-Quality assurance techniques
EC8791 Requirement-Specifications-Quality assurance techniquesRajalakshmiSermadurai
 
Formal approaches to software architecture design thesis presentation
Formal approaches to software architecture design   thesis presentationFormal approaches to software architecture design   thesis presentation
Formal approaches to software architecture design thesis presentationNacha Chondamrongkul
 
Unit v -Construction and Evaluation
Unit v -Construction and EvaluationUnit v -Construction and Evaluation
Unit v -Construction and EvaluationDhivyaa C.R
 
Architectural styles and patterns
Architectural styles and patternsArchitectural styles and patterns
Architectural styles and patternsHimanshu
 
Domain specific Software Architecture
Domain specific Software Architecture Domain specific Software Architecture
Domain specific Software Architecture DIPEN SAINI
 
Systems and Technical Architecture
Systems and Technical ArchitectureSystems and Technical Architecture
Systems and Technical ArchitectureRachel Gladdis
 
Reference Data Management
Reference Data ManagementReference Data Management
Reference Data ManagementProfinit
 
Software Architecture Design for Begginers
Software Architecture Design for BegginersSoftware Architecture Design for Begginers
Software Architecture Design for BegginersChinh Ngo Nguyen
 
Unit iv -Documenting and Implementation of Software Architecture
Unit iv -Documenting and Implementation of Software ArchitectureUnit iv -Documenting and Implementation of Software Architecture
Unit iv -Documenting and Implementation of Software ArchitectureDhivyaa C.R
 
SE - Software Requirements
SE - Software RequirementsSE - Software Requirements
SE - Software RequirementsJomel Penalba
 
Platform - Technical architecture
Platform - Technical architecturePlatform - Technical architecture
Platform - Technical architectureDavid Rundle
 
DoD Architecture Framework Overview
DoD Architecture Framework OverviewDoD Architecture Framework Overview
DoD Architecture Framework OverviewAlessio Mosto
 
Evaluating Alternatives for Requirements, Envireonment, and Implemetation
Evaluating Alternatives for Requirements, Envireonment, and ImplemetationEvaluating Alternatives for Requirements, Envireonment, and Implemetation
Evaluating Alternatives for Requirements, Envireonment, and ImplemetationHenhen Lukmana
 

What's hot (20)

Documenting software architecture
Documenting software architectureDocumenting software architecture
Documenting software architecture
 
Software Architecture
Software ArchitectureSoftware Architecture
Software Architecture
 
4+1 View Model of Software Architecture
4+1 View Model of Software Architecture4+1 View Model of Software Architecture
4+1 View Model of Software Architecture
 
EC8791 Requirement-Specifications-Quality assurance techniques
EC8791 Requirement-Specifications-Quality assurance techniquesEC8791 Requirement-Specifications-Quality assurance techniques
EC8791 Requirement-Specifications-Quality assurance techniques
 
Formal approaches to software architecture design thesis presentation
Formal approaches to software architecture design   thesis presentationFormal approaches to software architecture design   thesis presentation
Formal approaches to software architecture design thesis presentation
 
Unit v -Construction and Evaluation
Unit v -Construction and EvaluationUnit v -Construction and Evaluation
Unit v -Construction and Evaluation
 
Architectural styles and patterns
Architectural styles and patternsArchitectural styles and patterns
Architectural styles and patterns
 
Domain specific Software Architecture
Domain specific Software Architecture Domain specific Software Architecture
Domain specific Software Architecture
 
Systems and Technical Architecture
Systems and Technical ArchitectureSystems and Technical Architecture
Systems and Technical Architecture
 
Ch06
Ch06Ch06
Ch06
 
Reference Data Management
Reference Data ManagementReference Data Management
Reference Data Management
 
Chapter04
Chapter04Chapter04
Chapter04
 
Software Architecture Design for Begginers
Software Architecture Design for BegginersSoftware Architecture Design for Begginers
Software Architecture Design for Begginers
 
Class notes
Class notesClass notes
Class notes
 
Unit iv -Documenting and Implementation of Software Architecture
Unit iv -Documenting and Implementation of Software ArchitectureUnit iv -Documenting and Implementation of Software Architecture
Unit iv -Documenting and Implementation of Software Architecture
 
SE - Software Requirements
SE - Software RequirementsSE - Software Requirements
SE - Software Requirements
 
Platform - Technical architecture
Platform - Technical architecturePlatform - Technical architecture
Platform - Technical architecture
 
DoD Architecture Framework Overview
DoD Architecture Framework OverviewDoD Architecture Framework Overview
DoD Architecture Framework Overview
 
Evaluating Alternatives for Requirements, Envireonment, and Implemetation
Evaluating Alternatives for Requirements, Envireonment, and ImplemetationEvaluating Alternatives for Requirements, Envireonment, and Implemetation
Evaluating Alternatives for Requirements, Envireonment, and Implemetation
 
EA foundations (views + repository)
EA foundations (views + repository)EA foundations (views + repository)
EA foundations (views + repository)
 

Similar to IT6701-Information Management Unit 1

Database 3 Conceptual Modeling And Er
Database 3   Conceptual Modeling And ErDatabase 3   Conceptual Modeling And Er
Database 3 Conceptual Modeling And ErAshwani Kumar Ramani
 
Entity relationship modelling - DE L300
Entity relationship modelling - DE L300Entity relationship modelling - DE L300
Entity relationship modelling - DE L300Edwin Ayernor
 
Preparing for BIT – IT2301 Database Management Systems 2001d
Preparing for BIT – IT2301 Database Management Systems 2001dPreparing for BIT – IT2301 Database Management Systems 2001d
Preparing for BIT – IT2301 Database Management Systems 2001dGihan Wikramanayake
 
Syllabus mca 2 rdbms i
Syllabus mca 2 rdbms iSyllabus mca 2 rdbms i
Syllabus mca 2 rdbms iemailharmeet
 
Module 1 session 5
Module 1   session 5Module 1   session 5
Module 1 session 5raghuinfo
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSULTHAN BASHA
 
WBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagramsWBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagramsArshitSood3
 
Structured system analysis and design
Structured system analysis and design Structured system analysis and design
Structured system analysis and design Jayant Dalvi
 
Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Vidyasagar Mundroy
 
entity-relationship-diagram-chen-&-crow -model.ppt
entity-relationship-diagram-chen-&-crow -model.pptentity-relationship-diagram-chen-&-crow -model.ppt
entity-relationship-diagram-chen-&-crow -model.pptIRWANBINISMAILKPMGur1
 
DBMS-2.pptx
DBMS-2.pptxDBMS-2.pptx
DBMS-2.pptxkingVox
 
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionarySe 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionarybabak danyal
 

Similar to IT6701-Information Management Unit 1 (20)

Database 3 Conceptual Modeling And Er
Database 3   Conceptual Modeling And ErDatabase 3   Conceptual Modeling And Er
Database 3 Conceptual Modeling And Er
 
ERD.ppt
ERD.pptERD.ppt
ERD.ppt
 
ERD.ppt
ERD.pptERD.ppt
ERD.ppt
 
Entity relationship modelling - DE L300
Entity relationship modelling - DE L300Entity relationship modelling - DE L300
Entity relationship modelling - DE L300
 
DATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEM
 
Database part3-
Database part3-Database part3-
Database part3-
 
Erd1
Erd1Erd1
Erd1
 
DataModeling.pptx
DataModeling.pptxDataModeling.pptx
DataModeling.pptx
 
Preparing for BIT – IT2301 Database Management Systems 2001d
Preparing for BIT – IT2301 Database Management Systems 2001dPreparing for BIT – IT2301 Database Management Systems 2001d
Preparing for BIT – IT2301 Database Management Systems 2001d
 
Syllabus mca 2 rdbms i
Syllabus mca 2 rdbms iSyllabus mca 2 rdbms i
Syllabus mca 2 rdbms i
 
Module 1 session 5
Module 1   session 5Module 1   session 5
Module 1 session 5
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_Science
 
Cs501 intro
Cs501 introCs501 intro
Cs501 intro
 
WBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagramsWBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagrams
 
Structured system analysis and design
Structured system analysis and design Structured system analysis and design
Structured system analysis and design
 
Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)
 
Database system
Database system Database system
Database system
 
entity-relationship-diagram-chen-&-crow -model.ppt
entity-relationship-diagram-chen-&-crow -model.pptentity-relationship-diagram-chen-&-crow -model.ppt
entity-relationship-diagram-chen-&-crow -model.ppt
 
DBMS-2.pptx
DBMS-2.pptxDBMS-2.pptx
DBMS-2.pptx
 
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionarySe 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
 

More from SIMONTHOMAS S

Cs8092 computer graphics and multimedia unit 5
Cs8092 computer graphics and multimedia unit 5Cs8092 computer graphics and multimedia unit 5
Cs8092 computer graphics and multimedia unit 5SIMONTHOMAS S
 
Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4SIMONTHOMAS S
 
Cs8092 computer graphics and multimedia unit 3
Cs8092 computer graphics and multimedia unit 3Cs8092 computer graphics and multimedia unit 3
Cs8092 computer graphics and multimedia unit 3SIMONTHOMAS S
 
Cs8092 computer graphics and multimedia unit 2
Cs8092 computer graphics and multimedia unit 2Cs8092 computer graphics and multimedia unit 2
Cs8092 computer graphics and multimedia unit 2SIMONTHOMAS S
 
Cs8092 computer graphics and multimedia unit 1
Cs8092 computer graphics and multimedia unit 1Cs8092 computer graphics and multimedia unit 1
Cs8092 computer graphics and multimedia unit 1SIMONTHOMAS S
 
IT6701-Information Management Unit 5
IT6701-Information Management Unit 5IT6701-Information Management Unit 5
IT6701-Information Management Unit 5SIMONTHOMAS S
 
IT6701-Information Management Unit 4
IT6701-Information Management Unit 4IT6701-Information Management Unit 4
IT6701-Information Management Unit 4SIMONTHOMAS S
 
IT6701-Information Management Unit 3
IT6701-Information Management Unit 3IT6701-Information Management Unit 3
IT6701-Information Management Unit 3SIMONTHOMAS S
 
IT6701-Information Management Unit 2
IT6701-Information Management Unit 2IT6701-Information Management Unit 2
IT6701-Information Management Unit 2SIMONTHOMAS S
 
CS8391-Data Structures Unit 5
CS8391-Data Structures Unit 5CS8391-Data Structures Unit 5
CS8391-Data Structures Unit 5SIMONTHOMAS S
 
CS8391-Data Structures Unit 4
CS8391-Data Structures Unit 4CS8391-Data Structures Unit 4
CS8391-Data Structures Unit 4SIMONTHOMAS S
 
CS8391-Data Structures Unit 3
CS8391-Data Structures Unit 3CS8391-Data Structures Unit 3
CS8391-Data Structures Unit 3SIMONTHOMAS S
 
CS8391-Data Structures Unit 2
CS8391-Data Structures Unit 2CS8391-Data Structures Unit 2
CS8391-Data Structures Unit 2SIMONTHOMAS S
 
CS8391-Data Structures Unit 1
CS8391-Data Structures Unit 1CS8391-Data Structures Unit 1
CS8391-Data Structures Unit 1SIMONTHOMAS S
 

More from SIMONTHOMAS S (20)

Cs8092 computer graphics and multimedia unit 5
Cs8092 computer graphics and multimedia unit 5Cs8092 computer graphics and multimedia unit 5
Cs8092 computer graphics and multimedia unit 5
 
Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4
 
Cs8092 computer graphics and multimedia unit 3
Cs8092 computer graphics and multimedia unit 3Cs8092 computer graphics and multimedia unit 3
Cs8092 computer graphics and multimedia unit 3
 
Cs8092 computer graphics and multimedia unit 2
Cs8092 computer graphics and multimedia unit 2Cs8092 computer graphics and multimedia unit 2
Cs8092 computer graphics and multimedia unit 2
 
Cs8092 computer graphics and multimedia unit 1
Cs8092 computer graphics and multimedia unit 1Cs8092 computer graphics and multimedia unit 1
Cs8092 computer graphics and multimedia unit 1
 
Mg6088 spm unit-5
Mg6088 spm unit-5Mg6088 spm unit-5
Mg6088 spm unit-5
 
Mg6088 spm unit-4
Mg6088 spm unit-4Mg6088 spm unit-4
Mg6088 spm unit-4
 
Mg6088 spm unit-3
Mg6088 spm unit-3Mg6088 spm unit-3
Mg6088 spm unit-3
 
Mg6088 spm unit-2
Mg6088 spm unit-2Mg6088 spm unit-2
Mg6088 spm unit-2
 
Mg6088 spm unit-1
Mg6088 spm unit-1Mg6088 spm unit-1
Mg6088 spm unit-1
 
IT6701-Information Management Unit 5
IT6701-Information Management Unit 5IT6701-Information Management Unit 5
IT6701-Information Management Unit 5
 
IT6701-Information Management Unit 4
IT6701-Information Management Unit 4IT6701-Information Management Unit 4
IT6701-Information Management Unit 4
 
IT6701-Information Management Unit 3
IT6701-Information Management Unit 3IT6701-Information Management Unit 3
IT6701-Information Management Unit 3
 
IT6701-Information Management Unit 2
IT6701-Information Management Unit 2IT6701-Information Management Unit 2
IT6701-Information Management Unit 2
 
CS8391-Data Structures Unit 5
CS8391-Data Structures Unit 5CS8391-Data Structures Unit 5
CS8391-Data Structures Unit 5
 
CS8391-Data Structures Unit 4
CS8391-Data Structures Unit 4CS8391-Data Structures Unit 4
CS8391-Data Structures Unit 4
 
CS8391-Data Structures Unit 3
CS8391-Data Structures Unit 3CS8391-Data Structures Unit 3
CS8391-Data Structures Unit 3
 
CS8391-Data Structures Unit 2
CS8391-Data Structures Unit 2CS8391-Data Structures Unit 2
CS8391-Data Structures Unit 2
 
CS8391-Data Structures Unit 1
CS8391-Data Structures Unit 1CS8391-Data Structures Unit 1
CS8391-Data Structures Unit 1
 
SPC Unit 5
SPC Unit 5SPC Unit 5
SPC Unit 5
 

Recently uploaded

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 

Recently uploaded (20)

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 

IT6701-Information Management Unit 1

  • 2. OBJECTIVES • To expose students with the basics of managing the information • To explore the various aspects of database design and modeling, • To examine the basic issues in information governance and information integration • To understand the overview of information architecture.
  • 3. UNIT I DATABASE MODELLING, MANAGEMENT AND DEVELOPMENT Database design and modelling - Business Rules and Relationship; Java database Connectivity (JDBC), Database connection Manager, Stored Procedures. Trends in Big Data systems including NoSQL – Hadoop, HDFS, MapReduce, Hive, and enhancements.
  • 4. Introduction • Database – Central repository of data – To store the data in a structured manner. • Database Design • A process of defining the structure of a database.
  • 5. Data Model  Collection of conceptual tools for describing data, data relationship and consistency constraint.  Represents the nature of the data, business rules governing the data, and how it will be organized in the database.
  • 6. Data Model  Levels:  Conceptual – describes WHAT the system contains  Logical – describes HOW the system will be implemented regardless of the DBMS  Physical - describes HOW the system will be implemented using a specific DBMS.
  • 7. Data Model - Element Element Definition Entity a real world thing or an interaction between 2 or more real world things. Attribute Pieces of information the need to now about entities Relationship How entities depend on each other in terms of why the entities depend on each other and what that relationship is
  • 8. Data Model - Example  Customer and Product are an entity Customer: customer name, customer id Product: Product name, Price Sale- Relationship
  • 9. Data Model - Types  Entity Relationship Models Unified Modeling Language
  • 10. Entity Relationship Models  a visual representation of data that describes how data is related to each other.
  • 11. Components of ER Diagram Component Description Symbol Entity Rectangle Relationship Diamond Attributes for any Entity Ellipse Key Attribute for any Entity the attribute name inside the Ellipse is underlined.
  • 12. Components of ER Diagram Component Description Symbol Derived Attribute for any Entity dotted ellipse is created inside the main ellipse Multivalued Attribute for any Entity Double Ellipse
  • 13. ER Diagram - Entity Component Example Symbol Entity Employee, Manager, Department Weak Entity depends on another entity
  • 14. ER Diagram - Attribute Component Description Symbol Attribute (Name, Age, Address) property or characteristic of an entity Key Attribute main characterstic of an Entity Composite Attribute have their own attributes
  • 15. ER Diagram - Relationship Component Description Symbol One to One Relationship one student can enroll only for one course and a course will also have only one Student One to Many Relationship 1 student can opt for many courses Many to One Relationship Student enrolls for only one Course but a Course can have many Students
  • 16. ER Diagram - Relationship Component Description Symbol Many to Many Relationship one student can enroll for more than one courses. And a course can have more than 1 student enrolled in it
  • 17. ER Diagram - Example  College Database: Statements  A college contains many departments  Each department can offer any number of courses  Many instructors can work in a department  An instructor can work only in one department  For each department there is a Head  An instructor can be head of only one department  Each instructor can take any number of courses  A course can be taken by only one instructor  A student can enroll for any number of courses  Each course can have any number of students
  • 18. ER Diagram - Steps  Identify the Entities Identify the relationship Identify the key attributes Identify other relevant attribute
  • 19. ER Diagram - Example  Step 1 : Identify the Entities Department Course Instructor Student
  • 20. ER Diagram - Example  Step 2 : Identify the relationships Department and Course - One to Many (1:N) Department and Instructor - One to Many (1:N) Department and Head - One to One (1:1) Course and student - Many to Many (M:N) Course and instructor - Many to One (N :1)
  • 21. ER Diagram - Example  Step 3: Identify the key attributes Department_Name is the key attribute for the Entity "Department". Course_ID is the key attribute for "Course" Entity. Student_ID is the key attribute for "Student" Entity. Instructor_ID is the key attribute for "Instructor" Entity.
  • 22. ER Diagram - Example  Step 4: Identify other relevant attributes For the department entity, other attributes are location For course entity, other attributes are course_name,duration For instructor entity, other attributes are first_name, last_name, phone For student entity, first_name, last_name, phone
  • 23. ER Diagram - Example Step 5: Draw complete ER diagram
  • 24. ER Diagram – College Database
  • 25. NORMALIZATION  a process of organizing the data in database to avoid data redundancy Forms: First normal form(1NF) Second normal form(2NF) Third normal form(3NF) Boyce & Codd normal form (BCNF)
  • 26. NORMALIZATION  First normal form(1NF) : an attribute (column) of a table cannot hold multiple values. It should hold only atomic values. emp_id emp_name emp_address emp_mobile 101 Herschel New Delhi 8912312390 102 Jon Kanpur 8812121212 9900012222 103 Ron Chennai 7778881212 104 Lester Bangalore 9990000123 8123450987
  • 27. NORMALIZATION  First normal form(1NF) : each attribute of a table must have atomic (single) values. emp_id emp_name emp_address emp_mobile 101 Herschel New Delhi 8912312390 102 Jon Kanpur 8812121212 102 Jon Kanpur 9900012222 103 Ron Chennai 7778881212 104 Lester Bangalore 9990000123 104 Lester Bangalore 8123450987
  • 28. NORMALIZATION  Second normal form(1NF) : A table is said to be in 2NF if both the following conditions hold: Table is in 1NF (First normal form) All the non-key columns are dependent on the table’s primary key.
  • 29. NORMALIZATION – 2NF  Candidate Keys: {teacher_id, subject} Non prime attribute: teacher_age teacher_id subject teacher_age 111 Maths 38 111 Physics 38 222 Biology 38 333 Physics 40 333 Chemistry 40
  • 30. NORMALIZATION – 2NF  The table is in first normal form and all the columns depend on the table’s primary key. teacher_id subject teacher_age 111 Maths 38 111 Physics 38 222 Biology 38 333 Physics 40 333 Chemistry 40 teacher_id teacher_age 111 38 222 38 333 40 teacher_id subject 111 Maths 111 Physics 222 Biology 333 Physics 333 Chemistry
  • 31. NORMALIZATION  A table design is said to be in 3NF if both the following conditions hold: Table must be in 2NF Transitive functional dependency of non-prime attribute on any super key should be removed.
  • 32. NORMALIZATION – 3NF Emp_id Emp_name Emp_zip Emp_state Emp_city Emp_district 1001 John 282005 UP Agra Dayal Bagh 1002 Ajeet 222008 TN Chennai M-City 1006 Lora 282007 TN Chennai Urrapakkam 1101 Lilly 292008 UK Pauri Bhagwan 1201 Steve 222999 MP Gwalior Ratan
  • 33. NORMALIZATION – 3NF emp_id emp_name emp_zip 1001 John 282005 1002 Ajeet 222008 1006 Lora 282007 1101 Lilly 292008 1201 Steve 222999 emp_zip emp_state emp_city emp_district 282005 UP Agra Dayal Bagh 222008 TN Chennai M-City 282007 TN Chennai Urrapakkam 292008 UK Pauri Bhagwan 222999 MP Gwalior Ratan
  • 34. NORMALIZATION  Boyce Codd normal form: it is in 3NF and for every functional dependency X->Y, X should be the super key of the table. emp_id emp_nationality emp_dept dept_type dept_no_of_emp 1001 Austrian Production and planning D001 200 1001 Austrian stores D001 250 1002 American design and technical support D134 100 1002 American Purchasing department D134 600
  • 35. NORMALIZATION  Functional dependencies in the table above: emp_id->emp_nationality emp_dept -> {dept_type, dept_no_of_emp} emp_id emp_nationality emp_dept dept_type dept_no_of_emp 1001 Austrian Production and planning D001 200 1001 Austrian stores D001 250 1002 American design and technical support D134 100 1002 American Purchasing department D134 600
  • 36. NORMALIZATION - BCNF  emp_nationality table: emp_dept table: Emp_dept_mapping : emp_id emp_national ity 1001 Austrian 1002 American emp_dept dept_type dept_no _of_emp Production and planning D001 200 stores D001 250 design and technical support D134 100 Purchasing department D134 600 emp_id emp_dept 1001 Production and planning 1001 stores 1002 design and technical support 1002 Purchasing department
  • 37. Business Rule  a brief, precise and unambiguous description of a Policy, Procedure, or Principle within a specific organization.. Some of the Example of Business rules are,  i) A customer may generate many invoices  ii) An invoice is generated by only one customer  iii) A training session cannot be scheduled for fewer than 10 employees or for more than 30 employees., etc
  • 38. Business Rule - Purpose • They help standardize the Company's view of data • They can be a communication tool in between users and designers • They allow the designer to understand the nature, role and scope of Data • They allow the designer to understand business processes • They allow the DB designer to understand to develop appropriate relationship participation rules and constraints to create an accurate data model
  • 39. Java Database Connectivity (JDBC) JDBC Driver is a software component that enables java application to interact with the database. There are 4 types of JDBC drivers: JDBC-ODBC bridge driver Native-API driver (partially java driver) Network Protocol driver (fully java driver) Thin driver (fully java driver)
  • 40. JDBC Driver - JDBC-ODBC bridge driver uses ODBC driver to connect to the database. converts JDBC method calls into the ODBC function calls. Ex: JDK 1.2
  • 41. JDBC Driver - Native-API driver uses the client-side libraries of the database. converts JDBC method calls into native calls of the database API
  • 42. JDBC Driver - Network Protocol driver uses middleware (application server) that converts JDBC calls directly or indirectly into the vendor- specific database protocol
  • 43. JDBC Driver - Thin driver The thin driver converts JDBC calls directly into the vendor-specific database protocol. That is why it is known as thin driver. It is fully written in Java language
  • 44. Steps to connect a Java Application to Database Register the Driver Create a Connection Create SQL Statement Execute SQL Statement Closing the connection
  • 45. Register the Driver forName() - to register the driver class - to dynamically load the driver class Syntax: public static void forName(String className) throws ClassNotFoundException Example: Class.forName("oracle.jdbc.driver.OracleDriver");
  • 46. Create the connection object getConnection() method of DriverManager class is used to establish connection with the database Syntax: 1) public static Connection getConnection(String url) throws SQLException 2) public static Connection getConnection(String url, String name,String password) throws SQLException Example: Connection con=DriverManager.getConnection( "jdbc:oracle:thin:@localhost:1521:xe","system","password");
  • 47. Create the Statement object The createStatement() method - used to create statement. The object of statement - to execute queries with the database. Syntax: public Statement createStatement() throws SQLException Example: Statement stmt=con.createStatement();
  • 48. Execute the query The executeQuery() method - used to execute queries to the database. returns the object of ResultSet that can be used to get all the records of a table. Syntax: public ResultSet executeQuery(String sql)throws SQLException Example: ResultSet rs=stmt.executeQuery("select * from emp"); while(rs.next()){ System.out.println(rs.getInt(1)+" "+rs.getString(2)); }
  • 49. Close the connection object close() method of Connection interface is used to close the connection. Syntax: public void close()throws SQLException Example: con.close();
  • 50. Stored Procedures A stored routine is a set of SQL statements that can be stored in the server Steps: Picking a Delimiter How to Work with a Stored Procedure Parameters Variables Flow Control Structures
  • 51. Step 1: Picking a Delimiter  the character or string of characters that tells the mySQL client finished typing in an SQL statement.  use “//”
  • 52. Step 2:Work with a stored procedure Creating a Stored Procedure DELIMITER // CREATE PROCEDURE `p2` () LANGUAGE SQL DETERMINISTIC SQL SECURITY DEFINER COMMENT 'A procedure' BEGIN SELECT 'Hello World !'; END//
  • 53. Step 2:Work with a stored procedure  Calling a Stored Procedure to enter the word CALL, followed by the name of the procedure, and then the parentheses, including all the parameters between them (variables or values). Parentheses are compulsory. CALL stored_procedure_name (param1, param2, ....) CALL procedure1(10 , 'string parameter' , @parameter_var);
  • 54. Step 3: Parameter • CREATE PROCEDURE proc1 () : Parameter list is empty • CREATE PROCEDURE proc1 (IN varname DATA- TYPE) : One input parameter. The word IN is optional because parameters are IN (input) by default. • CREATE PROCEDURE proc1 (OUT varname DATA- TYPE) : One output parameter. • CREATE PROCEDURE proc1 (INOUT varname DATA- TYPE) : One parameter which is both input and output.
  • 55. Step 4: Variables • DECLARE varname DATA-TYPE DEFAULT defaultvalue; • DECLARE a, b INT DEFAULT 5; • DECLARE str VARCHAR(50); • DECLARE today TIMESTAMP DEFAULT CURRENT_DATE; • DECLARE v1, v2, v3 TINYINT;
  • 56. Step 5: Flow Control Structures DELIMITER // CREATE PROCEDURE `proc_IF` (IN param1 INT) BEGIN DECLARE variable1 INT; SET variable1 = param1 + 1; IF variable1 = 0 THEN SELECT variable1; END IF; IF param1 = 0 THEN SELECT 'Parameter value = 0'; ELSE SELECT 'Parameter value <> 0'; END IF; END //
  • 58. Overview  “90% of the world’s data was generated in the last few years.” • a collection of large datasets that cannot be processed using traditional computing techniques.
  • 59. What Comes Under Big Data?  Big data involves the data produced by different devices and applications. Black Box Data Social Media Data Stock Exchange Data Transport Data Search Engine Data
  • 60. Characteristics  Volume – the amount of data handled by the application. Velocity – the rate at which the data flows into the system. Variety – different types of data generated Structured data : Relational data. Semi Structured data : XML data Unstructured data : Word, PDF, Text, Media Logs
  • 61. Benefits Using the information in the social media: learning about the response for their campaigns, promotions, and other advertising mediums. consumers, product companies and retail organizations are planning their production. Using the data regarding the previous medical history of patients, hospitals are providing better and quick service
  • 62. Big Data Technologies providing more accurate analysis, which may lead to more concrete decision-making resulting in greater operational efficiencies, cost reductions, and reduced risks for the business. require an infrastructure that can manage and process huge volumes of structured and unstructured data in realtime and can protect data privacy and security
  • 63. Hadoop - Big Data Solutions Traditional Approach
  • 64. Hadoop - Big Data Solutions Google’s Solution MapReduce: divides the task into small parts and assigns those parts to many computers connected over the network, and collects the results to form the final result dataset.
  • 65. Hadoop - Big Data Solutions Hadoop  Doug Cutting, Mike Cafarella and team took the solution provided by Google and started an Open Source Project called HADOOP in 2005.  Two parts: Stoarge – HDFS Processing - MapReduce
  • 68. Hadoop -Architecture Both Master Node and Slave Nodes contain two Hadoop Components: HDFS Component MapReduce Component  Master Node (HDFS) – Name Node – To store metadata  Slave Node (HDFS) – Data Node – to store actual data
  • 70. Components of Ecosystem HDFS – Hadoop Distributes File System - Storage MapReduce – Data Processing YARN – Resource Management Hive - querying and analyzing large datasets Pig - analyzing and querying huge dataset – Programmer Hbase - to store structured data in tables Sqoop - imports data from external sources Zookeeper - coordinates a large cluster of machines Oozie - a workflow scheduler
  • 71. Hadoop Distributed File System holds very large amount of data and provides easier access  To store such huge data, the files are stored across multiple machines These files are stored in redundant fashion to rescue the system from possible data losses in case of failure
  • 72. Features of HDFS It is suitable for the distributed storage and processing. Hadoop provides a command interface to interact with HDFS. The built-in servers of namenode and datanode help users to easily check the status of cluster. Streaming access to file system data. HDFS provides file permissions and authentication.
  • 74. HDFS Architecture Elements Namenode: Manages the file system namespace. Regulates client’s access to files. It also executes file system operations such as renaming, closing, and opening files and directories  Datanode: Datanodes perform read-write operations on the file systems, as per client request. They also perform operations such as block creation, deletion, and replication according to the instructions of the namenode
  • 75. HDFS Architecture Elements  Block: the user data is stored in the files of HDFS. The file in a file system will be divided into one or more segments and/or stored in individual data nodes.  Size: 64 MB to 128 MB
  • 76. Writing a file in a Hadoop cluster 
  • 77. Reading a file in a Hadoop cluster 
  • 78. MapReduce  to process huge amount of data in parallel, reliable and efficient way in cluster environments. uses Divide and Conquer technique to process large amount of data. It divides input task into smaller and manageable sub-tasks to execute them in-parallel. Steps: Map function Shuffle function Reduce function
  • 79. MapReduce – Map Function  It takes input tasks and divides them into smaller sub-tasks Sub steps: Splitting - takes input DataSet from Source and divide into smaller Sub-DataSets. Mapping - takes those smaller Sub-DataSets and perform required action or computation on each Sub- DataSet The output of this Map Function is a set of key and value pairs as <Key, Value>
  • 80. MapReduce – Shuffle Function  Combine Function Sub steps: Merging - combines all key-value pairs which have same keys. Sorting - takes input from Merging step and sort all key- value pairs by using Keys  Shuffle Function returns a list of <Key, List<Value>> sorted pairs to next step
  • 81. MapReduce – Reduce Function  takes list of <Key, List<Value>> sorted pairs from Shuffle Function and perform reduce operation
  • 83. YARN  Yet Another Resource Negotiator  Resource Manager that enables Hadoop to improve its distributed processing capabilities. Resource Manager: communicates with the clients, tracks resources on the cluster and the jobs by assigning tasks to NodeManagers. Entities: Scheduler – scheduling resources to various tasks. ApplicationMaster – exchange resources from the scheduler and works with NodeManager
  • 84. YARN  NodeManager: Launches and tracks tasks on DataNodes. Container: portion of the NodeManager’s capacity and it is used by the client for running a program.
  • 86. NoSQL  Not only SQL or NoSQL all databases and data stores that are not based on the Relational Database Management Systems or RDBMS principles. relates to large data sets accessed and manipulated on a Web scale new classes of database products consist of column-based data stores, key/value pair databases, and document databases
  • 87. NoSQL - Types Key-Value database: a big hash table of keys and values Document-based database: stores documents made up of tagged elements. Column-based database: each storage block contains data from only one column. Graph-based database: a network database that uses nodes to represent and store data.
  • 88. HIVE  Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy. developed by Facebook, later the Apache Software Foundation took it up and developed it further as an open source under the name Apache Hive.
  • 90. HIVE – Data Flow  Executing Query from the UI The driver is interacting with Compiler for getting the plan The compiler creates the plan for a job to be executed. Compiler communicating with Meta store for getting metadata request. Meta store sends metadata information back to compiler Compiler communicating with Driver with the proposed plan to execute the query Driver Sending execution plans to Execution engine
  • 91. HIVE – Data Flow  Execution Engine (EE) acts as a bridge between Hive and Hadoop to process the query. For DFS operations  first contacts Name Node and then to Data nodes to get the values stored in tables. It collects actual data from data nodes related to mentioned query communicates bi-directionally with Meta store present in Hive to perform DDL (Data Definition Language) operations
  • 92. HIVE – Data Flow  Fetching results from driver Sending results to Execution engine. Once the results fetched from data nodes to the EE, it will send results back to driver and to UI ( front end)
  • 93. Hive Vs Relational Databases:-  Relational databases are of "Schema on READ and Schema on Write“. Hive is "Schema on READ only". supports "READ Many WRITE Once" pattern