TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
AADHAR Card- Database Creation
1. Creating A Unique Identity For Every Resident
Under The Guidance
Dr. T. Nambirajan
Professor
Department of Management Studies
School of Management
BY
GROUP 8
BASIL JOHN
PANCHAMI
SARITHA
GIRIDHARAN
KABILAN
JOEL JOSEPH
2. 2
• Collection of interrelated data
• Set of programs to access the data
• DBMS contains information about a
particular enterprise
• DBMS provides an environment that it both
convenient and efficient to use.
•Database management systems were developed to
handle the following difficulties of typical file-
processing systems supported by conventional
operating systems
4. 4
Principles of Aadhaar
One-time standardized Aadhaar enrollment establishes uniqueness of resident via
‘biometric de-duplication’
• Only one Aadhaar number per eligible individual
Online Authentication is provided by UIDAI
• Demographic Data (Name, Address, DOB, Gender)
• Biometric Data (Fingerprint)
Aadhaar :Subject to online authentication is proof of ID
Aadhaar enrollment / Update =
KYC
Aadhaar No. Issued,
stored in Auth. Server
“Verification” of KYC
(Authentication)
5. 5
Features of Aadhaar
Aadhaar is a 12 digit number – No Cards
Random Number – No Intelligence
Standard Attributes – No Profiling or Application Information
All Residents including Children get Numbers
Introducer System
Partnership Model
Flexibile Authentication Interface to Partners
1
2
3
4
5
6
7
6. 6
Benefits of Aadhaar (I)
• No fakes
• No duplicates
• To all with special focus on the marginalised and the
excluded
• Enable connectivity among databases
• Enable consolidation
Reduces
Leakages
Provides
Identities
Breaks
Silos
7. 7
Benefits of Aadhaar (II)
• Financial inclusion
• Electronic transfer of benefits
• Security of transactions
• Access to services
• Mobility in various application
• Building up of applications
Enables
Enhances
Ensures
8. 8
Registrar On-boarding Process
1. MoU with the State Government
2. Empowered Committee and Implementation Committee
3. Nodal Department and Registrar
4. KYR+ Fields
5. Vendor Selection
6. Enrolment Plan
7. IEC Activity
8. 13th
FC Funds
9. Monitoring the enrolment process
10.ICT Infrastructure
11.Aligning UID number to Databases & Government
Programmes
11. 11
11
Demographic data fields captured during Aadhaar enrolment
Field Name Comments
Name PoI documents required
Date of Birth Approximate/ Declared/ Verified
Gender M/F/T
Address PoA documents required
Parent/Spouse/Guardian Name Optional (mandatory in case of child
below 5 years)
Introducer UID Where PoI/PoA not provided
12. 12
Capture Demographic & Biometric Data
Optional data:
• Introducer data for verification
and/or
• Data of a relative who has a UID
number or an enrolment number
• Phone no., email address
Biometric Data
12
Resident’s Photograph
Resident’s
Finger Prints
Resident’s Iris
21. 21
Enrolment Data
• 600 to 800 million UIDs in 4 years
• 1 million a day with transaction, durability guarantees
• 350+ trillion matches every day
• ~5MB per resident
• Maps to about 10-15 PB of raw data (2048-bit PKI encrypted)
• About 30 TB I/O every day
• Replication and backup across DCs of about 5+ TB of incremental
data every day
• Lifecycle updates and new enrolments will continue for ever
• Enrolment data moves from very hot to cold, needing
multi-layered storage architecture
• Additional process data
• Several million events on an average moving through async
channels (some persistent and some transient)
• Needing insert and update guarantees across data stores
22. 22
Authentication Data
• 100+ million authentications per day (10 hrs)
• Possible high variance on peak and average
• Sub second response
• Guaranteed audits
• Multi-DC architecture
• All changes needs to be propagated from enrolment data stores to
all authentication sites
• Authentication request is about 4 K
• 100 million authentications a day
• 1 billion audit records in 10 days (30+ billion a year)
• 4 TB encrypted audit logs in 10 days
• Audit write must be guaranteed
23. 23
Aadhaar Data Stores
Mongo cluster
(all enrolment records/documents
– demographics + photo)
Shard
1
Shard
4
Shard
5
Shard
2
Shard
3 Low latency indexed read (Documents per sec),
High latency random search (seconds per read)
MySQL
(all UID generated records - demographics only,
track & trace, enrolment status )
Low latency indexed read (milli-
seconds per read),
High latency random search (seconds
per read)
UID master
(sharded)
Enrolment
DB
Solr cluster
(all enrolment records/documents
– selected demographics only)
Low latency indexed read (Documents per sec),
Low latency random search (Documents per sec)
Shard
0
Shard
2
Shard
6
Shard
9
Shard
a
Shard
d
Shard
f
HDFS
(all raw packets)
Data
Node 1
Data
Node 10
Data
Node ..
High read throughput (MB per sec),
High latency read (seconds per read)
Data
Node 20
HBase
(all enrolment
biometric templates)
Region
Ser. 1
Region
Ser. 10
Region
Ser. ..
High read throughput (MB per sec),
Low-to-Medium latency read (milli-seconds per read)Region
Ser. 20
NFS
(all archived raw packets)
Moderate read throughput,
High latency read (seconds per read)
LUN 1 LUN 2 LUN 3 LUN 4
24. 24
Systems Architecture
•
Work distribution
using SEDA &
Messaging
•
Ability to scale within
JVM and across
•
Recovery through
check-pointing
•
Sync Http based Auth
gateway
•
Protocol Buffers &
XML payloads
•
Sharded clusters
•
Near Real-time data delivery to warehouse
•
Nightly data-sets used to build dashboards,
data marts and reports
•
Real-time monitoring using Events
25. 25
Enrolment Biometric Middleware
• Distribute, Reconcile biometric data extraction and
de-dup requests across multiple vendors (ABISs)
• Biometric data de-referencing/read service(Http) over
sharded HDFS and NFS
• Serves bulk of the HDFS read requests (25TB per day)
• Locate data from multiple HDFS clusters
– Sharded by read/write patterns : New, Archive, Purge
• Calculates and maintains Volume allocation, SLA
breach thresholds of ABISs
• Thresholds stored in ZK and pushed to middleware
nodes
26. 26
Event Streams & Sinks
• Event framework supporting different interaction/data
durability patterns
• P2P, Pub-Sub
• Intra-JVM and Queue destinations - Durable / Non-Durable
• Fire & Forget, Ack. after processing
• Event Sinks
• Ephemeral data consumed by counters, metrics (dashboard)
• Rolling file appenders that push data to HDFS
– Primary mechanism for delivering raw fact data from
transactional systems to the warehouse staging area
27. 27
Data Analysis
• Statistical analysis from millions of events
• View into quality of enrolments – e.g. Enrolment
Agencies, Operators
• Feature introduction – e.g. Based on avg. time taken for
biometric capture, demographic data input
• Enrolment volumes – e.g. By Registrar, Agency,
Operator etc
– Useful in fraud detection
• Goal to share anonymized data sets for use by
industry and academia – information transparency
• Various reports – Self-serve, Canned, Operational
and/or Aggregates
28. 28
UID BI Platform
Data Analysis architecture
Data Access Framework
UIDAI Systems Events
(Rabbit MQ)
Server DB
(MySQL)
Hadoop HDFS
Data Warehouse (HDFS/Hive)
Event CSV
Fact DataDimension Data
Datasets
On-Demand Datasets
Datamarts
(MySQL)
Raw Data
Dimension Data
(MySQL)
Pig
Pentaho Kettle
Hive
Pentaho Kettle
Canned Reports Dashboard
Self-service
Analytics
Pentaho BI
FusionCharts
E-mail/Portal/Others
29. 29
FIELD NAME DATA TYPE KEY
NAME VARCHAR CONSTRAINT
MARITAL
STATUS
VARCHAR CONSTRAINTS
ADDRESS VARCHAR CONSTRAINTS
PHONE
NUMBER
NUMBER CONSTRAINTS
PINCODE NUMBER CONSTRAINTS
REGISTER NO NUMBER PRIMARY KEY
PERSONAL DETAILS 1
30. 30
FIELD NAME DATATYPE KEY
NAME VARCHAR CONSTRAINTS
Y.O.B DATE CONSTRAINTS
GENDER VARCHAR CONSTRAINTS
REGISTER NO NUMBER FOREGIN KEY
PERSONAL DETAILS 2
31. 31
• Data Query language
• Retrieve
• update
• Data Manipulation Language
• update
• delete
• Data Definition Language
• Create
• insert
• Transaction Language
• Commit
• Revoke
• savepoint
FUNCTION USED IN THE DATABASE CREATION
32. 32
• CREATION:
Create table tablename(columnname datatype(size),columnname datatype(size))
Create table person details(name varchar(10),marital status varchar(12),address
varchar(20),phone number number(10),pincode number(7),register no number(17));
PROCESS OF DATABASE
33. 33
INSERTION:
Insert into tablename[(columnname,columnname)]
Values (expression,expression);
Insert into table person details(name , maritalstatus, address, phone number ,pincode ,
register number)values(“xxx”, “w/o yyy “,”no:14,nehru street kamaraj nagar
puducherry”,”9123456789”,”605011”,”3560 2513 1913”);
34. 34
• UPDATION:
• Update tablename set columnname=expression, columnname=expression…. Where
columnname=expression;
– Update personal details set name=“www”
– Where pincode=“605011”;
35. 35
• RETRIVAL
• SELECT columnname,columname from tablename;
– Select name , register no from personal details;
– Select name and register no from personal details where phone number=“9123456789”
DELETION:
DELETE FROM tablename;
DELETE from personal details;