© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
1 © 2009 IBM CorporationIBM Confidential...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
2
Topics
ID Management, Identity & Biome...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
ID Management, Identity and Biometrics
I...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Views on biometrics
technology and syste...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
5
Extract insight from a high volume, va...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
6
Analytics Concept
Structured
Data &
Un...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Biometrics Data at Scale – Static & Sing...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
8
Big Data Sources
System Transaction, L...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Other Big data examples
150 Exabytes glo...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
“Brutal Force” De-Duplication
• Cumulati...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Face the Challenges
Identity Establishme...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Establishment Identity with All Sources
...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Infrastructure
Platform
Management
and A...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
A Prototype - Leveraging the cloud for B...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Focus on Parallelism and Scalability
• E...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Big Data Appliance Examples
IBM Nettezza...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
17
Identity and Biometrics Analytics in ...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
Achieve scale:
By partitioning applicati...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
19
Summary
Re-focus on Identity
• Biomet...
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
20
Page 20 6/18/2013
© 2009 IBM Corporation
Leveraging Information for Smarter Organizational Outcomes
21
A New Look - Identity and Biometrics ...
Upcoming SlideShare
Loading in …5
×

Identity and Biometrics in the Big Data & Analytics Context

2,852 views

Published on

Published in: Technology
  • I hear that biometric products, if used with a backup password, are now called a “below-one factor authentication”, since it makes the users less safe than a password-only single factor authentication. It is exactly like a house with two entrances is less safe against burglars than a house with one entrance. This means that biometric products must be used without a backup password if security is wanted. Can it be done? It should help a lot if you have a quick look at http://www.slideshare.net/HitoshiKokumai/blind-spot-in-our-mind-eyecatching-experience
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Identity and Biometrics in the Big Data & Analytics Context

  1. 1. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 1 © 2009 IBM CorporationIBM Confidential June, 2013 1© 2009 IBM Corporation Identity and Biometrics in the Big Data & Analytics Context Dr. Charles Li Analytics Solution Center Washington, DC Charles _Li@us.ibm.com Leveraging Information for Smarter Organizational Outcomes
  2. 2. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 2 Topics ID Management, Identity & Biometrics Views on Biometrics Technology and System The Concept of the Big Data, Analytics and Challenges Identity Establishment from All Sources Identity and Biometrics in the Cloud Identity and Biometrics Analytics in Near Real Time Summary
  3. 3. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes ID Management, Identity and Biometrics Identity Elements Players Entitlement(s) Actions Identity Trust (Rules) Status (Environment) Reputation (History) Identity Management
  4. 4. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Views on biometrics technology and system 4 What is missing?
  5. 5. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 5 Extract insight from a high volume, variety and velocity of data in a timely and cost-effective manner Big Data Concept Data in many forms – structured, unstructured, text and multimedia Data in Motion – Analysis of streaming data to enable decisions within fractions of a second Data at Scale - from terabytes to zettabytes Variety: Velocity: Volume:
  6. 6. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 6 Analytics Concept Structured Data & Unstructured Content Descriptive Analytics Prescriptive Analytics Predictive Analytics Made consumable and accessible to everyone What if these trends continue? Forecasting How can we achieve the best outcome and address variability? Stochastic Optimisation What is happening What exactly is the problem? How many, how often, where? What actions are needed? What could happen? Simulation How can we achieve the best outcome? Optimisation What will happen next if? Predictive Modelling Extracting insight, concepts and relationships Content Analytics Deep insights to improve visualization and marketing interactions Visual Analytics
  7. 7. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Biometrics Data at Scale – Static & Single Instance 1 Billion Arrivals 2012 world wide United States – 100-200 million international arrivals 2012 1 Exabytes traveling data Unique Identification Authority of India (UIDAI) plans to enroll 1.2 billion citizens.(UID Program) ( enroll million /day; half billion by 2014) 3-4 Exabytes Biometrics & Biographic Data Prolific Usage of Mobile Phones 6 Billion Mobile Phones 6 Exabytes of behavior data ID Cards/Border Crossings/Benefits/Multiple Instances 7,000,000,000x(10 Print 0.5-1MB + Face 200KB + IRIS KB) 7 Exabytes EU VIS Biometrics Matching System (BMS) at 70 million individuals and 100K daily enrollment ~100 Terabyte US DoS has in the range of 100 million faces & Others ~ at least 10-50 Terabytes DHS IDENT over 150 million identities; 125,000 transactions daily ~100-300 Terabytes FBI NGI ~ over100 Million Fingerprints & More coming plus Faces/Iris ~100-200 Terabytes 1 GigaBytes = 1000MB 1 TeraBytes = 1000GB 1 PetaBytes = 1000TB 1 ExaByes = 1000PB 1 ZettaBytes = 1000EB 1 YottaBytes = 1000ZB many instances, history, transaction, logs… data in reality
  8. 8. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 8 Big Data Sources System Transaction, Log and Transition Data – Several Times More!
  9. 9. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Other Big data examples 150 Exabytes global size of “Big Data” in Healthcare, growing between 1.2 and 2.4 EX / year For every session, NY Stock Exchange captures 1 Terabyte of trade information AT&T transfers about 30 Petabytes of data through its network daily Hadron Collider at CERN generates 40 Terabytes of usable data / day Facebook processes 500+ Terabytes of data daily Google processes > 24 Petabytes of data in a single day Twitter processes 12 Terabytes of data daily By 2016, annual Internet traffic will reach 1.3 Zettabytes We don’t have the most challenging problem!
  10. 10. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes “Brutal Force” De-Duplication • Cumulative de-duplication / Total number of checks= N(N-1)/2 – “Combination Problem” • De-duplicate 100 million population enrollment results 4,999,999,950,000,000 checking!!! • 15 years to complete with 10 million matches per second Biometric Accuracy Challenge • FMR at 1 Identification false match per million; • 500 False Matches with 1 million enrollment population • 5 million false matches with 100 million enrollment population Biometric Performance at Giga Scale* * Courtesy to Bojan Cukic* Courtesy to Bojan Cukic Prohibitive! We have some unique challenges! Prohibitive! We have some unique challenges!
  11. 11. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Face the Challenges Identity Establishment with All Data Sources - Leverage Entity Resolution Technologies Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platforms and Software Services Identity and Biometrics Analytics in Motion 11
  12. 12. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Establishment Identity with All Sources Biometrics(physical and behavioral) Biographic information Behavior data (Social media usage) Travel data (API, PNR) Banking Information Web or Desktop usage behavior • Emails • Multimedia Spatial and temporal information 12 Entity /Identity Resolution With all Sources Entity / Identity Resolution - a complex process involving the application of sophisticated algorithms across multiple heterogeneous data sources to resolve multiple records into a single fused view of an individual • Reduce search space and• Reduce search space and computing resources • Compliment to low quality images • Cost and benefits tradeoff • Systematic research necessary • Successful programs
  13. 13. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Infrastructure Platform Management and Administration Availability and Performance Security and Compliance Usage and Accounting Enterprise Application Services Application Lifecycle Application Resources Application Environments Application Management Integration Cloud Services Infrastructure and Platform as a Service Smarter Commerce Smarter Cities Social BusinessBusiness Analytics and Optimization Enterprise+ Cloud Solutions Software and Business Process as a Service Infrastructure aaS Platform PaaS Software SaaS Business Process BPaaS Deployment Private, Public and Hybrid Models Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platform and Software Services Standard Interface Process Data Process Data Process Data Process Data Process Data Process Data Process Data Process Data Process Data Enrolment Service 1:1 Identification Service …. Fingerprint Biometric Data Iris Face Note: Cloud & Big Data not the same
  14. 14. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes A Prototype - Leveraging the cloud for Big Data Biometrics • E. Kohlwey et al. “Leveraging the Cloud for Big Data Biometrics, 2011 • A prototype system for generalized searching of cloud-scale biometric data as well as an application of this system to the task of matching collection of synthetic human iris images • Implemented with Hadoop (Map/Reduce framework) Successful deployment of Identification algorithms for India UID program • Non-traditional matching vendor technologies Biometrics as a Service • Business process as a service • Software as a service 14 Progress
  15. 15. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Focus on Parallelism and Scalability • Excellent research and testing areas • Bring algorithms into operational environment Explore defining biometrics as a service program – new way of thinking about acquisition • Business process as a service • Software as a service Encourage partnership among Big Data & Analytics developers, traditional biometrics solution providers • Big Data and Analytics players 15 Challenges
  16. 16. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Big Data Appliance Examples IBM Nettezza Oracle EXADATA Terradata EMC2 Greenplum SAP HANA Schooner Appliance MySQL Example - (CBP) 40TB data (per appliance, a few hundreds cores) hosted by a little more than a dozen appliances support 30 – 40 % of DHS’s operations 16
  17. 17. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 17 Identity and Biometrics Analytics in Near Real Time ROC curve calibration along the security vs convenience • Allow systems to dynamically change operation criteria based on live situation • This is a real challenge due to the needed ground truth… Quality Feedback to the Collection • Avoid collecting ‘bad’ data to degrade the system Operating Metrics Monitoring • Rates on enrollment, rejection and etc. • Geo-location and temporal information Fuse all data sources based on real time feedback • Dynamically allocating fusion algorithms and configurations Provide controlled parallelism • System and algorithms levels
  18. 18. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Achieve scale: By partitioning applications into software components By distributing across stream-connected hardware hosts Infrastructure provides services for Scheduling analytics across hardware hosts, Establishing streaming connectivity Transform Filter / Sample Classify Correlate Annotate Where appropriate: Elements can be fused together for lower communication latency Continuous ingestion Continuous analysis One Approach - Streams Technology in Working © 2013 IBM Corporation1 Near Real Time on Big Data Platform
  19. 19. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 19 Summary Re-focus on Identity • Biometrics as an enabling technology Re-thinking on • Open architecture • Vendor agnostic solution via biometrics middleware Big Impact by Big Data and Cloud Technologies • Biometrics as a Service to Leverage Cloud Computing Big Data Real Time Platform • Near real time analytics requirements
  20. 20. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 20 Page 20 6/18/2013
  21. 21. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 21 A New Look - Identity and Biometrics Analytics Stream in Parallel Big Data Platform Entity /Identity Resolution Big Data Solution Pipeline Identification Services Including many Models Massively Parallel Processing Real Time High Volume Travel Data Banking Data Spatial Data Temporal Data Real-time feeds Biometrics Capture Data Biographic Data Unstructured data Social Media Info on Web Behavioral data Report – Descriptive Analytics Predictive Models Business Workflow Resolution Visualization Analytics Content Analytics

×