Your SlideShare is downloading. ×
0
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Identity and Biometrics in the Big Data & Analytics Context
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Identity and Biometrics in the Big Data & Analytics Context

1,339

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,339
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 1 © 2009 IBM CorporationIBM Confidential June, 2013 1© 2009 IBM Corporation Identity and Biometrics in the Big Data & Analytics Context Dr. Charles Li Analytics Solution Center Washington, DC Charles _Li@us.ibm.com Leveraging Information for Smarter Organizational Outcomes
  • 2. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 2 Topics ID Management, Identity & Biometrics Views on Biometrics Technology and System The Concept of the Big Data, Analytics and Challenges Identity Establishment from All Sources Identity and Biometrics in the Cloud Identity and Biometrics Analytics in Near Real Time Summary
  • 3. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes ID Management, Identity and Biometrics Identity Elements Players Entitlement(s) Actions Identity Trust (Rules) Status (Environment) Reputation (History) Identity Management
  • 4. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Views on biometrics technology and system 4 What is missing?
  • 5. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 5 Extract insight from a high volume, variety and velocity of data in a timely and cost-effective manner Big Data Concept Data in many forms – structured, unstructured, text and multimedia Data in Motion – Analysis of streaming data to enable decisions within fractions of a second Data at Scale - from terabytes to zettabytes Variety: Velocity: Volume:
  • 6. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 6 Analytics Concept Structured Data & Unstructured Content Descriptive Analytics Prescriptive Analytics Predictive Analytics Made consumable and accessible to everyone What if these trends continue? Forecasting How can we achieve the best outcome and address variability? Stochastic Optimisation What is happening What exactly is the problem? How many, how often, where? What actions are needed? What could happen? Simulation How can we achieve the best outcome? Optimisation What will happen next if? Predictive Modelling Extracting insight, concepts and relationships Content Analytics Deep insights to improve visualization and marketing interactions Visual Analytics
  • 7. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Biometrics Data at Scale – Static & Single Instance 1 Billion Arrivals 2012 world wide United States – 100-200 million international arrivals 2012 1 Exabytes traveling data Unique Identification Authority of India (UIDAI) plans to enroll 1.2 billion citizens.(UID Program) ( enroll million /day; half billion by 2014) 3-4 Exabytes Biometrics & Biographic Data Prolific Usage of Mobile Phones 6 Billion Mobile Phones 6 Exabytes of behavior data ID Cards/Border Crossings/Benefits/Multiple Instances 7,000,000,000x(10 Print 0.5-1MB + Face 200KB + IRIS KB) 7 Exabytes EU VIS Biometrics Matching System (BMS) at 70 million individuals and 100K daily enrollment ~100 Terabyte US DoS has in the range of 100 million faces & Others ~ at least 10-50 Terabytes DHS IDENT over 150 million identities; 125,000 transactions daily ~100-300 Terabytes FBI NGI ~ over100 Million Fingerprints & More coming plus Faces/Iris ~100-200 Terabytes 1 GigaBytes = 1000MB 1 TeraBytes = 1000GB 1 PetaBytes = 1000TB 1 ExaByes = 1000PB 1 ZettaBytes = 1000EB 1 YottaBytes = 1000ZB many instances, history, transaction, logs… data in reality
  • 8. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 8 Big Data Sources System Transaction, Log and Transition Data – Several Times More!
  • 9. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Other Big data examples 150 Exabytes global size of “Big Data” in Healthcare, growing between 1.2 and 2.4 EX / year For every session, NY Stock Exchange captures 1 Terabyte of trade information AT&T transfers about 30 Petabytes of data through its network daily Hadron Collider at CERN generates 40 Terabytes of usable data / day Facebook processes 500+ Terabytes of data daily Google processes > 24 Petabytes of data in a single day Twitter processes 12 Terabytes of data daily By 2016, annual Internet traffic will reach 1.3 Zettabytes We don’t have the most challenging problem!
  • 10. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes “Brutal Force” De-Duplication • Cumulative de-duplication / Total number of checks= N(N-1)/2 – “Combination Problem” • De-duplicate 100 million population enrollment results 4,999,999,950,000,000 checking!!! • 15 years to complete with 10 million matches per second Biometric Accuracy Challenge • FMR at 1 Identification false match per million; • 500 False Matches with 1 million enrollment population • 5 million false matches with 100 million enrollment population Biometric Performance at Giga Scale* * Courtesy to Bojan Cukic* Courtesy to Bojan Cukic Prohibitive! We have some unique challenges! Prohibitive! We have some unique challenges!
  • 11. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Face the Challenges Identity Establishment with All Data Sources - Leverage Entity Resolution Technologies Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platforms and Software Services Identity and Biometrics Analytics in Motion 11
  • 12. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Establishment Identity with All Sources Biometrics(physical and behavioral) Biographic information Behavior data (Social media usage) Travel data (API, PNR) Banking Information Web or Desktop usage behavior • Emails • Multimedia Spatial and temporal information 12 Entity /Identity Resolution With all Sources Entity / Identity Resolution - a complex process involving the application of sophisticated algorithms across multiple heterogeneous data sources to resolve multiple records into a single fused view of an individual • Reduce search space and• Reduce search space and computing resources • Compliment to low quality images • Cost and benefits tradeoff • Systematic research necessary • Successful programs
  • 13. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Infrastructure Platform Management and Administration Availability and Performance Security and Compliance Usage and Accounting Enterprise Application Services Application Lifecycle Application Resources Application Environments Application Management Integration Cloud Services Infrastructure and Platform as a Service Smarter Commerce Smarter Cities Social BusinessBusiness Analytics and Optimization Enterprise+ Cloud Solutions Software and Business Process as a Service Infrastructure aaS Platform PaaS Software SaaS Business Process BPaaS Deployment Private, Public and Hybrid Models Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platform and Software Services Standard Interface Process Data Process Data Process Data Process Data Process Data Process Data Process Data Process Data Process Data Enrolment Service 1:1 Identification Service …. Fingerprint Biometric Data Iris Face Note: Cloud & Big Data not the same
  • 14. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes A Prototype - Leveraging the cloud for Big Data Biometrics • E. Kohlwey et al. “Leveraging the Cloud for Big Data Biometrics, 2011 • A prototype system for generalized searching of cloud-scale biometric data as well as an application of this system to the task of matching collection of synthetic human iris images • Implemented with Hadoop (Map/Reduce framework) Successful deployment of Identification algorithms for India UID program • Non-traditional matching vendor technologies Biometrics as a Service • Business process as a service • Software as a service 14 Progress
  • 15. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Focus on Parallelism and Scalability • Excellent research and testing areas • Bring algorithms into operational environment Explore defining biometrics as a service program – new way of thinking about acquisition • Business process as a service • Software as a service Encourage partnership among Big Data & Analytics developers, traditional biometrics solution providers • Big Data and Analytics players 15 Challenges
  • 16. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Big Data Appliance Examples IBM Nettezza Oracle EXADATA Terradata EMC2 Greenplum SAP HANA Schooner Appliance MySQL Example - (CBP) 40TB data (per appliance, a few hundreds cores) hosted by a little more than a dozen appliances support 30 – 40 % of DHS’s operations 16
  • 17. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 17 Identity and Biometrics Analytics in Near Real Time ROC curve calibration along the security vs convenience • Allow systems to dynamically change operation criteria based on live situation • This is a real challenge due to the needed ground truth… Quality Feedback to the Collection • Avoid collecting ‘bad’ data to degrade the system Operating Metrics Monitoring • Rates on enrollment, rejection and etc. • Geo-location and temporal information Fuse all data sources based on real time feedback • Dynamically allocating fusion algorithms and configurations Provide controlled parallelism • System and algorithms levels
  • 18. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes Achieve scale: By partitioning applications into software components By distributing across stream-connected hardware hosts Infrastructure provides services for Scheduling analytics across hardware hosts, Establishing streaming connectivity Transform Filter / Sample Classify Correlate Annotate Where appropriate: Elements can be fused together for lower communication latency Continuous ingestion Continuous analysis One Approach - Streams Technology in Working © 2013 IBM Corporation1 Near Real Time on Big Data Platform
  • 19. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 19 Summary Re-focus on Identity • Biometrics as an enabling technology Re-thinking on • Open architecture • Vendor agnostic solution via biometrics middleware Big Impact by Big Data and Cloud Technologies • Biometrics as a Service to Leverage Cloud Computing Big Data Real Time Platform • Near real time analytics requirements
  • 20. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 20 Page 20 6/18/2013
  • 21. © 2009 IBM Corporation Leveraging Information for Smarter Organizational Outcomes 21 A New Look - Identity and Biometrics Analytics Stream in Parallel Big Data Platform Entity /Identity Resolution Big Data Solution Pipeline Identification Services Including many Models Massively Parallel Processing Real Time High Volume Travel Data Banking Data Spatial Data Temporal Data Real-time feeds Biometrics Capture Data Biographic Data Unstructured data Social Media Info on Web Behavioral data Report – Descriptive Analytics Predictive Models Business Workflow Resolution Visualization Analytics Content Analytics

×