• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Li charles    biometrics analytics & big data 122013a for release
 

Li charles biometrics analytics & big data 122013a for release

on

  • 150 views

biometrics, big data, identity analytics

biometrics, big data, identity analytics

Statistics

Views

Total Views
150
Views on SlideShare
150
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Li charles    biometrics analytics & big data 122013a for release Li charles biometrics analytics & big data 122013a for release Presentation Transcript

    • Dr. Charles Li Analytics Solution Center Charles_Li@us.ibm.com Biometrics, Identity and Big Data Analytics © 2013 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Topics Biometrics, Identity & ID Management Views on Biometrics Technology and System Big Data Analytics and Challenges Identity Establishment from All Sources Identity and Biometrics in the Cloud Identity and Biometrics Analytics in Motion Summary 2 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Biometrics, Identity and ID Management Entitlement(s) Actions Identity Reputation (History) Trust (Rules) Identity Establishment Status (Environment) Identity Management Players © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Views on biometrics technology and system What is missing? 4 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Big Data Concept Extract insight from a high volume, variety and velocity of data in a timely and cost-effective manner Data in many forms – Variety: structured, unstructured, text and multimedia Data in Motion – Analysis of Velocity: streaming data to enable decisions within fractions of a second Volume: Data at Scale - from terabytes to zettabytes 5 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Analytics Concept What is happening How many, how often, where? What exactly is the problem? Structured Data & Unstructured Content 6 Made consumable and accessible to everyone What actions are needed? Biometrics Quality Monitoring What could happen? Simulation What if these trends continue? Forecasting How can we achieve the best outcome? Optimisation What will happen next if? Predictive Modelling How can we achieve the best outcome and address variability? Stochastic Optimisation Descriptive Predictive Analytics Analytics Prescriptive Analytics Biometrics Reports Extracting insight, concepts and relationships Content Analytics Deep insights to improve visualization and marketing interactions Visual Analytics © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Biometrics Data at Scale – Static & Single Instance ID Cards/Border Crossings/Benefits/Multiple Instances 7,000,000,000x(10 Print 0.5-1MB + Face 200KB + IRIS KB) DHS IDENT over 150 million identities; 125,000 transactions daily 7 Exabytes ~100-300 Terabytes 1 GigaBytes = 1000MB 1 TeraBytes = 1000GB FBI NGI ~ over100 Million Fingerprints & More PetaBytes 1 coming plus Faces/Iris = 1000TB 1 ExaByes ~100-200 Terabytes = 1000PB 1 ZettaBytes = 1000EB 1 YottaBytes = 1000ZB US DoS has in the range of 100 million faces & Others ~ at least 10-50 Terabytes EU VIS Biometrics Matching System (BMS) at 70 million individuals and 100K daily enrollment Prolific Usage of Mobile Phones 6 Billion Mobile Phones 6 Exabytes of behavior data 1 Billion Arrivals 2012 world wide United States – 100-200 million international arrivals 2012 1 Exabytes traveling data Unique Identification Authority of India (UIDAI) plans to enroll 1.2 billion citizens.(UID Program) ( enroll million /day; half billion by 3-4 Exabytes Biometrics & 2014) Biographic Data ~100 Terabyte many instances, history, transaction, logs… data in reality © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Big Data Sources System Transaction, Log and Transition Data – Several Times More! 8 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Other Big data examples By 2016, annual Internet traffic will reach 1.3 Zettabytes Google processes > 24 Petabytes of data in a single day Facebook processes Twitter processes 500+ Terabytes of data daily 12 Terabytes of data daily 150 Exabytes global size of AT&T transfers about 30 Petabytes of data through its network daily “Big Data” in Healthcare, growing between 1.2 and 2.4 EX / year We don’t have the most challenging problem! Hadron Collider at CERN generates 40 Terabytes of usable data / day For every session, NY Stock Exchange captures 1 Terabyte of trade information © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Biometric Performance at Giga Scale* “Brutal Force” De-Duplication • Cumulative de-duplication / Total number of checks= N(N-1)/2 – “Combination Problem” • De-duplicate 100 million population enrollment results 4,999,999,950,000,000 checking!!! • 15 years to complete with 10 million matches per second Biometric Accuracy Challenge • FMR at 1 Identification false match per million; • 500 False Matches with 1 million enrollment population (de-duplicate) • 5 million false matches with 100 million enrollment population Prohibitive! We have some unique challenges! * Courtesy to Bojan Cukic © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Face the Challenges Identity Establishment with All Data Sources - Leverage Entity Resolution Technologies - Leverage ‘Context Accumulation’ Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platforms - Leverage Software Services Biometrics and Identity Analytics in Motion - Monitor quality - Monitor performance 11 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Establishment Identity with All Sources Biometrics(physical and behavioral) • Reduce search space and computing resources • Compliment to low quality images • Cost and benefits tradeoff • Systematic research necessary • Successful programs Biographic information Behavior data (Social media usage) Travel data (API, PNR) Credit Card/Banking Information Entity /Identity Resolution With all Sources Web or Mobile App usage behavior • Emails • Multimedia Spatial and temporal information 12 Entity / Identity Resolution - a complex process involving the application of sophisticated algorithms across multiple heterogeneous data sources to resolve multiple records into a single fused view of an individual © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Biometrics Services in the Cloud - Leverage Big Data Infrastructure, Platform and Software Services Cloud Solutions Software and Business Process as a Service Business Process BPaaS Business Analytics and Optimization Social Business Smarter Commerce Smarter Cities Enrolment Service Process Data Process Data 1:1 Identification Service Process Data …. Software SaaS Standard Interface Cloud Services Infrastructure and Platform as a Service Application Services Platform PaaS Application Lifecycle Application Resources Application Environments Enterprise Fingerprint Face Iris Biometric Data Infrastructure aaS Infrastructure Management Availability and Platform and Administration Performance Application Management Integration Enterprise+ Security and Compliance Usage and Accounting Deployment Note: Cloud & Big Data not the same Private, Public and Hybrid Models © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Exemplary Progress A Prototype - Leveraging the cloud for Big Data Biometrics • E. Kohlwey et al. “Leveraging the Cloud for Big Data Biometrics, 2011 • A prototype system for generalized searching of cloud-scale biometric data as well as an application of this system to the task of matching collection of synthetic human iris images • Implemented with Hadoop (Map/Reduce framework) Successful deployment of Identification algorithms for India UID program • Non-traditional matching vendor technologies Biometrics as a Service • Business process as a service • Software as a service 14 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Challenges Focus on Parallelism and Scalability • Excellent research and testing areas • Bring algorithms into operational environment Explore defining biometrics as a service program – new way of thinking about acquisition • Business process as a service • Software as a service Encourage partnership among Big Data & Analytics developers, traditional biometrics solution providers • Big Data and Analytics players 15 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Big Data Appliance Examples IBM Nettezza Oracle EXADATA Terradata EMC2 Greenplum SAP HANA Schooner Appliance MySQL Example - (CBP) 40TB data (per appliance, a few hundreds cores) hosted by a little more than a dozen appliances support 30 – 40 % of DHS’s operations 16 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Biometrics and Identity Analytics in Motion ROC curve calibration along the security vs convenience • Allow systems to dynamically change operation criteria based on live situation • This is a real challenge due to the needed ground truth… Quality Feedback to the Collection • Avoid collecting ‘bad’ data to degrade the system Operating Metrics Monitoring • Rates on enrollment, rejection and etc. • Geo-location and temporal information Fuse all data sources based on real time feedback • Dynamically allocating fusion algorithms and configurations Provide controlled parallelism • System and algorithms levels 17 © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes One Approach - Streams Technology in Working Continuous ingestion Continuous analysis Filter / Sample Infrastructure provides services for Scheduling analytics across hardware hosts, Establishing streaming connectivity Annotate Transform Correlate Classify Near Real Time on Big Data Platform Achieve scale: By partitioning applications into software components By distributing across stream-connected hardware hosts © 2013 IBM 1 Corporation Where appropriate: Elements can be fused together for lower communication latency © 2009 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes Summary Re-focus on Identity • Biometrics as an enabling technology Re-thinking on • Open architecture • Vendor agnostic solution via biometrics middleware Big Impact by Big Data and Cloud Technologies • Biometrics as a Service to Leverage Cloud Computing Big Data Real Time Platform • Near real time analytics requirements 19 © 2009 IBM Corporation
    • 20 © 2013 IBM Corporation
    • Leveraging Information for Smarter Organizational Outcomes A New Look - Identity and Biometrics Analytics Real-time feeds Business Workflow Resolution Real Time Biometrics Capture Data Biographic Data Stream in Parallel Including many Models Entity /Identity Resolution Pipeline Identification Services Predictive Models Unstructured data Social Media Info on Web Behavioral data High Volume Content Analytics Big Data Platform Travel Data Banking Data Spatial Data Temporal Data 21 Big Data Solution Massively Parallel Processing Visualization Analytics Report – Descriptive Analytics © 2009 IBM Corporation