Unlock Big Data's Potential in Financial Services with Hortonworks

3,254 views

Published on

Published in: Technology
  • Be the first to comment

Unlock Big Data's Potential in Financial Services with Hortonworks

  1. 1. CONSULTING SOLUTIONS OUTSOURCING Unlock Big Data's Potential in Financial Services Kurt Lueck – Pactera – US ITS Director of BI & Analytics Chris Hackett – Hortonworks – Enterprise Account Manager Ajay Singh – Hortonworks – Director of Technical Channels PARTNER FOR A NEW ERA
  2. 2. Topics 1 Pactera & Hortonworks Intro 2 The Hortonworks Approach 3 Smart Banking Requires a Polyglot Approach 4 Catching the Christmas Grinch (Fraud Detection in 2013) 5 360 Degree View of a Customer 6 Next Steps © Pactera. Confidential. All Rights Reserved. 2
  3. 3. Global Footprint and Flexible Delivery Capabilities Pactera is a global company strategically headquartered in China, enabling 360 partnerships with global brands seeking to expand in one of the world’s largest and fastest-growing markets. Global FTE: 24,000 © Pactera. Confidential. All Rights Reserved. 3
  4. 4. Hortonworks Approach to Enterprise Hadoop Community Driven Enterprise Apache Hadoop Identify and introduce enterprise requirements into the public domain Work with the community to advance and incubate open source projects Apply Enterprise Rigor to provide the most stable and reliable distribution © Pactera. Confidential. All Rights Reserved. 4
  5. 5. Hortonworks: The Value of “Open” for You Connect With the Hadoop Community We employ a large number of Apache project committers & innovators so that you are represented in the open source community Avoid Vendor Lock Hortonworks Data Platform remains as close to the open source trunk as possible and is developed 100% in the open so you are never locked in The partners you rely on, rely on Hortonworks We work with partners to deeply integrate Hadoop with data center technologies so you can leverage existing skills and investments Certified for the Enterprise We engineer, test and certify the Hortonworks Data Platform at scale to ensure reliability and stability you require for enterprise use Support from the experts We provide the highest quality of support for deploying at scale. You are supported by hundreds of years of Hadoop experience 5 © Pactera. Confidential. All Rights Reserved.
  6. 6. Our Mission: Enable your Modern Data Architecture by delivering One Enterprise Hadoop Our Commitment Headquarters: Palo Alto, CA Employees: 240+ and growing Customers: 120+ and growing Investors: Benchmark, Index, Yahoo, Dragoneer, Tenaya Innovate in the Open We employ the core architects and operators of Hadoop and drive innovation through open source Apache Foundation projects to avoid vendor lock-in Certify for the Enterprise Trusted Partners with: We engineer, test and certify the Hortonworks Data Platform for enterprise usage and deliver the highest quality of support Interoperate with the Ecosystem We work with partners to deeply integrate Hadoop with key technologies so you can leverage existing skills and investments © Hortonworks Inc. 2013 - Confidential 6
  7. 7. APPLICATIONS A Modern Data Architecture Custom Applications Business Analytics Packaged Applications DEV & DATA TOOLS SOURCES DATA SYSTEM BUILD & TEST OPERATIONAL TOOLS RDBMS EDW MANAGE & MONITOR MPP REPOSITORIES Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Pactera. Confidential. All Rights Reserved. 7
  8. 8. DATA SYSTEM APPLICATIONS Goal: Interoperable and Familiar BusinessObjects BI DEV & DATA TOOLS OPERATIONAL TOOLS RDBMS HANA EDW MPP SOURCES INFRASTRUCTURE Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Pactera. Confidential. All Rights Reserved. 8
  9. 9. Betting on Hortonworks… HDInsight & HDP for Windows • Only Hadoop Distribution for Windows Azure & Windows Server • Native integration with SQL Server, Excel, and System Center Teradata Portfolio for Hadoop • Seamless data access between Teradata and Hadoop (SQL-H) • Simple management & monitoring with Viewpoint integration • Flexible deployment options • Extends Hadoop to .NET community Instant Access + Infinite Scale • SAP can assure their customers they are deploying an SAP HANA + Hadoop architecture fully supported by SAP • Enables analytics apps (BOBJ) to interact with Hadoop Complete Portfolio for Hadoop UDA Diagram Appliances © Hortonworks Inc. 2013 - Confidential 9
  10. 10. HDP: Enterprise Hadoop Platform OPERATIONAL SERVICES AMBARI FLUME HBASE FALCON* OOZIE Hortonworks Data Platform (HDP) DATA SERVICES PIG SQOOP HIVE & HCATALOG • The ONLY 100% open source and complete platform LOAD & EXTRACT HADOOP CORE PLATFORM SERVICES NFS WebHDFS KNOX* MAP REDUCE TEZ YARN HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OS/VM Cloud © Hortonworks Inc. 2013 - Confidential • Integrates full range of enterprise-ready services • Certified and tested at scale • Engineered for deep ecosystem interoperability Appliance 10
  11. 11. Transferring Hadoop Expertise The expert source for Apache Hadoop training & certification • World class training programs Designed to help you learn fast – Role-based hands on classes with 50% lab time – • Hadoop Certification demonstrates expertise in Development & Administration • Expert consulting services • Programs designed to transfer knowledge • Industry leading Hadoop Sandbox Free download – Fastest way to learn Apache Hadoop – Personal, portable Hadoop environment – 11 © Hortonworks Inc. 2013 - Confidential
  12. 12. BI in Financial Markets A Polyglot Approach © Pactera. Confidential. All Rights Reserved.
  13. 13. Why Big Data What Can You Not Do Today? Store More for Less “Data Lake” © Pactera. Confidential. All Rights Reserved. • • • • Fraud Detection 360 Degree View of Customer Account Risk Analysis Social Media Analysis 13
  14. 14. Many Aspects of Smart Banking © Pactera. Confidential. All Rights Reserved. 14
  15. 15. Polyglot approach Analytics Massive Process Transactional Applications Real Time BI Process Persistence • Indexing, Clustering, • Interrupt processing • Time sharing processing • A new way of data processing, one technology of MPP (Massive Parallel Processing) NoSQL • Key Value DB / Key Value Stores • Large Column DB • Document-oriented DB • Graphic DB Hadoop • Parallel data storage model • BASE Transform Source HDFS/GPFS ftp/ftps CEP Data Mining SQL Map Reduce No Transform Real Time BI RDBMS • Traditional database for OLTP and OLAP • ACID • Scale up and scale out • New MPP support Memory RAC Cache after loading Streams Tools for stream data MQ/ESB Connectors ELT – Transform ETL – Transform while loading ETL Tools (datastage, informatica, flume, sqoop, etc.) In-Memory Computing • SAP HANA • Software AG Terracotta • Designed For real time analytics and transaction • Column based compressing • Computing near persistence In-Database Computing • SAS Large Memory Disk Persistence SQL for direct loading WS Clients JDBC/MDX API/WS Multi-channels Data Sources © Pactera. Confidential. All Rights Reserved. 15
  16. 16. Big Data is part of the Ecosystem Big Data BATCH SOURCE DATA Map Reduce HIVE ETL PIG (data processing) clickstream social USE (data processing) DB PIG HCATALOG (table metadata) INTERACTIVE server logs compute & storage . . . . Flume . . . . compute & storage EDW HIVE/SQL MPP ONLINE geo-location Sqoop sensor . . HBASE YARN STREAMING text © Pactera. Confidential. All Rights Reserved. STORM 16
  17. 17. Fraud Detection in 2013 Catching the Christmas Grinch © Pactera. Confidential. All Rights Reserved.
  18. 18. Fraud Story Line © Pactera. Confidential. All Rights Reserved. Old School 18
  19. 19. Fighting Fraud – Using Rules & Known Patterns Charlotte, NC -$500 Atlanta, GA -$500 Dallas, TX Hong Kong -$500 -$500 Balance = $2000 © Pactera. Confidential. All Rights Reserved. 20
  20. 20. Fighting Fraud - Anomaly Detection We have a very simple data model. Each credit card transaction contains the following 4 attributes: 1. 2. 3. 4. Transaction ID Time of the day Money spent Vendor type Here are some examples. The last one is an outlier, injected into the data set. YX66AJ9U 1025 20.47 Drug store 98ZCM6B1 1910 55.50 Restaurant XXXX7362 0100 1875.40 Jeweler store © Pactera. Confidential. All Rights Reserved. 21
  21. 21. Fighting Fraud -Predictive Analytics Predictive Descriptive Decision *Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns. * Wikipedia © Pactera. Confidential. All Rights Reserved. 22
  22. 22. Fighting Fraud - Social Network Analysis © Pactera. Confidential. All Rights Reserved. © Pactera. Confidential. All Rights Reserved. 23 23
  23. 23. Additional Use Cases of Big Data in Financial Services © Pactera. Confidential. All Rights Reserved.
  24. 24. 6 Key Hadoop DATA TYPES 1. Sentiment Understand how your customers feel about your brand and products – right now 2. Clickstream Capture and analyze website visitors’ data trails and optimize your website 3. Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4. Geographic Value Analyze location-based data to manage operations where they occur 5. Server Logs Research logs to diagnose process failures and prevent security breaches 6. Text Understand patterns in text across millions of web pages, emails, and documents 26 © Hortonworks Inc. 2013
  25. 25. Big Data in Financial Services Financial Services • Insurance Underwriting • 360 Degree View of the Customer • Website optimization • Brand sentiment • New Account Risk Screening • Accelerate Loan Processing 27 © Hortonworks Inc. 2013
  26. 26. Insurance Underwriting Financial Services Data: Geo, Text Business Problem • Insurance companies hold massive amounts of unstructured, textbased claim data • Without analyzing both structured and unstructured data, insurance companies have an incomplete view of risk • Data scarcity leads to moral hazard – companies sell to risky customers, safer individuals stay out of the market Solution • HDP gives underwriters more statistical confidence • Store and use more data, from more sources, for longer • Sensor and geographic data at large scale give real underwriting info for car, home, crop and cargo insurance 28 © Hortonworks Inc. 2012
  27. 27. Website Optimization Financial Services Data: Clickstream, Business Problem • Online bankers leave a long trail of clickstream data • Clickstream data can tell product pages customers visit and their interest • The huge volume of unstructured weblogs is difficult to store, refine and analyze for insight • Storing log data in relational databases is too expensive Solution • HDP stores all web logs, for years, at a low cost • Banks use that to understand user paths, do basket analysis, run A/B tests and prioritize site updates • Improve customer service & reduce expense 29 © Hortonworks Inc. 2012
  28. 28. 360° View of the Customer Financial Services Data: Clickstream, Text Business Problem • Banks interact with customers across multiple channels • Customer interaction and product subscription is often siloed • Few banks can correlate customer interactions with marketing campaigns and online browsing behavior • Merging data in relational databases is expensive Solution • HDP gives banks a 360° view of customer behavior • Store data longer & track phases of the customer lifecycle • Gain competitive advantage: increase sales, reduce service expense and retain the best customers 30 © Hortonworks Inc. 2012
  29. 29. Next Steps © Pactera. Confidential. All Rights Reserved.
  30. 30. Pactera Big Data Capability Big Data Solution Architecture  In-Memory Solutions  Scalable Distributed Platforms Next Generation Analytics  Models, Algorithms, and Simulations  Visualization Improving Operational Ability  Help companies drive more operational efficiencies from existing investments.  Moving from the realm of data scientists into everyday business transactions and encounters. New Business Processes  Impact on both customer intelligence and operational efficiency by making everything immediately actionable.  Armed with immediate decision-making capability and intelligence, companies will be able to implement new business processes that will change how business is done.  We ask the Right Questions © Pactera. Confidential. All Rights Reserved. 32
  31. 31. How Pactera can help with Big Data Executive Workshop Strategies, Planning, and Expectations • Big Data strategy on what tomorrow will look like POC (2-4 Weeks) • Using Big Data to establish market dominance • Big Data project takeaways • Roadblocks to implementing Big Data analytics • Defining an ROI for Big Data • Getting the right ROI on Big Data Workshop Benchmark and Monitoring Implementation and Architecture Implementation and Architecture Pilot Concept (2-4 Weeks) Technical Workshop End-To-End Management • System tuning/auto-tuning and configuration management • Dealing with both structured and unstructured data • Monitoring, diagnosis, and automated behavior detection Solution Architecture • Processor, memory, and system architectures for data analysis • Benchmarks, metrics, and workload characterization for big data • Availability, fault tolerance and recovery issues • Data management and analytics for vast amounts of unstructured data © Pactera. Confidential. All Rights Reserved. (4 Hours) Projects: • Benchmark & Monitoring • Integrations & Migrations • Implementation & Architecture • Project Management • Analytics • Reporting 33
  32. 32. Thank You Kurt Lueck, Managing Director of BI & Analytics Kurt.lueck@pactera.com Chris Hackett chackett@hortonworks.com Ajay Singh ajaysingh@hortonworks.com © Pactera. Confidential. All Rights Reserved.

×