Our Big Data Journey: A Consultant's Perspective
Upcoming SlideShare
Loading in...5
×
 

Our Big Data Journey: A Consultant's Perspective

on

  • 813 views

David Douglas, CrinLogic ...

David Douglas, CrinLogic

The Big Data headlines are unrelenting; with each passing day seemingly bringing new discoveries, products, partnerships, venture funds, you name it into the mix. If anything, it is all a bit confusing. Listening to all this you might come to the conclusion that Big Data will solve most of your problems, place your company miles ahead of your competition, drive your Net Promoter Scores through the roof, and fall just short of solving world hunger (ok…maybe not that far).

And one can’t blame you if you think all one needs to do is install the Hadoop ecosystem of projects, conjure up some possible business use cases, throw some commodity hardware into the mix, attend some training, purchase some Big Data analytics software and VOILA, you have arrived and can enjoy the fruits of your Big Data efforts.

With tongue firmly planted in cheek, the reality is vastly different. This talk is partially a reality check on Big Data implementation strategies - starting with Big Data is easy, becoming proficient is hard, fully integrating into a broader enterprise data strategy is very hard – and partially an information sharing session on what we’re learning as we engage with customers in various industries on Big Data. Among other things we will explore: building the business case; software and hardware requirements analysis; selection process and implementation approaches; what tends to work well, not so well, and what to avoid; and how big data is likely to affect enterprise data architecture.

David Douglas is a member of Hadoop-DC User Group and is a co-founder of CrinLogic, a Big Data consultancy based in the greater DC area. He has devoted his 17 years of professional experience to helping clients maximize the value of their strategic IT initiatives. Prior to co-founding CrinLogic, David started two other companies. The first was an angel-backed Sales Force Automation software company he sold in 2002 and the second is a consulting services company that focuses on Agile and Lean software adoption and large-scale program implementation services. He helped start the Data Warehousing practice at American Management Systems and was one of the first consultants to join IBM’s Business Intelligence practice.

Statistics

Views

Total Views
813
Views on SlideShare
813
Embed Views
0

Actions

Likes
1
Downloads
43
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Our Big Data Journey: A Consultant's Perspective Our Big Data Journey: A Consultant's Perspective Presentation Transcript

  • Our Big Data Journey A Consultant’s Perspective April 26, 20125/1/2012 1 CrinLogic
  • 5/1/2012 2 CrinLogic
  • A little Big Data story to start us out ;)5/1/2012 3 CrinLogic
  • About meDavid Douglas is a member of Hadoop-DC and co-founder of CrinLogic. He hasover 17 years of IT consulting experience with concentration in BusinessIntelligence, Agile and Lean software development, and large programimplementations. He is a passionate believer in Big Data and the enormouspossibilities it offers.CrinLogic is a Big Data consulting firm. Our passion for Big Data is surpassedonly by our curiosity and love of learning. We offer full service Big Dataconsulting services and training. Visit us at www.CrinLogic.com. We are based inDC, Chicago, Austin, and Sarajevo.david@crinlogic.com443.413.4038 5/1/2012 4 CrinLogic
  • This talk is about…Some things I’d like you to walk away with1. A big picture perspective on this market2. What customers are saying3. Thoughts on developing a business case4. Learnings What this talk is not about 1. A technical discussion of Big Data 5/1/2012 5 CrinLogic
  • Data for this talk came from1. Talking with 30 plus companies from all walks of life and all stages of maturity2. Talking with colleagues in the Big Data space (hardware and software vendors)3. Current customer engagements4. ResearchMy biggest surprise is my awareness of how little I know. No one reallyunderstands how to build a Big Data solution. We are all learning as we go. 5/1/2012 6 CrinLogic
  • My Perspective on Big Data Enterprise Data Architecture Iterative Rising tide Post adopter syndrome Failures Business problem Systems Thinking focused5/1/2012 7 CrinLogic
  • Yes it is big and growing!5/1/2012 8 CrinLogic
  • Except when compared to Bieber 5/1/2012 9 CrinLogic
  • And me of course ;) Though I’ve been trending down… 5/1/2012 10 CrinLogic
  • Need to expand my Network on LinkedIn 5/1/2012 11 CrinLogic
  • So what do the customers really think?5/1/2012 12 CrinLogic
  • They are confused and rightfully so!5/1/2012 13 CrinLogic
  • No generally accepted definition for “Big Data” “We don’t generate enough data for that” “Don’t you need at least 100TBs?” Or they simply think they are already using Big Data And lest we forget the 3Vs… Volume, Variety, Velocity (just a couple pointers on these) 5/1/2012 14 CrinLogic
  • So many products and choices promising so muchWhere Database Open Source Hardware Analytical Tools Network5/1/2012 15 CrinLogic
  • So much software offerings5/1/2012 16 CrinLogic
  • No generally accepted definition for “DataScientist”Are they a critical success factor for Big Data Solutions?[True or False]Were they a critical success factor to BusinessIntelligence solutions?OSEMI – Obtain, Scrub, Explore, Model, Interpretwww.dataists.com Hillary Mason & Chris Wiggins 5/1/2012 17 CrinLogic
  • So tell me the why please?5/1/2012 18 CrinLogic
  • McKinsey’s 5 Value Propositions1. Make information transparent and usable more readily2. Expose variability and enable performance improvement3. Better customer segmentations4. Advanced analytics for better decision making5. New products 5/1/2012 19 CrinLogic
  • Not seeing the Big Analytics PieceIn fact, of the many companies employing Big Data we’vetalked to or are working with are not doing big dataanalytics5/1/2012 20 CrinLogic
  • Tactical versus StrategicTactical solves an immediate pain point•Batch jobs taking too long•Reaching limit of scalability on current infrastructure•Budget was reduced recently but still have to deliver•New project ‘just so happens’ to need this newer technologyStrategic implies Big Data as strategic•Seeking competitive differentiation direction is much•Creating actual solutions with value harder 5/1/2012 21 CrinLogic
  • Big Data Strategic Business CaseApproach5/1/2012 22 CrinLogic
  • Figure out the Business CaseCongratulations! The CEO of a large Financial Services firm hasasked you and your team to map out the company’s Big DataStrategy so he can present to the board. He is known for beingthorough. Now get to work!!Of the below choices, which is the best first step?a. Scour the Internet for Big Data c. Phone a friend (or CrinLogic) use case success stories for d. Build relationships, interview all Financial Services and then go areas of the company, research talk to VPs in that area market, and consolidate theb. Build a virtual cluster on your results [but time-box it to a machine, open direct link to couple weeks] Twitter hose and show CEO what the community is saying about Goal is to identify the most him real-time appropriate areas to start…high reward…high visibility 5/1/2012 23 CrinLogic
  • This can be helpful… Manage Manage Manage Develop Business Acquire Customer Service Delinquencies Finance & Strategy Customers Relationshi Customers Recoveries & Accounting p Fraud Establish Develop Account Develop Card Define Customer Develop Collections Manage Accounting Strategic Management Offers Acquisition Offers Experience Strategies and Reporting Imperatives & Policies Design Account Develop Marketing Develop Acquisition Develop Servicing Develop Recoveries Management Manage Treasury Strategy Campaigns Strategies Strategies Campaigns Develop Market Identify Customers/ Provide Customer Develop Fraud Manage Planning & Identify Prospects Innovations Targets Service Strategies Analysis Collect on Solicit Prospects & Communicate Delinquent Manage Line of Maintain Accounts Promote Offers Offers/Changes Accounts BusinessSample Decision Applications & Book Accounts Decision Response/Request Process Credit Card Transactions Recover Charged Off Accounts Manage Credit RiskLarge Fulfill on Decisions Fulfill on Detect and Recover FraudFinancial Offers/ChangesInstitution Manage Manage Manage Regulatory Manage Human Information Affairs & Correspondence Resources Technology Compliance Manage Funds Manage IT Manage External Manage Rewards Manage HR Disbursements Operations Compliance 5/1/2012 24 CrinLogic
  • Identify Big Data Impact Areas Manage Manage Manage Develop Business Acquire Customer Service Delinquencies Finance & Strategy Customers Relationshi Customers Recoveries & Accounting p Fraud Establish Develop Account Develop Card Define Customer Develop Collections Manage Accounting Strategic Management Offers Acquisition Offers Experience Strategies and Reporting Imperatives & Policies Design Account Develop Marketing Develop Acquisition Develop Servicing Develop Recoveries Management Manage Treasury Strategy Campaigns Strategies Strategies Campaigns Develop Market Identify Customers/ Provide Customer Develop Fraud Manage Planning & Identify Prospects Innovations Targets Service Strategies Analysis Collect on Solicit Prospects & Communicate Delinquent Manage Line of Maintain Accounts Promote Offers Offers/Changes Accounts Business Decision Recover Charged Decision Process Credit Card Applications & Book Off Accounts Manage Credit Risk Response/Request Transactions Accounts Fulfill on Detect and Recover Fulfill on Decisions Fraud Offers/ChangesNo Impact Manage Manage Manage RegulatoryLow Impact Manage Human Information Affairs & Correspondence Resources Technology ComplianceModerate ImpactHigh Impact Manage Funds Manage IT Manage External Manage Rewards Manage HR Disbursements Operations Compliance 5/1/2012 25 CrinLogic
  • ManageNaturally! Fraud & Recoveries Delinquencies Recoveries & Fraud Develop Develop Collect on Recover Develop Fraud Detect and Collections Recoveries Delinquent Charged Off Strategies Recover Fraud Strategies Strategies Accounts Accounts• Analyze Collections • Research Fraud • Determine Collections • Charge Off Bad Debt • Detect Fraud Strategies Strategies Strategy • Process • Decision• Maintain Collections • Design/Test • Enter Collections Bankruptcies Identity Fraud Systems Fraud • Exit Collections • Process Estates • Decision Strategies Strategy Transaction • Process Recoveries • Implement Fraud • Fulfill Collections Payments Fraud Strategy • Recover Fraud Strategies • Monitor Commitments • Service Collections AccountNo ImpactLow ImpactModerate ImpactHigh Impact 5/1/2012 26 CrinLogic
  • Be ready to answer these questions 1.Do they currently have an analytics group? 2.Do they make decisions based on data? 3.Do they have data center management skills? 4.Do they have stringent regulatory requirements? 5.What are the current sources of data? 6.What other sources of data are of interest? 7.What are their KPIs? 8.What is the maturity of their enterprise data architecture? 9. What is the maturity of their business intelligence initiative(s)? 10. Others?5/1/2012 27 CrinLogic
  • Predictive Analytics Maturity Model Maturity Level Supply Chain Simulator PnP Simulator Full picture/context optimization Mix Simulator Global Suite Analytics (Supply Chain/PnP/Mix/Media …) Media Buy Holy Grail with integrated workflow (ERP …) […] 5 ActionableAnalytical Optimization ImplementableMaster Insights( InstitutionalizedAnalytics) Forecasting/Full Threshold Simulation 4 based Insights Preemptive Suggestions Forward Looking DSS InsightsAnalyticalPractitioner Automated Automated(In-house Insight Insightss team) Decks Pricing and Promotions Marketing Mix 3 Specialized/Targeted Analytics Products Segmentation Current Market is fragmented and overlapping. ConsumerizationAnalytics Highly specialized. AssortmentAmateur Many players often produced excel- Churn/Attrition(Some BI) Oracle Suite based tools Supply Chain Microsoft Suite […] Full BI Suites With Some SAS 2 Data Mining/Analytics IBM (Mostly Built-In) SpotfireLocalized […]Analytics(Some SalesDrilldown) 1 BI Tools/ Reporting MS Office Tools Engine BI Reporting Tools Really basic Analytics Internal attemptsAnalyticsLaggard Data/Software Decision SciencesMAP 4.0 Product Features Model courtesy of Dhiraj Rajaram, CEO Mu Sigma and Joseph de Castelnau, SVP Engineering Nielsen CrinLogic
  • Opportunity Areas High Highest benefits are most likely realized when building these products or featuresBusinessStrategic Size of bubble = Est.Value Effort Low IT Strategic Low High Value Sources: “Measuring the Business Value of Information Technology”, Intel Press 5/1/2012 29 CrinLogic
  • Opportunity Areas Size of bubble = Est. High EffortBusinessStrategicAlignment So why do these get built? Low IT Strategic Low Alignment High Sources: “Measuring the Business Value of Information Technology”, Intel Press 5/1/2012 30 CrinLogic
  • The Things We Learned5/1/2012 31 CrinLogic
  • Implementation Approach• Big Data does not lend itself to a Big Bang approach (actually does anything really?)• Proof of concepts make perfect sense to gain traction (top-down push is preferable to federated)• As with any effort with such potential, appropriate oversight by combination of IT/Business executiveOther considerations• Form a central team with key skills in building Big Data solutions. This consulting team should help train, mentor, and provide consulting expertise to new initiatives ….helps ensure consistency in approach.• There is a price of entry…each new participating area should bring resources to the table• Encourage building a community of analytic junkies and support them…community building … goal is information sharing…build a Big Data culture• Preference for consolidation 5/1/2012 32 CrinLogic
  • Components of Commodity Hardware # sockets, # cores, core memory, processor speed SATA, SSD, # Disks5/1/2012 33 CrinLogic
  • 2009/2010 H/W Recommendations• 4 x 1TB hard drives• 2 x Quad-core CPUs, each 2.0-2.5GHz• 16GB RAM• Gigabit Ethernet 5/1/2012 34 CrinLogic
  • Commodity Hardware Today 4-6 x 2TB SATA Drives1U 1 or 2 Socket 1 x 6 or 2 x 6 Cores Approx $4K 4GB Core Memory 24+ GB RAM 12 x 2TB SATA Drives2U 1 or 2 Socket Approx $6K 10GB 1 x 6 or 2 x 6 Cores ethernet? 4-8GB Core Memory ($7K) 24 + GB RAM 20 or 36 x 2TB SATA Drives4U 80 x 3TB (???) 1 or 2 Socket Approx $12K 1 x 6 or 2 x 6 Cores 4-8GB Core Memory5/1/2012 35 CrinLogic
  • Thoughts on Storage TCO• *Price != Cost and TCA is < 20% of TCO• $ per TB not an exact science*David Merrill, Hitachi Data Systems Chief Economist, “Storage Economics: Four Principles for Reducing Total Cost of Ownership” July, 2011 5/1/2012 36 CrinLogic
  • ‘New to Big Data’ learnings Iterative process…if you go inA lot of this is not claiming you ‘know’ the use case youintuitive…e.g. MapReduce, want to solve you are in for a surpriseColumnar based DBs and we live in anRDBMS world Open Source orIt takes a wealth of skills not ‘Free’ not a bigresident in a single person…understand batch MapReduce framework, selling point for thenetworks, grid computing, analytics, subject larger companiesmatter experts, and more5/1/2012 37 CrinLogic
  • Some key learnings from early adoptersDon’t forget operations…in Be ready to embrace2010 Facebook had between 400-500 emergent solutions andoperations professionals…on par withentire engineering organization emergent architecture…Source: http://framethink.wordpress.com/2011/01/17/how- and emergent support base withinfacebook-ships-code/ company Vendor support still needs to catch up…not the level of support companies are used to from established technology vendors 5/1/2012 38 CrinLogic
  • Thinking about workloads…latency Solutions are generally a mix of different paradigms Start here!High Latency Low Latency(1 hour plus) (real-time)5/1/2012 39 CrinLogic
  • Other Random Learnings• For many companies, there will likely be a cultural change required to become good in Big Data analytics• Customers have major concerns about security and cloud• Don’t tell a risk officer that Hadoop’s replication framework mitigates need for disaster recovery • How about you all…any Random Learnings you want to share? 5/1/2012 40 CrinLogic
  • Big Data Analytics Learnings “Analytics is the act of taking Big Data streams and human-sizing them for our small data brains.” Source: http://www.dataspora.com/2010/05/new-tools-for-big-data/#more-182There are no turnkey solutions in analytics space(efforts underway to make big data analytics accessible to the non-Data Scientist) 5/1/2012 41 CrinLogic
  • Final Thoughts5/1/2012 42 CrinLogic
  • Musings• Chief Data Officer || Chief Data Scientist• Just because you can retain all this data does it mean you should?• Big Data and virtualization 5/1/2012 43 CrinLogic
  • Thinking about Starting a Big Data solution1. Big Data strategy assessment?2. Go small (success breeds success)3. Let RT and near RT come to you…don’t start there4. Ensure you have the right skills (or bring them in)5. If only R&D focus then upside may be limited…business needs to have a seat at the tableConsider hiring a professional Big Data consulting firm tohelp in the transition! 5/1/2012 44 CrinLogic
  • Some good resources for youBlogs VideosDatabases and Data Infrastructure http://www.youtube.com/watch?v=SS27F-http://www.dbms2.com hYWfU&feature=relmfuhttp://dbmsmusings.blogspot.com. http://www.youtube.com/watch?v=2FpO7w6X 41Ihttp://databeta.wordpress.com. http://www.youtube.com/watch?v=OmlX3IHbhttp://blogs.gartner.com/donald-feinberg 0JEhttp://itmarketstrategy.com/ http://www.youtube.com/watch?src_vid=UaGIBig Data Analytics NWPK068&annotation_id=annotation_65559http://hunch.net/ &v=XAuwAHWpzPc&feature=ivhttp://ml.typepad.com/ http://www.youtube.com/watch?v=eUcej07dGwww.dataists.com u4http://www.dataspora.com/blog/ http://www.youtube.com/watch?v=viPRny0nqhttp://blog.data-miners.com/ 3ohttp://www.visualcomplexity.com/vc/blog/ 5/1/2012 45 CrinLogic
  • Q&A5/1/2012 46 CrinLogic