Online Tuesday #26 - Big Data - Jeroen Dijkxhoorn, SAS

3,384 views

Published on

Dit is de presentatie die Jeroen Dijkxhoorn gaf tijdens Online Tuesday #26 over Big Data op dinsdag 12 juni 2012.

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,384
On SlideShare
0
From Embeds
0
Number of Embeds
1,791
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • The company has big data if they can’t do all the “what if you could’s”. They either:don’t have enough storage to capture the data they want to analyze,can’t run the analysis they want because it would take too long to process that much data, Or both.Big Data is relative and our high performance technologies are specifically designed to help them.
  • 17% of the world’s population used a social networking site in 2011.Twitter logs 100 million Tweets per day.Facebook counts 350 million unique visitors per day.60 hours of video is transferred to YouTube every 60 seconds.80% of companies use social media for recruitment.Many competitors are talking about the first 3 Vs. We belive that Value is the key one. Just like speed isn’t enough, gaining value from the data is all that matters.
  • http://www.sas.com/success/catalina.htmlCatalina Marketing says it has reduced its model-scoring times from 4.5 hours to around 60 seconds thanks to SAS Scoring Accelerator for Netezza. The company also said that as a result, it is able to use more complex, varied models to obtain analytical results faster for more efficient, reliable decisions -- improving brand performance on behalf of its food, drug, and mass advertising and marketing partners.As the largest customer behavior marketing company in the world, Catalina analyzes and predicts shoppers’ buying behaviors to generate customized point-of-sale color coupons, advertisements and informational messages for retail stores and pharmacies nationwide. “We want to not just be on the leading edge, but on the bleeding edge in the marketplace in terms of what we can offer our clients,’’ says Laurie Wachter, Senior Vice President of Analytics for Catalina. “SAS solutions have allowed us to actually predict what customers are likely to buy and that has revolutionized our ability to make our clients’ coupons and messages relevant to shoppers,” says Wachter. "I know from experience and discussions with colleagues in the industry how long it can take to build a predictive model,'' Wachter says. "They're taking more than a month to build one model. Using SAS, we've automated the execution of our models and scoring them against our entire 140 million consumer database for the implementation of marketing campaigns literally in days.""Not only that, but our samples are 10 to 15 times larger than anything anyone else is doing today,'' Williams says. "Nobody could do what SAS is enabling us to do – the capabilities just didn't exist beforehand.“Ryan Carr, Catalina’s Vice President of Advanced Analytics, added: “By using the SAS Scoring Accelerator for Netezza, Catalina's execution time has diminished from hours to minutes. We have experienced as much as a 90 percent reduction in execution time compared to our traditional use of complex models, previously performed outside the database.”
  • http://blogs.sas.com/content/sascom/2012/04/16/assess-risk-in-seconds-not-days-with-high-performance-analytics/The traditional analytical process is time consuming and inefficient. It simply take long time (days) to finish data preparation/exploration, model development and model deployment steps. More specifically what if you :Have problems that they can’t solve because their data volumes are too big or beyond the capacity of their existing systems.Have too many records in their data or too many attributes or variables that needs to be incorporated in the modeling process. Predictive Analytics need to be applied on more granular level data. For example, churn at the customer level, predicting parts failure for an entire product line, or propensity to buy at the account level. Massive variable selection steps, which necessitates sorting through thousands of variables to determine which are the most predictive. Do not want to compromise by using sub-optimal modeling techniques.Cannot quickly test or experiment different modeling techniques and find the best fit to improve accuracy.Want to modify the model with new attributes and do not have time to wait.For example, a large bank in US was building and deploying probability of default models on the home loans they were service. The goal was to detect high-risk accounts, reduce bad-debt provision and overall losses. Deploying the model in a standard SMP environment took them 167 hrs. The same model deployed using SAS HPA took 84 seconds to complete. So what?Helps to incorporate large volumes of data with no limits on number of observations and variables or attributes for accurate determination of likelihood of defaults and loss forecasting.Allow banks to adjust the historical transition probabilities bases on changes in macroeconomic factors, and hedge these risks effectively. Quickly determine when and whether the borrower is migrating to a riskier pool. Offers flexibility to test multiple scenarios or new ideas, use best modeling techniques, and perform model iterations more frequently to accurately and quickly identify risks at individual portfolio level and take targeted actions for bank to stay ahead of the market.
  • 17% of the world’s population used a social networking site in 2011.Twitter logs 100 million Tweets per day.Facebook counts 350 million unique visitors per day.60 hours of video is transferred to YouTube every 60 seconds.80% of companies use social media for recruitment.Many competitors are talking about the first 3 Vs. We belive that Value is the key one. Just like speed isn’t enough, gaining value from the data is all that matters.
  • Online Tuesday #26 - Big Data - Jeroen Dijkxhoorn, SAS

    1. 1. SAS Big Data Casesputting hype into practiceJeroen DijkxhoornHead of Strategic InitiativesGlobal CoE Information Management & AnalyticsSAS Copyright © 2012, SAS Institute Inc. All rights reserved.
    2. 2. Agenda 2 Copyright © 2012, SAS Institute Inc. All rights reserved.
    3. 3. SAS Introduction CUSTOMERS 55.000 SAS BUSINESS ANALYTICS EMPLOYEES 12.958 COMPANY $2.725 Billion 3 Copyright © 2012, SAS Institute Inc. All rights reserved.
    4. 4. SAS Netherlands 4 Copyright © 2012, SAS Institute Inc. All rights reserved.
    5. 5. External Viewpoint We put nearly all of the data that 22% is of real value to good use We probably leverage about half 53% of our valuable data Vast quantities of useful data go 24% untapped Source: Economist Intelligence Unit 2011 Report, Sponsored by SAS, 2011 5© The Economist Intelligence Unit Limited 2 Copyright © 2012, SAS Institute Inc. All rights reserved.
    6. 6. OURPERSPECTIVE Big Data is RELATIVE not ABSOLUTE Big Data When volume, velocity and variety of data exceeds an organization’s capacity for accurate and timely decision-making. 6 Copyright © 2012, SAS Institute Inc. All rights reserved.
    7. 7. THRIVING IN THE BIG DATA ERA VOLUME VARIETYDATA SIZE VELOCITY TODAY THE FUTURE 7 Copyright © 2012, SAS Institute Inc. All rights reserved.
    8. 8. Sources of Big Data Consumer Generated BIG In-Database Analytics Text Analytics Enterprise Generated DATA In-Memory Analytics Device Generated 8 Copyright © 2012, SAS Institute Inc. All rights reserved.
    9. 9. Global Pulse is an innovation initiative of the UN Secretary- General, harnessing todays new world of digital data and real-time analytics to gain a better understanding of changes in human well-being. 9Copyright © 2012, SAS Institute Inc. All rights reserved.
    10. 10. 10Copyright © 2012, SAS Institute Inc. All rights reserved.
    11. 11. Project Objectives, Approach and Goals  Project Objective  Enrich insights into Unemployment Shocks and Resulting Coping Strategies by including Real-Time Social Listening Data and the Application of Analytic Methods  Approach  Investigated 500,000+ Social Listening Sources with respect to Unemployment Shocks  Goals  Can we understand qualitative experiences and feelings around unemployment to complement official statistics?  Can online conversations provide an early indicator of impending job losses?  Can the data help policy makers enrich their understanding of how communities cope? 11 Copyright © 2012, SAS Institute Inc. All rights reserved.
    12. 12. 12Copyright © 2012, SAS Institute Inc. All rights reserved.
    13. 13. Quantifying the Data with Mood States  Captures Author’s Psychological and Emotional State  Hierarchical Dimensional Construction  Positive dimensions paired with (opposite) Negative dimensions  Results in 6 mood scales; each range from positive to negative  Analytic Scoring Process  2-Stage Model Process  Content Categorization Filtering  Classifier Weighting  Negation / Amplification / Dampening COMPOSED CONFIDENT CLEARHEADED ENERGETIC AGREEABLE ELATED ANXIOUS UNSURE CONFUSED TIRED HOSTILE DEPRESSED 13 Copyright © 2012, SAS Institute Inc. All rights reserved.
    14. 14. 14Copyright © 2012, SAS Institute Inc. All rights reserved.
    15. 15. 15Copyright © 2012, SAS Institute Inc. All rights reserved.
    16. 16. 16Copyright © 2012, SAS Institute Inc. All rights reserved.
    17. 17. 17Copyright © 2012, SAS Institute Inc. All rights reserved.
    18. 18. 18Copyright © 2012, SAS Institute Inc. All rights reserved.
    19. 19. CUSTOMER CATALINA MARKETING CASE STUDY SAS® IN-DATABASE ANALYTICS 4.5 HRS DEVELOPMENT EXPLORATION DEPLOYMENT MODEL MODEL DATA•• 2.5 petabytes of customer information 23,000 stores and 14,000 retail pharmacies 60 SECONDS• 250 million transactions per week 19 Copyright © 2012, SAS Institute Inc. All rights reserved.
    20. 20. CUSTOMER CATALINA MARKETINGCASE STUDY SAS® IN-DATABASE ANALYTICS “SAS solutions have allowed us to actually predict what customers are likely to buy and that has revolutionized our ability to make our clients’ coupons and messages relevant to shoppers” “Using SAS, weve automated the execution of our models and scoring them against our entire 140 million consumer database for the implementation of marketing campaigns literally in days.” “Not only that, but our samples are 10 to 15 times larger than anything anyone else is doing today” Laurie Wachter Senior Vice President of Analytics 20 Copyright © 2012, SAS Institute Inc. All rights reserved.
    21. 21. CUSTOMER 167 Hours SAS® IN-MEMORY ANALYTICSCASE STUDY DEVELOPMENT Bottom-line Impact: EXPLORATION DEPLOYMENT MODEL MODEL DATA Tens of Millions of Dollars "The unique method by which SAS is distributing complex workloads is allowing us to perform even the most complex 84 SECONDS analysis in a fraction of the time. We view this as a major breakthrough for quickly analyzing and modeling color/size intensive product data across hundreds of store locations concurrently.“  +14% in sales, +17% in retained value Five-year expected ROI = $500 million. 21 Copyright © 2012, SAS Institute Inc. All rights reserved.
    22. 22. THRIVING IN THE BIG DATA ERA VOLUME VARIETYDATA SIZE VELOCITY VALUE TODAY THE FUTURE 22 Copyright © 2012, SAS Institute Inc. All rights reserved.
    23. 23. jeroen.dijkxhoorn@sas.comCopyright © 2012, SAS Institute Inc. All rights reserved.

    ×