Predicting Customer Behavior With Big Data


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • As customers continue to seek out individualized purchasing experiences, companies are beginning to realize that their Big Data initiatives must be optimized in order to provide the personalized level of service that customers demand. spot real-time shifts in customer behavior This will facilitate stronger connections between customer, products, pricing, promotions and sales.If a promotion does not yield the purchases for which the company was hoping, what can it do to increase sales with a different promotion? To discover meaningful relationships and trends within its data, companies must learn how to best acquire, analyze and act on this new information. The data can assist an organization in decisions to bundle services and solutions and determine pricing and packaging.Data categoriesOrganizations typically divide data into two categories: transactional and sub-transactional data. I recommend focusing on sub-transactional data, or clickstream-based data, to identify patterns in user intentions. This data helps your organization make useful predictions about customers’ behaviors. pricing modelsTie the usage to pricing and better match customer behavior to pricing. Using big data solutions, companies can offer services at prices and packages that align with customer actions, while offering the ability to upsell based on those actions in parallel.behavior based customer offersTrack user clickstream behaviorsShowing real-time offers to customers based on specific behavior patterns. this architecture allows to create offers and monetize them immediately to drive more revenue.For example that a well-known travel site will do to increase revenue: they will constantly adjust their prices based on the sales of a specific route in real-time based on user’s reactions.preventing customer churnCustomers lost to churn need to be replaced. That’s expensive. Keeping customer acquisition costs at less than annual recurring revenue is important within today’s products. Sub-transactional data allows your organization to identify when customers begin to use your solutions less frequently or when they are about to make specific mistakes using your products. Analyze this information and put it to good use by fine-tuning the user experience to lead to more engaged customers.
  • Reduce a Classification Problem I'd try to reduce the problem to a classification problem - and use Machine learning existing tools to get an answer. A proper Big Data Solution Architecture can help in discovery.Extract defined and variable featuresLook for golden nuggets in unstructured data that you can extractDefine what you want to predictCreate a training setGather as many classified examples as you can get. Example: User x visited 5 pages and spent a total of 4 minutes.Execute any existing Classification AlgorithmsTry to predict what a non-classified user did, given just her features aloneA short list of some of the algorithms you can use for this: SVM - not intuitive - but is considered by many the best classification algorithm available. K Nearest neighbor - very intuitive and simple to program, and also the training set can be iteratively increased easily, but is usually a bad decision if the number of features is high. Decision trees algorithms - allows very fast classification, and the resulting tree is intuitive and readable to humans.You can evaluate your algorithm and get your confusion matrix Evaluation how much the algorithm was right and how much it was wrong, and how by using cross-validation on the training set.
  • Initial investigations1. Look at the data dictionary to see which data is available2. What is the outcome ? is it yes / no ? is it continuous ?3. Decide upon the model required (logistic ! for yes / no outcome)Getting the data ready4 cross tabulations on categorical variables to understand the coding and volumes5. summary statistics to understand the distribution of the continuous variables6. Ask questions about data quality:remove these variables from any potential models ? or,think about imputation ? or,obtain accurate data ?7. Convert continuous variables into categorical variablesModeling8. Check for multi-colinearity / correlation between variables (variance inflation factors), or correlation tests9. Check for interactions10. Choose type of logistic approach (e.g. forward, backward, stepwise)11. Choose the baseline attribute for each categorical variable12. Create a random variable – mustn’t step into the model - something is wrong if it does step into the model13. Split the dataset into two parts (ratio 80%/20%)using random selection without replacementthe larger sample is the build dataset the smaller sample is the test dataset 14. Put all variables from the build dataset (including interactions and the random variable) into the model and run itCheck odds ratios – do they make sense ?, andCheck the coefficients – do they make sense ?Check the model15. Do diagnostic checks and plots of the fit (e.g. Somers D, residuals etc., etc.)16. Put all variables from the test dataset (including interactions and the random variable) into a new model and run itAre the coefficients the same as the model it was built on ? andAre the odds ratios the same as the model it was built on ?Start again 17. Back to the start, fine tune the grouping of the data, put variables in or take variables out.
  • Start with a small data sample. You don’t need the full width and depth of your data to find interesting stuff. Start small. It saves you a lot of technology headache at the start.Use BigML or any other simple to use tool to build an initial predictive model that you can understand and quickly integrate. Check this series of blog posts comparing some SaaS machine learning offerings. The important words here are simple, actionable and understandable. You don’t want to waste too much time figuring out how to use a tool. Neither do you want to lose time translating and coding the outcomes. And you want to understand the outcomes so you can execute step three.Check if the model gives you any practical insights. Explore the model. Find it’s gold. Or not and discard it.Use the model to generate predictions and see if it can improve your company’s performance. Put the model to action. Find a playground in your company to take a test and measure the changes in churn, conversion, risk or whatever you modeled.Check how more data can improve the model. You can add data in two ways: simply add more datapoints to the same dataset. Or you can add more features to the dataset, new pieces of information, to enhance the model and find new relationships with possibly better performance. In spreadsheet terms: you can add more rows or more columns.Check if this more sophisticated model beats the previous model. Again: put it to action and see how it performs. Does it improve the previous results? Iterate. The secret is to try multiple models to see which one gives the best results at this point in time. Continue to iterate to find the best fit.Now check the technology concept suitable for your situation. Now that you’ve seen some successful implementations of predictive models, you’re much better equipped to evaluate various vendor’s offerings. You have experienced how a cloud-based service saves you the annual license fees, investment in hardware and training etc. You can compare that to the more traditional on site implementations and pick which concept best fits your needs and budget.Actionable analytics.
  • Many think that the explosion in Big Data will create demand for data scientists able to slice and dice data to guide more informed decision making within the organization. Others go a step further,that a chronic data scientist shortage will hold back the full potential of Big Data.For years, the BI and data analytics conversation was framed around how to aggregate massive volumes of data and then unleash the data scientists to find the value. Today, despite the information deluge, enterprise decision makers are often unable to access the data in a useful way. The tools are designed for those who speak the language of algorithms and statistical analysis.Accelerate natural language capabilitiesIn laying out several predictions for Big Data and data analytics in the coming years, research firm Gartner forecasts that by 2016, 70 percent of leading BI vendors will have incorporated natural language and spoken-word capabilities. Improving natural language queries is essential to the consumerization of Big Data, as it better enables everyday business users to ask questions of their analytics tool in an intuitive format (text, voice-enabled, etc.), and receive the most relevant and meaningful visualization results.Look smart (and ask very little of the user)Users now expect the answer to be presented via visualization — and they expect to be able to interact with the visualization. It’s more than a picture telling a thousand words. It’s the picture becoming the interaction.Create social learning layerBy building a social learning layer into data analytics tools, user behavior can be “learned” by the tool, so it can anticipate queries and be trained to the specific needs of each user. By making the tool as seamless as possible for business users, there is less chance the tool will waste time, resources, or deliver results different than what the user is looking for.Deflate Big DataAn enterprise should start with a small set of users it believes will experience the most immediate value.Focus on right kind of mobilityToday’s Big Data Analytics platform must consumerize the user experience by removing spreadsheets and reports, and place the power of analytics in the hands of users of any level and analytics expertise. This consumerized data analytics experience must enable mobility on any smartphone or tablet device, with complete flexibility on how data can be visualized. It is this flexibility for visualizing data in real-time on mobile devices – rather than just making massive volumes of data accessible to mobile users – that will help data analytics providers realize the most from their mobile investments.A consumerized Big Data experience is in sight for enterprises, as long as data analytics providers remain focused on features and functionality that ultimately will matter most to business users. Big Data analytics is leading the charge to reinventing enterprise applications – bringing our consumer-level expectations to our everyday work experience.
  • Flight Cost Variant Determination Flight Cost is one of the algorithm methods being used to increase/decrease revenue based on page views, consumer marketing, and time spent on a particular one-way or round-trip flight by a consumer. The goal is to provide not only alternatives, but increase/decrease cost while other consumers are also viewing the same flights. This is determined by sales from all related airlines and competitors during the flight availability. This method can be extended to use other sources as well.Destinations:web applicationsmobile applicationshadooprdbms
  • Predicting Customer Behavior With Big Data

    1. 1. CONSULTING SOLUTIONS OUTSOURCINGPARTNER FOR A NEWERABig Data in RetailTom Kersnick - Director, Big Data Solutions, PacteraChallen Bonar - Senior Director, Retail Practice, PacteraUtilizing Big Data toPredict Customer BehaviorTom Kersnick – Director Big Data Solutions
    2. 2.  NASDAQ: Symbol PACT Based in Charlotte NC & Beijing, China 35 Offices Globally / 24,000 Employees Fortune 500 Clients (Retail, Financial Services, High Tech) Focus on Driving Innovation (Big Data, Analytics, Mobility, Cloud Solutions)2Pactera Snapshot© Pactera. Confidential. All Rights Reserved.
    3. 3. Contact Presenter3© Pactera. Confidential. All Rights Reserved.Tom KersnickDirector, Big Data SolutionsEmail: Tom.Kersnick@pactera.com6100 Fairview Road, Suite 560, Charlotte, NC 28210Visit our website: www.pactera.comExperience: Solution Architecture Scalability for the Next Generation Algorithm development/enhancement for Fortune 500/1000 clients within all verticals Real-Time Analytics In-Memory Architectures Internal and Consumer facing API development Technical Project & Business Process Management Reporting Solution Framework Apache Foundation Open Source development contributor
    4. 4. Topics2 How to Create a Successful Predictive Analytics Strategy3 Making Big Data Accessible to Non-Data Scientists© Pactera. Confidential. All Rights Reserved. 45 Ways Big Data Can Provide Intimacy with Your Customers1
    5. 5. Providing Intimacy with your CustomersSpot Real-Time Shifts in Customer Behavior Acquire, Analyze, and Act on new information Data can assist an organization in decisions to bundle services and solutionsto determine pricing and packagingData Categories Transactional and Sub-TransactionalPricing Models Consumptive Pricing ModelsBehavior Based Customer Offers Predictive Behaviors to provide item suggestions and offersPreventing Customer Churn Improve Customer Satisfaction by preventing dissatisfaction5© Pactera. Confidential. All Rights Reserved.
    6. 6. Creating a Successful Predictive StrategyTry to Reduce a Classification Problem Big Data Solution Architecture for DiscoveryExtract Defined and Variable Features Looking for the golden nuggets of dataDefine What You Want to Predict Classify to answer the ‘what if’ questionsCreate a Training Set Gather as many classified examples as you can getExecute Any Existing Classification Algorithms Try to Predict a Non-Classified User- SVM- K Nearest Neighbor- Decision trees6© Pactera. Confidential. All Rights Reserved.
    7. 7. 7Predictive Model Framework© Pactera. Confidential. All Rights Reserved.DataMLAlgorithmPredictionOutputFeaturesWhat Data to Use? What Features? Which Algorithm? RecommendationsInvestigateData QualityModelsChecking
    8. 8. Predictive Big Data Flow8© Pactera. Confidential. All Rights Reserved.Capturingand StoringData1Processing2DerivingInsights3MakingInsightsActionable4Reject Bad Data | Wrong Insights
    9. 9. Making Big Data Accessible to Non-Data Scientists Accelerate Natural LanguageCapabilities Text and voice enabledvisualization Look Smart (ask very little from theuser) Visualization Create a Social Learning Layer Learning within analytics tools Deflate Big Data Start small Focus on Mobility Mobility of visualization withreal-time analytics9© Pactera. Confidential. All Rights Reserved.
    10. 10. PredictiveAnalysis UseCase forOnline TravelCompany10© Pactera. Confidential. All Rights Reserved.
    11. 11. Flight Cost by Variants Determination11© Pactera. Confidential. All Rights Reserved.Data Feeds utilize real-time in-memory streaming to execute matching algorithms inorder to determine views within a session of certain one-way and round trip flightsviewed by users.Predictive Analytics algorithms determine how to increase/decrease prices based onviews, market pricing, time, and availability.httplogspartnerscustomincomingoutgoingdestinationsrdbmshadoopapplicationmobileReal Time In-Memory Solution
    12. 12. How Pactera can help with Big DataImplementation and ArchitectureBenchmark and MonitoringImplementation and ArchitecturePOC (2-4 Weeks)© Pactera. Confidential. All Rights Reserved. 12Executive WorkshopStrategies, Planning, and Expectations• Big Data strategy on what tomorrow will look like• Using Big Data to establish market dominance• Big Data project takeaways• Roadblocks to implementing Big Data analytics• Defining an ROI for Big Data• Getting the right ROI on Big DataWorkshop(4 Hours)Proof of Concept(2-4 Weeks)Projects:•Benchmark & Monitoring•Integrations & Migrations•Implementation & Architecture•Project Management•Analytics•ReportingTechnical WorkshopEnd-To-End Management• System tuning/auto-tuning and configuration management• Dealing with both structured and unstructured data• Monitoring, diagnosis, and automated behavior detectionSolution Architecture• Processor, memory, and system architectures for data analysis• Benchmarks, metrics, and workload characterization for bigdata• Availability, fault tolerance and recovery issues• Data management and analytics for vast amounts ofunstructured data
    13. 13. Thank you© Pactera. Confidential. All Rights Reserved.