Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

THE ARTISTRY BEHIND DATA SCIENCE

155 views

Published on

Some case studies of the work done by N.0 team

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

THE ARTISTRY BEHIND DATA SCIENCE

  1. 1. Ndot0 Case Studies
  2. 2. Nutritional products company betters financial health with “Test and Learn” Program Challenge A leading manufacturer of nutritional products that were sold both in-store and via health care providers wanted an easy way to predict effectiveness of promotional campaigns on lift, margin and the bottom line before they were rolled out nation-wide. Additionally, they needed an easy solution to allow managers to plan and test future campaigns themselves. Solution • Develop a bespoke test & learn tool that would measure the effectiveness of promotional, incentive and discount campaigns • Ensure accurate results by carefully matching test and control locations • Apply statistical tools to determine significance of test and overall campaign impact • Develop a database of retail, chain and regional profiles for future test & learn efforts Highlights & Results • Conducted 30+ campaigns via Test & Learn analysis, with lifts measured as small as 0.1% • Delivered measurements on impact of campaigns at all product hierarchy levels: SKU, brand, category and portfolio • Developed demographic profiles of top performing stores for future campaigns • Database allows teams to search by brand or market to see potential campaign lift for future efforts Assigning test & control stores by campaign Records lift between test & control locations For more accurate results, test and control locales are selected on nearly identical variables: demographics, sales mixes, volume, seasonality trends, weather, etc. Insights include how the campaign would have performed if it was rolled out to the top 20% vs bottom 20% locations Where suitable options didn’t exist, forecasting creates pseudo-control environments T C P Measuring impact 1 2 Key Components of the Solution
  3. 3. Pharmaceuticals company improves financial returns with better call planning Challenge A leading manufacturer of pharmaceutical drugs wanted an optimal way to allocate calls from medical reps to doctors, in such a way that the overall revenue could be maximized while also including business logic and constraints into the algorithm. Solution • Build promotion response curves (S-curves) for the drugs’ past year’s performance, for various segments of doctors • Apply ‘greedy’ algorithm along with business constraints for optimal allocation • Build handles and levers in the tool for easy input of business logic and decisions Highlights & Results • The optimization algorithm could run on SAS platform in under an hour, opposed to other manual techniques which could take multiple days for the same process • Results were delivered for multiple scenarios within a very short time frame • The tool’s code base could be very easily re-used for other optimization activities like territory alignment Build promotion response curves for each drug Run greedy algorithm along with business constraints 1 2 Key Components of the Solution Saturation curves are built for each product using modeling techniques Greedy decisions based on the local optimum is the underlying principle of greedy optimization
  4. 4. E-commerce company improves customer experience by quantifying and acting on the underlying drivers Challenge A leading e-commerce company wanted to quantify the impact of various drivers on customer satisfaction scores, which could in turn be used for deriving actionable insights. Additionally, they wanted to forecast the impact of changes to drivers on the overall customer satisfaction score (Net Promoter Score – NPS). Solution • Run exploratory data analysis (EDA) on various potential drivers to understand correlation with the outcome as well as with other driver variables • Build a multinomial logistic regression model to identify and quantify impact from various drivers • Build waterfall charts and what-if scenarios to understand future impact on NPS by changing values of drivers Highlights & Results • It was first of its kind analysis in the e-commerce industry, and the identified key drivers did not belie intuition • Based on the what-if scenarios it was easy to determine the key action areas so as to improve the overall customer satisfaction score Build promotion response curves for each drug Run greedy algorithm along with business constraints 1 2 Key Components of the Solution Multinomial logistic regression models were built after validation of underlying assumptions Model coefficients were translated into effective contributions to the change in NPS over time
  5. 5. Electricity Supplier improves demand side management by identifying Home Appliances Challenge A retail energy supplier wanted to improve the demand side management of energy by understanding and influencing the customer behaviour with respect to the time and types of electrical/electronic appliances used in a household, so that peak power demand can be minimised. Solution • First level of classification was based on domain input to identify switching of ’resistive’, ’capacitive’ and ’inductive’ devices • Two key parameters, ’Active Power’ and ’Reactive Power’ were clustered based on their values and time of appearances to define power states • Power states were further clustered based on time to identify a home appliance Highlights & Results • The algorithm helped in identifying common household appliances like CFL, AC, Heating Devices, Kettle, Geyser, Refrigerator and Washing Machine • Customer incentive programs were launched based on the finding to shift non-essential loads during peak consumption hours. Cluster analysis of switching patterns of home appliances Pattern based classification of home appliances 1 2 Key Components of the Solution Time and Value based cluster analysis of active and Reactive power to determine the power states Power States are classified based on pattern matching with device signature bank
  6. 6. Equipment Manufacturer reduces turnaround time by predicting failures in advance Challenge A drilling equipment manufacturer wanted to minimise the turnaround time for the damaged assets and optimize entire servicing supply chain. The idea was to reduce overall downtime for the customers through predictive maintenance cycles. Solution • EDA was done to examine various factors like operating condition, weather, geography, etc. to identify leading correlating factors to the failures of equipments • Asset cluster analysis was done to identify similar type of assets in same age segment working under similar conditions • Remaining life of asset and failure frequency (MTTF) was predicted based on cluster analysis Highlights & Results • Asset failure frequency helped in forecasting the number of service requests for every item class, thereby, reducing the time to repair them • Predicting the remaining life of asset helped in taking repair vs replacement decision EDA explained the factors and items to be considered for making the model Pattern based classification of home appliances 1 2 Key Components of the Solution Outlier detection was done to scrap erroneous records before making the model 0 600 1200 1800 2400 3000 3600 Age Item clusters revealed the expected number of failures in near future
  7. 7. Insurance Company finds driving behaviour based telematics fingerprint Challenge An insurance company fit GPS devices in the cars. The GPS location of the cars was recorded with good accuracy. The insurance company wanted to understand which of the drives on the cars have been taken up the owner of the insurance Solution • EDA was done to examine various factors such as average speed, deciles of velocity, deciles of acceleration, per capita jerks, sharp turns etc. to identify leading factors which might contain information • Some visualizations were made to understand the paths • Various machine learning techniques, ensembles, and model stacking were used to get a 91% accurate model Highlights & Results • The model is a experimental and can’t be used by the insurance company, under the current laws, to detect frauds • However, the company keeps evaluating certain drivers for whom the devices are installed Usage of variable selection Plotting the paths of drivers 1 2 Key Components of the Solution Variable selection using various techniques Plotting the paths of the drivers
  8. 8. Mining company predicts component failure in advance Challenge 1 The mining company has 100’s of TB of data from more than 20 sources. Most of it was sensor data. There were other dirty data sources such as maintenance data, alarms data, maintenance dare etc. The company wanted to predict their mining truck failures in advance to save money and smoothen operations Solution • A massive EDA phase helped us understand the data • A number of map-reduce job helped do the data engineering quickly • Some visualizations were made the patterns in the sensor data • Various clustering techniques were used for 2nd problem while supervised learning techniques helped solve 1st and 3rd problem Highlights & Results • The models were extremely successful in predicting the truck failures • The alarm prioritisation helped reduce alarm handling time by over 95% • The oil impurities models became a major indicator of how trucks should be maintained Challenge 2 The mining company had a number of OEM defined and self defined alarms. There were more than 1000’s of such defined alarms. This caused a lot of false alarms which couldn’t be handled by alarm operators. The company wanted to put a priority number at the top of every alarm Challenge 3 The mining company checked the oiling of the parts of their machinery in an empirical manner. However, there were some trucks which overworked while others underworked. The company wanted to know if there was a way to predict impurities on the basis of the work done by trucks and on the basis of terrains they covered.

×