This document discusses several machine learning and big data topics including:
1. It discusses different streaming processing frameworks like Spark Streaming, Storm, and micro-batch processing.
2. It also discusses Elasticsearch and HBase for storage and analytics and how they differ in terms of being schema-free, using in-memory indexes, and supporting SQL queries.
3. Finally, it provides the results of different machine learning models like logistic regression, decision trees, and gradient boosted trees on a prediction problem and compares their precision, recall, and false positive rates.
This document lists various data science resources including:
1) Lists of data mining blogs and analytic websites maintained by others.
2) A list of data science resources including books, companies, training, and vendors maintained on the DataShaping website.
3) Vincent's favorite books on topics like natural language processing, statistics, data mining, and machine learning.
4) Internal resources on the Analyticbridge website including news, jobs, training courses, conferences, and more.
This document discusses several machine learning and big data topics including:
1. It discusses different streaming processing frameworks like Spark Streaming, Storm, and micro-batch processing.
2. It also discusses Elasticsearch and HBase for storage and analytics and how they differ in terms of being schema-free, using in-memory indexes, and supporting SQL queries.
3. Finally, it provides the results of different machine learning models like logistic regression, decision trees, and gradient boosted trees on a prediction problem and compares their precision, recall, and false positive rates.
This document lists various data science resources including:
1) Lists of data mining blogs and analytic websites maintained by others.
2) A list of data science resources including books, companies, training, and vendors maintained on the DataShaping website.
3) Vincent's favorite books on topics like natural language processing, statistics, data mining, and machine learning.
4) Internal resources on the Analyticbridge website including news, jobs, training courses, conferences, and more.
How to Perform Churn Analysis for your Mobile Application?Tatvic Analytics
For every marketer of mobile application, acquiring new customers certainly requires more effort in terms of time and money. On the other hand, firm can always focus on maintaining existing customer base and gain maximum out of them. If this is the case, then predictive analysis will be the correct approach for this situation.
The primary goal of this webinar is to predict segment of Mobile application users,
* Who will uninstall the app
* Remain inactive (which will be also termed as a churner) for quite long time and are expected to churn.
Churn analysis is the approach by which we will predict the likelihood of this event to occur.
Our webinar covers:
* How to extract data from Google Analytics using R
* How to build churn model in R
* Identifying the customer/subscriber segment that are classified based on past data pattern, who are likely to churn (Study customer behavior Patterns)
Watch Full Webinar - http://www.tatvic.com/webinar/churn-analysis-for-mobile-application/
Data Tactics Analytics Brown Bag (Aug 22, 2013)Rich Heimann
This document provides an overview and agenda for a brown bag presentation on analytics services. The presentation includes introductions of the analytics team, discussions of why analytics are important both for business and practical reasons, and case studies of identifying smugglers and analyzing text data. The presentation emphasizes a philosophy of not being "data agnostic" and using modes of inquiry like induction and abduction rather than deduction.
This document summarizes Salford Systems' participation in an international competition to predict customer churn for a major mobile provider. Salford Systems used an ensemble of decision tree models called TreeNet to predict churn with significantly higher accuracy than other methods. TreeNet models achieved a top decile lift of 3.01 and Gini coefficient of 0.400 on future churn predictions, substantially better than the average and second place method. The document outlines the data and task, TreeNet methodology, results, and conclusions that TreeNet was key to winning due to its superior predictive performance.
This document discusses churn management in mobile communications. It defines churn as customer attrition or loss and churn rate as the number of customers who discontinue service divided by the total number of customers. It identifies reasons for churn such as easy switching between providers and inadequate services. It discusses types of churn, data transformation for modeling, identifying customers' propensity to churn, and calculating customer profitability. Finally, it outlines strategies for reducing churn such as identifying valuable customers and developing win-back policies.
(Presented by Antonio Piccolboni to Strata 2012 Conference, Feb 29 2012).
Rhadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R. RHadoop is comprised of three packages.
- rhdfs provides file level manipulation for HDFS, the Hadoop file system
- rhbase provides access to HBASE, the hadoop database
- rmr allows to write mapreduce programs in R
rmr allows R developers to program in the mapreduce framework, and to all developers provides an alternative way to implement mapreduce programs that strikes a delicate compromise betwen power and usability. It allows to write general mapreduce programs, offering the full power and ecosystem of an existing, established programming language. It doesn’t force you to replace the R interpreter with a special run-time—it is just a library. You can write logistic regression in half a page and even understand it. It feels and behaves almost like the usual R iteration and aggregation primitives. It is comprised of a handful of functions with a modest number of arguments and sensible defaults that combine in many useful ways. But there is no way to prove that an API works: one can only show examples of what it allows to do and we will do that covering a few from machine learning and statistics. Finally, we will discuss how to get involved.
The document discusses machine learning algorithms for predicting customer churn in a prepaid mobile network. It presents an overview of supervised and unsupervised learning techniques including support vector machines, k-nearest neighbors, neural networks, decision trees and naive Bayes. The document outlines features for a churn prediction model, describes a demo of the model using different algorithms, and evaluates the classification accuracy and churn rates.
This document discusses various use cases and analyses for a telecom company, including subscription activation and termination, CRM and billing, revenue segmentation by service and customer, customer churn analysis and reasons for churn, customer profiling, calculating average revenue per user (ARPU) and the shift to average revenue per account (ARPA), segmentation of postpaid and prepaid customers, analyzing tariff plan changes, and how customer segmentation can benefit operators by maximizing revenue and retention.
How to Perform Churn Analysis for your Mobile Application?Tatvic Analytics
For every marketer of mobile application, acquiring new customers certainly requires more effort in terms of time and money. On the other hand, firm can always focus on maintaining existing customer base and gain maximum out of them. If this is the case, then predictive analysis will be the correct approach for this situation.
The primary goal of this webinar is to predict segment of Mobile application users,
* Who will uninstall the app
* Remain inactive (which will be also termed as a churner) for quite long time and are expected to churn.
Churn analysis is the approach by which we will predict the likelihood of this event to occur.
Our webinar covers:
* How to extract data from Google Analytics using R
* How to build churn model in R
* Identifying the customer/subscriber segment that are classified based on past data pattern, who are likely to churn (Study customer behavior Patterns)
Watch Full Webinar - http://www.tatvic.com/webinar/churn-analysis-for-mobile-application/
Data Tactics Analytics Brown Bag (Aug 22, 2013)Rich Heimann
This document provides an overview and agenda for a brown bag presentation on analytics services. The presentation includes introductions of the analytics team, discussions of why analytics are important both for business and practical reasons, and case studies of identifying smugglers and analyzing text data. The presentation emphasizes a philosophy of not being "data agnostic" and using modes of inquiry like induction and abduction rather than deduction.
This document summarizes Salford Systems' participation in an international competition to predict customer churn for a major mobile provider. Salford Systems used an ensemble of decision tree models called TreeNet to predict churn with significantly higher accuracy than other methods. TreeNet models achieved a top decile lift of 3.01 and Gini coefficient of 0.400 on future churn predictions, substantially better than the average and second place method. The document outlines the data and task, TreeNet methodology, results, and conclusions that TreeNet was key to winning due to its superior predictive performance.
This document discusses churn management in mobile communications. It defines churn as customer attrition or loss and churn rate as the number of customers who discontinue service divided by the total number of customers. It identifies reasons for churn such as easy switching between providers and inadequate services. It discusses types of churn, data transformation for modeling, identifying customers' propensity to churn, and calculating customer profitability. Finally, it outlines strategies for reducing churn such as identifying valuable customers and developing win-back policies.
(Presented by Antonio Piccolboni to Strata 2012 Conference, Feb 29 2012).
Rhadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R. RHadoop is comprised of three packages.
- rhdfs provides file level manipulation for HDFS, the Hadoop file system
- rhbase provides access to HBASE, the hadoop database
- rmr allows to write mapreduce programs in R
rmr allows R developers to program in the mapreduce framework, and to all developers provides an alternative way to implement mapreduce programs that strikes a delicate compromise betwen power and usability. It allows to write general mapreduce programs, offering the full power and ecosystem of an existing, established programming language. It doesn’t force you to replace the R interpreter with a special run-time—it is just a library. You can write logistic regression in half a page and even understand it. It feels and behaves almost like the usual R iteration and aggregation primitives. It is comprised of a handful of functions with a modest number of arguments and sensible defaults that combine in many useful ways. But there is no way to prove that an API works: one can only show examples of what it allows to do and we will do that covering a few from machine learning and statistics. Finally, we will discuss how to get involved.
The document discusses machine learning algorithms for predicting customer churn in a prepaid mobile network. It presents an overview of supervised and unsupervised learning techniques including support vector machines, k-nearest neighbors, neural networks, decision trees and naive Bayes. The document outlines features for a churn prediction model, describes a demo of the model using different algorithms, and evaluates the classification accuracy and churn rates.
This document discusses various use cases and analyses for a telecom company, including subscription activation and termination, CRM and billing, revenue segmentation by service and customer, customer churn analysis and reasons for churn, customer profiling, calculating average revenue per user (ARPU) and the shift to average revenue per account (ARPA), segmentation of postpaid and prepaid customers, analyzing tariff plan changes, and how customer segmentation can benefit operators by maximizing revenue and retention.
Xuefeng Si earned two Microsoft certifications in 2010 - a Microsoft Certified IT Professional in Business Intelligence Developer on June 5, 2010 and a Microsoft Certified Technology Specialist in Microsoft SQL Server 2005, Business Intelligence Development on May 29, 2010. He successfully completed the exams required for these certifications, exam 446 on June 5, 2010 and exam 445 on May 29, 2010.