The document discusses how machine learning algorithms can be used in ecommerce to increase sales and conversions. It provides an overview of common algorithms such as K-means clustering which can be used to segment customers into personas for targeted marketing. K-nearest neighbors algorithm can be used to generate personalized product recommendations based on a user's purchase history and preferences of similar customers. Examples are given of how these algorithms work and practical tips provided for implementing machine learning in ecommerce applications.
The document provides an introduction to machine learning techniques for category representation, outlining topics like clustering, classification, dimensionality reduction, and density estimation. It discusses supervised, unsupervised, and semi-supervised learning approaches and how to evaluate models using techniques like cross-validation to avoid overfitting. The goal of the course is to introduce common machine learning algorithms used in object recognition systems.
sentiment analysis using support vector machineShital Andhale
SVM is a supervised machine learning algorithm that can be used for classification or regression. It works by finding the optimal hyperplane that separates classes by the largest margin. SVM identifies the hyperplane that results in the largest fractional distance between data points of separate classes. It can perform nonlinear classification using kernel tricks to transform data into higher dimensional space. SVM is effective for high dimensional data, uses a subset of training points, and works well when there is a clear margin of separation between classes, though it does not directly provide probability estimates. It has applications in text categorization, image classification, and other domains.
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Exploring the Impact of Magnitude- and Direction-based Loss Function on the P...Dr. Amarjeet Singh
Researches on predicting prices (as time series) from deep learning models usually use a magnitude-based error measurement (such as ). However, in trading, the error in the predicted direction could affect trading results much more than the magnitude error. Few works consider the impact of ill-predicted trading direction as part of the error measurement.
In this work, we first find parameter sets of LSTM and TCN models with low magnitude-based error measurement, and then calculate the profitability using program trading. Relationships between profitability and error measurements are analyzed.
We also propose a new loss function considering both directional and magnitude error for previous models for re-evaluation. Three commodities are tested: gold, soybean, and crude oil (from GLOBEX). Our findings are: with given parameter sets, if merchandise (gold and soybean) is of low averaged magnitude error, then its profitability is more stable. The proposed loss function can further improve profitability. If it is of larger magnitude error (crude oil), then its profitability is unstable, and the proposed loss function cannot improve nor stabilize the profitability.
Furthermore, the relationship between profitability and error measurement for models of LSTM and TCN with or without customized loss function is not, as commonly believed, highly positively correlated (i.e., the more precise the predicted value, the more trading profit) since the correlation coefficients are rarely higher than 0.5 in all our experiments. However, the customized loss functions perform better in TCN than in LSTM.
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand what is clustering, K-Means clustering, flowchart to understand K-Means clustering along with demo showing clustering of cars into brands, what is logistic regression, logistic regression curve, sigmoid function and a demo on how to classify a tumor as malignant or benign based on its features. Machine Learning algorithms can help computers play chess, perform surgeries, and get smarter and more personal. K-Means & logistic regression are two widely used Machine learning algorithms which we are going to discuss in this video. Logistic Regression is used to estimate discrete values (usually binary values like 0/1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logit function. It is also called logit regression. K-means clustering is an unsupervised learning algorithm. In this case, you don't have labeled data unlike in supervised learning. You have a set of data that you want to group into and you want to put them into clusters, which means objects that are similar in nature and similar in characteristics need to be put together. This is what k-means clustering is all about. Now, let us get started and understand K-Means clustering & logistic regression in detail.
Below topics are explained in this Machine Learning tutorial part -2 :
1. Clustering
- What is clustering?
- K-Means clustering
- Flowchart to understand K-Means clustering
- Demo - Clustering of cars based on brands
2. Logistic regression
- What is logistic regression?
- Logistic regression curve & Sigmoid function
- Demo - Classify a tumor as malignant or benign based on features
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
This document discusses a study on using different machine learning classifiers for classification and regression problems. It first provides a brief description of linear regression, logistic regression, neural networks, and support vector machines. It then discusses using these classifiers on classification and regression data. For classification data consisting of hand gesture images, logistic regression achieved 87.89% accuracy, while support vector machines achieved 90% accuracy with a linear kernel. Neural networks generally performed best by training more complex models without overfitting. Overall, the study evaluated the performance of different machine learning algorithms on sample datasets.
The document examines using a nearest neighbor algorithm to rate men's suits based on color combinations. It trained the algorithm on 135 outfits rated as good, mediocre, or bad. It then tested the algorithm on 30 outfits rated by a human. When trained on 135 outfits, the algorithm incorrectly rated 36.7% of test outfits. When trained on only 68 outfits, it incorrectly rated 50% of test outfits, showing larger training data improves accuracy. It also tested using HSL color representation instead of RGB with similar results.
Methodological study of opinion mining and sentiment analysis techniquesijsc
Decision making both on individual and organizational level is always accompanied by the search of
other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum
discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated
content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining
and sentiment analysis are the formalization for studying and construing opinions and sentiments. The
digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is
an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
The document provides an introduction to machine learning techniques for category representation, outlining topics like clustering, classification, dimensionality reduction, and density estimation. It discusses supervised, unsupervised, and semi-supervised learning approaches and how to evaluate models using techniques like cross-validation to avoid overfitting. The goal of the course is to introduce common machine learning algorithms used in object recognition systems.
sentiment analysis using support vector machineShital Andhale
SVM is a supervised machine learning algorithm that can be used for classification or regression. It works by finding the optimal hyperplane that separates classes by the largest margin. SVM identifies the hyperplane that results in the largest fractional distance between data points of separate classes. It can perform nonlinear classification using kernel tricks to transform data into higher dimensional space. SVM is effective for high dimensional data, uses a subset of training points, and works well when there is a clear margin of separation between classes, though it does not directly provide probability estimates. It has applications in text categorization, image classification, and other domains.
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Exploring the Impact of Magnitude- and Direction-based Loss Function on the P...Dr. Amarjeet Singh
Researches on predicting prices (as time series) from deep learning models usually use a magnitude-based error measurement (such as ). However, in trading, the error in the predicted direction could affect trading results much more than the magnitude error. Few works consider the impact of ill-predicted trading direction as part of the error measurement.
In this work, we first find parameter sets of LSTM and TCN models with low magnitude-based error measurement, and then calculate the profitability using program trading. Relationships between profitability and error measurements are analyzed.
We also propose a new loss function considering both directional and magnitude error for previous models for re-evaluation. Three commodities are tested: gold, soybean, and crude oil (from GLOBEX). Our findings are: with given parameter sets, if merchandise (gold and soybean) is of low averaged magnitude error, then its profitability is more stable. The proposed loss function can further improve profitability. If it is of larger magnitude error (crude oil), then its profitability is unstable, and the proposed loss function cannot improve nor stabilize the profitability.
Furthermore, the relationship between profitability and error measurement for models of LSTM and TCN with or without customized loss function is not, as commonly believed, highly positively correlated (i.e., the more precise the predicted value, the more trading profit) since the correlation coefficients are rarely higher than 0.5 in all our experiments. However, the customized loss functions perform better in TCN than in LSTM.
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand what is clustering, K-Means clustering, flowchart to understand K-Means clustering along with demo showing clustering of cars into brands, what is logistic regression, logistic regression curve, sigmoid function and a demo on how to classify a tumor as malignant or benign based on its features. Machine Learning algorithms can help computers play chess, perform surgeries, and get smarter and more personal. K-Means & logistic regression are two widely used Machine learning algorithms which we are going to discuss in this video. Logistic Regression is used to estimate discrete values (usually binary values like 0/1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logit function. It is also called logit regression. K-means clustering is an unsupervised learning algorithm. In this case, you don't have labeled data unlike in supervised learning. You have a set of data that you want to group into and you want to put them into clusters, which means objects that are similar in nature and similar in characteristics need to be put together. This is what k-means clustering is all about. Now, let us get started and understand K-Means clustering & logistic regression in detail.
Below topics are explained in this Machine Learning tutorial part -2 :
1. Clustering
- What is clustering?
- K-Means clustering
- Flowchart to understand K-Means clustering
- Demo - Clustering of cars based on brands
2. Logistic regression
- What is logistic regression?
- Logistic regression curve & Sigmoid function
- Demo - Classify a tumor as malignant or benign based on features
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
This document discusses a study on using different machine learning classifiers for classification and regression problems. It first provides a brief description of linear regression, logistic regression, neural networks, and support vector machines. It then discusses using these classifiers on classification and regression data. For classification data consisting of hand gesture images, logistic regression achieved 87.89% accuracy, while support vector machines achieved 90% accuracy with a linear kernel. Neural networks generally performed best by training more complex models without overfitting. Overall, the study evaluated the performance of different machine learning algorithms on sample datasets.
The document examines using a nearest neighbor algorithm to rate men's suits based on color combinations. It trained the algorithm on 135 outfits rated as good, mediocre, or bad. It then tested the algorithm on 30 outfits rated by a human. When trained on 135 outfits, the algorithm incorrectly rated 36.7% of test outfits. When trained on only 68 outfits, it incorrectly rated 50% of test outfits, showing larger training data improves accuracy. It also tested using HSL color representation instead of RGB with similar results.
Methodological study of opinion mining and sentiment analysis techniquesijsc
Decision making both on individual and organizational level is always accompanied by the search of
other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum
discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated
content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining
and sentiment analysis are the formalization for studying and construing opinions and sentiments. The
digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is
an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
Tweets Classification using Naive Bayes and SVMTrilok Sharma
This document summarizes a project to automatically classify tweets into predefined Wikipedia categories. It discusses using three algorithms - Naive Bayes, SVM, and rule-based - to classify tweets into 11 categories like business, sports, politics etc. It explains the concepts used like removing outliers, stemming, spell checking. Accuracy results using 10-fold cross validation show SVM and rule-based achieving over 80% accuracy on most categories. The project analyzed real-time tweet data using an API and achieved high performance speeds for classification.
Here are the key calculations:
1) Probability that persons p and q will be at the same hotel on a given day d is 1/100 × 1/100 × 10-5 = 10-9, since there are 100 hotels and each person stays in a hotel with probability 10-5 on any given day.
2) Probability that p and q will be at the same hotel on given days d1 and d2 is (10-9) × (10-9) = 10-18, since the events are independent.
This is my Summer internship project presentation.I have Worked on total three projects and all the brief related details are provided in the presentation.
Thanks to Eckovation.
Automation of IT Ticket Automation using NLP and Deep LearningPranov Mishra
Overview of Problem Solved: IT leverages Incident Management process to ensure Business Operations is never impacted. The assignment of incidents to appropriate IT groups is still a manual process in many of the IT organizations. Manual assignment of incidents is time consuming and requires human efforts. There may be mistakes due to human errors and resource consumption is carried out ineffectively because of the misaddressing. Manual assignment increases the response and resolution times which result in user satisfaction deterioration / poor customer service.
Solution: Multiple deep learning sequential models with Glove Embeddings were attempted and results compared to arrive at the best model. The two best models are highlighted below through their results.
1. Bi-Directional LSTM attempted on the data set has given an accuracy of 71% and precision of 71%.
2. The accuracy and precision was further improved to 73% and 76% respectively when an ensemble of 7 Bi-LSTM was built.
I built a NLP based Deep Learning model to solve the above problem. Link below
https://github.com/Pranov1984/Application-of-NLP-in-Automated-Classification-of-ticket-routing?fbclid=IwAR3wgofJNMT1bIFxL3P3IoRC3BTuWmhw1SzAyRtHp8vvj9F2sKZdq67SjDA
Amirkabir University of Technology
Advanced Database Course
Conference Presentation
Review on Data Mining and its techniques.
Supervisor: Dr. Bagheri
November 2016
In English Presented in Persian
دانشگاه صنعتی امیرکبیر (پلی تکنیک تهران)
دانشکده مهندسی کامپیوتر و فناوری اطلاعات
ارائه کنفرانس درس پایگاه داده پیشرفته
داده کاوی و تکنیک های آن
استاد: دکتر علیرضا باقری
آذرماه 1395
Three case studies deploying cluster analysisGreg Makowski
Three case studies are discussed, that include cluster analysis as a component.
1) Customer description for a credit card attrition model, to describe how to talk to customers.
2) Hotel price optimization. Use clusters to find subsets of similar behavior, and optimize prices within each cluster. Use a neural net as the objective function.
3) Retail supply chain, planning replenishment using 52 week demand curves using thousands of seasonal "profiles" or clusters.
User Payment Prediction in Free-to-PlayAhmed Hassan
The document discusses a methodology for using machine learning to predict whether players in free-to-play games will make real-currency payments based on their data. Random forests, SVMs, and gradient boosting were tested with and without oversampling to address class imbalance. Random forests performed best with an AUC of 0.9607 after oversampling. While results were promising, future work is needed to improve true positive rates, predict number/value of payments, address class overlap, and integrate the framework.
This presentation discusses decision trees as a machine learning technique. This introduces the problem with several examples: cricket player selection, medical C-Section diagnosis and Mobile Phone price predictor. It discusses the ID3 algorithm and discusses how the decision tree is induced. The definition and use of the concepts such as Entropy, Information Gain are discussed.
This document summarizes the results of analyzing reviews of the Samsung Galaxy Mega 5.8 I9152 smartphone. Key analysis techniques included:
1. Creating a word cloud to identify prominent terms in reviews like "screen", "battery", and "camera".
2. Using latent semantic analysis and clustering to group reviews into 5 clusters related to features, experience, and company faith.
3. Finding that 73% of reviews gave positive ratings, indicating most users were satisfied.
4. Employing support vector machine classification to separate reviews into satisfied and dissatisfied categories based on word sentiment.
This document discusses algorithm-independent machine learning techniques. It introduces concepts like bias and variance, which can quantify how well a learning algorithm matches a problem without depending on a specific algorithm. Methods like cross-validation, bootstrapping, and resampling can be used with different algorithms. While no algorithm is inherently superior, such techniques provide guidance on algorithm use and help integrate multiple classifiers.
Machine Learning: Foundations Course Number 0368403401butest
This machine learning course will cover theoretical and practical machine learning concepts. It will include 4 homework assignments and programming in Matlab. Lectures will be supplemented by student-submitted class notes in LaTeX. Topics will include learning approaches like storage and retrieval, rule learning, and flexible model estimation, as well as applications in areas like control, medical diagnosis, and web search. A final exam format has not been determined yet.
The document describes the C4.5 algorithm for building decision trees. It begins with an overview of decision trees and the goals of minimizing tree levels and nodes. It then outlines the steps of the C4.5 algorithm: 1) Choose the attribute that best differentiates training instances, 2) Create a tree node for that attribute and child nodes for each value, 3) Recursively create subordinate nodes until reaching criteria or no remaining attributes. An example applies these steps to build a decision tree to predict customers' responses to a life insurance promotion using attributes like age, income and insurance status.
The document discusses machine learning and provides examples of its applications. It introduces concepts such as learning from experience to improve performance, constructing learning algorithms, and representing the target function. Examples discussed include using patient data to predict high-risk pregnancies, using financial data to analyze credit risk, and learning to play checkers by representing the value of board positions and updating weights. Key questions in machine learning design are also summarized.
Netflix uses a variety of techniques to provide personalized recommendations to users. Some key aspects include:
1. Netflix recommendations are generated using both offline and online techniques. Offline techniques allow for more complex computations but results may become stale, while online techniques can respond quickly but have stricter time constraints.
2. Recommendations are generated using a variety of data sources and machine learning models, including SVD, RBMs, gradient boosted trees, and other techniques. Both the data and models are important for generating high quality recommendations.
3. Netflix tests recommendations using both offline and online A/B testing techniques. Offline testing is used to evaluate new models and ideas before launching online tests involving real users
This document provides an introduction to data mining. It discusses why organizations use data mining, such as for credit ratings, fraud detection, and customer relationship management. It describes the data mining process of problem formulation, data collection/preprocessing, mining methods, and result evaluation. Specific mining methods covered include classification, clustering, association rule mining, and neural networks. It also discusses applications of data mining across various industries and gives some examples of successful real-world data mining implementations.
Romer Rosales is the Director of Artificial Intelligence at LinkedIn. He discusses how AI is used throughout LinkedIn's products and services to optimize member experiences. Key applications of AI include the home page feed, notifications, connections, and search. The goal is to automatically deliver the right information to the right user through the right channel. Originally, AI focused on single objectives like engagement, but now takes a holistic approach considering multiple objectives and tradeoffs across the ecosystem. Examples provided demonstrate how AI is used to balance objectives like reducing notifications while maintaining engagement.
This document provides an overview of machine learning, from basic concepts to cutting-edge trends. It begins with an introduction to machine learning and provides examples of supervised, unsupervised, and reinforcement learning techniques. It then describes basic algorithms like linear regression, decision trees, and k-nearest neighbors. The document outlines important concepts like feature engineering and cross-validation. Finally, it discusses generative adversarial networks as an emerging trend in machine learning.
The document introduces machine learning concepts from the basics to cutting-edge trends. It begins with an overview of supervised learning, unsupervised learning, and reinforcement learning. Then it covers basic algorithms like linear regression, decision trees, and k-nearest neighbors. Next, it discusses intermediate concepts such as feature engineering and cross-validation. Finally, it explores generative adversarial networks as a cutting-edge trend in machine learning.
Tweets Classification using Naive Bayes and SVMTrilok Sharma
This document summarizes a project to automatically classify tweets into predefined Wikipedia categories. It discusses using three algorithms - Naive Bayes, SVM, and rule-based - to classify tweets into 11 categories like business, sports, politics etc. It explains the concepts used like removing outliers, stemming, spell checking. Accuracy results using 10-fold cross validation show SVM and rule-based achieving over 80% accuracy on most categories. The project analyzed real-time tweet data using an API and achieved high performance speeds for classification.
Here are the key calculations:
1) Probability that persons p and q will be at the same hotel on a given day d is 1/100 × 1/100 × 10-5 = 10-9, since there are 100 hotels and each person stays in a hotel with probability 10-5 on any given day.
2) Probability that p and q will be at the same hotel on given days d1 and d2 is (10-9) × (10-9) = 10-18, since the events are independent.
This is my Summer internship project presentation.I have Worked on total three projects and all the brief related details are provided in the presentation.
Thanks to Eckovation.
Automation of IT Ticket Automation using NLP and Deep LearningPranov Mishra
Overview of Problem Solved: IT leverages Incident Management process to ensure Business Operations is never impacted. The assignment of incidents to appropriate IT groups is still a manual process in many of the IT organizations. Manual assignment of incidents is time consuming and requires human efforts. There may be mistakes due to human errors and resource consumption is carried out ineffectively because of the misaddressing. Manual assignment increases the response and resolution times which result in user satisfaction deterioration / poor customer service.
Solution: Multiple deep learning sequential models with Glove Embeddings were attempted and results compared to arrive at the best model. The two best models are highlighted below through their results.
1. Bi-Directional LSTM attempted on the data set has given an accuracy of 71% and precision of 71%.
2. The accuracy and precision was further improved to 73% and 76% respectively when an ensemble of 7 Bi-LSTM was built.
I built a NLP based Deep Learning model to solve the above problem. Link below
https://github.com/Pranov1984/Application-of-NLP-in-Automated-Classification-of-ticket-routing?fbclid=IwAR3wgofJNMT1bIFxL3P3IoRC3BTuWmhw1SzAyRtHp8vvj9F2sKZdq67SjDA
Amirkabir University of Technology
Advanced Database Course
Conference Presentation
Review on Data Mining and its techniques.
Supervisor: Dr. Bagheri
November 2016
In English Presented in Persian
دانشگاه صنعتی امیرکبیر (پلی تکنیک تهران)
دانشکده مهندسی کامپیوتر و فناوری اطلاعات
ارائه کنفرانس درس پایگاه داده پیشرفته
داده کاوی و تکنیک های آن
استاد: دکتر علیرضا باقری
آذرماه 1395
Three case studies deploying cluster analysisGreg Makowski
Three case studies are discussed, that include cluster analysis as a component.
1) Customer description for a credit card attrition model, to describe how to talk to customers.
2) Hotel price optimization. Use clusters to find subsets of similar behavior, and optimize prices within each cluster. Use a neural net as the objective function.
3) Retail supply chain, planning replenishment using 52 week demand curves using thousands of seasonal "profiles" or clusters.
User Payment Prediction in Free-to-PlayAhmed Hassan
The document discusses a methodology for using machine learning to predict whether players in free-to-play games will make real-currency payments based on their data. Random forests, SVMs, and gradient boosting were tested with and without oversampling to address class imbalance. Random forests performed best with an AUC of 0.9607 after oversampling. While results were promising, future work is needed to improve true positive rates, predict number/value of payments, address class overlap, and integrate the framework.
This presentation discusses decision trees as a machine learning technique. This introduces the problem with several examples: cricket player selection, medical C-Section diagnosis and Mobile Phone price predictor. It discusses the ID3 algorithm and discusses how the decision tree is induced. The definition and use of the concepts such as Entropy, Information Gain are discussed.
This document summarizes the results of analyzing reviews of the Samsung Galaxy Mega 5.8 I9152 smartphone. Key analysis techniques included:
1. Creating a word cloud to identify prominent terms in reviews like "screen", "battery", and "camera".
2. Using latent semantic analysis and clustering to group reviews into 5 clusters related to features, experience, and company faith.
3. Finding that 73% of reviews gave positive ratings, indicating most users were satisfied.
4. Employing support vector machine classification to separate reviews into satisfied and dissatisfied categories based on word sentiment.
This document discusses algorithm-independent machine learning techniques. It introduces concepts like bias and variance, which can quantify how well a learning algorithm matches a problem without depending on a specific algorithm. Methods like cross-validation, bootstrapping, and resampling can be used with different algorithms. While no algorithm is inherently superior, such techniques provide guidance on algorithm use and help integrate multiple classifiers.
Machine Learning: Foundations Course Number 0368403401butest
This machine learning course will cover theoretical and practical machine learning concepts. It will include 4 homework assignments and programming in Matlab. Lectures will be supplemented by student-submitted class notes in LaTeX. Topics will include learning approaches like storage and retrieval, rule learning, and flexible model estimation, as well as applications in areas like control, medical diagnosis, and web search. A final exam format has not been determined yet.
The document describes the C4.5 algorithm for building decision trees. It begins with an overview of decision trees and the goals of minimizing tree levels and nodes. It then outlines the steps of the C4.5 algorithm: 1) Choose the attribute that best differentiates training instances, 2) Create a tree node for that attribute and child nodes for each value, 3) Recursively create subordinate nodes until reaching criteria or no remaining attributes. An example applies these steps to build a decision tree to predict customers' responses to a life insurance promotion using attributes like age, income and insurance status.
The document discusses machine learning and provides examples of its applications. It introduces concepts such as learning from experience to improve performance, constructing learning algorithms, and representing the target function. Examples discussed include using patient data to predict high-risk pregnancies, using financial data to analyze credit risk, and learning to play checkers by representing the value of board positions and updating weights. Key questions in machine learning design are also summarized.
Netflix uses a variety of techniques to provide personalized recommendations to users. Some key aspects include:
1. Netflix recommendations are generated using both offline and online techniques. Offline techniques allow for more complex computations but results may become stale, while online techniques can respond quickly but have stricter time constraints.
2. Recommendations are generated using a variety of data sources and machine learning models, including SVD, RBMs, gradient boosted trees, and other techniques. Both the data and models are important for generating high quality recommendations.
3. Netflix tests recommendations using both offline and online A/B testing techniques. Offline testing is used to evaluate new models and ideas before launching online tests involving real users
This document provides an introduction to data mining. It discusses why organizations use data mining, such as for credit ratings, fraud detection, and customer relationship management. It describes the data mining process of problem formulation, data collection/preprocessing, mining methods, and result evaluation. Specific mining methods covered include classification, clustering, association rule mining, and neural networks. It also discusses applications of data mining across various industries and gives some examples of successful real-world data mining implementations.
Romer Rosales is the Director of Artificial Intelligence at LinkedIn. He discusses how AI is used throughout LinkedIn's products and services to optimize member experiences. Key applications of AI include the home page feed, notifications, connections, and search. The goal is to automatically deliver the right information to the right user through the right channel. Originally, AI focused on single objectives like engagement, but now takes a holistic approach considering multiple objectives and tradeoffs across the ecosystem. Examples provided demonstrate how AI is used to balance objectives like reducing notifications while maintaining engagement.
This document provides an overview of machine learning, from basic concepts to cutting-edge trends. It begins with an introduction to machine learning and provides examples of supervised, unsupervised, and reinforcement learning techniques. It then describes basic algorithms like linear regression, decision trees, and k-nearest neighbors. The document outlines important concepts like feature engineering and cross-validation. Finally, it discusses generative adversarial networks as an emerging trend in machine learning.
The document introduces machine learning concepts from the basics to cutting-edge trends. It begins with an overview of supervised learning, unsupervised learning, and reinforcement learning. Then it covers basic algorithms like linear regression, decision trees, and k-nearest neighbors. Next, it discusses intermediate concepts such as feature engineering and cross-validation. Finally, it explores generative adversarial networks as a cutting-edge trend in machine learning.
The document provides an overview of a machine learning and knowledge discovery course. It outlines the course objectives, components, and topics that will be covered, including machine learning algorithms, experimental methodology, and two research papers. It also discusses what machine learning and knowledge discovery are, and provides examples of typical tasks like predicting customer behavior or medical outcomes.
This talk was presented in Startup Master Class 2017 - http://aaiitkblr.org/smc/ 2017 @ Christ College Bangalore. Hosted by IIT Kanpur Alumni Association and co-presented by IIT KGP Alumni Association, IITACB, PanIIT, IIMA and IIMB alumni.
My co-presenter was Biswa Gourav Singh. And contributor was Navin Manaswi.
http://dataconomy.com/2017/04/history-neural-networks/ - timeline for neural networks
The document describes Tuhin AI Advisory, an artificial intelligence and business analytics consulting firm. It provides information on Tuhin AI's thought leadership, international experience, areas of business analytics expertise including marketing, digital, supply chain, and financial analytics. It also lists past clients and sections on solution frameworks, use cases, and how to get started with various analytics such as marketing, financial, digital, supply chain, and more.
This document provides an introduction to the concept of data mining. It discusses several applications of data mining such as credit ratings, targeted marketing, fraud detection, and customer relationship management. It then defines data mining as the process of analyzing large databases to find valid, novel, useful, and understandable patterns. The document outlines some common data mining techniques including classification, clustering, association rule mining, and collaborative filtering. It provides examples of how these techniques can be applied and discusses their advantages and disadvantages.
The Power of Auto ML and How Does it WorkIvo Andreev
Automated ML is an approach to minimize the need of data science effort by enabling domain experts to build ML models without having deep knowledge of algorithms, mathematics or programming skills. The mechanism works by allowing end-users to simply provide data and the system automatically does the rest by determining approach to perform particular ML task. At first this may sound discouraging to those aiming to the “sexiest job of the 21st century” - the data scientists. However, Auto ML should be considered as democratization of ML, rather that automatic data science.
In this session we will talk about how Auto ML works, how is it implemented by Microsoft and how it could improve the productivity of even professional data scientists.
Machine learning and deep learning techniques can be used to analyze diverse types of data such as images, text, signals and more. Deep learning uses neural networks to learn directly from raw data, enabling applications like object recognition, speech recognition, and analyzing time series signals. Deep learning has become popular due to labeled public datasets, increased GPU acceleration, and pre-trained models that provide a starting point for new problems.
Challenges in building a churn prediction model in different industries, presented by Jelena Pekez from Comtrade System Integration. Talk is focused on real-life use-case experience.
Dwdm ppt for the btech student contain basisnivatripathy93
This document provides an introduction to data mining. It discusses why organizations use data mining, such as for credit ratings, fraud detection, and customer relationship management. The document defines data mining as the process of analyzing large databases to find valid, novel, useful, and understandable patterns. It outlines some common data mining applications and techniques, including classification, clustering, association rule mining, and collaborative filtering. The document also compares data mining to related fields and discusses how the knowledge discovery process works.
This document contains legal notices and disclaimers for an Intel presentation. It states that the presentation is for informational purposes only and that Intel makes no warranties. It also notes that performance depends on system configuration and that sample source code is released under an Intel license agreement. Finally, it provides basic copyright information.
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveJune Andrews
The document discusses Stitch Fix's efforts to transform visibility into recommendations customers will love through machine learning. It summarizes the development of their Design the Line architecture, including model training, featurization, prediction, and deployment processes. It also discusses learnings around ways of working like steel thread development, code standards, and prioritizing people. The goal is to scale recommendations by leveraging internal ML products and integrating ML into operations for more efficient buying decisions.
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...PAPIs.io
This document provides an overview of artificial intelligence and machine learning. It discusses how machine learning works using data and examples to build intelligence. Examples of everyday and business uses of machine learning are presented, such as predicting property prices, email spam detection, and demand forecasting. The document outlines the types of analytics that can be performed, from descriptive to predictive to prescriptive. It also discusses how machine learning models are developed and deployed through predictive APIs.
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
Talk from Software Engineering for Machine Learning Workshop (SW4ML) at the Neural Information Processing Systems (NIPS) 2014 conference in Montreal, Canada on 2014-12-13.
Abstract:
Building a real system that incorporates machine learning as a part can be a difficult effort, both in terms of the algorithmic and engineering challenges involved. In this talk I will focus on the engineering side and discuss some of the practical issues we’ve encountered in developing real machine learning systems at Netflix and some of the lessons we’ve learned over time. I will describe our approach for building machine learning systems and how it comes from a desire to balance many different, and sometimes conflicting, requirements such as handling large volumes of data, choosing and adapting good algorithms, keeping recommendations fresh and accurate, remaining responsive to user actions, and also being flexible to accommodate research and experimentation. I will focus on what it takes to put machine learning into a real system that works in a feedback loop with our users and how that imposes different requirements and a different focus than doing machine learning only within a lab environment. I will address the particular software engineering challenges that we’ve faced in running our algorithms at scale in the cloud. I will also mention some simple design patterns that we’ve fond to be useful across a wide variety of machine-learned systems.
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Hima Patel
It is widely accepted that data preparation is one of the most time-consuming steps of the machine learning (ML) lifecycle. It is also one of the most important steps, as the quality of data directly influences the quality of a model. In this session, we will discuss the importance and the role of exploratory data analysis (EDA) and data visualisation techniques to find data quality issues and for data preparation, relevant to building ML pipelines. We will also discuss the latest advances in these fields and bring out areas that need innovation. Finally, we will discuss on the challenges posed by industry workloads and the gaps to be addressed to make data-centric AI real in industry settings.
This document discusses knowledge discovery and data mining. It defines knowledge discovery as the process of automatically searching large volumes of data for patterns that can be considered knowledge. Data mining is defined as one step in the knowledge discovery process and involves using computational methods to discover patterns in large datasets. The document outlines common data mining tasks such as predictive tasks, descriptive tasks, and anomaly detection. It also discusses evaluating data mining algorithms, including assessing the performance of a single algorithm and comparing the performance of multiple algorithms.
Similar to Machine Learning in e commerce - Reboot (20)
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
Tired of chasing down expiring contracts and drowning in paperwork? Mastering contract management can significantly enhance your business efficiency and productivity. This guide unveils expert secrets to streamline your contract management process. Learn how to save time, minimize risk, and achieve effortless contract management.
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
The Steadfast and Reliable Bull: Taurus Zodiac Signmy Pandit
Explore the steadfast and reliable nature of the Taurus Zodiac Sign. Discover the personality traits, key dates, and horoscope insights that define the determined and practical Taurus, and learn how their grounded nature makes them the anchor of the zodiac.
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...BBPMedia1
Nathalie zal delen hoe DEI en ESG een fundamentele rol kunnen spelen in je merkstrategie en je de juiste aansluiting kan creëren met je doelgroep. Door middel van voorbeelden en simpele handvatten toont ze hoe dit in jouw organisatie toegepast kan worden.
KALYAN CHART SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
Satta matka fixx jodi panna all market dpboss matka guessing fixx panna jodi kalyan and all market game liss cover now 420 matka office mumbai maharashtra india fixx jodi panna
Call me 9040963354
WhatsApp 9040963354
Efficient PHP Development Solutions for Dynamic Web ApplicationsHarwinder Singh
Unlock the full potential of your web projects with our expert PHP development solutions. From robust backend systems to dynamic front-end interfaces, we deliver scalable, secure, and high-performance applications tailored to your needs. Trust our skilled team to transform your ideas into reality with custom PHP programming, ensuring seamless functionality and a superior user experience.
11. Agenda
Who: Who are we?
Why: Why value does machine learning add in eCommerce?
What: What algorithms are used in eCommerce?
Introduction to algorithms
Business Use Cases : User Personas, Product Recommendations
Which algorithm is used for this specific case?
How: How does this algorithm actually work?
High Level Description
Example Implementation
Q&A
12. About Reboot.ai
Matt O’Connor
BBA Finance
Previous: Lead Trader Algorithmic Desk - Macro Hedge Fund
Current: Full stack developer and professional Scrum Master (PSM I)
Avid futurist –social ramifications of AI & blockchain
Dhruv Sahi
BA Mathematics and Economics
Previous: Data Science Chief – Smart Cities Startup
Current: Business Intelligence Analyst – eCommerce - Grana
AI, IoT, and smart cities enthusiast
Reboot.ai
Hong Kong’s only dedicated machine learning and AI training provider
Part time evening courses for beginners and advanced
Curriculums developed in partnership with local data companies
Use ML & AI in our classrooms to improve teaching and personalize learning
Who?
13. Why Machine Learning?
1) Computers much faster than humans
Even complex or infinite solution problems have practical ‘solutions’ and optimizations
Ex. Google maps vs human intuition
2) Logic is replicable and scalable
Consistency of results not humanly possible
Conducive to experimentation and A/B testing can limit variables at play
3) Can incorporate elements of ‘learning’ from results
Can ‘teach itself’ and improve
Can identify insights that are not intuitive or sometimes invisible to humans
Why?
14. Headline Use Cases
Recommendation Engines: How Amazon and Netflix Are Winning the Personalization
Battle and optimizing revenues
75% of all content on Netflix is viewed through their recommendation engine
35% of Amazon’s revenues are the product of their recommendation engine
Machine Learning Generates Clickbait Headlines That Will SHOCK You
Predict Sentiment From Movie Reviews Using Deep Learning
Can Chatbots Help Reduce Customer Service Costs by 30%?
Why?
18. Summary: Why?
Benefits
Can be faster and cheaper than human alternative
Can be employed in a wide variety of real world conditions even with limited/flawed data
Can improve, learn, and identify trends humans would have trouble identifying
Weaknesses
Very difficult to create intelligence good in multiple unrelated contexts
No instincts, ‘genetic knowledge’ or ‘intuition’
Mistrusted and misunderstood
Questions?
Why?
19. What is an Algorithm
An algorithm is a step by step process for completing a task.
Everyday examples: recipes, ‘habits’, traditions, traffic laws
Example in code
emailCustomer(gender):
if (gender == male):
sendPromoiton(shirt)
else
sendPromotion(dress)
Algorithm knows to suggest for gender, but not buying patterns, age, occasion, etc… is it
intelligent?
What?
20. Tic-Tac-Toe Algorithm
Let’s pseudo code an algorithm right now
If you were playing Tic-Tac-Toe, how would you decide to move?
Algorithm: a step by step process (game strategy) for completing a task (winning)
What?
21. Tic-Tac-Toe Algorithm
Check if we have 2 in a row next to an empty space, play and win
Check if opponent has 2 in a row next to an empty space, block it
Imagine playing in a space and how opponent would react… repeat
Try to play in spaces that maximize my connections while minimizing opponent’s
It’s just tic-tac-toe, it doesn’t matter that much, when in doubt choose randomly and
remember what happens for next time (experiment)
What?
22. Business Use Case #1:
Segmenting Customers
Customer Personas
‘a semi-fictional representation of your ideal customer based on market
research and real data about your existing customers’
Allow for targeted marketing messages
Personalize = higher conversions
Previous method: manually identify, sort, and maintain separate lists
Problem: expensive (time and money), prone to human error, not standardized
therefore not improvable
What?
23. Segmenting Customer Personas
Challenge: find a more repeatable, scalable process for sorting customers into
distinct user personas
Type of problem: clustering (grouping)
Algorithm: K-means clustering
Why:
Groups data into distinct clusters
Doesn’t need to know any labels or additional information (unsupervised)
Can be used to label data for future categorization
What?
24. K-Means Clustering: Details
Goal: Group bunches of points into ‘K’ distinct groups
Provided Inputs
Set of Data Points
Integer value of ‘K’, ie 3 meaning split data points into 3 clusters
Outputs
K number of ranges containing all provided data points
Note this is not same as categorization (unsupervised)
How?
25. K-Means Clustering: Process
1) Initialize K cluster points centers, called
‘centroids’ at random locations
2) For each point, calculate distance to centroids
and assign to closest centroid (smallest
distance)
3) Update centroid to average position of all data
points in its cluster
4) Repeat steps 2 and 3 until clusters do not
change from one run to next
5) Evaluate model: Silhouette Coefficient
How?
26. K-Means Clustering: Process
How?
Example of how clusters change per
iteration
Here the random initial centroid spots
create a ‘green’ cluster that is imprecise,
and a ‘blue’ cluster spread between 2
clusters
As a result, the blue centroid is ‘pulled’
towards its center towards top middle,
thus taking more out of green and shifting
green to bottom left
28. Use Case #1- Clustering Personas
Summary
High Level
Separating user personas is a situation with a lot of unlabeled data
KMeans clustering can be used to group data points into K distinct groups
Advantage is that is relatively easy to implement
Deeper Dive
An iterative algorithm which runs many times
Optimizes centroids at the average point of all the points within their cluster
Questions?
29. Business Use Case #2:
Product Recommendations
Product Recommendations
Allow for personalized advertising, complementary buys, and upsells
Maximize each customer’s lifetime value
Previous method: one-size-fits-all recommendations
Problem: not personalized, can be operationally difficult
What?
30. Product Recommendations
Challenge: generate personalized recommendations for each individual user, not
just broad categories of users
Type of problem: neighbor distance calculation
Algorithm: K-Nearest Neighbors (KNN)
Why:
Calculates nearest neighbors to any given data point
Relatively simple to implement with high output quality
Can incorporate various sources of data: product characteristics or
characteristics of users who also bought, special logic (context)
What?
31. KNN: Details
Goal: Find the most similar items to a given data point by mapping out the entire
universe of relevant points
Provided Inputs
Specific data point
Universe of data points
K – number of neighbors to return
Method to calculate similarity
Outputs
K neighbors closest (most similar) to provided input data point
How?
32. KNN Cosine Similarity: Side Note
Side note: Why cosine similarity?
We must first answer, what are vectors?
Distance between two points is a function of two
elements:
Magnitude
Direction
Vectors are combinations of magnitudes and direction,
and multi-dimensional vectors can be broken down into
smaller parts (ie x and y)
Allows us to create a single vector which expresses
multiple different metrics, such as 1) user rating and 2)
price
How?
33. KNN Cosine Similarity: Side Note
Side note: Why cosine similarity?
Multiple ways of measuring similarity between
two items
Pure distance between two things isn’t always
best measure
Consider case of direction as positive or
negative ratings
End distance from points not as important as
similarity in vectors
How?
34. KNN Cosine Similarity: Process
1) Clean, wrangle and normalize your data
2) Pick a point from data set and calculate
distance (cosine similarity) from given point
3) Repeat for all points in data set
4) Return K choices with highest similarities
How?
35. KNN Cosine Similarity: Process
How?
1) Prepare inputs
Select columns: style_attributes & mrp
Clean data and convert into correct numerical types
Normalise data using the feature scaling and ordinal scaling
techniques
Store inputs in correct data structure, i.e. dictionary in this case
2) Define a function to calculate distance between any two points
3) Write function to iterate distances between primary point to find it’s
closest K neighbors
4) Return neighbors as suggestions
Let’s look at the code!
36. Use Case #2- Product
Recommendations
High Level
Using datasets in different segments to make more personalized recommendations to
customers
Increase basket size and average order value to drive sales and improve customer experience
Advantage: automate reccommendations to customers on the website/eDM/ads
Deeper Dive
A non-parametric, lazy algorithm that returns closest matches given a starting point and
number of desired recommendations
Uses some type of distance metric to compute distance, and returns closest neighbors
Questions?
37. Practical Tips and Tools For ML & AI in
eCommerce
NLP – it’s complex under the hood, but easy to implement
Sentiment analysis for reviews: https://www.lexalytics.com/
Chatbot platform with lots of easy integrations: API.ai
Python – many powerful libraries to start analyzing your data today
Scikit-learn, SciPy, StatsModels, PySpark, NLTK and many others
Cloud services for running recommendation engines in real-time
Enterprise Cloud Solutions for Deployment (e.g. AWS EMR + Redshift + Elastic
Beanstalk)
Campfire KT: Digital and tech environment focus.
We offer solution for:
every company/team size.
every industry
Need to target the need of your prospect. Listen to him/her and propose accordingly.
Fashion, Design and Creative spirit.
Campfire is planning to become the new ecosystem for (net)work.
Inspired work environment
Networking
Value added Service
Weekly events
-----------------------------------------------------
1) Campfire Secret Island Party – a two day outdoor event
2) Campfire Networking Thurdays – Pitch night event.
3) Campfire Waffle Wednesday – Monthly networking Breakfast event that are host by inspiring guest speakers.
4) Campfire WCH Grande Oppening – Fashion Show for showcasing HK’s talents in the fashion industries.
5) Campfire Yoga Classes – Bi-monthly classes