Senior Project Powerpoint

•Download as PPTX, PDF•

1 like•59 views

Robert Clark

Essentials of Business Analytics
 What is Business Analytics?
 Part 1 – Descriptive Data Mining Through Cluster Analysis
 Part 2 - Predictive Data Mining Through Classification
 Part 3 – Linear Optimization Models

What is Business Analytics?
 Business analytics is the analyzing of data in order to drive
business decisions.
http://www.thehansindia.com/posts/index/Young-Hans/2016-03-09/Business-Analytics-course-at-IIT-H-/212475

Descriptive Data Mining Through Cluster
Analysis
 Goal: segment observations into similar groups.
 Two common methods: Hierarchical and k-Means
https://www.quora.com/Does-measuring-clustering-efficiency-with-precision-and-
recall-make-any-sense
Part 1

Descriptive Data Mining Through Cluster
Analysis
 Here is the data we are clustering:
Part 1

Descriptive Data Mining Through Cluster
Analysis
 Method 1 – Hierarchical Clustering
Part 1

Descriptive Data Mining Through Cluster
Analysis
 Method 2 – k-Means Clustering
Part 1

Descriptive Data Mining Through Cluster
Analysis
k = 2 k = 3
k = 4 k = 5
Part 1

Predictive Data Mining Through
Classification
 Goal: classify a new observation based on current data
 Three common methods: Logistic Regression, k-NN, CART
Part 2

Predictive Data Mining Through
Classification
 Method 1 – Logistic Regression
Part 2
Cutoff Value

Predictive Data Mining Through
Classification
 Method 2 – k-Nearest Neighbors
Part 2

Predictive Data Mining Through
Classification
 Method 3 – Classification and Regression Trees
Part 2

Predictive Data Mining Through
Classification
 Which method is best at classifying new observations?
Part 2

Linear Optimization Models
 Goal: Maximize or minimize the objective function
https://en.wikipedia.org/wiki/Linear_programming
Part 3

Linear Optimization Models
Part 3
 Par, Inc. wants to make standard and deluxe golf bags. They
are constrained by a limited amount of time for each
production step.
 They make $10 profit for standard bags, $9 profit for deluxe
bags
 Objective function to maximize: 10S + 9D

Linear Optimization Models
Feasible Region
Part 3
Feasible Region
Constraint Functions

Linear Optimization Models
Feasible Region
Part 3
Objective Function
Optimized solution
Feasible Region

Linear Optimization Models
 J.D. Williams Inc. Case Study:
 3 funds
 Growth stock fund (18% yield, .10 risk)
 Income fund (12.5% yield, .07 risk)
 Money market fund (7.5% yield, .01 risk)
 Client has $800,000 to invest. How should the client allocate
their money at a controlled risk level while maximizing profit?
Part 3

Linear Optimization Models
 Growth stock fund: $320,000
 Income fund: $240,000
 Money market fund: $240,000
 Yearly return: $105,600
Part 3

Similar to Senior Project Powerpoint

Optimum Investment Selection process-Nov 9-2013

Gary Crosbie

IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion

IRJET Journal

6asso

Vishwajeet Gudadhe

Presentation Title

butest

To prepare for this Assignment: · Review this week’s Learning Resources. Consider how to assess and treat clients requiring bipolar therapy. The Assignment Examine Case Study: An Asian American Woman With Bipolar Disorder. You will be asked to make three decisions concerning the medication to prescribe to this client. Be sure to consider factors that might impact the client’s pharmacokinetic and pharmacodynamic processes. At each decision point stop to complete the following: · Decision #1 · Which decision did you select? · Why did you select this decision? Support your response with evidence and references to the Learning Resources. · What were you hoping to achieve by making this decision? Support your response with evidence and references to the Learning Resources. · Explain any difference between what you expected to achieve with Decision #1 and the results of the decision. Why were they different? · Decision #2 · Why did you select this decision? Support your response with evidence and references to the Learning Resources. · What were you hoping to achieve by making this decision? Support your response with evidence and references to the Learning Resources. · Explain any difference between what you expected to achieve with Decision #2 and the results of the decision. Why were they different? · Decision #3 · Why did you select this decision? Support your response with evidence and references to the Learning Resources. · What were you hoping to achieve by making this decision? Support your response with evidence and references to the Learning Resources. · Explain any difference between what you expected to achieve with Decision #3 and the results of the decision. Why were they different? Also include how ethical considerations might impact your treatment plan and communication with clients. 1 Shridhik John CSE 171B S. Desa Final Examination PROBLEM 1: PLANNING Activity Matrix: A B C D E F G A A B X B C X X C D X X D E X X E F X X F X G *Waterfall Method Key A – Problem 2: SCM Design/Analysis Framework B – Problem 3: Optimal Lot Size and Cycle Inventory for SPC C – Problem 4: Safety Inventory for Polystyrene Resin at SPC D – Problem 5: Sourcing for SPC E – Problem 6: Transportation Design for SPC F – Problem7: Execution of your plan G – Extra Credit 2 GANTT Chart: PERT Chart: CPM: A à B à C à D à E à F à G PROBLEM 2: SCM DESIGN/ANALYSIS FRAMEWORK Step One: Define the Problem You have been hired as a consultant by Poly (formerly Plantronics), a medium-sized company “headquartered” in Santa Cruz, which is the world leader in communication head-sets. You have been asked to design their supply chain all the way from “high-level” concerns (e.g., competitive strategy, “alignment”), through analysis/procedures (e.g., inventory management models) to the actual integrated software that will be used to manage the ...

To prepare for this Assignment· Review this week’s Learning Res.docx

juliennehar

In today’s dynamic marketplace, telecommunication organizations, both private and public, are increasingly leaving antiquated marketing philosophies and strategies to the adoption of more customer-driven initiatives that seek to understand, attract, retain and build intimate long term relationship with profitable customers. This paradigm shift has undauntedly led to the growing interest in Customer Relationship Management (CRM) initiatives that aim at ensuring customer identification and interactions. The urgent market requirement is to identify automated methods that can assist businesses in the complex task of predicting customer churning. The immediate requirement of the market is to have systems that can perform accurate (i) identification of loyal customers (so that companies can offer more services to retain them) (ii) prediction of churners to ensure that only the customers who are planning to switch their service providers are being targeted for retention

Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...

Eswar Publications

Data Science Demystified

Emily Robinson

Machine_Learning_Trushita

Trushita Redij

Data Severance Using Machine Learning for Marketing Strategies

IRJET Journal

Bank Customer Segmentation & Insurance Claim Prediction

IRJET Journal

IRJET- A Comprehensive way of finding Top-K Competitors using C-Miner Algorithm

IRJET Journal

Introduction to Data Mining

Nofel Elahi

Description Marks out of Wtg(%) Word Count Due date Assignment 4 Written and Practical Report 100 (55%) 4500 30/05/14 Assignment 4 relates to the specific course learning objectives 1, 2 and 4 and associated MBA program learning goals and skills: Global Content, Problem solving, Change, Critical thinking, and Written Communication at level 3. 1. demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes 2. identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems 4. demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed. The key frameworks, concepts and activities covered in modules 2–12 and more specifically modules 6 to 12 are particularly relevant for this assignment. This assignment consists of three tasks 1, 2 and 3 and builds on the research and analysis you conducted in Assignment 2. Task 1 is concerned with developing and evaluating a model of key factors impacting on credit risk ratings for loan applications in determining whether approve a loan or not approve a loan. Task 2 is concerned with the key opportunities and challenges associated with the implementation and utilisation of business intelligence systems. Task 3 is concerned with performance management and provides you with the opportunity to design and build an interactive sales performance dashboard with drill down capability using Tableau 8.0 Desktop or pivot tables. Task 1 (Worth 40 marks) In Task 1 of this Assignment 4 you are required to follow the six step CRISP DM process and make use of the data mining tool RapidMiner to analyse and report on the creditrisk_train. csv and creditrisk_score.csv data sets provided for Assignment 4. You should refer to the data dictionary for creditrisk_train.csv (see Table 1 below). In Task 1 and 2 of Assignment 4 you are required to consider all of the business understanding, data understanding, data preparation, modelling, evaluation and deployment phases of the CRISP DM process. Table 1 Data Dictionary for creditrisk_train.csv Variable Description Row.No Unique identifier for each row – integer Application.ID Unique identifier for loan application – integer Credit.Score Credit score given to the loan application This is a measure of the creditworthiness of the a.

Description Marks out of Wtg() Word Count Due d.docx

theodorelove43763

A presentation for Retail Sales Projects

Amjad Raza, Ph.D.

Interface Between Six Sigma and Knowledge Management

sachinmgadekar21

Engineering design is the process of developing a system, component or process to satisfy the desired requirements. It is a decision making process, in which the basic mathematics and engineering disciplines are utilized to convert resources optimally to achieve a predetermined objective. It also includes a variety of realistic constraints such as reliability, safety, economic factors, ethical and social impacts. This work proposes a methodology and a procedure for the make-or-buy problem. Companies following this methodology are guided through a structured sequence comprising identification of factors for the make-or-buy decision, and the comparison of internal sourcing and external sourcing factors against each other. Multi-attribute decisionmaking is utilized to present an overall make-or-buy decision recommendation.

Developing a Multiple-Criteria Decision Methodology for the Make-or-Buy Problem

International journal of scientific and technical research in engineering (IJSTRE)

從數據處理到資料視覺化－商業智慧的實作與應用

Pei-Syuan Li

Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...

IRJET Journal

Loan Analysis Predicting Defaulters

IRJET Journal

Big data

Rishabh Gupta

Similar to Senior Project Powerpoint (20)

Optimum Investment Selection process-Nov 9-2013

IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion

6asso

Presentation Title

To prepare for this Assignment· Review this week’s Learning Res.docx

Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...

Data Science Demystified

Machine_Learning_Trushita

Data Severance Using Machine Learning for Marketing Strategies

Bank Customer Segmentation & Insurance Claim Prediction

IRJET- A Comprehensive way of finding Top-K Competitors using C-Miner Algorithm

Introduction to Data Mining

Description Marks out of Wtg() Word Count Due d.docx

A presentation for Retail Sales Projects

Interface Between Six Sigma and Knowledge Management

Developing a Multiple-Criteria Decision Methodology for the Make-or-Buy Problem

從數據處理到資料視覺化－商業智慧的實作與應用

Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...

Loan Analysis Predicting Defaulters

Big data

Senior Project Powerpoint

1. Essentials of Business Analytics Robert Clark Lawrence Chilton (Mentor) Brigham Young University – Idaho Spring 2016

2. Essentials of Business Analytics  What is Business Analytics?  Part 1 – Descriptive Data Mining Through Cluster Analysis  Part 2 - Predictive Data Mining Through Classification  Part 3 – Linear Optimization Models

3. What is Business Analytics?  Business analytics is the analyzing of data in order to drive business decisions. http://www.thehansindia.com/posts/index/Young-Hans/2016-03-09/Business-Analytics-course-at-IIT-H-/212475

4. Descriptive Data Mining Through Cluster Analysis  Goal: segment observations into similar groups.  Two common methods: Hierarchical and k-Means https://www.quora.com/Does-measuring-clustering-efficiency-with-precision-and- recall-make-any-sense Part 1

5. Descriptive Data Mining Through Cluster Analysis  Here is the data we are clustering: Part 1

6. Descriptive Data Mining Through Cluster Analysis  Method 1 – Hierarchical Clustering Part 1

7. Descriptive Data Mining Through Cluster Analysis  Method 2 – k-Means Clustering Part 1

8. Descriptive Data Mining Through Cluster Analysis k = 2 k = 3 k = 4 k = 5 Part 1

9. Predictive Data Mining Through Classification  Goal: classify a new observation based on current data  Three common methods: Logistic Regression, k-NN, CART Part 2

10. Predictive Data Mining Through Classification  Method 1 – Logistic Regression Part 2 Cutoff Value

11. Predictive Data Mining Through Classification  Method 2 – k-Nearest Neighbors Part 2

12. Predictive Data Mining Through Classification  Method 3 – Classification and Regression Trees Part 2

13. Predictive Data Mining Through Classification  Which method is best at classifying new observations? Part 2

14.

15. Linear Optimization Models  Goal: Maximize or minimize the objective function https://en.wikipedia.org/wiki/Linear_programming Part 3

16. Linear Optimization Models Part 3  Par, Inc. wants to make standard and deluxe golf bags. They are constrained by a limited amount of time for each production step.  They make $10 profit for standard bags, $9 profit for deluxe bags  Objective function to maximize: 10S + 9D

17. Linear Optimization Models Feasible Region Part 3 Feasible Region Constraint Functions

18. Linear Optimization Models Feasible Region Part 3 Objective Function Optimized solution Feasible Region

19. Linear Optimization Models  J.D. Williams Inc. Case Study:  3 funds  Growth stock fund (18% yield, .10 risk)  Income fund (12.5% yield, .07 risk)  Money market fund (7.5% yield, .01 risk)  Client has $800,000 to invest. How should the client allocate their money at a controlled risk level while maximizing profit? Part 3

20. Linear Optimization Models  Growth stock fund: $320,000  Income fund: $240,000  Money market fund: $240,000  Yearly return: $105,600 Part 3

Editor's Notes

Simple: The goal of business analytics is to analyze all the data collected in order to drive business decisions, leading to better performance. This can be done by reducing costs, better marketing strategies, even predicting future events. The three methods that I mentioned, clustering, classification and optimization, are 3 common methods that are not often taught in introductory business analytics courses. Detailed: What to do with missing data: leave it out, fill it in with average data. Depends on why the data is missing. Is it completely random, or not? Other methods include linear regression, statistical inference, time series analysis and forecasting, integer and nonlinear optimization models, etc.
Simple: Clustering can be used on people for marketing strategies, buildings for city planning, geographical locations for earth observation studies, etc. Two common ways to do clustering: hierarchical clustering and k-means clustering. Here the clustering is very obvious, but most often the data is hard to visualize, because the data is very mixed, and often has more than two dimensions. Detailed: Clustering is commonly used in marketing to divide consumers into different groups, a process known as market segmentation. Once divided into groups, a firm can then tailor marketing strategies for the different groups. Clustering can be used to group and compare many things, such as people, buildings, genes, stars and geographical locations. People for marketing strategies, or buying patterns, even for banks and insurance companies. Buildings for city planning. Genes to discover genes of similar functions. Stars to compare similar celestial bodies in order to discover possible planets with life. Geographical locations for earth observation studies.
This data comes from ‘Know Thy Customer’, a financial advising company that provides personalized financial advice to its clients. If KTC is able to cluster their clients, they could tailor specific financial advice to each cluster, making their work easier. Note that this data is 7 dimensions with both quantitative and categorical variables.
Simple: Hierarchical takes each individual observations and groups similar ones, and continues to group similar clusters until one cluster remains. The user decides when to stop the clustering. A horizontal line is drawn on this graph in order visualize where the clustering could end. This would produce 3 clusters. Detailed: Most common measure of similarity is Euclidean distance. It doesn’t work well for categorical variables, though. If clustering just cat. variables, using the matching coefficient is better. This is done by counting the number of class 1’s and 0’s and dividing by n. As more and more clusters are joined, the dissimilarity between observations in each cluster increases. The distance on the Y-axis is the measure of dissimilarity. The number of vertical lines the horizontal line crosses is the number of clusters, with each cluster then defined as to what’s below it. In this case, there would be four clusters. Hierarchical clustering can be sensitive to outliers, added and removed observations. Better with smaller data sets. (<500) Good because you can see solutions with increasing numbers of clusters.
Simple: The user first specifies how many clusters the computer should make. The computer then assigns each observation into a random cluster, and the centroid of each cluster is calculated. Observations are then reassigned to the cluster with the closest centroid. This iteration is repeated until no changes are observed, or the set iteration limit is reached. K-means works well on quantitative variables. Detailed: It is good practice to convert each observation into a z-score, so that the similarity is not dominated by one variable. If not converted to a z-score, the similarity between observations here would be dominated by the income variable, because the values are so much bigger than age. Limitations on k-means is that it is not very good with categorical variables. It is good with large data sets. K-means is more visual. We can see the three clusters are 1. younger with lower income, 2. older with lower income, and 3. older with higher income. The big dots are the centroids of each cluster. The centroid is the average values of each cluster.
This slide shows how the algorithm continues to make new clusters as we increase the number of clusters the computer should make.
Simple: Classification is a form of predictive data mining, which means we are making models that try to predict an observation’s outcome. There are three common methods for classification: Logistic regression, k-nearest neighbors and classification and regression trees. Detailed: Examples are; a bank is deciding if it should give a loan out to a customer by trying to predict whether they would default on the loan or not, or predicting expenditures of potential customers. The success of the model is determined by the classification error. Class 1 error is the percent of true class 1 observations that were predicted to be class 0. Class 0 error is the percent of true class 0 observations that were predicted to be class 1. The overall error rate is the overall percent it incorrectly classified observations. When classifying, a data set is divided up into a training set and a testing set. The training set is used to build the model, and the testing set is used to test the model’s effectiveness.
Simple: Logistic regression attempts to classify a categorical outcome as a function of explanatory variables. In this example, it is classifying a movie as to whether it will win the Oscar for best picture or not, based on the number of Oscar nominations. Whether it classifies a movie as winning the best picture Oscar or not depends on the cutoff value, which we define. Detailed: The value the logistic regression gives is the probability of winning the best picture Oscar. If the calculated probability is over the set cutoff value, (50%), then it is classified as class 1. If not, class 0. In this example, you can see that with 11 nominations and above, a movie would be classified as winning best picture. First, the odds of something happening is calculated, (p/(1-p)), by a linear combination of explanatory variables, then the natural log is taken of the odds, then solving for p. The final equation is p-hat = 1 / (1 + exp(-(linear equation))).
Simple: With k-nearest neighbors, a new observation is compared to its nearest neighbors. The number of neighbors it compares itself to is specified by the user. The observation is classified by which category most of the neighbors belongs to. Detailed: If the percent of neighbors that are class 1 is over the cutoff value, then the new observation is classified as class 1. Otherwise, it will be class 0. The optimal number of k can be found by building models over a typical k value (1, … , 20), and choosing the model with the lowest classification error. This can be used on categorical and continuous outcomes. When estimating continuous outcomes with k-NN, a new observation’s outcome is determined by the average of its nearest neighbors.
Simple: CART works by partitioning the data into increasingly smaller and more homogenous groups. The user specifies the least amount of observations a cluster must have before the method considers dividing it. Detailed: The measure of heterogeneity in a group of observations’ outcome classes or outcome values is the impurity. With classification trees, the impurity is based on the proportion of incorrectly classified observations. If all observations in a group are in the same class, there is zero impurity. The impurity in a regression tree is based on the variance of the outcome value for the observation in the group. Once the tree is constructed, the estimated outcome value of an observation is based on the mean outcome value of the partition in which the new observation belongs. Like clustering, the different methods here can be combined to build more sophisticated models. The optimal number here can again be identified by building multiple models with different division numbers and choosing the one with the lowest error rate.
Simple: We will compare the effectiveness of the 3 methods. This data set is cell phone usage information from customers, and whether or not they cancelled their service. We will try to predict whether or not a current customer will cancel their service (Class 1) or not (Class 0). Detailed: This data is from a cell phone company, who tracked the number of weeks they have had their account, it they had recently renewed a contract or not, if they have a data plan or not, how much monthly data they use, how many customer service calls they’ve made, the average minutes they use per month, the average calls they make in a month, their monthly bill, their largest overage fee in the last 12 months, and their monthly average of roaming minutes. The company wants to predict whether a customer is likely to cancel their service (Class 1) or not (Class 0). I used all three classification methods and compared the classification errors.
Simple: A model’s effectiveness is measured by their error rates. Class 0 error are the percent of people that we predicted would cancel, but they didn’t. Class 1 error is the percent of people that we predicted wouldn’t cancel, but did. CART is the clear winner. Detailed: CART is the clear winner. If we predict that someone will cancel, let’s say we spend $100 dollars advertising on them. Then they don’t, that sets us back $100. If we someone cancels, and we don’t predict that, we could be losing $1,000 or more per customer. So having such a lower class 1 error rate means that it could be saving the company thousands or millions of dollars.
Simple: Optimization is a very widely used method with many applications: telecommunications, manufacturing and transportation, flight crew scheduling, portfolio investments, and marketing techniques. Optimization seeks to maximize or minimize some objective function, like maximizing profits or minimizing costs. Detailed: Optimization can be used for telecommunications, with how to optimize call routing; manufacturing and transportation, with how to distribute goods from producers to distributers; crew scheduling, with assigning crews to flights; portfolios, with distributing investments among investment funds; marketing, with how to distribute funds among various marketing techniques; and the traveling salesmen problem, with what is the shortest route you can take that gets you to every city you need? The main goal of optimization is to maximize or minimize an objective function. The objective function is a business problem that is represented in a mathematical equation. This objective function is restricted by some constraints, like a factory can only produce so many units, or it costs a certain amount of money to travel from one location to the other. This is linear optimization because each equation is a linear equation.
Simple: This is an optimization problem for Par, Inc. who wants to make golf bags. There are 4 steps to the production process, and each one has a limited total number of hours that can be allotted for the step. Detailed: Optimization can be used for telecommunications, with how to optimize call routing; manufacturing and transportation, with how to distribute goods from producers to distributers; crew scheduling, with assigning crews to flights; portfolios, with distributing investments among investment funds; marketing, with how to distribute funds among various marketing techniques; and the traveling salesmen problem, with what is the shortest route you can take that gets you to every city you need? The main goal of optimization is to maximize or minimize an objective function. The objective function is a business problem that is represented in a mathematical equation. This objective function is restricted by some constraints, like a factory can only produce so many units, or it costs a certain amount of money to travel from one location to the other. This is linear optimization because each equation is a linear equation.
Simple: When maximizing an objective function, like a function that calculates profit, there are always constraints that the company is bound to. You can’t have an infinite amount of material or time. These constraints are shown here on the graph as lines. The solutions to the optimization problem are the set of points that satisfy each constraint equation, known as the feasible region. Detailed: The black circles on the constraint lines are maximum solutions, called extreme points. When the constraints are put on the graph and there is no feasible region, then there is no possible solution to the problem. If this were to happen, then you would need to go to management and explain that more resources are needed. If an infinite feasible region is created, then that means you didn’t set up the problem correctly. If a business could maximize their profits infinitely, then that would be a business I would want to work for!
Simple: To maximize the function, The objective function is slid downwards until it intersects with a point on the feasible region. That is the optimized solution. Detailed: It is possible that the objective function is parallel to a side of the feasible region, thus intersecting many points at once. This creates multiple optimal solutions, which is useful for the company.
The risk is a measurement of the potential to lose money on the investment. The higher the percentage, the more risky it is to invest in that fund. The customer further specified that 20% - 40% must go to growth, 20% - 50% must go to income, at least 30% must go to money.
Solution hard to visualize due to having more than two variables. And that is the end of my presentation on business analytics. Are there any questions?

Senior Project Powerpoint

Recommended

Recommended

More Related Content

Similar to Senior Project Powerpoint

Similar to Senior Project Powerpoint (20)

Senior Project Powerpoint

Editor's Notes