1. The document analyzes marketing data from The Bee Corp to identify top performing markets and products.
2. It finds that California and New York are the most successful markets based on sales and profit. Many states are losing money due to large discounts of 20% or more.
3. Technology and office supplies categories contribute most to profit, while furniture contributes least. Specific profitable products include copiers, phones, and binders.
Using the same Excel file, open the tab labeled �Problem 2�. Copy and .pdffashiodofashion
Using the same Excel file, open the tab labeled Problem 2. Copy and paste your table from
problem Error! Reference source not found. into the top left corner of the Problem 2 sheet.Create
a matrix of Leontief multipliers in cells U4:X10. Instructions can be found in the appendix to
Taylor and Lybbert chapter 3, and in the Zoom video found on the Canvas course page for this
assignment. Note that you need to use the MINVERSE command in Excel to create the matrix of
multipliers, as described by Taylor and Lybbert and the help video.You are conducting a policy
analysis for the government of the country whose economy is represented by Leontief multiplier
matrix you just calculated. The government would like you to answer the following
questions:Which will have the biggest impact on GDP? An extra dollar of final demand in
agriculture, industry, or services?Which will have the biggest impact on earnings by labor? An
extra dollar of final demand in agriculture, industry, or services?Which sector has the strongest
linkages? For example, if you boost final demand for agriculture by a dollar, what is the
combined effect on output by industry and services? How does this linkage effect compare to the
linkage effects that would be generated by giving boosting final demand for either of the other
two sectors? Sectors Use Intermediate Inputs and FactorsFinal demandTOTALS Leontief
coefficient matrix (A) Identity matrix (I) I-A Leontief multiplier matrix (I-A)^-1
AgricultureIndustryServices AgricultureIndustryServices AgricultureIndustryServices
AgricultureIndustryServices AgricultureIndustryServices Income to activitiesProduction sectors
Calculate GDP using factor cost in cell H9:Production sectors Production sectors Production
sectors Production sectors LinkagesAgriculture37585128111283Agriculture Agriculture
Agriculture Agriculture Industry3102001157101335Industry Industry Industry Industry
Services8515035299569Services Services Services Services Income to householdsFactors
N/A Factors Factors Labor122400300822Labor Labor Capital
(Profits)16528099544 Capital Capital Imported inputs2262208454 GDP (sum of
changes in factor payments) TOTALS128313355691820 Calculate GDP using the
value of final goods in cell G12:.
Using the same Excel file, open the tab labeled �Problem 2�. Copy and .pdffashiodofashion
Using the same Excel file, open the tab labeled Problem 2. Copy and paste your table from
problem Error! Reference source not found. into the top left corner of the Problem 2 sheet.Create
a matrix of Leontief multipliers in cells U4:X10. Instructions can be found in the appendix to
Taylor and Lybbert chapter 3, and in the Zoom video found on the Canvas course page for this
assignment. Note that you need to use the MINVERSE command in Excel to create the matrix of
multipliers, as described by Taylor and Lybbert and the help video.You are conducting a policy
analysis for the government of the country whose economy is represented by Leontief multiplier
matrix you just calculated. The government would like you to answer the following
questions:Which will have the biggest impact on GDP? An extra dollar of final demand in
agriculture, industry, or services?Which will have the biggest impact on earnings by labor? An
extra dollar of final demand in agriculture, industry, or services?Which sector has the strongest
linkages? For example, if you boost final demand for agriculture by a dollar, what is the
combined effect on output by industry and services? How does this linkage effect compare to the
linkage effects that would be generated by giving boosting final demand for either of the other
two sectors? Sectors Use Intermediate Inputs and FactorsFinal demandTOTALS Leontief
coefficient matrix (A) Identity matrix (I) I-A Leontief multiplier matrix (I-A)^-1
AgricultureIndustryServices AgricultureIndustryServices AgricultureIndustryServices
AgricultureIndustryServices AgricultureIndustryServices Income to activitiesProduction sectors
Calculate GDP using factor cost in cell H9:Production sectors Production sectors Production
sectors Production sectors LinkagesAgriculture37585128111283Agriculture Agriculture
Agriculture Agriculture Industry3102001157101335Industry Industry Industry Industry
Services8515035299569Services Services Services Services Income to householdsFactors
N/A Factors Factors Labor122400300822Labor Labor Capital
(Profits)16528099544 Capital Capital Imported inputs2262208454 GDP (sum of
changes in factor payments) TOTALS128313355691820 Calculate GDP using the
value of final goods in cell G12:.
Business insights Evaluation of a Telecom client dataset using R AbdulMajedRaja R S
A Telecom Operator has provided us their customer data to analyse and find meaningful insights in business context that can help the company to improve their process and services to their customers. This report summarises all of the statistical findings from the analysis of the Telecom operator’s dataset.
Uncovering the Bangor Region's Competitive Economic SectorsStephen Bolduc
An original and important investigation of the economic sectors producing rapid growth in the Bangor region. The presentation proceeds from broad economic sectors to the niche categories at the 6 digit NAICS level.
This PowerPoint contains mock visualizations and data about a company's product portfolio. The insights are presented based on the assumption that the stakeholders have high business acumen and product knowledge.
The two-party systemCheck out this list Thats the .docxrhetttrevannion
The two-party system
Check out this list:
That's the
official list (see this list by clicking on
official list ) of political parties receiving votes in the 2020 presidential election.
So don't try to tell me the U.S. has a "two-party system." 😀
Nevertheless, we know that lots of people are frustrated with politics in America today, and one common complaint is that the two-party system is at fault. While we have a lot of parties, it would be disingenuous to dismiss complaints about the two-party system, because realistically it's mostly either Republicans or Democrats who have a realistic chance to be elected. That is, of course, not always true. There are currently two U.S. senators who are neither Republican nor Democratic, Angus King, from Maine, and Bernie Sanders, from Vermont. Both of these senators caucus with the Democrats, though, and are generally reliable Democratic votes. There have been numerous third party (or "minor party") candidates who have won elections, including Jesse Ventura winning the race to be governor of Minnesota in 1998. The late Texas billionaire Ross Perot received 19% of the national vote for President of the United States in 1992.
Still, people complaining about the two-party system worry that because of the virtual lock on politics of the two main parties, political views not represented by the two parties – Republicans and Democrats – are excluded from political discourse. Or that having only two parties limits the choices available on election day, and if neither candidate is desirable, there’s no one left to choose.
But there is another view, as you’ve read. This other view is that blaming the two-party system for today’s problems is misguided. This argument says, among other things, that citizens of other countries with multi-party systems are no more satisfied with the state of their politics than Americans are with ours. We blame the two-party system, they say, because we think “the grass is always greener on the other side,” when really it’s not. It is also argued that people unhappy with the two parties think they want a "centrist" party, but when they see the policy platforms of those centrist parties they don't like them as much as they thought they would, so they end up voting for a major party candidate.
There's also the issue that our method of elections is the reason for the two-party system. There's a general principle of political science known as
Duverger's Law. It says that in a system with
plurality (or first-past-the-post) voting (and that's a pretty good video, by the way), coupled with single member districts, a two-party system is virtually inevitable. Simply declaring "I want a third party!" is almost certainly not going to get produce one that can win. To have successful third parties will require a change in the way we vote.
This video advocates "approval voting." "Ranked-choice voting" has alre.
Dynamics gp insights to distribution - sales ordersSteve Chapman
Dynamics GP includes integrated distribution functionality that makes it easy to control inventory and efficiently process purchases and customer orders. This document includes tips and tricks that you might not ordinarily find or know about.
INTRODUCTION TO CaseWare IDEAProvided by Audimation Services, .docxnormanibarber20063
INTRODUCTION TO CaseWare IDEA
Provided by Audimation Services, Inc. & the IDEA Academic Partnership Program
1
What Is IDEA?
CaseWare IDEA is a CAAT(Computer Assisted Audit Tool) designed by auditors for auditors (and other data analysts). IDEA allows auditors to analyze 100% of the data, as opposed to the traditional 10%. IDEA is a user-friendly tool that makes data mining and data analysis easy and efficient.
History of IDEA
IDEA is a data analysis tool that was originally created in Canada by the Canadian Institute of Chartered Accountants (CICA) in 1987 and is now developed by CaseWare IDEA. IDEA is available in 16 languages and distributed in over 90 countries. Originally created by auditors for auditors, IDEA is user-friendly with an intuitive user interface. IDEA has been distributed in the U.S. by Audimation Services, Inc since 1992 and is located in Houston, Texas.
3
Who Uses IDEA?
Big 4
More than 80% of Top 100 CPA Firms in U.S.
Fortune 500 Companies
Government Agencies - Federal, State & Local (including universities)
More than 150,000 Companies Globally
The IDEA Process
Let’s Get Started!
Stages of Using IDEA
Consider Audit Objectives
Determine How IDEA is Appropriate for the Audit
Specify the Data Required
Arrange Download of the Data
Utilize IDEA
Review and Housekeeping
Create a Project Folder
IDEA facilitates organization of your work through Managed Projects.
Before doing so, be sure to copy the data files from CD (or network) into the Source Files folder in the Library for this project.
Example –
The following data files have been provided with our Version Nine Workbook:
ACCPAY2012.TXT – Accounts Payable History File
SUPPLIER.XLS – Authorized Suppliers Excel worksheet
Copy and paste, or drag and drop into the Source Files folder for importing efficiency.
Create a Project Folder (cont.)
After starting IDEA, you are able to create your project with the following procedure:
On the Home tab, in the Projects group, click Create.
When the Create Project dialog box appears, type the name of your project (for this instance: Accounts Payables) in the Managed project section next to Project name and click OK.
The newly created project will remain active until the Project Folder is changed.
Importing the Data Files
To import the files for testing, access the Import Assistant by clicking Desktop button in the Import section of the menu ribbon.
Once loaded, the Import Assistant guides you through the process of importing the data.
For efficiency, add the needed data files to the Source Files folder in the Library prior to importing. To achieve this, right-click on the Source Fields.
Text Example –
To import ACCPAY2012.TXT (an ASCII Delimited file), select Text and click Browse button to navigate to and select the file from the Project Folder.
Click Open on the Select File dialog box.
Click Next.
Once the data file has been selected, the Import Assistant wi.
Chapter 2 Graphical Descriptions of Data 25 Chapter 2.docxcravennichole326
Chapter 2: Graphical Descriptions of Data
25
Chapter 2: Graphical Descriptions of Data
In chapter 1, you were introduced to the concepts of population, which again is a
collection of all the measurements from the individuals of interest. Remember, in most
cases you can’t collect the entire population, so you have to take a sample. Thus, you
collect data either through a sample or a census. Now you have a large number of data
values. What can you do with them? No one likes to look at just a set of numbers. One
thing is to organize the data into a table or graph. Ultimately though, you want to be able
to use that graph to interpret the data, to describe the distribution of the data set, and to
explore different characteristics of the data. The characteristics that will be discussed in
this chapter and the next chapter are:
1. Center: middle of the data set, also known as the average.
2. Variation: how much the data varies.
3. Distribution: shape of the data (symmetric, uniform, or skewed).
4. Qualitative data: analysis of the data
5. Outliers: data values that are far from the majority of the data.
6. Time: changing characteristics of the data over time.
This chapter will focus mostly on using the graphs to understand aspects of the data, and
not as much on how to create the graphs. There is technology that will create most of the
graphs, though it is important for you to understand the basics of how to create them.
Section 2.1: Qualitative Data
Remember, qualitative data are words describing a characteristic of the individual. There
are several different graphs that are used for qualitative data. These graphs include bar
graphs, Pareto charts, and pie charts.
Pie charts and bar graphs are the most common ways of displaying qualitative data. A
spreadsheet program like Excel can make both of them. The first step for either graph is
to make a frequency or relative frequency table. A frequency table is a summary of
the data with counts of how often a data value (or category) occurs.
Example #2.1.1: Creating a Frequency Table
Suppose you have the following data for which type of car students at a college
drive?
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota,
Honda, Chevy, Toyota, Nissan, Ford, Toyota, Nissan, Mercedes, Chevy,
Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan, Honda,
Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan,
Toyota, Chevy, Honda, Chevy, Saturn, Toyota, Chevy, Chevy, Nissan,
Honda, Toyota, Toyota, Nissan
Chapter 2: Graphical Descriptions of Data
26
A listing of data is too hard to look at and analyze, so you need to summarize it.
First you need to decide the categories. In this case it is relatively easy; just use
the car type. However, there are several cars that only have one car in the list. In
that case it is easier to make a category called other for the ones with low values.
Now ...
You can use a calculator to do numerical calculations. No graphing.docxjeffevans62972
You can use a calculator to do numerical calculations. No graphing calculator is allowed. Please DO NOT USE ANY COMPUTER SOFTWARE to solve the problems.
1. (a) What is an assignment problem? Briefly discuss the decision variables, the objective function and constraint requirements in an assignment problem. Give a real world example of the assignment problem.
(b) What is a diet problem? Briefly discuss the objective function and constraint requirements in a diet problem. Give a real world example of a diet problem.
(c) What are the differences between QM for Windows and Excel when solving a linear programming problem? Which one you like better? Why?
(d) What are the dual prices? In what range are they valid? Why are they useful in making recommendations to the decision maker? Give a real world example.
Answer Questions 2 and 3 based on the following LP problem.
Let P1 = number of Product 1 to be produced
P2 = number of Product 2 to be produced
P3 = number of Product 3 to be produced
P4 = number of Product 4 to be produced
Maximize 80P1 + 100P2 + 120P3 + 70P4 Total profit
Subject to
10P1 + 12P2 + 10P3 + 8P4 ≤ 3200 Production budget constraint
4P1 + 3P2 + 2P3 + 3P4 ≤ 1000 Labor hours constraint
5P1 + 4P2 + 3P3 + 3P4 ≤ 1200 Material constraint
P1 > 100 Minimum quantity needed for Product 1 constraint
And P1, P2, P3, P4 ≥ 0 Non-negativity constraints
The QM for Windows output for this problem is given below.
Linear Programming Results:
Variable
Status
Value
P1
Basic
100
P2
NONBasic
0
P3
Basic
220
P4
NONBasic
0
slack 1
NONBasic
0
slack 2
Basic
160
slack 3
Basic
40
surplus 4
NONBasic
0
Optimal Value (Z)
34400
Original problem w/answers:
P1 P2 P3 P4 RHS Dual
Maximize
80 100 120 70
Constraint 1
10 12 10 8 <= 3200 12
Constraint 2 4 3 2 3 <= 1000 0
Constraint 3 5 4 3 3 <= 1200 0
Constraint 4 1 0 0 0 >= 100 -40
Solution
-> 100 0 220 0 Optimal Z-> 34400
Ranging Results:
Variable
Value
Reduced Cost
Original Val
Lower Bound
Upper Bound
P1
100
0
80
-Infinity
120
P2
0
44
100
-Infinity
144
P3
220
0
120
87.5
Infinity
P4
0
26
70
-Infinity
96
Constraint
Dual Value
Slack/Surplus
Original Val
Lower Bound
Upper Bound
Constraint 1
12
0
3200
1000
3333.333
Constraint 2
0
160
1000
840
Infinity
Constraint 3
0
40
1200
1160
Infinity
Constraint 4
-40
0
100
0
120
2. (a) Determine the optimal solution and optimal value and interpret their meanings.
(b) Determine the slack (or surplus) value for each constraint and interpret its meaning.
3. (a) What are the ranges of optimali.
Project Management CaseYou are working for a large, apparel desi.docxbriancrawford30935
Project Management Case
You are working for a large, apparel design and manufacturing company, Trillo Apparel Company (TAC), headquartered in Albuquerque, New Mexico. TAC employs around 3000 people and has remained profitable through tough economic times. The operations are divided into 4 districts; District 1 – North, District 2 – South, District 3 – West and District 4 – East. The company sets strategic goals at the beginning of each year and operates with priorities to reach those goals.Trillo Apparel Company Current Year Priorities
Increase Sales and Distribution in the East
Improve Product Quality
Improve Production in District 4
Increase Brand Recognition
Increase RevenuesCompany Details
Company Name: Trillo Apparel Company (TAC)
Company Type: Apparel design and production
Company Size: 3000 employees
Position
# Employees
Owner/CEO
1
Vice President
4
Chief Operating Officer
1
Chief Financial Officer
1
Chief Information Officer
1
IT Department
38
District Manager
4
Sales Team
30
Accountant
12
Administrative Assistant
7
Order Fullfilment
45
Customer Service
57
Designer
24
Project Manager
10
Maintenance
25
Operations
2500
Shipping Department
240
Total Employees
3000
Products: Various Apparel
Corporate Location: Albuquerque, New MexicoTAC Organization Chart
District 4 Production Warehouse Move Project Details
The business has expanded considerably over the past few years and District 4 in the East has outgrown its current production facility. Because of this growth the executives want to expand the current facility, moving the whole facility 10 miles away. The location selected has enough room for the production and the shipping department. However, the current warehouse needs some renovation to accommodate the district’s operational needs.
The VP of Operations estimates the production and shipping warehouse move for District 4 will provide room required to generate the additional $1 million/year product revenues to meet the current demand due to the expanded production capacity. Daily production generates $50,000 revenue so a week of downtime will cost $250,000 in lost revenues.
The move must be completed in 4 months.
Mileage between the old and new facilities is 10 miles.
Bids have been received from contractors to build out the new office space and production floor and have signed contracts for work as follows:
Activity
Company Providing Services
Total Contract
Supplies
Time Needed
Pack, move and unpack production equipment
City Equipment Movers
$150,000
n/a
5 Days
Move non-production equipment and materials
Express Moving Company
$125,000
n/a
5 Days
Framing
East Side Framing & Drywall
$121,000
$125,000
15 Days
Electrical
Sparks Electrical
$18,000
$12,000
10 Days
Plumbing
Waterworks Plumbing
$15,000
$13,000
10 Days
Drywall
East Side Framing & Drywall
$121,000
$18,000
15 Days
Finish Work
Woodcraft Carpentry
$115,000
$15,000
15 Days
Build work benches for production floor
Student Workers Carpentry
$112,000
$110,000
15 Days
Product.
Case Study on Data Analytics with given Dataset (Biswadeep Ghosh Hazra) - [Ha...Biswadeep Ghosh Hazra
I had to analyze and visualize the given data and come up with answers to the questions asked in the case study competition. The only tools allowed were Excel, Tableau, and Power BI, and I used the first two for coming up with the answers
Our Stock Pitch For Mailing Shipping Services PowerPoint Presentation PPT Slide Template is the perfect way to pitch your stock. We have researched thousands of stock pitches and designed the most impactful way to convince your investors to invest in your equity. https://bit.ly/3eZUTut
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Business insights Evaluation of a Telecom client dataset using R AbdulMajedRaja R S
A Telecom Operator has provided us their customer data to analyse and find meaningful insights in business context that can help the company to improve their process and services to their customers. This report summarises all of the statistical findings from the analysis of the Telecom operator’s dataset.
Uncovering the Bangor Region's Competitive Economic SectorsStephen Bolduc
An original and important investigation of the economic sectors producing rapid growth in the Bangor region. The presentation proceeds from broad economic sectors to the niche categories at the 6 digit NAICS level.
This PowerPoint contains mock visualizations and data about a company's product portfolio. The insights are presented based on the assumption that the stakeholders have high business acumen and product knowledge.
The two-party systemCheck out this list Thats the .docxrhetttrevannion
The two-party system
Check out this list:
That's the
official list (see this list by clicking on
official list ) of political parties receiving votes in the 2020 presidential election.
So don't try to tell me the U.S. has a "two-party system." 😀
Nevertheless, we know that lots of people are frustrated with politics in America today, and one common complaint is that the two-party system is at fault. While we have a lot of parties, it would be disingenuous to dismiss complaints about the two-party system, because realistically it's mostly either Republicans or Democrats who have a realistic chance to be elected. That is, of course, not always true. There are currently two U.S. senators who are neither Republican nor Democratic, Angus King, from Maine, and Bernie Sanders, from Vermont. Both of these senators caucus with the Democrats, though, and are generally reliable Democratic votes. There have been numerous third party (or "minor party") candidates who have won elections, including Jesse Ventura winning the race to be governor of Minnesota in 1998. The late Texas billionaire Ross Perot received 19% of the national vote for President of the United States in 1992.
Still, people complaining about the two-party system worry that because of the virtual lock on politics of the two main parties, political views not represented by the two parties – Republicans and Democrats – are excluded from political discourse. Or that having only two parties limits the choices available on election day, and if neither candidate is desirable, there’s no one left to choose.
But there is another view, as you’ve read. This other view is that blaming the two-party system for today’s problems is misguided. This argument says, among other things, that citizens of other countries with multi-party systems are no more satisfied with the state of their politics than Americans are with ours. We blame the two-party system, they say, because we think “the grass is always greener on the other side,” when really it’s not. It is also argued that people unhappy with the two parties think they want a "centrist" party, but when they see the policy platforms of those centrist parties they don't like them as much as they thought they would, so they end up voting for a major party candidate.
There's also the issue that our method of elections is the reason for the two-party system. There's a general principle of political science known as
Duverger's Law. It says that in a system with
plurality (or first-past-the-post) voting (and that's a pretty good video, by the way), coupled with single member districts, a two-party system is virtually inevitable. Simply declaring "I want a third party!" is almost certainly not going to get produce one that can win. To have successful third parties will require a change in the way we vote.
This video advocates "approval voting." "Ranked-choice voting" has alre.
Dynamics gp insights to distribution - sales ordersSteve Chapman
Dynamics GP includes integrated distribution functionality that makes it easy to control inventory and efficiently process purchases and customer orders. This document includes tips and tricks that you might not ordinarily find or know about.
INTRODUCTION TO CaseWare IDEAProvided by Audimation Services, .docxnormanibarber20063
INTRODUCTION TO CaseWare IDEA
Provided by Audimation Services, Inc. & the IDEA Academic Partnership Program
1
What Is IDEA?
CaseWare IDEA is a CAAT(Computer Assisted Audit Tool) designed by auditors for auditors (and other data analysts). IDEA allows auditors to analyze 100% of the data, as opposed to the traditional 10%. IDEA is a user-friendly tool that makes data mining and data analysis easy and efficient.
History of IDEA
IDEA is a data analysis tool that was originally created in Canada by the Canadian Institute of Chartered Accountants (CICA) in 1987 and is now developed by CaseWare IDEA. IDEA is available in 16 languages and distributed in over 90 countries. Originally created by auditors for auditors, IDEA is user-friendly with an intuitive user interface. IDEA has been distributed in the U.S. by Audimation Services, Inc since 1992 and is located in Houston, Texas.
3
Who Uses IDEA?
Big 4
More than 80% of Top 100 CPA Firms in U.S.
Fortune 500 Companies
Government Agencies - Federal, State & Local (including universities)
More than 150,000 Companies Globally
The IDEA Process
Let’s Get Started!
Stages of Using IDEA
Consider Audit Objectives
Determine How IDEA is Appropriate for the Audit
Specify the Data Required
Arrange Download of the Data
Utilize IDEA
Review and Housekeeping
Create a Project Folder
IDEA facilitates organization of your work through Managed Projects.
Before doing so, be sure to copy the data files from CD (or network) into the Source Files folder in the Library for this project.
Example –
The following data files have been provided with our Version Nine Workbook:
ACCPAY2012.TXT – Accounts Payable History File
SUPPLIER.XLS – Authorized Suppliers Excel worksheet
Copy and paste, or drag and drop into the Source Files folder for importing efficiency.
Create a Project Folder (cont.)
After starting IDEA, you are able to create your project with the following procedure:
On the Home tab, in the Projects group, click Create.
When the Create Project dialog box appears, type the name of your project (for this instance: Accounts Payables) in the Managed project section next to Project name and click OK.
The newly created project will remain active until the Project Folder is changed.
Importing the Data Files
To import the files for testing, access the Import Assistant by clicking Desktop button in the Import section of the menu ribbon.
Once loaded, the Import Assistant guides you through the process of importing the data.
For efficiency, add the needed data files to the Source Files folder in the Library prior to importing. To achieve this, right-click on the Source Fields.
Text Example –
To import ACCPAY2012.TXT (an ASCII Delimited file), select Text and click Browse button to navigate to and select the file from the Project Folder.
Click Open on the Select File dialog box.
Click Next.
Once the data file has been selected, the Import Assistant wi.
Chapter 2 Graphical Descriptions of Data 25 Chapter 2.docxcravennichole326
Chapter 2: Graphical Descriptions of Data
25
Chapter 2: Graphical Descriptions of Data
In chapter 1, you were introduced to the concepts of population, which again is a
collection of all the measurements from the individuals of interest. Remember, in most
cases you can’t collect the entire population, so you have to take a sample. Thus, you
collect data either through a sample or a census. Now you have a large number of data
values. What can you do with them? No one likes to look at just a set of numbers. One
thing is to organize the data into a table or graph. Ultimately though, you want to be able
to use that graph to interpret the data, to describe the distribution of the data set, and to
explore different characteristics of the data. The characteristics that will be discussed in
this chapter and the next chapter are:
1. Center: middle of the data set, also known as the average.
2. Variation: how much the data varies.
3. Distribution: shape of the data (symmetric, uniform, or skewed).
4. Qualitative data: analysis of the data
5. Outliers: data values that are far from the majority of the data.
6. Time: changing characteristics of the data over time.
This chapter will focus mostly on using the graphs to understand aspects of the data, and
not as much on how to create the graphs. There is technology that will create most of the
graphs, though it is important for you to understand the basics of how to create them.
Section 2.1: Qualitative Data
Remember, qualitative data are words describing a characteristic of the individual. There
are several different graphs that are used for qualitative data. These graphs include bar
graphs, Pareto charts, and pie charts.
Pie charts and bar graphs are the most common ways of displaying qualitative data. A
spreadsheet program like Excel can make both of them. The first step for either graph is
to make a frequency or relative frequency table. A frequency table is a summary of
the data with counts of how often a data value (or category) occurs.
Example #2.1.1: Creating a Frequency Table
Suppose you have the following data for which type of car students at a college
drive?
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota,
Honda, Chevy, Toyota, Nissan, Ford, Toyota, Nissan, Mercedes, Chevy,
Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan, Honda,
Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan,
Toyota, Chevy, Honda, Chevy, Saturn, Toyota, Chevy, Chevy, Nissan,
Honda, Toyota, Toyota, Nissan
Chapter 2: Graphical Descriptions of Data
26
A listing of data is too hard to look at and analyze, so you need to summarize it.
First you need to decide the categories. In this case it is relatively easy; just use
the car type. However, there are several cars that only have one car in the list. In
that case it is easier to make a category called other for the ones with low values.
Now ...
You can use a calculator to do numerical calculations. No graphing.docxjeffevans62972
You can use a calculator to do numerical calculations. No graphing calculator is allowed. Please DO NOT USE ANY COMPUTER SOFTWARE to solve the problems.
1. (a) What is an assignment problem? Briefly discuss the decision variables, the objective function and constraint requirements in an assignment problem. Give a real world example of the assignment problem.
(b) What is a diet problem? Briefly discuss the objective function and constraint requirements in a diet problem. Give a real world example of a diet problem.
(c) What are the differences between QM for Windows and Excel when solving a linear programming problem? Which one you like better? Why?
(d) What are the dual prices? In what range are they valid? Why are they useful in making recommendations to the decision maker? Give a real world example.
Answer Questions 2 and 3 based on the following LP problem.
Let P1 = number of Product 1 to be produced
P2 = number of Product 2 to be produced
P3 = number of Product 3 to be produced
P4 = number of Product 4 to be produced
Maximize 80P1 + 100P2 + 120P3 + 70P4 Total profit
Subject to
10P1 + 12P2 + 10P3 + 8P4 ≤ 3200 Production budget constraint
4P1 + 3P2 + 2P3 + 3P4 ≤ 1000 Labor hours constraint
5P1 + 4P2 + 3P3 + 3P4 ≤ 1200 Material constraint
P1 > 100 Minimum quantity needed for Product 1 constraint
And P1, P2, P3, P4 ≥ 0 Non-negativity constraints
The QM for Windows output for this problem is given below.
Linear Programming Results:
Variable
Status
Value
P1
Basic
100
P2
NONBasic
0
P3
Basic
220
P4
NONBasic
0
slack 1
NONBasic
0
slack 2
Basic
160
slack 3
Basic
40
surplus 4
NONBasic
0
Optimal Value (Z)
34400
Original problem w/answers:
P1 P2 P3 P4 RHS Dual
Maximize
80 100 120 70
Constraint 1
10 12 10 8 <= 3200 12
Constraint 2 4 3 2 3 <= 1000 0
Constraint 3 5 4 3 3 <= 1200 0
Constraint 4 1 0 0 0 >= 100 -40
Solution
-> 100 0 220 0 Optimal Z-> 34400
Ranging Results:
Variable
Value
Reduced Cost
Original Val
Lower Bound
Upper Bound
P1
100
0
80
-Infinity
120
P2
0
44
100
-Infinity
144
P3
220
0
120
87.5
Infinity
P4
0
26
70
-Infinity
96
Constraint
Dual Value
Slack/Surplus
Original Val
Lower Bound
Upper Bound
Constraint 1
12
0
3200
1000
3333.333
Constraint 2
0
160
1000
840
Infinity
Constraint 3
0
40
1200
1160
Infinity
Constraint 4
-40
0
100
0
120
2. (a) Determine the optimal solution and optimal value and interpret their meanings.
(b) Determine the slack (or surplus) value for each constraint and interpret its meaning.
3. (a) What are the ranges of optimali.
Project Management CaseYou are working for a large, apparel desi.docxbriancrawford30935
Project Management Case
You are working for a large, apparel design and manufacturing company, Trillo Apparel Company (TAC), headquartered in Albuquerque, New Mexico. TAC employs around 3000 people and has remained profitable through tough economic times. The operations are divided into 4 districts; District 1 – North, District 2 – South, District 3 – West and District 4 – East. The company sets strategic goals at the beginning of each year and operates with priorities to reach those goals.Trillo Apparel Company Current Year Priorities
Increase Sales and Distribution in the East
Improve Product Quality
Improve Production in District 4
Increase Brand Recognition
Increase RevenuesCompany Details
Company Name: Trillo Apparel Company (TAC)
Company Type: Apparel design and production
Company Size: 3000 employees
Position
# Employees
Owner/CEO
1
Vice President
4
Chief Operating Officer
1
Chief Financial Officer
1
Chief Information Officer
1
IT Department
38
District Manager
4
Sales Team
30
Accountant
12
Administrative Assistant
7
Order Fullfilment
45
Customer Service
57
Designer
24
Project Manager
10
Maintenance
25
Operations
2500
Shipping Department
240
Total Employees
3000
Products: Various Apparel
Corporate Location: Albuquerque, New MexicoTAC Organization Chart
District 4 Production Warehouse Move Project Details
The business has expanded considerably over the past few years and District 4 in the East has outgrown its current production facility. Because of this growth the executives want to expand the current facility, moving the whole facility 10 miles away. The location selected has enough room for the production and the shipping department. However, the current warehouse needs some renovation to accommodate the district’s operational needs.
The VP of Operations estimates the production and shipping warehouse move for District 4 will provide room required to generate the additional $1 million/year product revenues to meet the current demand due to the expanded production capacity. Daily production generates $50,000 revenue so a week of downtime will cost $250,000 in lost revenues.
The move must be completed in 4 months.
Mileage between the old and new facilities is 10 miles.
Bids have been received from contractors to build out the new office space and production floor and have signed contracts for work as follows:
Activity
Company Providing Services
Total Contract
Supplies
Time Needed
Pack, move and unpack production equipment
City Equipment Movers
$150,000
n/a
5 Days
Move non-production equipment and materials
Express Moving Company
$125,000
n/a
5 Days
Framing
East Side Framing & Drywall
$121,000
$125,000
15 Days
Electrical
Sparks Electrical
$18,000
$12,000
10 Days
Plumbing
Waterworks Plumbing
$15,000
$13,000
10 Days
Drywall
East Side Framing & Drywall
$121,000
$18,000
15 Days
Finish Work
Woodcraft Carpentry
$115,000
$15,000
15 Days
Build work benches for production floor
Student Workers Carpentry
$112,000
$110,000
15 Days
Product.
Case Study on Data Analytics with given Dataset (Biswadeep Ghosh Hazra) - [Ha...Biswadeep Ghosh Hazra
I had to analyze and visualize the given data and come up with answers to the questions asked in the case study competition. The only tools allowed were Excel, Tableau, and Power BI, and I used the first two for coming up with the answers
Our Stock Pitch For Mailing Shipping Services PowerPoint Presentation PPT Slide Template is the perfect way to pitch your stock. We have researched thousands of stock pitches and designed the most impactful way to convince your investors to invest in your equity. https://bit.ly/3eZUTut
Similar to Marketing analysis - writing sample (20)
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
1. Marketing Analysis for The Bee Corp
Qingyang(Kevin) Liu
Email:tug14939@temple.edu
June 22, 2017
1 Introduction of the dataset
The orginal file Quant Round.xlsx contains three sheets. However, xlsx format is proprietary format hence can
not be imported to R software without using other packages. I transfer Quant Round.xlsx into Quant Round.csv
file and only keep the first sheet since csv foramt has much better compatibility and the first sheet from Quant
Round.xlsx contains all information we need.
The import process using read.csv command for R software is shown below:
> df1 <- read.csv(file =
+ "/home/kevin/Desktop/The Bee Corp/Quant Round.csv",
+ header = T)
> dim(df1)
[1] 9994 22
The Quant Round.csv file has been imported into R as df1 data frame, which contains 9994 rows and 22 variables.
The summary information for important variables are shown below.
Row.ID: The primary key for this dataset. This variable is unique for each row.
Order.ID: The order identification. This variable doesn’t have to be unqiue. One order could contain multiple
rows (one order may contain different products.). There are 5009 distinct orders in df1.
Order.Date: The date when order was created or submitted. Order.Date was stored in numeric formation. I trans-
fer the numeric formation into yyyy-mm-dd formation, assuming the original date is "1900-01-01".
Ship.Date: The date when order was shipped. Also stored in numeric formation. I transfer the numeric formation
into yyyy-mm-dd formation, assuming the original date is "1900-01-01".
Ship.Mode: There are four different ship mode: Same Day, First Class, Standard Class and Second Class.
Customer.ID: Customer Identification. One customer has one unique ID.
Segment: There are three different segments, Customer, Corporate and Home Office, in this dataset.
(Corporate␣ has been corrected as Corporate)
Country: All orders have been shipped within United States.
City: There are 531 different cities in this dataset.
State: There are 48 contiguous U.S. states and the District of Columbia in this dataset.
(CAL␣ has been corrected as California. IND␣ has been corrected as Indiana)
1
2. Region: There are five regions, Central, East, North, South and West, in this dataset. There are few mistakes
in the original dataset. For example, there are 37 records in which Florida was categorized as North
region.
Product.ID: Production Identification. One product has one unique ID.
Category: All productions belong to three categories, funiture, office supplies and technology.
Sub.Category: The relationship between Sub.Category and Category are shown in Table 1.1.
One Sub.Category only belongs to one Category.
Table 1.1: Sub.Category (in column) and Category (in row)
Furniture Office Supplies Technology
Accessories 0 0 775
Appliances 0 466 0
Art 0 796 0
Binders 0 1523 0
Bookcases 228 0 0
Chairs 617 0 0
Copiers 0 0 68
Envelopes 0 254 0
Fasteners 0 217 0
Furnishings 957 0 0
Labels 0 364 0
Machines 0 0 115
Paper 0 1370 0
Phones 0 0 889
Storage 0 846 0
Supplies 0 190 0
Tables 319 0 0
SalesTotal: SalesTotal = Iterm.Price × Quantity, where Item.Price is the price after discount.
Profit: Positive number stands for profit. Negative number stands for deficit.
2
3. 2 Sales/Profit by Region
Figure 2.1: Maps of Sales and Profit in State Level
Total Sales in State Level
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Profit in State Level
−20000
0
20000
40000
60000
80000
3
4. Table 2.1: Inconsistent definition of regions (part)
> head(table(df1$State,df1$Region),10)
Central East North South West
Alabama 0 0 1 60 0
Arizona 0 0 0 0 224
Arkansas 0 0 1 59 0
California 0 0 0 0 2001
Colorado 0 0 0 0 182
Connecticut 0 82 0 0 0
Delaware 0 96 0 0 0
District of Columbia 0 10 0 0 0
Florida 0 0 37 346 0
Georgia 0 0 6 178 0
Table 2.2: Top 4 States by Sales
> head(df2[order(-df2$sales.ratio),],4)
state sales profit sales.ratio
4 california 457687.6 76381.39 0.19923710
31 new york 310876.3 74038.55 0.13532829
42 texas 170188.0 -25729.36 0.07408497
46 washington 138641.3 33402.65 0.06035226
Table 2.3: Top 4 States by Profit
> head(df2[order(-df2$profit),],4)
state sales profit sales.ratio
4 california 457687.63 76381.39 0.19923710
31 new york 310876.27 74038.55 0.13532829
46 washington 138641.27 33402.65 0.06035226
21 michigan 76269.61 24463.19 0.03320111
Table 2.4: Discount in Texas
> table(df1[df1$State == "Texas","Discount"])
0.2 0.3 0.32 0.4 0.6 0.8
570 94 27 13 81 200
The rule of categorizing regions is doubtful and inconsistent in this dataset. According to Table 2.1, 37 records in
Florida have been defined as records in North region and 1 record in Alabama has been defined as a record in North
region. There are more than 100 records that have been defined in wrong regions. In real work, we need to discuss the
definition of each region with supervisor. For this analysis report, quantitative marketing analysis based on regions is
skipped.
4
5. According to Table 2.2 and Table 2.3, California is the largest market for the company and New York State is the
second largest market for the company either based on sales or by profit. The Sales/Profit performance in Texas
market is contradictory. By sales, Texas is the third largest market for the company. However, the company lost
$25, 729.36 in Texas market. By looking at Table 2.4, we find that the company has large discount policy in Texas
and that every product sold in Texas market has at least 20% discount. There are even 200 records of 80% discount.
Future more, by look at Table 2.5 and Table 2.6, we find out that sales in deficits markets have at least 20% discount.
Discount is a important reason for deficit in those market. We need to discuss the reason for applying large discount
strategy with business manager. It could be market penetration strategy or those products are too difficult to sell.
Table 2.5: Deficits Markets in States Level
> df3[df3$Profit < 0,]
State Profit SalesTotal Profit.Sales.Ratio
40 Oregon -1190.470 17431.15 -0.06829558
41 Florida -3399.302 89473.71 -0.03799219
42 Arizona -3427.925 35282.00 -0.09715789
43 Tennessee -5341.694 30661.87 -0.17421289
44 Colorado -6527.858 32108.12 -0.20330864
45 North Carolina -7490.912 55603.16 -0.13472097
46 Illinois -12607.887 80166.10 -0.15727205
47 Pennsylvania -15559.960 116511.91 -0.13354823
48 Ohio -16971.377 78258.14 -0.21686405
49 Texas -25729.356 170188.05 -0.15118192
Table 2.6: Discount in Deficits Markets
> tab1 <- table(df1[,c("State","Discount")])
> tab1[as.character(df3[df3$Profit < 0,"State"]),]
Discount
State 0 0.1 0.15 0.2 0.3 0.32 0.4 0.45 0.5 0.6 0.7 0.8
Oregon 0 0 0 100 0 0 0 0 5 0 19 0
Florida 0 0 0 299 0 0 0 11 6 0 67 0
Arizona 0 0 0 174 0 0 0 0 9 0 41 0
Tennessee 0 0 0 144 0 0 8 0 2 0 29 0
Colorado 0 0 0 138 0 0 0 0 4 0 40 0
North Carolina 0 0 0 201 0 0 8 0 4 0 36 0
Illinois 0 0 0 264 53 0 0 0 18 57 0 100
Pennsylvania 0 0 0 354 36 0 82 0 10 0 105 0
Ohio 0 0 0 290 23 0 67 0 8 0 81 0
Texas 0 0 0 570 94 27 13 0 0 81 0 200
Conclusion:
1. California and New York States are the first two most successful markets based on either profit or sales.
2. Companies are losing money in states like Texas, Ohio and many others due to large discount.
3. The region is ill-defined so no conclusion has been made based on it.
5
6. 3 Profit by Category/Subcategory/Specific Product
According to Table 3.1, we find that Technology and Office Supplies account for 50.79% and 42.77% of total
profit for the company. The products that belong to Furniture only contributes 6.44% of the total Profit for the
company.
Table 3.1: Profit by Category
> df4 <- ddply(df1[,c("Category","Profit")],.(Category),colwise(sum))
> df4 <- arrange(df4,-df4$Profit)
> df4$percent <- round(df4$Profit/sum(df4$Profit)*100,2)
> df4
Category Profit percent
1 Technology 145454.95 50.79
2 Office Supplies 122490.80 42.77
3 Furniture 18451.27 6.44
Figure 3.1: Profit by Category/Subcategory
Profit by Category/Subcategory
Profit/Deficit
Tables
Bookcases
Furnishings
Chairs
−20000 0 20000 40000 60000
Furniture
Supplies
Fasteners
Labels
Art
Envelopes
Appliances
Storage
Binders
Paper
Office Supplies
Machines
Accessories
Phones
Copiers
Technology
Profit
Deficit
6
7. Table 3.2: Most Profitable Products
Category Sub.Category Product.Name Total.Quantity Total_Profit Average.Term.Price Max.Item.Price Min.Item.Price
Technology Copiers
Canon image
CLASS 2200
Advanced Copier
20 25199.93 1259.996 3499.99 2099.994
Office Supplies Binders
Fellowes PB500
Electric Punch
Plastic Comb
Binding Machine
with Manual Bind
31 7753.039 250.098 1270.99 254.198
Technology Copiers
Hewlett Packard
LaserJet 3310 Copier
38 6983.884 183.7864 599.99 359.994
Technology Copiers
Canon PC1060
Personal Laser Copier
19 4570.935 240.5755 10559.99 559.992
Technology Machines
HP Designjet
T520 Inkjet
Large Format Printer
- 24" Color
12 4094.977 341.2481 1749.99 874.995
Technology Machines
Ativa V4110MDD
Micro-Cut Shredder
11 3772.946 342.9951 699.99 699.99
Looking at Figure 3.1, we find that all 4 products that belong to technology can make profit for the company. Copiers,
Phones and Accessories can make more than $40, 000 for the company! All products ,except Supplies, that belong
to Office Supplies can make profit for the companies. For the Furniture products, Chairs and Furnishings, can
make profit while Bookcases and Tables are responsible for deficit.
From Table 3.2, the most profitable product is Canon image CLASS2200 Advanced Copier, which is a copiers and a
sort of technology product. However, there is doubt about the Item.Price of Canon PC1060 Personal Laser Copier.
The Max.Item.Price for that product is $10, 559.99 while the Min.Item.Price is $559.992. The difference is too
large for a copier. I guess the difference was caused by Typo.I will discuss these large difference between maximum
item price and minimum item price in section 5.
Table 3.3: Details of Canon PC1060 Personal Laser Copier’s transaction
Product_Name Item_Price Quantity Discount
Canon PC1060 Personal Laser Copier 559.992 2 0.2
Canon PC1060 Personal Laser Copier 10559.992 5 0.2
Canon PC1060 Personal Laser Copier 559.992 5 0.2
Canon PC1060 Personal Laser Copier 699.99 7 0
Conclusion:
1. Products like copiers, phones, accessories in Technology category can make a lot of profit.
2. The performance of Furniture products are generally not good. Those products either make little profit and
loss much money for the company.
3. The most profitable product is Canon image CLASS2200 Advanced Copier.
4. Some Item.Price are doubtful, (in Table 3.3, same printer has been sold at $10, 559.992 and $559.99).
7
8. 4 Cluster Analysis (DEMO)
Cluster Analysis is a powerful tool for marketing analysis. The cluster analysis is very handy when there are many
continuous variables. Though we don’t have many continuous variables for this dataset, we can still use this methods
to have some interesting findings.
We create a new dataset after aggregating on State. The first 6 rows of the new dataset could be found in Table 4.1.
The cluster analysis is based on SalesTotal, Profit, Quantity and Avg.item.price.
Table 4.1: Dataset for clustering analysis
> df7 <- ddply(df1[,c("State","SalesTotal","Profit","Quantity")],
+ .(State),colwise(sum))
> df7$Avg.iterm.price <- df7$SalesTotal/df7$Quantity
> rownames(df7) <- as.character(df7$State)
> df7 <- df7[,2:5]
> head(df7)
SalesTotal Profit Quantity Avg.iterm.price
Alabama 19510.64 5786.825 256 76.21344
Arizona 35282.00 -3427.925 862 40.93040
Arkansas 11678.13 4008.687 240 48.65887
California 457687.63 76381.387 7667 59.69579
Colorado 32108.12 -6527.858 693 46.33206
Connecticut 13384.36 3511.492 281 47.63116
After standardizing each variable via scale function, we calculate the euclidean distance between each variable. Then
we choose "average" algorithm for clustering analysis.The initial result of clustering analysis could be found in Figure
4.1.
Figure 4.1: Initial Results - Cluster Analysis
California
NewYork
Wyoming
Texas
Washington
Vermont
Florida
Pennsylvania
Illinois
Ohio
Michigan
Virginia
Georgia
Indiana
RhodeIsland
Montana
Nevada
Maryland
Massachusetts
Missouri
Alabama
Oklahoma
Minnesota
Delaware
NewJersey
Kentucky
Wisconsin
NorthCarolina
Arizona
Colorado
Tennessee
WestVirginia
DistrictofColumbia
Idaho
Louisiana
Nebraska
NewHampshire
Mississippi
Arkansas
Connecticut
SouthCarolina
Utah
Oregon
Maine
Iowa
Kansas
NewMexico
NorthDakota
SouthDakota
0246
Average Linkage Clustering
hclust (*, "average")
d
Height
There are many criterion we can choose to determine the number of clusters. According to my experience, the
NbClust::NbClust function could be very helpful.
8
9. Figure 4.2: Determine the number of clusters
0 2 3 5 9 10
Number of Clusters Chosen by 26 Criteria
Number of Clusters
NumberofCriteria
02468
The NbClust::NbClust use 26 different criteria to determine the number of clusters. According to the result from
NbClust::NbClustin Figure 4.2, I decide to set the number of cluster equal 3.
Figure 4.3: Final Results - Cluster Analysis
California
NewYork
Wyoming
Texas
Washington
Vermont
Florida
Pennsylvania
Illinois
Ohio
Michigan
Virginia
Georgia
Indiana
RhodeIsland
Montana
Nevada
Maryland
Massachusetts
Missouri
Alabama
Oklahoma
Minnesota
Delaware
NewJersey
Kentucky
Wisconsin
NorthCarolina
Arizona
Colorado
Tennessee
WestVirginia
DistrictofColumbia
Idaho
Louisiana
Nebraska
NewHampshire
Mississippi
Arkansas
Connecticut
SouthCarolina
Utah
Oregon
Maine
Iowa
Kansas
NewMexico
NorthDakota
SouthDakota
0246
Average Linkage Clustering
3 Cluster Solution
hclust (*, "average")
d
Height
The final result could be found in Figure 4.3. New York and California are categorized as cluster 2. Wyoming is
categorized as cluster 3. The rest states are categorized as cluster 1.
Description of Clusters
> aggregate(df7, by = list(clusters), median)
Group.1 SalesTotal Profit Quantity Avg.iterm.price
1 1 20944.270 2116.598 268.5 57.87003
2 2 384281.951 75209.968 5945.5 66.64670
3 3 1603.136 100.196 4.0 400.78400
9
10. We can easily find that the average item price sold to Wyoming is as high as $400. This makes Wyoming a outlier
compared to other states. New York state and California are grouped together due to their outstanding performance in
profit. Other states are grouped together since the algorithm "thinks" the similarity between them is large. However,
I have to point out that this section is just a demo to illustrate my ability in data mining and machine learning. Much
more work still need to be done to draw serious conclusions.
5 Doubtful Item.Price
Table 5.1: Doubtful Item.Price
Category Sub_Category Product_Name Total_Profit Max_Item_Price Min_Item_Price Range
Furniture Furnishings
Deflect-o
DuraMat Antistatic
Studded Beveled Mat
for Medium Pile Carpeting
244.3888 10105.34 42.136 10063.2
Technology Accessories
Logitech P710e
Mobile Speakerphone
1645.361 10257.49 205.992 10051.5
Furniture Chairs
DMI Arturo Collection
Mission-style Design
Wood Chair
486.1556 10105.69 105.686 10000
Technology Copiers
Canon PC1060
Personal Laser Copier
4570.935 10559.99 559.992 10000
Technology Phones BlackBerry Q10 548.0565 10100.79 100.792 10000
Technology Phones
RCA ViSYS 25825
Wireless digital phone
90.993 10103.99 103.992 10000
Office Supplies Binders
Ibico EPK-21
Electric Binding System
3345.282 1889.99 377.998 1511.992
Technology Machines
Cubify CubeX 3D
Printer Double Head Print
-8879.97 2399.992 899.997 1499.995
Technology Copiers
Canon imageCLASS 2200
Advanced Copier
25199.93 3499.99 2099.994 1399.996
Office Supplies Binders
GBC DocuBind P400
Electric Binding System
-1878.17 1360.99 272.198 1088.792
Technology Machines
Lexmark MX611dhe
Monochrome Laser Printer
-4589.97 1529.991 509.997 1019.994
Office Supplies Binders
Fellowes PB500 Electric
Punch Plastic Comb Binding
Machine with Manual Bind
7753.039 1270.99 254.198 1016.792
Office Supplies Binders
Fellowes PB200 Plastic Comb
Binding Machine
693.5592 1050.997 50.997 1000
Office Supplies Envelopes
Tyvek Top-Opening
Peel & Seel Envelopes,
Plain White
225.0504 1021.744 21.744 1000
As I mentioned at the end of Section 3, the difference between maximum item price and minimum item price are too
large for some products. In Table 5.1, I will all products that have doubtful Item.Price. The Range variable equals
the difference between Max_Item_Price and Min_Item_Price. It is implausible that Blackberry Q10 could be sold
at $10, 100.79 meanwhile be sold at $100.79.
10