This document provides an overview of the key topics and activities covered in Week 11 of the FINM4100 course. It discusses evaluating ethical considerations regarding fintech and analytics in finance. Case studies and potential solutions to ethical issues are investigated. Applications of analytics in areas like risk management, fraud detection, and personalized services are reviewed. The document also explores governance and accountability, bias in AI systems, privacy concerns, and how to make technologies like open banking more ethical.
FINM4100Analytics in Accounting, Finance and Economics
1. FINM4100
Analytics in Accounting,
Finance and Economics
Ethical considerations and more
applications of business analytics and
technology in accounting, finance and
economics
Week 11
Lesson Learning Outcomes
1 Evaluate ethical considerations regarding FinTech
and the use of analytics in Accounting, finance and
economics
2 Investigate case studies
3 Find potential solutions to ethical, privacy and legal
issues related to the finance sector and its use of data
2. 4 More applications of analytics in finance
Glossary1: Data Ethics
• Data Ethics relates to
- Responsible use of data
- The value placed on data by competing parties
- The purpose and interests of data processing
• It is about the right to keep your personal data protected
• It is about transparency & accountability
https://dataethics.eu/data-ethics-principles/
One implication is that Individual humans should have control
of their data.
T
h
is
P
h
o
to
b
y
U
4. https://dataethics.eu/data-ethics-principles/
http://paul-barford.blogspot.com/2010/06/ss-holds-out.html
https://creativecommons.org/licenses/by-nd/3.0/
Where we are at…
• More and more accounting and finance organisations are
adopting AI and analytics
• There’s already an 80 - 90% reduction in time taken to do
usual tasks
• The roles of professionals in this area are changing as
repetitive tasks are automated
• Technology is changing the way we deal with compliance
• Ethical questions are arising daily
https://bernardmarr.com/artificial-intelligence-in-accounting-
and-finance/
This Photo by Unknown Author is licensed under CC BY-SA-
NC
https://technofaq.org/posts/2019/09/cyber-security-trends-to-
watch-out-for-organizations-to-stay-ahead/
https://creativecommons.org/licenses/by-nc-sa/3.0/
This Photo by Unknown Author is licensed under CC BY-SA
Where we are heading..
• Near real-time insights
5. • Algorithms will transform ideas around compliance and
reduce fraud costs and lead to….
• More flexible work arrangements and different roles
• Possible need to hire an ethics expert
•
• The redefining of ethical conduct in business
https://www.thebluediamondgallery.com/tablet/b/business-
ethics.html
https://creativecommons.org/licenses/by-sa/3.0/
Case Study: Google a bank?
• It hasn’t been easy for all financial institutions to keep up
with new
technology and demand for convenient services
• Consequently…. Amazon, Apple and Google have started to
offer services
normally offered by big banks
• Example: Google Pay
• The issue: Google is an advertising company with ads
representing 71% of
its revenue sources in 2019.
• Given Google’s history of collecting Terrabytes of data from
your location,
6. emails, shopping and song preferences
• Q: Do we really trust Google as a bank?
T
h
is
P
h
o
to
b
y
U
n
k
n
o
w
n
A
u
th
o
r is
lic
e
n
8. • Exchange traded fund (ETF) is a kind of “pooled
investment security” (or basket of them) which can be
traded like a single stock
• They track a particular index, sector or commodity, e.g SPY
tracks the S&P 500 index
https://www.investopedia.com/terms/e/etf.asp
https://www.marketmovers.it/2019/01/pimco-euro-high-yield-
IE00BD8D5H32.html
https://creativecommons.org/licenses/by-nc/3.0/
Glossary 3: What is
Superannuation?
• “Superannuation (or ‘super’) is money set aside while
you’re working to support your financial needs in
retirement. Your super is invested in a range of assets to
help grow your balance so you can have the best
possible retirement outcome.”
This Photo by Unknown Author is licensed under CC BY-ND
http://theconversation.com/you-may-be-quietly-lining-up-to-
lose-on-your-superannuation-7612
9. https://creativecommons.org/licenses/by-nd/3.0/
Glossary 4: What is a robo-advisor
• “Robo-advisors are digital platforms that provide
automated, algorithm-driven financial planning services
with little to no human supervision. A typical robo-advisor
asks questions about your financial situation and future
goals through an online survey; it then uses the data to
offer advice and automatically invest for you.”
https://www.investopedia.com/terms/r/roboadvisor-
roboadviser.asp
This Photo by Unknown Author is licensed under CC BY-NC
https://navesinkinternational.com/where-robo-advisors-are-
better-than-financial-advisors/
https://creativecommons.org/licenses/by-nc/3.0/
Case study: Ethics and investing of
your money by others
• Are ETFs, superannuation funds and robo-advisors ethical?
• ETFs and superannuation are everchanging “holding
structures”.
10. • For example, an Australian shares ETF or choosing Australian
Shares
as an option in a super fund. In both cases you are buying
shares but
not directly.
• Robo Advisors also invest for you, so you are not directly
buying the
items yourself, just buying based on non-human advice.
• How do you know that they are ethical?
• Ways to find out if they are ethical or not:
– to understand how your money is invested
– To ask them for their environmental, social and governance
policy
Activity 1: Is technology neutral?
• Form small groups
• Watch the video at
https://www.youtube.com/watch?v=q_AwceyM68k
• Discuss the following:
Q1. You can’t see ethical value in technology by just looking at
it, so where
do we have to look to find it and how can we apply moral
judgement
11. regarding a particular technology?
Q2.What is so special about technology?
https://www.youtube.com/watch?v=q_AwceyM68k
•
Solution
s to potential ethical,
privacy and legal issues
This Photo by Unknown Author is licensed under CC BY-NC-
ND
https://www.gbcnv.edu/admissions/privacy.html
https://creativecommons.org/licenses/by-nc-nd/3.0/
Holding financial institutions and staff
accountable
• Australian Securities and Investments Commission (ASIC) is
Australia’s
12. corporate, markets and financial services regulator
• The Australian Prudential Regulation Authority (APRA)
establishes
frameworks of standards and practises in the financial sector
• Australian Competition and Consumer Commission (ACCC)
helps
protect consumers https://www.accc.gov.au/
• The Office of the Australian Information Commissioner
(OAIC) has a
great deal of documentation on Australia’s data privacy laws.
https://www.accc.gov.au/
Examining ethical conduct at an
individual level
13. • Given that the ethical implications of AI are such
a large concern, Who will examine the ethical
dilemas at an individual level?
• If you don’t have an ethics officer, it may be your
organisation’s management account or
compliance officer.
• They will
– Practice ethical standards
– Create an culture of ethical nature
– Use diagnostic analytics in cases where AI caused
ethical issues
https://sfmagazine.com/post-entry/january-2021-ethics-maps-
for-ai-analytics/
Ethics Mapping
• What is an ethics mapping?
14. • An ethics map is a map of the range of concerns you might
have
in the context of the type of service your staff/AI is suppose to
provide
• An overview of certain behaviours, e.g. what may be
considered
as acting in the accounting or finance space with
– no ethics
– indifference (or a relative view of ethics)
– value-based ethics
– (see examples next slide)
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.196.
7022&rep=rep
1&type=pdf#:~:text=Page%201,Ethics%20and%20the%20Publi c
%20Service'
This Photo by Unknown Author is licensed
16. bias or does not see a
problem
Codes with fairness and
non-discrimination in
mind regarding the
approval of loans
Collects data and sells it
on without permission
Collects data they may
not need and does not
see an issue with
sharing it
17. Collects data for a
specific purpose and
does not share it
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.196.
7022&rep=rep
1&type=pdf#:~:text=Page%201,Ethics%20and%20the%20Public
%20Service'
Activity 2: Ethics Mapping
• Form small groups,
• Suppose that you work in either accounting, finance or
economics
• Create an ethics mapping table with four rows
• Each row should provide an example as in the previous slide
18. No values Relative Values or
Indifference
Value – based ethics
Example 1……
Example 2……
Example 3……
Example 4……
Reviewing the banking code
of practice
• “Australia's banks may face rules on ethical use of
tech, data”
• The banking code of practice was reviewed in 2021
19. • The code consists of a “set of enforceable standards” that
customers
and small businesses can expect from Australian banks, i.e. a
set of
rules setting out the rights of customers.
• The safe and secure handling of bank customer’s data was
questioned
in the review,
– especially in the context of financial and elder abuse, as well
as domestic violence
• What issues do you think they are talking about?
https://www.itnews.com.au/news/australias-banks-may-face-
rules-on-ethical-use-of-tech-data-566937
Case Study: Digital banking and privacy
• “Banks say they should not be treated like Big Tech by online
privacy bill”
• Along similar lines to the review of the banking code of
20. practice
• A new online privacy bill aimed at tech companies may affect
banks, insurers, superannuation funds, etc., because it is so
broad
in its definition of “online platforms”
• Examples of obligations:
• 26KC(4)(a) “respond to a request to not use…personal
information within a
reasonable period.”
• 26KC(2) “Notify an individual…of the purposes for which the
organisation
collects, uses and discloses personal information.”
https://www.itnews.com.au/news/banks-say-they-should-not-be-
treated-like-big-tech-by-online-privacy-
bill-575701This Photo by Unknown Author is licensed under
CC BY-SA-NC
https://technofaq.org/posts/2017/03/everything-you-should-
know-about-successful-online-employee-training/
21. https://creativecommons.org/licenses/by-nc-sa/3.0/
Activity 3: Why not to use a robo-
advisor
• Watch the video about robo-advisors at
• https://www.youtube.com/watch?v=wjB3hp1RUKQ
Q1. What reasons are given for not using the robo-advisors
Q2. Given that this guy is advertising his own business,
what do you think?
https://www.youtube.com/watch?v=wjB3hp1RUKQ
Reduce bias in AI-based financial
services
• Bias is often present in input data in finance
22. • Ways around this are:
– avoid gender, racial or ideological biases
– use complete and representative data
– have diversity in development teams
– monitor situations where AI systems self-improve, acquire
new behaviours and have unintended results
This Photo by Unknown Author is licensed under CC BY
https://disasteravoidanceexperts.com/how-to-evaluate-
unconscious-bias-caused-by-cognitive-biases-at-work/
https://creativecommons.org/licenses/by/3.0/
Consider ethical, privacy and legal
issues on a case by case basis, e.g.
• Many platform owners currently have the option to use
customer data for commercial/other purposes (in the fine
23. print)
• Suggestions to make things more ethical:
– Allow customers to disable use of some personal data
– Inform users of how exactly their data is being used
– Allow customers to choose how they want to share their data,
what type of data and for what purpose
https://fintechweekly.com/magazine/articles/what-about-the-
ethics-of-
fintech#:~:text=Ethical%20considerations%20for%20FinTech,-
First%2C%20many%20online&text=Customers%20should%20b
e%20able
%20to,vis%2D%C3%A0%2Dvis%20customers.
24. Glossary 5: What is Open Banking?
• Open banking is about sharing your banking data with third
parties.
• In Australia, the third parties must accredited by the ACCC
https://www.ausbanking.org.au/priorities/open-banking/
This Photo by Unknown Author is licensed under CC BY
Data that can be shared
• Personal information
• Account balances
• Bank product information
• Transaction amounts
http://www.midiatismo.com.br/open-banking-sera-que-vamos-
ter-acesso-isso-algum-dia
https://creativecommons.org/licenses/by/3.0/
25. Activity 4: Think-group-share
• Form small groups and
brainstorm
1. Potential ethical and
privacy issues in relation
to open banking
2. ways in which to make
open banking ethical, safe
and private where
necessary
This Photo by Unknown Author is licensed under CC BY-SA-
NC
https://www.getmespark.com/five-ways-not-to-brainstorm/
https://creativecommons.org/licenses/by-nc-sa/3.0/
Applications of analytics in finance
In prep for next week we will start revising some methods
and applications in Finance
26. Broad application areas are
• 1. Risk Analytics
• 2. Real-Time Analytics
• 3. Consumer Analytics
• 4. Customer Data Management
• 5. Personalized Services
• 6. Financial Fraud Detection
• 7. Algorithmic Trading
https://www.upgrad.com/blog/data-science-use-cases-finance-
industry/
This Photo by Unknown Author is licensed under CC BY-ND
https://www.upgrad.com/blog/data-science-use-cases-finance-
industry/#1_Risk_Analytics
https://www.upgrad.com/blog/data-science-use-cases-finance-
28. recommendations for
savings, investments and
loans
• Targeted offers based on
spending patterns
• AI-based money
management programs
https://personetics.com/
https://themonetaryfuture.blogspot.com/2013/11/banking-
innovation-depends-on-bitcoin.html
https://creativecommons.org/licenses/by/3.0/
Last minute questions?
This Photo by Unknown Author is licensed under CC BY
29. https://leadershipfreak.wordpress.com/2010/03/05/10-best-
questions-ever
https://creativecommons.org/licenses/by/3.0/
Victim Advocate Worksheet
Job Profile
Directions: Research the position of victim advocate and answer
the following questions.
What responsibilities does a victim advocate have in a case?
When does a victim advocate become involved in a criminal
case? When does the involvement end?
What skills would be important for a victim advocate to
possess? Why?
Based on what you have learned about this position, would you
be interested in becoming a victim advocate? Why or why not?
1
30. Walden University - MSCRJS CRJS6203
Type a caption for your photo
The highest rates of victims in Washington, D.C. include:
Include 5-10 types of victims and statistics for each type
Crime Victims' Bill of Rights
Insert information
Phone: [Telephone]
Email: [Email address]
Web: [Web address]
Victims’ Rights and Services
Above the title, insert an appropriate and engaging graphic. In
this text box, Insert a few important statistics.
Crime Victims’ Compensation Program
Contact Us
Insert information
Types of Victims
Note:
This brochure is designed to be printed. You should test print
on regular paper to ensure proper positioning before printing on
card stock.
You may need to uncheck Scale to Fit Paper in the Print dialog
(in the Full Page Slides dropdown).
31. Check your printer instructions to print double-sided pages.
To change images on this slide, select a picture and delete it.
Then click the Insert Picture icon
in the placeholder to insert your own image.
To change the logo to your own, right-click the picture
“replace with LOGO” and choose Change Picture.
Header
Community Resources
This spot would be perfect for a mission statement. You might
use the right side of the page to summarize how you stand out
from the crowd and use the center for a brief success story.
(And be sure to pick photos that show off what your company
does best. Pictures should always dress to impress.)
Think a document that looks this good has to be difficult to
format?
Think again! The placeholders in this brochure are formatted for
you. Enter your own text with just a click.
“insert powerful quote about rights and/or services.”
Get the exact results you want
To easily customize the look of this brochure, on the Design tab
of the ribbon, check out the Themes, Colors, and Fonts
galleries.
Have company-branded colors or fonts?
32. No problem! The Themes, Colors, and Fonts galleries give you
the option to add your own.
Use a photo depicting victim resources
Don’t forget to include some specifics about what you offer,
and how you differ from the competition.
Want to help us create change? Volunteer with us!
Insert volunteer information
Use a photo depicting volunteers
Note:
This brochure is designed to be printed. You should test print
on regular paper to ensure proper positioning before printing on
card stock.
You may need to uncheck Scale to Fit Paper in the Print dialog
(in the Full Page Slides dropdown).
Check your printer instructions to print double-sided pages.
To change images on this slide, select a picture and delete it.
Then click the Insert Picture icon
in the placeholder to insert your own image.
To change the logo to your own, right-click the picture
“replace with LOGO” and choose Change Picture.
33. Economic Applications of
Big Data & Predictive Analytics
FINM4100
Analytics in Accounting,
Finance and Economics
Week 9
Lesson Learning Outcomes
1 Define and review ideas around micro- and
macroeconomics
2 Review the concept of correlation
34. 3 Analyse Macroeconomic data
Why Build Models?
“Just because you
have more data
doesn’t mean that
you’re going to make
better decisions.”
Models encapsulate
patterns that exist in
data, helping us make
sense of them.Christina Zhu
Assistant Professor of Accounting
Wharton School of the University of Pennsylvania
35. SELTS
• Student feedback is usually done in week 9
• You may be asked to fill in a survey
This Photo by Unknown Author is licensed under CC BY-SA
http://exzuberant.blogspot.co.uk/2011/02/putting-student-voice-
into-practice.html
https://creativecommons.org/licenses/by-sa/3.0/
Software for today
1. Google Colab
• Either
A. watch the teacher demonstrate analytics and accounting in
python
OR
36. B. you can run the python scripts yourself in Google Colab
• If you want to run the code provided, make sure you have
access
(signed in) to Google Colab https://colab.research.google.com
2. Exploratory
A. watch the teacher demonstrate analytics and accounting in
Exploratory OR
B. run each step yourself online (access is explained on the next
slide)
https://colab.research.google.com/
Dataset
• Data: countries of the world.csv (1970 to 2017)
• Business Problem: How do we determine factors affecting a
country's GDP per capita and make a model using the data of
many countries?
37. • We have data from 227 countries and variables (factors) such
as GDP, population, literacy, crops (%), birthrate, and others.
• We will explore correlations between each factor and GDP
across various countries in python
• Make charts (try multiple linear regression in Exploratory)
This Photo by Unknown Author is licensed under CC BY-SA
http://superuser.com/questions/49642/where-can-i-find-google-
maps-with-a-geopolitical-overlay-as-in-colored-countrie
https://creativecommons.org/licenses/by-sa/3.0/
What is Economics?
• Economics is the study of how society allocates scarce
resources to satisfy unlimited wants
• We can consider two branches of economics:
▪ Microeconomics is the study of how single economic
38. units of society make economic decisions
▪ Macroeconomics is the study of how an aggregated
economy makes economic decisions
What is Economics?
Is the study of how society allocates scarce resources
to satisfy unlimited wants
Economics
Production,
distribution
and
consumption
39. Scarcity,
choice and
decision
making
Microeconomics
Focus:
• How individual consumers and companies make decisions
• How they respond to changes in price
• Why different goods have different prices
• How humans may trade in an optimal way
Typical topics in this area are:
• Demand and supply
40. • Costs of producing goods (production, revenue and costs)
• Market structure, e.g. perfect competition
This Photo by Unknown Author is licensed under CC BY-ND
https://mru.org/courses/principles-economics-
microeconomics/subsidies-definition-subsidy-wedge
https://creativecommons.org/licenses/by-nd/3.0/
Macroeconomics
Focus:
The overall economy of a region, e.g. country, using aggregated
data
Typical topics in this area are:
• Economic cycles
• Economic growth
41. • Fiscal and monetary policy
• Unemployment rates
• Gross Domestic Product (GDP) which is a broad measure of a
country’s economic performance
T
h
is
P
h
o
to
b
y
U
n
k
n
o
43. B
Y
We will be analysing GDP data today
https://courses.lumenlearning.com/ivytech-
introbusiness/chapter/reading-stages-of-the-economy/
https://creativecommons.org/licenses/by/3.0/
Why is Economic Growth important?
• It is an indicator of a healthy economy
• One theory says increasing GDP leads to more employment in
some
sectors
• It leads to a better standard of living
• Key components of economic growth are thought to be
– Natural resources
– Infrastructure
44. – Population/labour
– Human capital
– Technology
– Law
This Photo by Unknown Author is licensed under CC BY-SA-
NC
https://ourworld.unu.edu/en/does-economic-growth-make-us-
happy
https://creativecommons.org/licenses/by-nc-sa/3.0/
GDP per capita 2021
How are we doing?
Activity 1: Think – pair – share
45. Economics
• Watch the video below which compares micro- and
macro- economics
• https://www.youtube.com/watch?v=nJbWj_kHCJQ
• Form pairs
• Person 1 will explain macroeconomics to person 2, then
person 2 will explain microeconomics to person 1
• Report back to class with comments and questions
https://www.youtube.com/watch?v=nJbWj_kHCJQ
Review of concepts
• Before analysing today’s data, we need to
review the idea of
– Covariance and correlation
– correlation heatmaps
46. This Photo by Unknown Author is licensed under CC BY
https://courses.lumenlearning.com/precalcone/chapter/distinguis
h-between-linear-and-nonlinear-relations/
https://creativecommons.org/licenses/by/3.0/
Two Measures of Association
▪ Covariance (is there any pattern to the way two variables
move together?)
a. Only concerned with the direction of the relationship
b. No causal effect is implied
c. Is affected by units of measurement
▪ Correlation coefficient which incorporates part of the
covariance formula (how strong is the linear relationship
between two variables?)
47. Correlation coefficient
Also called Standardised Covariance and is between –1 and 1
• The closer to –1, the stronger the negative linear relationship
• The closer to 1, the stronger the positive linear relationship
• The closer to 0, the weaker the linear relationship
This Photo by Unknown Author
is licensed under CC BY-NC-ND
http://communitymedicine4asses.wordpress.com/2013/12/27/cor
relation
https://creativecommons.org/licenses/by-nc-nd/3.0/
Visualising correlation coefficient
• Method 1: Correlation heatmap
48. This Photo by Unknown Author is licensed under CC BY-SA
http://stackoverflow.com/questions/6189327/correlation-heat-
map-for-windows-presentation-foundation
https://creativecommons.org/licenses/by-sa/3.0/
Visualising correlation coefficient
Y
X
Y
X
Y
X
r = -1.0 r = 0r = +0.3
Method 2: Plots of pairs of variables
49. Formulae for Covariance and
Correlation
Measures the relative strength of the linear relationship
between two variables
Sample covariance
and correlation coefficient
where
� =
σ�=1
� ��� − ҧ� (�� − ത�
σ�=1
� �� − ҧ�
2 σ�=1
� �� − ത�
50. 2
COV(x, y� =
σ�=1
� ��� − ҧ� (�� − ത�
� − 1
ҧ� is the mean of the x’s
ത� is the mean of the y’s
countries of the world.csv data
• In today’s data some of the variables are obvious while others
are
not
• It also has commas instead of dots (which we will deal with
later)
• Variables
– Agriculture
51. – Industry
– Service
• These three represent labour force by sector, so if agriculture
in
Liberia is 0,769. It is really 0.769 and means that 76.9% of the
work
force in Liberia work in the agricultural sector. Similarly for
Industry
and Service.
• Climate measure is a classification between 1 (drier) and 4
(milder)
Activity Open the script and run
or watch the demo
• Download the data countries of the world.csv to a directory of
your
choice
52. • Open the script below
https://colab.research.google.com/drive/15LsR6QoH858T4e2U4
LHFtlzWSL
EJrWMG?usp=sharing
• You will be prompted in the second block of code to choose
the data file
• Click in the box and find your countries of the world.csv to be
uploaded
• Run the rest of the script and analyse the output as it is
generated, e.g.
correlation heatmap, countries with the highest GDP, etc.
https://colab.research.google.com/drive/15LsR6QoH858T4e2U4
LHFtlzWSLEJrWMG?usp=sharing
Sample Output
Sample Output
53. Sample Output
Data Modification
• Make a copy of the data file in your folder
• Open the data in Microsoft Excel
• We would normally use a dot to indicate accuracy to one or
more decimal places, however a comma has been used here
• Highlight the data columns with commas
• Go to the “Editing menu”
• Click on Find & Select and scroll down to “replace”
• Replace commas , for dots . (Enter symbols as below) and
click
54. on Replace all
• Save your file
Data Modification
• Create a new column heading in column U called “GDP
Low_High”
• Type =IF(I2<3000, 0,1) in cell U2 and enter
• Click on the corner of that cell (you should see a cross), hold
and drag it down
the column to repeat the formula in rows down to cell U228
• You should see a zero if GDP < $3000 per capita and a one
otherwise
• Save your file
55. Exploratory
• Access Exploratory
• Start a new project called GDP analysis
• Use Data Frames + to find and import the modified data file
• Change variable GDP Low_High from numeric to logical
before clicking on save
• Select Analytics
• We are going to go through a simple guided Decision tree
model then you can
experiment and try to interpret your own
• Instructions for the model type and variables are on the next
slide
Exploratory analytics model
56. • Select Decision Tree as the type
• GDP Low_High as the Target variable
• Phones, birthrate and Agriculture as the predictor variables
• Leave sample size as is an run
• You will see a tree which is to be read from the top
• We will start to interpret this (first see next slide)
Simple Decision Tree
• The model makes its own
thresholds if you don’t make
all variables binary
• Positive of each condition is
to the right and negative to
57. the left
• If you add the percentages
from the bottom of the tree,
they sum at each level, e.g.
• 7% + 4% make up the 11%,
• 11% + 25% make up the
36%
Simple Decision Tree
The model makes its own thresholds if you don’t make all
variables binary
Positive of each condition is to the right and negative to the left
• Rule 1: “< 75 phones per 1000
58. persons”
• In the case “no” = “>=75 phones
per 1000 persons”
• 64% of the countries have >=75
phones per 1000 persons (dark
blue)
• This gives them a (0.92) 92%
chance of having a GDP >=$3000
per capitaOf the countries with < 75 phones
per 1000 persons (36%), only a
0.15 (15%) have a GDP >=$3000
per capita
59. Simple Decision Tree
• Rule 2: “Agricultural workforce >=20%”
• If we split the group with >75 phones per 1000
persons up further into those with an Agricultural
workforce >=20% or not
• We find that 59% of countries have >75 phones
per 1000 persons and an Agricultural workforce
>=20%
• This raises the chance of the country having a
GDP >=$3000 per capita to 0.96, i.e. 96%, given
the two other conditions
60. Simple Decision Tree
• Rule 3: “Birthrate >=29 (thought to be
roughly 29 births per 1000 capita)
• 11% of countries have <75 phones per
1000 capita and a birth rate < 29 both
per 1000 capita
• These would give the countries a 43%
chance of having a GDP >=$3000 per
capita
• 4% of the countries have <75 phones
per 1000 capita and a birth rate < 29
both per 1000 capita and an Agricultural
workforce < 16%. 62% in this category
61. have a GDP >=$3000 per capita
If you look at the “Importance” menu (green) , the order
of importance is phones, birth rate, agriculture
Decision Tree Exploration
• Try some different combinations of predictor variables
and attempt to interpret the results
• You will find that the thresholds change a lot
• Report back to class as needed
This Photo by Unknown Author is licensed under CC BY
http://www.sapelli.org/building-a-simple-decision-tree-with-
sapelli-xml/
https://creativecommons.org/licenses/by/3.0/
62. Vis poverty with satellite data
• If time (or in your own time) look at the report
at
• https://www.kaggle.com/reubencpereira/visua
lizing-poverty-w-satellite-data/report
• and interact with the maps on Kaggle
• You may have to sign in
https://www.kaggle.com/reubencpereira/visualizing-poverty-w-
satellite-data/report
Finance applications of big data and
predictive analytics: risk & return
FINM4100
Analytics in Accounting,
63. Finance and Economics
Week 10
Lesson Learning Outcomes
1 Define risk and return
2 Explore different ways of measuring risk and return
3 Investigate factors influencing risk and return
4 Performing portfolio analytics and optimisation
Why Build Models?
“Just because you
have more data
doesn’t mean that
64. you’re going to make
better decisions.”
Models encapsulate
patterns that exist in
data, helping us make
sense of them.Christina Zhu
Assistant Professor of Accounting
Wharton School of the University of Pennsylvania
Software for today
1. Google Colab
• Either
A. watch the teacher demonstrate analytics and accounting in
python
65. OR
B. you can run the python scripts yourself in Google Colab
• If you want to run the code provided, make sure you have
access
(signed in) to Google Colab https://colab.research.google.com
2. Exploratory
A. watch the teacher demonstrate analytics and accounting in
Exploratory OR
B. run each step yourself online (access is explained on the next
slide)
https://colab.research.google.com/
The risk return relationship is one of
the most fundamental relationships in
all of finance
66. • Return is a measure of the amount
earned by owning an asset
• Risk is a measure of the variability of
that return
To earn more return, an asset owner
must be prepared to accept more risk
The Risk Return Relationship
Photo by Parker Johnson on Unsplash
https://unsplash.com/@pkripperprivate?utm_source=unsplash&u
tm_medium=referral&utm_content=creditCopyText
https://unsplash.com/s/photos/pattern?utm_source=unsplash&ut
m_medium=referral&utm_content=creditCopyText
All investments carry risk, some more than others.
Risk & Return
67. Cash is generally low
risk. Suitable for investors
who have a short-term
investment outlook or low
tolerance for risk.
Shares are the most
volatile asset class, but
historically over long
periods of time have
achieved on average the
highest returns.
68. Risk and return in Australia
Risk and Return for Australian Shares & Bonds from 1974 to
2009
High return, high risk
Medium return, medium risk
Low return, low risk
Average
return
Std
14.34% 21.89%
10.14% 7.66%
9.73% 4.33%
69. How do we measure risk and return?
Return is a
measure of the
earnings made on
an asset
Risk is a measure
of the variability in
earnings made on
an asset
Dollar terms ($)
Percentage terms
(%)
Standard deviation
70. Coefficient of
variation
Beta
Dollar terms ($)
Percentage terms
(%)
• Let’s review the measures of standard deviation and
coefficient of variation
• We saw Beta in week 8
Glossary 1: Variance and Standard
deviation as measures of variability
71. • Measures the squared difference
of a data set relative to its mean.
Variance
• Measures the spread of a data
set relative to its mean.
Standard deviation
Recall from STAM4000 that
Hence, standard deviation is used a
measure of financial risk
Formulas for the variance &
standard deviation
N = population size
n = sample size
72. � = population mean (average)
ҧ� = sample mean (average)
Population Sample
Variance �2=
σ x−� 2
�
�2=
σ x− ҧ� 2
(n−1)
Standard
deviation σ = �2 s = �2
11
Use �2 and s, respectively, as we
73. have a sample.
First, we need ҧ� =
σ �
�
=
6.9−4.8+2.3+2.2+0.6
6
= 1.68%
�2=
σ �− ҧ� 2
(�−1)
so we have
Example of STDEV of returns for the
S&P 500
Month Return
74. October 2021 6.9%
September 2021 -4.8%
August 2021 2.9%
July 2021 2.3%
June 2021 2.2%
May 2021 0.6%
Returns for S&P 500, May 2021-October 2021
�2=
6.9−1.68 2+ −4.8 −1.68 2+ 2.9−1.68 2+ 2.3−1.68 2+ 2.2−1.68
2+ 0.6−1.68 2
(6 −1)
=14.5
Standard deviation, s = 14.5 = 3.8%
https://www.businessinsider.com.au/what-is-standard-deviation
75. Standard deviation measures the variability of possible
outcomes and therefore quantifies uncertainty and risk
%150
Melbourne
investment
Sydney investment
Which investment is riskier – Melbourne or
Sydney?
Quantifying uncertainty and risk
76. • To measure the relationship between average return and
(risk) volatility simultaneously, we use the Coefficient of
Variation (CV):
CV =
�
�
=
Standard Deviation
Annualised Return
• Thus, CV can be used as a measure of asset quality.
• Note that single measures rarely provide the entire picture
but this is a start.
Glossary 2: Coefficient of variation
77. Activity 1: Can you identify the
least/most risky assets?
Investment Risk & Return
RISK
RETURN
Other risk factors and return
Interest Dividend
Capital
Gains
Housing
79. • From the video and previous slides, answer the
following
Q1. Return and risk are measures of what ?
Q2. What is standard deviation used to measure ?
Q3. Are bonds riskier than shares or visa versa?
Q4. What measure maximises return for the same risk?
https://www.youtube.com/watch?v=4KGvoy_Ke9Y
What is a Portfolio?
• A portfolio is a collection of materials, e.g.
career related materials, investments, art
work
• In assessment 3 you will create a portfolio of
analytics methods
• In a risk return context, a portfolio contains
financial investments
80. https://clarke.edu/academics/careers-internships/student-
checklist/resume-writing-and-portfolios/what-is-a-
portfolio/
This Photo by Unknown Author is licensed
under CC BY-NC-ND
This Photo by Unknown Author is licensed under CC BY-
SA-NC
This Photo by Unknown Author is licensed under CC BY-NC-
ND
http://ezdesigns.deviantart.com/art/Portfolio-design-190112229
https://creativecommons.org/licenses/by-nc-nd/3.0/
https://www.peoplematters.in/article/hr-analytics/7-
fundamentals-scale-hr-analytics-capabilities-12634
https://creativecommons.org/licenses/by-nc-sa/3.0/
http://dollarsandsense.sg/a-simple-strategy-to-create-an-easy-to-
manage-investment-portfolio/
https://creativecommons.org/licenses/by-nc-nd/3.0/
81. Risk and diversification for an
investment portfolio
In the same way that particular measures apply
to single stocks, they can also be applied to a
portfolio
• Standard deviation captures uncertainty
• Coefficient of variation standardises risk
• Beta measures systematic risk
Diversification refers to correlation reducing
portfolio standard deviation. Hence we seek to
have some uncorrelated (or imperfectly
correlated) investments.Photo by Michel Porro on Unsplash
83. • A ratio of 3.0 or higher is considered excellent
• A ratio under 1.0 is considered sub-optimal
• Sharpe ratio can be compared with Coeff. of Var. to make an
assessment on asset quality and performance.
Glossary 3: Sharpe Ratio
Activity 3: Quick Quiz
Q1. What mathematical methods are commonly used to measure
risk ?
Q2. Consider
Investment A and Investment B
• Portfolio return: 20% Portfolio return: 30%
• Risk free rate: 10% Risk free rate: 10%
84. • Standard Deviation: 5 Standard Deviation: 40
If the Sharpe ratios are (A) 2.0 and (B) 5.0, Confirm this from
the formula and
interpret these outcomes.
Q3. Is diversification useful in a portfolio or do you just need
more
investments?
Glossary 4: Skewness and Kurtosis
• Skewness and Kurtosis which you may have encountered in
STAM4000
are also measures of risk for investments
“Skewness is a measure of symmetry, or the lack of it.
T
h
is
87. are heavy-tailed or light-tailed relative to a
normal distribution. ”
https://en.wikipedia.org/wiki/Skewness
https://creativecommons.org/licenses/by-sa/3.0/
http://stats.stackexchange.com/questions/84158/how -is-the-
kurtosis-of-a-distribution-related-to-the-geometry-of-the-
density-fun
https://creativecommons.org/licenses/by-sa/3.0/
Activity 4: Portfolio calculations
• Make sure you are signed up with Google Colab or watch the
demo
• We start with a portfolio of four stocks (Google, Amazon,
MacDonalds,
The Walt Disney Company) and then start adding Australian
stocks to
see how the measures of risk change.
• Expected return, volatility, Sharpe ratio, skewness and
kurtosis are
88. calculated each time.
• The script is here
https://colab.research.google.com/drive/1T7sS1KLo_WcwyLKn
KBmZtaso6
cBsSzSQ?usp=sharing
• All you need to do is run each block of code and attempt to
interpret the
results with your teacher
https://colab.research.google.com/drive/1T7sS1KLo_WcwyLKn
KBmZtaso6cBsSzSQ?usp=sharing
Glossary 5: Annualised return
• The annualized return equates to what you would earn if the
annual
return was compounded over a period of time.
• It is the geometric average of an investment’s earnings in a
year
89. This Photo by Unknown Author is licensed under
CC BY-SA-NC
http://www.xaktly.com/ProbStat_Averages.html
https://creativecommons.org/licenses/by-nc-sa/3.0/
• There are various analytics methods for portfolio optimisation
• In broad terms, we seek to find the minimum (volatility)
variance
portfolio for a given selection of investments, i.e. perform
mean-
variance optimisation.
• Requirements and conditions for mean-variance optimisation:
Portfolio optimisation
Minimise
Portfolio
90. Covariance
Define Acceptable
Portfolio Return
Fully Allocate
Budgeted Capital
Set Capital
Allocation
Constraints
For example, consider a four security portfolio.
• BHP Billiton, QBE Insurance, Telstra and Westpac Banking
Corporation
Question: In what proportions should these investments be held
91. such
that the risk (volatility), measured using standard deviation, is
minimised
for a given level of return?
That is, how do we make a minimum variance portfolio?
Portfolio Optimisation contd…
Portfolio 1: Equal allocation…
Mean = 23.96% | Standard Deviation = 16.24%
Portfolio 2: Financials heavy…
Mean = 12.49% | Standard Deviation = 21.76%
Portfolio 3: Me heavy…
Mean = 11.73% | Standard Deviation = 19.67%
92. Attempts to create a min var
portfolio
Portfolio Efficient Frontier
• Efficient Frontier method: An optimisation method which
takes into
account volatility and Sharpe ratio
• The idea of an efficient frontier comes from Modern Portfolio
theory
• The frontier is a curve representing a set of portfolios which
provide the
greatest returns for each level of risk
This Photo by Unknown Author
is licensed under CC BY-SA-
93. NC
https://bogleheads.es/foro/viewtopic.php?f=4&t=673
https://creativecommons.org/licenses/by-nc-sa/3.0/
• Using the Efficient Frontier, the portfolio can be optimised for
– minimum volatility
– maximum Sharpe ratio
– minimum volatility for a given target return
– maximum Sharpe ratio for a given target volatility
• We have found a python script which uses the Efficient
Frontier method
• This allows us to compute and visualise optimised portfolios
Portfolio Efficient Frontier
This Photo by Unknown Author is
94. licensed under CC BY-ND
https://www.quoteinspector.com/images/investing/pie-area-
chart/
https://creativecommons.org/licenses/by-nd/3.0/
Activity 5: Efficient Frontier
• Make sure you are signed up with Google Colab or watch the
demo
• The script is here
https://colab.research.google.com/drive/1FiwNZKvvVLLWEpH
RX1plnLjS
zam7kwmb?usp=sharing
• Discuss the results of the different optimisation criteria with
your
teacher
• Example output next page
95. https://finquant.readthedocs.io/en/latest/examples.html
This Photo by Unknown Author is
licensed under CC BY
https://colab.research.google.com/drive/1FiwNZKvvVLLWEpH
RX1plnLjSzam7kwmb?usp=sharing
https://www.scirp.org/journal/PaperInformation.aspx?PaperID=
80120
https://creativecommons.org/licenses/by/3.0/
Of the portfolios that comprise the efficient frontier, there is
one portfolio
that had the lowest level of risk…
Risk & Return
�
�
They called it, the Minimum Variance Portfolio
96. Efficient Frontier Output
FINM4100
Analytics in Accounting,
Finance and Economics
Week 8
Data analytics techniques and applications in
accounting, finance and economics
Lesson Learning Outcomes
1 Explore and apply some of the widely used data
97. analytics techniques which are used to extract
insights in accounting, finance and economics, e.g.
• Association rule learning
• Classification tree analysis
• Genetic algorithms
• Machine learning
• Regression analysis
Software for today
1. Google Colab
• Either
A. watch the teacher demonstrate analytics and accounting in
python
98. OR
B. you can run the python scripts yourself in Google Colab
• If you want to run the code provided, make sure you have
access
(signed in) to Google Colab https://colab.research.google.com
2. Exploratory
A. watch the teacher demonstrate analytics and accounting in
Exploratory OR
B. run each step yourself
https://colab.research.google.com/
Data for today
1. GroceryStoreDataSet.csv
2. Churn_Modelling.csv
3. Salary_Data.csv
99. This Photo by Unknown Author is licensed under CC BY-SA-
NC
https://www.peoplematters.in/blog/recruitment/how-data-
analytics-is-revolutionizing-recruitment-28683
https://creativecommons.org/licenses/by-nc-sa/3.0/
A Vital Commodity
“It is a capital mistake to
theorize before one has
data.”
Sir Arthur Conan Doyle
Author
Sherlock Holmes
100. The Big Data Environment
216,000TB
Amount of new information
generated per person per year
90%
Proportion of the world’s total
big data created in the past 3
years.
$65 million
Boost in net income for every
Fortune 1000 company (if
data access is boosted 10%)
83%
Proportion of surveyed
businesses (Accenture)
101. investing in Big Data
initiatives.
Inevitable Transition
Force multiplier - Big data analytics and analytics
infrastructure is the means by which institutions apply force to
achieve geo-economic advantage.
Commercial activities will increasing relay on sophisticated
network-based logistics, communications systems and a big
data ecology to recommend products, retain customers and
mitigate churn.
The goal is to turn data into information, and information into
102. insight.
Techniques
There are a number of widely used analysis techniques to
extract valuable insights from data.
• Association rule learning
• Classification tree analysis
• Genetic algorithms
• Machine learning
• Regression analysis
This Photo by Unknown Author is licensed under CC BY-SA-
NC
https://ocw.tudelft.nl/courses/big-data-strategies-transform-
business/
103. https://creativecommons.org/licenses/by-nc-sa/3.0/
Association Rule Learning
Association rule learning is a method for discovering interesting
correlations between variables in large databases. It was first
used by
major supermarket chains to discover interesting relations
between
products, using data from supermarket point-of-sale (POS)
systems.
“Are people who purchase tea more or less
likely to purchase carbonated drinks?”
Association Rule Learning
Association rule learning is used to:
104. • place (correlated) products in better proximity to each other
in order to increase sales
• Determine data quality in accounting
• Help in investment planning
• monitor system logs to detect intruders and malicious
activity
• provide insight in revenue analysis
T
h
is
P
h
o
to
b
y
106. n
d
e
r C
C
B
Y
This Photo by Unknown Author is licensed under CC BY-NC-
ND
https://researchoutreach.org/articles/value-added-data-systems-
architecture-end-user-informed-data-preparation/
https://creativecommons.org/licenses/by/3.0/
http://www.flickr.com/photos/hmtreasury/4723319199/
https://creativecommons.org/licenses/by-nc-nd/3.0/
Association coding concepts
“The Apriori Algorithm, used for the first phase of the
Association Rules, is the most
popular and classical algorithm in the frequent old parts. These
107. algorithm properties and
data are evaluated with Boolean Association Rules. In this
algorithm, there are product
clusters that pass frequently, and then strong relationships
between these products and
other products are sought.
Three main parameters that are used to identify the strength of
the algorithm are
Activity 2: Python in Colab
• Make sure you have access (signed in) to Colab
https://colab.research.google.com
• Click on the ‘File’ menu and select ‘New notebook’
https://colab.research.google.com/
Activity 2: Python in Colab
We have grocery store data for you to analyse
108. • The code is given below. All you have to do is click on the
arrows and run the
code
• NOTE: you don’t need to run the interpretation text at the end
it is just to help
you interpret the results
•
https://colab.research.google.com/drive/1Qg0qokW_oDUI6xU8
gvmZeV6AiMo
6bhxu?usp=sharing
• We start by getting you to choose to upload the
GroceryStoreDataSet.csv file
on MyKBS
(You will be prompted to Choose (find) the data file from where
it is
stored on your device)
109. https://colab.research.google.com/drive/1Qg0qokW_oDUI6xU8
gvmZeV6AiMo6bhxu?usp=sharing
Activity 2: Output
Interpretation
# The probability of seeing sugar sales is seen as 30%.
# Bread intake is seen as 65%.
# We can say that the support of both of them is measured as
20%.
# 67% of those who buys sugar, buys bread as well.
# Users who buy sugar will likely consume 3% more bread than
users who don't buy sugar.
# Their correlation with each other is seen as 1.05.
# As a result, if item X and Y are bought together more
frequently, then several steps can be take
110. n to increase the profit.
Glossary 1: What are Bonds and
mortgage-backed security (MBS) ?
• Securitisation is about pooling debt (such as mortgages) and
selling
their cash flows, as securities, to third party investors
• A bond is a fixed income security that provides a return in the
form of
fixed interest payments made at regular intervals over time
• A mortgage-backed security (MBS) is an investment similar to
a
bond. A MBS consists of a bundle of loans sold to investors.
• The bundles are rated between AAA (best, debts most likely to
be paid
back) through to “not rated” (worst)
111. • The bank effectively becomes an intermediary between a
person with a
mortgage and investors. See next slide
Risk Ratings
Can machine learning help classify items for investment?
Classification Tree Analysis
YES! Classification, a machine learning method can be used to
classify debt
• Statistical classification is a method of identifying categories
that a
new observation belongs to. It requires a training set of
correctly
identified observations – historical data in other words.
• Classifying customers correctly will maximise sales and
112. minimise
expenses (cost of acquisition, discounts, bad debt etc).
“Are these mortgages investment grade or sub-prime?”
AAA BBB D
Classification Tree Analysis
Statistical classification is also being used to:
• automatically assign financial documents to
categories;
• categorize customers into groupings (e.g.
insurance);
• classify transactions
This Photo by Unknown Author is licensed under CC BY-NC
113. https://www.freepngimg.com/png/48807-exchange-png-file-hd
https://creativecommons.org/licenses/by-nc/3.0/
Activity 3: Decision Trees
• Decision trees that classify items into categories are called
“Classification tree”
• Decision trees that predicts numerical values is called
“Regression tree”
Watch the video at
https://www.youtube.com/watch?v=zs6yHVtxyv8
From groups,
• Suppose that you are an analyst at the tax office. You wish to
identify which of
your clients is most likely to avoid lodging a tax return form
and thus avoid
paying tax (or even recouping funds after paying too much tax)
114. 1. Discuss the idea of using a classification tree for this pur pose
2. How would you limit so-called “overfitting”?
3. What kind of data would you collect for the classification
tree?
https://www.youtube.com/watch?v=zs6yHVtxyv8
Genetic Algorithms
Genetic algorithms are inspired by the way evolution works –
that is,
through mechanisms such as inheritance, mutation and natural
selection.
These mechanisms are used to “evolve” useful solutions to
problems that
require optimization.
“Which TV programs should we offer viewers,
115. and in what time slot, to maximize viewership?”
Genetic Algorithms
• A biology- inspired algorithm which reflects natural selection
(the fittest
individuals survive)
• Technically an optimisation method
• It has three main rules:
selection
crossovermutation
evaluation
This Photo by Unknown Author is licensed under CC BY-SA
1. “Selection rules select the
individuals, called parents, that
116. contribute to the population at the
next generation.”
2. Crossover rules represent
reproduction, i.e. combining two
parents to form children.
3. Mutation rules apply random
changes to individual parents to
create genetic diversity in children.
https://leblancfg.com/higher-level-functions-python-reduce.html
https://creativecommons.org/licenses/by-sa/3.0/
Genetic Algorithms
Genetic algorithms are being used in:
• Finance:
117. – Algorithmic trading;
– Financial statement fraud
• In accounting
– Distribution problems assigning sources to destinations
– Bankruptcy predictions
• The cobweb model in economics which explains
why prices may fluctuate in certain markets.
This Photo by Unknown
Author is licensed under
CC BY
http://brainz.org/15-real-world-applications-genetic-algorithms/
http://www.blacklistednews.com/Mysterious_Algorithm_Was_4
%25_of_Trading_Activity_Last_Wee k/21915/0/38/38/Y/M.html
https://creativecommons.org/licenses/by/3.0/
118. Activity 4: Genetic Algorithms
• Here is a video with a real-world examples of a genetic
algorithms.
Watch the video at
https://www.youtube.com/watch?v=ziMHaGQJuSI
Form groups and answer the following,
Q1. What issues do genetic algorithms appear to have at the
start?
Q2. What are the three rules used here?
Q3. What applications are shown here?
Q4. How could this be used in accounting and finance?
https://www.youtube.com/watch?v=ziMHaGQJuSI
Machine Learning
119. Machine learning includes software that can ‘learn’ from data
and generate
adaptive solutions. It gives computers the ability to compute
solutions
without being explicitly programmed along a strict instruction
set.
Applications are primarily focused on making predictions based
on known
properties learned from sets of ‘training data’.
“What other products would this customer likely
purchase, based on their transaction history?”
Extract Transform Test Validate
Machine Learning
120. Machine learning is being used to:
• distinguish between spam and non-spam email
messages;
• learn invoice coding behaviours for allocation
purposes
• determine the best content for engaging
prospective customers;
• run AI chatbots for customer enquiries
This Photo by Unknown Author is licensed under CC BY-NC-
ND
https://www.cittadiniditwitter.it/news/il-maxxi-lancia-un-
chatbot-che-guida-i-visitatori-alla-scoperta-delle-collezioni/
https://creativecommons.org/licenses/by-nc-nd/3.0/
Activity 5: Customer churn example
121. Source: https://www.kaggle.com/kmalit/bank-customer-churn-
prediction
• Watch the demo by your teacher or run the code for analysis
of
customer churn at
https://colab.research.google.com/drive/1Sgro8G9o2UtErsiEMG
-
UOe7yS-JQMqUU?usp=sharing
• Data for this script is Churn_Modelling.csv
• NOTE: This is a part of a project on Kaggle, so we took a
small section
of it to give you an appreciation of this technique
• Interpret your findings. For example, regarding churn, is there
any
difference depending on the country of origin of customers,
122. gender,
ownership of a credit card or whether or not a member is
active?
https://colab.research.google.com/drive/1Sgro8G9o2UtErsiEMG
-UOe7yS-JQMqUU?usp=sharing
Regression Analysis
• Regression analysis involves manipulating one or more
independent
variables (i.e. number of customers) to see how they influence a
dependent variable (i.e. weekly sales).
• The dependent variable is also called a target variable
• The independent variable is also called a predictor variable
“How would social, biological, demographic and
lifestyle factors affect health insurance premiums?”
124. Estimated (or
predicted) Y value for
observation i
Value of X for observation
i��� = �� + �� ��
Simple linear regression equation
for estimating values
• Example: ������� ����� = 98.248 + 0.110 Number of
customers
• Weekly sales is the target variable,
• Number of customers is a predictor variable
Simple linear regression equation
125. for estimating values
• Example: ������� ����� = 98.248 + 0.110 Number of
customers
• Weekly sales is the target variable,
• Number of customers is a predictor variable
0
50
100
150
200
250
300
0 500 1000 1500 2000
127. – Demand curves
– Predicting economic growth rate
• In Finance:
– Forecasting, e.g. revenues from Ads
– Bank performance given multiple variables
– levels of customer satisfaction affect customer loyalty
• In accounting:
– to estimate fixed and variable costs
– Cost versus hours worked
T
h
e
s
e
P
129. lic
e
n
s
e
d
u
n
d
e
r C
C
B
Y
This Photo by Unknown Author is licensed under CC BY-SA
http://www.ccpixs.com/ccimages/3d-growing-revenue-
graph/1192/
https://creativecommons.org/licenses/by/3.0/
132. Indiv Stock
Field: Indiv Stock and Field: Market appear highly correlated.
Other types of regression
This Photo by Unknown Author is licensed under CC BY-SA
T
h
is
P
h
o
to
b
y
U
n
k
134. C
B
Y
-S
A
Polynomial regression
3-D regression movie
https://devopedia.org/types-of-regression
https://creativecommons.org/licenses/by-sa/3.0/
http://stackoverflow.com/questions/11949331/adding-a-3rd-
order-polynomial-and-its-equation-to-a-ggplot-in-r
https://creativecommons.org/licenses/by-sa/3.0/
Activity 6: Salary regression model
• We will look at a simple model of how salary is related to
years of work
135. experience.
• Data for this activity in Exploratory is Salary_Data.csv
• Open Exploratory and create a new project called Salary
analysis
• Use the Data Frames menu to load the Salary_Data.csv file
and save it
Activity 6: Salary regression model
• The Summary in Exploratory shows the distribution of the two
variables
• Click on the Analytics menu (in Green)
• Go to the model ‘Type’ menu
• Choose ‘Linear regression’ as the type
of model you want
136. • Choose ‘Salary’ as the Target variable
• Choose ‘YearExperience’ as the
predictor variable and run
Activity 6: Salary regression model
• Interpret the output in a general sense
• Click on ‘Coef. Table’ to see the values
of the coefficients for the regression
equation
• The equation is
• ������� = 25,792 + 9,449 YearsExperience
• You can make estimates from this by
substituting numbers for Years of
137. experience, e.g. 5 years of experience
gives you an estimate of
• ������� = 25,792 +9,449*5 = $73,037
• You will learn more detail on this in week 9 of
STAM4000
Create a slide deck which represents a portfolio of analytics
methods used of accounting, economics or finance. This task is
to be done as an individual. 16 slides, total 30 marks.
Assessment Description
You will discuss below five analytics methods and a financial
or accounting or economics application for each one.
· Association rule learning
· Classification tree analysis
· Genetic algorithms
· Machine learning
· Regression analysis
• Out of the five methods that you chose, investigate one in
138. more detail.
• Reflect on the limitations of the methods and possible ethical,
legal or privacy issues.
Please refer to the assessment marking guide to assist you in
completing all the assessment criteria.
Slide format should be as follows:
• Title, student name and ID [1 slide]
• Discuss any 4 analytics methods from above. Create one slide
for each analytics method and one for its application in
accounting or finance or economics. [8 slides, 16 marks]
• Discuss the remaining 1 Analytics method in detail and create
three slides for the analytics method and one slide for its
application in accounting or economics or finance [4 slides, 8
marks]
• Reflect and list the limitations of the 5 analytics methods [1
slides, 2 marks]
• Discuss in short sentences possible ethical, legal and privacy
issues. Please refer to lecture slide week 11. [2 slides, 4 marks]