2. Areas for Discussion
1.) Data Driving Trends â Big Data & Machine Engineering
2.) Building Data Centric Business
3.) Future of Professions
4.) Talent Scarcity
5.) Democratisation of Data Science
5. 2020 Global Data Forecast (Bytes)
2020 estimates suggest four times more digital data than all the grains of sand on Earth
Source: Pg. 4, Building a Digital Analytics Organization: Create Value by Integrating Analytical Processes,
Technology, and People into Business Operations by Judah Phillips, FT Press, 30 Jul 2013
6. Data Origination Trends
1) Mobile = multiple sensors comprising camera, microphone, GPS, accelerometer
2) Sensors everywhere including âeyes in the skyâ via drones, satellites and roads
3) Online customer interactions generate IP addresses, time, geocode, page visits
4) Large scale data curation e.g. Airbnb, Google (Art, Street View, N Gram, Gdelt** ),
Guardian, Million Songs, OpenCalifornia, Openflights, OpenStreetMap,
Planethunters, Pandora, Shazam, Wikipedia
5) Data fusion e.g. LA Times Homicide Blog using coroner reports
6) Reviews e.g. Tripadvisor, Amazon, Yelp
7) Open Data e.g. data.gov and O Data protocol (http://www.odata.org/)
**The GDELT Project pushes the boundaries of âbig data,â weighing in at over a quarter-billion rows with
59 fields for each record, spanning the geography of the entire planet, and covering a time horizon of
more than 35 years. GDELT is the largest open-access database on human society in existence. Its
archives contain nearly 400M latitude/longitude geographic coordinates spanning over 12,900 days,
making it one of the largest open-access spatio-temporal datasets as well.
8. Black Box Insurance
⢠Big data transforms actuarial insurance from using probability methods to estimate premiums into dynamic risk management
using real data generating individually tailored premiums
⢠Estimate 20 km work or home journey, data point acquired every min and journey captures 12 points per km. Assume 1000 km
per month driving or generating 12,000 points per month resulting in 144,000 points per car/annum. Hence, 1,000 cars leads to
144 million points per annum.
⢠Telematics technology (black box) monitor helps assess the driving behavior and prices policy based on true driver centric
premiums by capturing:
â Number of journeys
â Distances travelled
â Types of roads
â Speed
â Time of travel
â Acceleration and braking
â Any accidents
â Location ?
⢠Benefits low mileage, smooth and safe drivers
⢠Privacy vs. Saving monies on insurance (Canada ; http://bit.ly/Black_box)
9. The ANZ Heavy Traffic Index comprises flows
of vehicles weighing more than 3.5 tonnes
(primarily trucks) on 11 selected roads
around NZ. It is contemporaneous with GDP
growth.
The ANZ Light Traffic Index is made up of
light or total traffic flows (primarily cars and
vans) on 10 selected roads around the
country. It gives a six month lead on GDP
growth in normal circumstances (but cannot
predict sudden adverse events such as the
Global Financial Crisis).
http://www.anz.co.nz/about-us/economic-markets-research/truckometer/
ANZ TRUCKOMETER
10. http://tacocopter.com/
New Sources of Information (Big data) : Social Media + Internet of Things
ď ď Innovations
7,919 40,204
2,003,254,102 51
Gridded Data Sources
11. Variety of Data Types & Big Data Challenge
1. Astronomical
2. Documents
3. Earthquake
4. Email
5. Environmental sensors
6. Fingerprints
7. Health (personal) Images
8. Graph data (social network)
9. Location
10.Marine
11.Particle accelerator
12.Satellite
13.Scanned survey data
14.Sound
15.Text
16.Transactions
17.Video
Big Data consists of extensive datasets primarily in the characteristics of
volume, variety, velocity, and/or variability that require a scalable
architecture for efficient storage, manipulation, and analysis.
Computational portability is the movement of the computation to the location of the data.
12. HadoopConfigurations(SingleandMulti-Rack)
Adapted from: http://stackiq.com/
Cluster manager e.g. Apache Ambari, Apache Mesos, or Rocks
3 TB drives ,18 data nodes
configuration represents 648 TB
of raw storage HDFS standard
replication factor of 3
216 TB of usable storage
Name/secondary/data nodes â 6 core 96 GB
Management node â 4 core 16 GB
13. Computer
Data
Program
Output
Computer
Data
Output
Algorithms
Traditional Computing Paradigm & Machine Engineering
Machine learning is a scientific discipline that deals with the construction
and study of algorithms that can learn from data. Such algorithms operate
by building a model based on inputs and using that to make predictions or
decisions, rather than following only explicitly programmed instructions.
http://en.wikipedia.org/wiki/Machine_learning
14. 8 Steps Towards Building the Data Centric Business
1. Put digital service (Vargo & Lusch) at centre of business blurring distinction with
physical products via sensors and apps
2. Identify data and monetisation opportunities using business model canvas
3. Select unique sources of data to help drive innovation
4. Uses data to drive interactions and customer experiences
5. Understand the data lifecycle from creation to storage
6. Value extraction from data (economic or social)
7. Review patterns of big data businesses
8. Got on top of big data technology trends and analytics software
15. Netflix â A Picture of A Data Driven Company
⢠~75 million users
⢠8.5 million events per second
⢠Zero loss?
⢠550 billion events per day
⢠Hundreds of event types
⢠1.3 PB/day
⢠21GB /sec (peak)
⢠37% of peak US internet bandwidth
⢠Operates on Amazon Web Services
Source : http://techblog.netflix.com/2016/02/evolution-of-netflix-data-pipeline.html
16. ⢠Next generation radio telescope
⢠100 x more sensitive & 1,000,000 X faster
⢠5 square km of dish over 3000 km
⢠Two sites: Western Australia & Karoo Desert RSA
⢠Worlds most ambitious IT Project
⢠First real exascale ready application
⢠Largest global big-data challenge
⢠SKA SDP exascale systems:
⢠100,000 nodes
⢠800 cabinets
⢠consume 20 MW
⢠Expected failure rates of 300 nodes per week
Square Kilometre Array
http://www.ska.gov.au/
17. The Future of the Professions
(Susskind & Susskind 2015)
â Tax and audit work replaced by computer assisted techniques
â Technology automating and innovating
â Accounting work reconfiguring
â New business models
â Move from bespoke to âoff the pegâ
â Mastery of data with new tools and techniques - Big Data
â Diversification
â Shift to proactivity from reactivity
â Professionals replaced by less expert people and high performing systems
â Post-professional society expertise available online
18. The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind
(2015)
19. Adapted From:
The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind
(2015)
A HEALTH
KNOWLEDGE
COMMONS
New knowledge used
differently: people
managing their own health
information, personalising
their care and creating
new kinds of health
knowledge
https://www.nesta.org.uk/sites/default/files/the-nhs-in-2030.pdf
20. The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel
Susskind (2015)
21. Weâre sitting on a big data time bomb
Catastrophic loss of transparency. Few IT professionals
have experience managing big data platforms at scale
â a situation that has created a massive skills
shortage in the industry. By 2018, U.S. companies will
be short 1.5 million managers able to make data-
based decisions. A recent McKinsey Quarterly report
estimates that, in order to close this gap, companies
would need to spend 50 percent of their data and
analytics budget on training frontline managers; it also
notes that few companies realize this need.
Source: CAMERON SIM, CREWSPARK, OCTOBER 24, 2015
http://venturebeat.com/2015/10/24/were-sitting-on-a-big-data-time-bomb/
22. Australia/NZ needs â30,000 data savvy managers by 2018â
⢠This statement derives from the McKinsey (2011) study âa shortage of talent necessary for organizations
to take advantage of big data. By 2018, the United States requires a talent pool of 140-190,000 deep
analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big
data to make effective decisionsâ.
⢠Taking 2% of the US economy as a rule of thumb, in 2018 Australia will require another 30,000 managers
or analysts. However, the shortage commences well before 2018. These numbers do not accommodate
the training of managers or analysts for overseas destinations.
⢠Another 2011 study from EMC Corporation interviewed nearly 500 data science and business intelligence
professionals globally. Two-thirds of the informants believe demand will outpace supply and 30% from
disciplines outside of computer science. Additionally, the study found the biggest obstacle to data science
as being education and training.
⢠A 2012 study âData Equity: Unlocking the value of big dataâ commissioned by SAS UK and conducted by
the Centre for Economics and Business Research, an independent business research consultancy, found
unlocking big data leads to adding another 58,000 jobs to the UK economy (2012-2017).
⢠Gartner (2012) estimates by 2015 4.4 Million IT Jobs will be created globally to Support big data or 1.9
million jobs in the United Sates alone.
⢠Closer afield, the 2013 Hudson study âTackling the Big Data Challengeâ found 78% of the Australian
research informants, âbelieve organisations do not have the skills and competencies to successfully
undertake a big data project.
⢠Building on the McKinsey (2011), Gartner (2012) and Hudson (2013) estimations, Australia and the world
requires 3 distinct but related skills. Most specifically, the demand is very strong for data savvy managers
conversant with big data practice.
24. Indiaâs high demand for big data workers contrasts
with scarcity of skilled talent
The talent deficit is on two fronts, said Velamakanni: data
scientists who can perform analytics, and analytics consultants
who can understand and use the data. The first, big data
engineers and scientists, are extremely scarce. "In the second
category, we need better quality, and India is going to be short
of a million data consultants soon," he said.
Source: India's high demand for big data workers contrasts with scarcity of
skilled talent, Saritha Rai, June, 2014, http://www.techrepublic.com/article/indias-
high-demand-for-big-data-workers-contrasts-with-scarcity-of-skilled-talent/
25. Google Trends Worldwide, Australia and New Zealand - Accounting + Analytics
January 2004 - September 2015
Worldwide
Australia
New
Zealand
26. 'The Predictive Accountantâ and Data Centric Practice
1. Data savvy
3. Focus shifts from being reactive to proactive and predictive
4. Leverages accounting data and predictive analytics software to find patterns in data and
insights
5. Uses the tools and dashboards to predict client scenarios before time: maximising
opportunity, limiting risks and proactively advising.
6. Accountant benefit from analytics by adding value when connecting with client
challenges and opportunities to identified customer patterns. Sharing these insights
delivers more value in the accounting conversations and helps tackle the real business
problems facing clients.
27. The Predictive Accountant Portal: Democratisation of Data Science
The Predictive Accountant Data Sources
Predictive
Analytics
Excel style
dashboard
Connected Practice
Digital Marketing / eNewsletters/ Integrated
business tools software
Apps Marketplace
Accounting Analytic Apps
Education
Analytic Training
28. âI had come to an entirely erroneous
conclusion which shows, my dear
Watson, how dangerous it is to
reason from insufficient data.â
The Adventure of the Speckled Bird