UWS Guest Lecture
9 August 2014
“Wake me up tomorrow has already happened !”
Big Data and Foresight
1. Foresight and Transdisciplinarity
2. Social media and predictive capabilities
3. New and innovative information sources
4. Internet of Things
5. Big Data Scenarios
6. How to use foresight as part of daily business operations
Concept of Transdisciplinarity
Tress, B. Tress, G. & Fry, G. (2006). Defining concepts and the process of knowledge production in integrative research. In B. Tress, G. Tress, G. Fry
& P. Opdam (Eds.), From landscape research to landscape planning: Aspects of integration, education and application (pp. 13-26). Dordrecht:
Datafication 2 : First National Study of Twitter Usage in Australia
Australians send an average of 234 million tweets per month and 5,000 tweets per minute, a new Twitter
study by advertising agency The Works has found. Aussie females are more likely to retweet than males
and most retweets occur on Mondays, according to the agency's 'datafication' research project. Douglas
Nicol, creative partner and director at The Works, said the study was designed to help marketers talk to
consumers more effectively. “There’s a lot of hype around social media. Using research from datafication,
we are able to equip Australian marketers with no nonsense practical advice,” Nicol said.“This in turn will
help marketers appeal directly to an audience. We believe that in turn, this will boost the way people view
and talk about a brand or product online.”
Lovers, carers and jesters were identified as the top three archetypical personalities on Twitter.
According to the study marketers can talk most effectively to lovers by being passionate, carers by being
gentle and jesters by being mischievous.“If you understand what drives the motivations behind Australians
you will be in a better position to connect with them,” Nicol said. Almost 11% of the Australian population
is on Twitter and of those users 46% are male and 54% are females.
The study also found that Sydney hosted the largest population of Twitter users while Hobart is
responsible for the most tweets per capita.
'Datafication', which was supported by the University of Technology Sydney (UTS), analysed the most
popular words used in Twitter over an eight week period to rank motivations and behaviours on the
Software created by Dr Suresh Sood, a social media expert at UTS, then analysed the data to produce
the insights into what individuals are doing on Twitter.
'Datafication' is set to launch as a real-time service for the agency’s clients early next year.
Datafication 3- First Australian Instagram Study Conducted
Analytic Insights from Millions of Images
• Sunday at 5pm is the peak usage for Instagram in Australia while on
weekdays 8pm is the most popular posting time
• The average Aussie Instagram user posts 2.3 times a week with around 10
posts being made a month
• Sydney, Brisbane and the Gold Coast are the ‘selfie’ capitals of Australia,
with more pictures of people taking photos of themselves posted than any
• In Melbourne images of food are the most popular Instagram subject, while
in Perth its portrait piccies and in Adelaide it’s more artistic shots.
• Brand recognition on Instagram is low. The most popular hashtag is
#instagood with more than 1.6 million references, however brands such as
McDonald’s, Nike and Holden have been hashtagged less than 15,000 times.
Instagram Deception (Suspects outside of -20 & +20)
Vine Deception (Suspects outside of -5 and +5)
The Newman Model of Deception (Pennebaker et al)
Key word categories for deception mapping:
1. Self words e.g. “I” and “me” – decrease when someone distances
themselves from content
1. Exclusive words e.g. “but” and “or” decrease with fabricated
content owing to complexity of maintaining deception
1. Negative emotion words e.g. “hate” increase in word usage owing
to shame or guilty feeling
1. Motion verbs e.g. “go” or “move” increase as exclusive words go
down to keep the story on track
Ideas concerning thinking between Australia and Chinese citizens (1st
• Sina weibo (132,555,895)
• QQ weibo (3,721,300)
• Taisha BBS (1,228,967)
In total: 137,506,162
• Tigtag (21,755,909)
• Oursteps (14,568,879)
• FreeOZ (4,718,210)
In total: 47,678,151
The numbers are the Australia-related posts found in each site.
Twitter and Marketing Predictions
• Tweets is “found data” without asking questions
• More meaning than typical search engine query
• Large numbers of passive participants in natural settings
• Twitter can predict the stock market (Lisa Grossman, Wired, Oct 19 2010)
• Predict movie success in first few weekends of release
– “…it also raises an interesting new question for advertisers and marketing
executives. Can they change the demand for their film, product or service buy
directly influencing the rate at which people tweet about it? In other words,
can they change the future that tweeters predict?”
Tech Review, http://www.technologyreview.com/blog/arxiv/25000/
Detecting flu trends using search engine query data (intentionality)
• Environmental sensors
• Health (personal) Images
• Particle accelerator Satellite
• Scanned survey data Social media
• The data collected by SKA in a single day take nearly two million years to playback on an MP3 player The
SKA central computer has processing power of about one hundred million PCs.
• The SKA will use enough optical fiber linking up all the radio telescopes to wrap twice around the Earth.
• The dishes of the SKA when fully operational will produce 10 times the global internet traffic as of 2013.
• The aperture arrays in the SKA could produce more than 100 times the global internet traffic as of 2013.
• The SKA will generate enough raw data to fill 15 million 64 GB MP3 players every day.
• The SKA supercomputer will perform 1018 operations per second - equivalent to the number of stars in
three million Milky Way galaxies - in order to process all the data that the SKA will produce.
• The SKA will be so sensitive that it will be able to detect an airport radar on a planet 50 light years away.
• The SKA will contain thousands of antennas with a combined collecting area of about one square
kilometer (that's 1,000,000 square meters).
• Previous mapping of Centaurus A galaxy took a team 12,000 hours of observations and several years.
SKA ETA 5 minutes !
To the scientists involved, however, the SKA is no testbed, it’s a transformative instrument
which, according to Luijten, will lead to “fundamental discoveries of how life and planets and
matter all came into existence. As a scientist, this is a once in a lifetime opportunity.”
Sources: http://bit.ly/amazin-facts & http://bit.ly/astro-ska
Image credit: Ilana Feain, Tim
Cornwell & Ron Ekers
(CSIRO/ATNF). ATCA northern
middle lobe pointing courtesy R.
Morganti (ASTRON), Parkes data
courtesy N. Junkes (MPIfR).
The image has been created by Dr Ilana Feain and
her team using CSIRO’s Australia Telescope
Compact Array telescope near Narrabri in New
South Wales to observe the galaxy over several
With ASKAP these same observations will take just
five minutes. In first six hours of operation, ASKAP
will generate more information than all previous
radio telescopes in the world combined.
Number of journeys made
Types of roads used
Time of travel
Levels of acceleration and braking
Any accidents which may occur
New Sources of Information (Big data) : Social Media + Internet of Things
Data Driven Innovations
The ANZ Heavy Traffic Index comprises flows
of vehicles weighing more than 3.5 tonnes
(primarily trucks) on 11 selected roads around
NZ. It is contemporaneous with GDP growth.
The ANZ Light Traffic Index is made up of light
or total traffic flows (primarily cars and
vans) on 10 selected roads around the
country. It gives a six month lead on GDP
Smartphone, Google Glass or Apple Watch will
Know What you Want before you do
“…from 2014 your phone [glasses or watch] will
anticipate your needs, do the research, tell you
what what you want to know – sometimes
before the question even occurs to you…”
Chapman, Jake (2013), The Wired World in 2014
Useful References Informing our Thinking
on Mobility and Movement
(Silva et al (2013) A comparison of Foursquare and Instagram to the study of city
dynamics and urban social behavior, Proceedings of the 2nd ACM SIGKDD
International Workshop on Urban Computing
Instagram and Foursquare datasets might be compatible in finding popular regions of
Chaoming Song, et al. (2010), Limits of Predictability in Human Mobility, Science
There is a potential 93% average predictability in user mobility, an exceptionally high
value rooted in the inherent regularity of human behavior. Yet it is not the 93%
predictability that we find the most surprising. Rather, it is the lack of variability in
predictability across the population.
Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for
Pervasive Systems. Proceedings of the 9th International Conference on Pervasive
Daily and weekly routines => Few significant places every day => Regularity in human
activities => Regularity leads to predictability
Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human
Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge
we have shown that it is possible to exploit the correlation between movement data and
social interactions in order to improve the accuracy of forecasting of the future geographic
position of a user. In particular, mobility correlation, measured by means of mutual
information, and the presence of social ties can be used to improve movement forecasting
by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of
potential existence of physical or distant social interactions and vice versa.
Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human Mobility
Where are you going to be 285 days from now at 2pm …we show that it is possible to
predict location of a wide variety of hundreds of subjects even years into the future and
with high accuracy.
Useful References Informing our Thinking
on Mobility and Movement
Roadmap to Recommender Tool
Find Preference and Behavior
pattern(including Trajectory pattern)
Recommend right product (or
service) to right person ( or
group) at right time and place
Manual Automatic Recommndation
MongoDB Mahout Recommender
• Points of Interest
• User profiles
• Image details
• Recommender engine
Internet of Things
Shelf Shelf Shelf
Supermarket control room
Smart Social Card System
Reader/Wifi Gateway and Active Card
Smart Sandbag System
Smart Sleeping System
Big Data Scenarios
• Brands - Luxury goods
Big Data and Broadcasting in Australia 2014
As an advertising data analyst working with broadcasters and content providers, I recognise online
advertising alone is worth over $4 billion in 2014. (http://www.iabaustralia.com.au/). To help clients and my
own employer benefit from the change occurring as well as maintain increases in salary, I undertook the
MDSI. This degree helped to provide me with a brush up of core analytical skills (mathematics and
statistics), technical (programming in the context of big data), soft skills of storytelling to help fellow team
members understand actions stemming from the data analysis and innovation to explore new data sources
to help find new solutions.
Just looking at TV advertising alone, a major paradigm shift is underway. Since the 1950s, TV advertising is
literally hardwired into the show. Hence, the entire audience of one million watching “Home and Away” got
to see the same advertisement. This is no longer the case, TV is becoming available everywhere. The
television has broken free from the box in the living room with many Australians watching programs on
mobile and tablets. TV programs are even available on demand at the whim of a viewer. This second screen
or multi-platform phenomena creates new opportunities including an entirely new revenue stream from the
viewings happening outside of the TV. For example, the sports fans in Australia receive real time statistics
and scores for the Australian Open generating via mobile creating more data for analysis (IBM 2014).
Predictions are possible being able to determine a hit even before filming (House of cards; Carr 2013). Other
data sources available are from the actual digital players on the second screen providing users and times of
pausing, forward, rewinding and replaying or stopping the TV show. Complimenting this viewer data with
the comments on TV shows and films in social media provides an understanding of viewer behavior during
and outside of the program schedule.
Using the variety of data, I am expected to help provide reporting and dashboards to support:
An increase in revenue through monetizing new views across devices
The development of new programs or movies
Web or mobile products around popular shows including recommender systems
Big Data and Insurance in Australia 2014
Since the launch of insurance box (Collett 2013; insurancebox.com.au) in Australia, a rethink of our existing
approach to insuring drivers is driving changes to our existing business structure and processes. Previously,
the actuaries provided the necessary quantitative skills. Today, the transforming business requires data
science for insurance capabilities across our entire enterprise. Some of this capability includes our existing
actuaries with additional training in machine learning. The necessity to move to a data driven organization is
fostered by our data science graduates. The organization now uses data and analysis over the previous gut
feel and intuition in all facets of business including the growth of our new revenues via the website and
mobile from younger drivers adopting the “black box” recorder. Our data science team focuses on data
collection from the black boxes, analysis and reporting as well as ongoing deployment of new sources of
information enriching the business.
Totally new capability is now available within the business not hitherto seen. This includes A/B testing of the
online environment, trend analysis, segmentation, text mining as well as the important black box dashboard
visualisations of driving scores (http://insurancebox.com.au/tour). The business is now able to acquire close
to real time from our drivers, make predictions and start to look forward to creating new products entirely
tailored to the needs of our individual customers with a “pay as you” go business model. True driving
behavior as well as the actual overnight location of the car determines the insurance price not the previous
statistical base of drivers and demographics.
Big Data and Education 2014
As a project manager for a local educational institution, I am driving our student dashboard
project. I attended the MSCI course to understand the project management of data science
intensive projects while mindful of privacy and the ability to innovate using new data
sources. The idea behind the project is to predict and student challenges with regard to
academic performance and any behavioral challenges. The dashboard notifies lecturers and
tutors with a recommended course of action. Students receive similar recommendations
The key aim of the dashboard is ensuring the student is able to perform at the best while
maximizing student retention through providing assistance to help with grades. The system
generates a predictive model from past performance, library usage, wi-fi hot spots around
campus and frequency of access of the learning management system.
As project manager, I am continually on the look out to integrate new information sources
including social media providing a representation of the student behavior as well as
ensuring the system output provides students with directly actionable information. The
system undergoes regular penetration security testing at each new release ensuring the
privacy of student information
Big Data and Luxury Goods 2014
Forecasting the sale of our luxury goods has often been an art rather than a science. As a marketing data
analyst, I support the sales and marketing team by honing forecasts to bring just enough stock from Italy to
satisfy demand in Australasia. Furthermore, the provision of accurate forecasting is complicated with
customers not just buying from bricks and mortar stores but online directly from our own Website and other
online properties. Online shopping revenue sales 2013/14 according to IBISWorld is AUD12.4 billion.
In 2016, I attended the MDSI course and enhanced my profile as a marketing data analyst/scientist. The
course was instrumental in providing me with a mix of mathematical and statistical skills as well as coding.
Innovation aspects of the course let me feel comfortable with introducing new approaches in the business.
The course is timely, as the company had been thinking about offshoring our data analysis capabilities work
to India and the company MuSigma. Instead, the expertise gained from MDSI allows me to develop and lead
a balanced team of marketing specialists and general data talent. The communication skills learnt from the
course form the basis of the shared language of data analysis amongst team members and executives.
Amongst the innovation, I have put in place is the big data forecasting system. The MDSI course reinforced
the notion of finding new innovative sources for making marketing discoveries. The idea is simple to use a
variety of information sources not previously considered to help improve our luxury goods forecasting. With
the team, we hit upon using indicators derived from Tasmanian oyster farms (Sense-T), Twitter luxury goods
bloggers and finding correlations with historical sales. This helps us forecast 6 months in advance with far
greater accuracy than previously available. The data is multi-structured and required using Hadoop to
acquire the large volumes of data. The approach learnt from MDSI worked to a tee by starting simple with
the forecasting problem, building the team to including mathematical/statistical specialists, training on not
just sandboxes but the actual cluster of 4 nodes for processing the data and documenting the results.
Big Data and Retailing in Australia 2014
Retailing is not only one of the world’s top ten industries and in some countries (e.g. India) the largest but
above all presents great opportunity for data science innovation as “the retail store experience is set to
change more in the next five years than it has over the past century” (McKinsey 2013). In Australia
Woolworths and Wesfarmers are in the top 20 worldwide retailers (Deloitte 2013). Until recently, the
understanding of shopper behaviour in the Australian and global marketplaces is built atop volumes of
scanner data under the control of the major retailers e.g. Wesfarmers and Woolworths working in
conjunction with a handful of large research organisations (Nielsen and Ipsos). Hence, the research insights
on shopper behaviour are often not transparent and not readily available to members of the global retail
ecosystem including brand owners.
As head of the retail analytics team with a major retailer, the importance of extracting value from the big
data available from the supply chain, point of sale, online web site and shoppers is paramount. The data
sources include store cameras to obtain shopper demographics, tracking behavior and heat mapping
around key store positions.
Today, as the head of retail analytics, we can utilise data mining techniques in conjunction with multi-
structured data inclusive of shopper CCTV videos, traffic counters, online click stream data of retailers using
social media to make pre-purchase buying decisions, the opportunity exists to generate predictions built on
highly transparent big data sets combining rich data from a variety of information sources. However, these
rich information sources are real time signals providing very big multi-structured data streams not
traditionally manageable by retailers or research organisations. The retailer interest includes the availability
of predictions to not only help with minimising stock-outs but even help ensure correct staffing levels. The
Mckinsey Global Institute Big Data report (2011) supports this thinking by recognising “…the use of large
datasets will continue to transform the face of retail…”.
Roadmap – Evolution from Existing Operations to Predictive
Making Foresight Relevant to Daily Operations !
Rigid Flexible Connected
What if conversations continue?
(Adapted from Solis, 2012 and Davenport 2007)
Freely share info and
Knowledge on internal basis
acting social with customers
2 –way communications
Connected internal and
External. Listening and
Learning. Internal and
Shared via hub and
Connected directly to
Agile, integrate customer
Experiences and feedback
Loops. Listening and
Learning now become
analyse and insights
Makes sense of data
And transforms into
Respond in Real time
Shift from reactive to
Proactive and predictive
Business uses social
media heavily and is
adaptive and predictive
in terms of customer
needs and new
scenarios before they
opportunity and limit risk
How can we lead conversations?
What conversations are next?
Why are these conversations occurring?
What actions are required?
What are the sentiment of conversations?
When and where are conversations taking place?
What conversations are taking place?
The future is impossible to predict. However one
thing is certain :
The company that can excite it’s customers
dreams is out ahead in the race to business
Selling Dreams, Gian Luigi Longinotti