Bigdataforesight
Upcoming SlideShare
Loading in...5
×
 

Bigdataforesight

on

  • 329 views

Guest lecture UWS Big Data & Foresight

Guest lecture UWS Big Data & Foresight

Statistics

Views

Total Views
329
Views on SlideShare
252
Embed Views
77

Actions

Likes
0
Downloads
3
Comments
0

2 Embeds 77

http://www.strikingly.com 49
http://mdsi.strikingly.com 28

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Bigdataforesight Bigdataforesight Presentation Transcript

  • UWS Guest Lecture suresh.sood@uts.edu.au or linkedin.com/in/sureshsood 9 August 2014 “Wake me up tomorrow has already happened !” Big Data and Foresight
  • Topic Areas 1. Foresight and Transdisciplinarity 2. Social media and predictive capabilities 3. New and innovative information sources 4. Internet of Things 5. Big Data Scenarios 6. How to use foresight as part of daily business operations
  • Concept of Transdisciplinarity Tress, B. Tress, G. & Fry, G. (2006). Defining concepts and the process of knowledge production in integrative research. In B. Tress, G. Tress, G. Fry & P. Opdam (Eds.), From landscape research to landscape planning: Aspects of integration, education and application (pp. 13-26). Dordrecht: Springer. https://library.wur.nl/ojs/index.php/frontis/article/view/1096/667
  • 100 Years of Foresight Writing
  • PESTEL Analysis Narrative & Story Scenarios Trend Spotting/Ethn ography Forecasting/E conometrics Historical Analysis Patent Analysis & Tracking
  • Future of Higher Education in Australia, S. Welsman, Australian, 11 Jan, 2006 Collapse of Higher Education 2008 - Regulatory Change Technology Student needsGlobal Competition Demo- graphics 2008 - 2016 Variety of : Providers Alliances Course options Content Delivery modes
  • Global Strategic Trends out to 2045 UK Ministry of Defence, 15 July 2014 https://www.gov.uk/government/publications/global-strategic-trends-out-to-2045
  • Social CRM integrates social data 9
  • Datafication 2 : First National Study of Twitter Usage in Australia Australians send an average of 234 million tweets per month and 5,000 tweets per minute, a new Twitter study by advertising agency The Works has found. Aussie females are more likely to retweet than males and most retweets occur on Mondays, according to the agency's 'datafication' research project. Douglas Nicol, creative partner and director at The Works, said the study was designed to help marketers talk to consumers more effectively. “There’s a lot of hype around social media. Using research from datafication, we are able to equip Australian marketers with no nonsense practical advice,” Nicol said.“This in turn will help marketers appeal directly to an audience. We believe that in turn, this will boost the way people view and talk about a brand or product online.” Lovers, carers and jesters were identified as the top three archetypical personalities on Twitter. According to the study marketers can talk most effectively to lovers by being passionate, carers by being gentle and jesters by being mischievous.“If you understand what drives the motivations behind Australians you will be in a better position to connect with them,” Nicol said. Almost 11% of the Australian population is on Twitter and of those users 46% are male and 54% are females. The study also found that Sydney hosted the largest population of Twitter users while Hobart is responsible for the most tweets per capita. 'Datafication', which was supported by the University of Technology Sydney (UTS), analysed the most popular words used in Twitter over an eight week period to rank motivations and behaviours on the social site. Software created by Dr Suresh Sood, a social media expert at UTS, then analysed the data to produce the insights into what individuals are doing on Twitter. 'Datafication' is set to launch as a real-time service for the agency’s clients early next year.
  • Datafication 3- First Australian Instagram Study Conducted www.datafication.com.au
  • Analytic Insights from Millions of Images • Sunday at 5pm is the peak usage for Instagram in Australia while on weekdays 8pm is the most popular posting time • The average Aussie Instagram user posts 2.3 times a week with around 10 posts being made a month • Sydney, Brisbane and the Gold Coast are the ‘selfie’ capitals of Australia, with more pictures of people taking photos of themselves posted than any other category • In Melbourne images of food are the most popular Instagram subject, while in Perth its portrait piccies and in Adelaide it’s more artistic shots. • Brand recognition on Instagram is low. The most popular hashtag is #instagood with more than 1.6 million references, however brands such as McDonald’s, Nike and Holden have been hashtagged less than 15,000 times.
  • Instagram Deception (Suspects outside of -20 & +20) Vine Deception (Suspects outside of -5 and +5)
  • The Newman Model of Deception (Pennebaker et al) Key word categories for deception mapping: 1. Self words e.g. “I” and “me” – decrease when someone distances themselves from content 1. Exclusive words e.g. “but” and “or” decrease with fabricated content owing to complexity of maintaining deception 1. Negative emotion words e.g. “hate” increase in word usage owing to shame or guilty feeling 1. Motion verbs e.g. “go” or “move” increase as exclusive words go down to keep the story on track
  • Ideas concerning thinking between Australia and Chinese citizens (1st May 2014) Chinese social networks Australian social networks • Sina weibo (132,555,895) • QQ weibo (3,721,300) • Taisha BBS (1,228,967) In total: 137,506,162 • Tigtag (21,755,909) • Oursteps (14,568,879) • Yeeyi(6,635,153) • FreeOZ (4,718,210) In total: 47,678,151 The numbers are the Australia-related posts found in each site.
  • Twitter and Marketing Predictions • Tweets is “found data” without asking questions • More meaning than typical search engine query • • Large numbers of passive participants in natural settings • Twitter can predict the stock market (Lisa Grossman, Wired, Oct 19 2010) • Predict movie success in first few weekends of release – “…it also raises an interesting new question for advertisers and marketing executives. Can they change the demand for their film, product or service buy directly influencing the rate at which people tweet about it? In other words, can they change the future that tweeters predict?” Tech Review, http://www.technologyreview.com/blog/arxiv/25000/ 18
  • Detecting flu trends using search engine query data (intentionality) 19
  • Data Types • Astronomical • Documents • Earthquake • Email • Environmental sensors • Fingerprints • Health (personal) Images • Location • Marine • Particle accelerator Satellite • Scanned survey data Social media • Sound • Text • Transactions • Video
  • Square Kilometer Array (SKA) • The data collected by SKA in a single day take nearly two million years to playback on an MP3 player The SKA central computer has processing power of about one hundred million PCs. • The SKA will use enough optical fiber linking up all the radio telescopes to wrap twice around the Earth. • The dishes of the SKA when fully operational will produce 10 times the global internet traffic as of 2013. • The aperture arrays in the SKA could produce more than 100 times the global internet traffic as of 2013. • The SKA will generate enough raw data to fill 15 million 64 GB MP3 players every day. • The SKA supercomputer will perform 1018 operations per second - equivalent to the number of stars in three million Milky Way galaxies - in order to process all the data that the SKA will produce. • The SKA will be so sensitive that it will be able to detect an airport radar on a planet 50 light years away. • The SKA will contain thousands of antennas with a combined collecting area of about one square kilometer (that's 1,000,000 square meters). • Previous mapping of Centaurus A galaxy took a team 12,000 hours of observations and several years. SKA ETA 5 minutes ! To the scientists involved, however, the SKA is no testbed, it’s a transformative instrument which, according to Luijten, will lead to “fundamental discoveries of how life and planets and matter all came into existence. As a scientist, this is a once in a lifetime opportunity.” Sources: http://bit.ly/amazin-facts & http://bit.ly/astro-ska Galileo
  • Centaurus A Image credit: Ilana Feain, Tim Cornwell & Ron Ekers (CSIRO/ATNF). ATCA northern middle lobe pointing courtesy R. Morganti (ASTRON), Parkes data courtesy N. Junkes (MPIfR). Centaurus A The image has been created by Dr Ilana Feain and her team using CSIRO’s Australia Telescope Compact Array telescope near Narrabri in New South Wales to observe the galaxy over several years. With ASKAP these same observations will take just five minutes. In first six hours of operation, ASKAP will generate more information than all previous radio telescopes in the world combined.
  • Number of journeys made Distances travelled Types of roads used Speed Time of travel Levels of acceleration and braking Any accidents which may occur (http://bit.ly/Black_box) http://tacocopter.com/ New Sources of Information (Big data) : Social Media + Internet of Things  Data Driven Innovations
  • The ANZ Heavy Traffic Index comprises flows of vehicles weighing more than 3.5 tonnes (primarily trucks) on 11 selected roads around NZ. It is contemporaneous with GDP growth. The ANZ Light Traffic Index is made up of light or total traffic flows (primarily cars and vans) on 10 selected roads around the country. It gives a six month lead on GDP growth http://www.anz.co.nz/commercial-institutional/economic-markets-research/truckometer/
  • Smartphone, Google Glass or Apple Watch will Know What you Want before you do “…from 2014 your phone [glasses or watch] will anticipate your needs, do the research, tell you what what you want to know – sometimes before the question even occurs to you…” Chapman, Jake (2013), The Wired World in 2014
  • Useful References Informing our Thinking on Mobility and Movement (Silva et al (2013) A comparison of Foursquare and Instagram to the study of city dynamics and urban social behavior, Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing Instagram and Foursquare datasets might be compatible in finding popular regions of city Chaoming Song, et al. (2010), Limits of Predictability in Human Mobility, Science There is a potential 93% average predictability in user mobility, an exceptionally high value rooted in the inherent regularity of human behavior. Yet it is not the 93% predictability that we find the most surprising. Rather, it is the lack of variability in predictability across the population. Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for Pervasive Systems. Proceedings of the 9th International Conference on Pervasive Computing (Pervasive'11) Daily and weekly routines => Few significant places every day => Regularity in human activities => Regularity leads to predictability
  • Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge Workshop. we have shown that it is possible to exploit the correlation between movement data and social interactions in order to improve the accuracy of forecasting of the future geographic position of a user. In particular, mobility correlation, measured by means of mutual information, and the presence of social ties can be used to improve movement forecasting by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of potential existence of physical or distant social interactions and vice versa. Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human Mobility Where are you going to be 285 days from now at 2pm …we show that it is possible to predict location of a wide variety of hundreds of subjects even years into the future and with high accuracy. Useful References Informing our Thinking on Mobility and Movement
  • Roadmap to Recommender Tool Data collection Individual(Group) Analysis Find Preference and Behavior pattern(including Trajectory pattern) Recommendation Recommend right product (or service) to right person ( or group) at right time and place Manual Automatic Recommndation
  • MongoDB Mahout Recommender Recommended Trajectories • Trajectories • Points of Interest • User profiles • Image details • Recommender engine (Mahout) Algorithms MongoDB Connector for Hadoop Version 1.2.0
  • Internet of Things “trillion sensors” Source: www.tsensorssummit.org
  • Beacon Active Card Shelf Shelf Shelf Gateway ServerMonitor Internet Supermarket control room
  • Smart Social Card System Reader/Wifi Gateway and Active Card
  • Smart Sandbag System & Smart Sleeping System smart-dove.com
  • Hype Cycle for Big Data, Gartner (2014)
  • Big Data Scenarios • Broadcasting • Education • Insurance • Brands - Luxury goods • Retailing
  • Big Data and Broadcasting in Australia 2014 As an advertising data analyst working with broadcasters and content providers, I recognise online advertising alone is worth over $4 billion in 2014. (http://www.iabaustralia.com.au/). To help clients and my own employer benefit from the change occurring as well as maintain increases in salary, I undertook the MDSI. This degree helped to provide me with a brush up of core analytical skills (mathematics and statistics), technical (programming in the context of big data), soft skills of storytelling to help fellow team members understand actions stemming from the data analysis and innovation to explore new data sources to help find new solutions. Just looking at TV advertising alone, a major paradigm shift is underway. Since the 1950s, TV advertising is literally hardwired into the show. Hence, the entire audience of one million watching “Home and Away” got to see the same advertisement. This is no longer the case, TV is becoming available everywhere. The television has broken free from the box in the living room with many Australians watching programs on mobile and tablets. TV programs are even available on demand at the whim of a viewer. This second screen or multi-platform phenomena creates new opportunities including an entirely new revenue stream from the viewings happening outside of the TV. For example, the sports fans in Australia receive real time statistics and scores for the Australian Open generating via mobile creating more data for analysis (IBM 2014). Predictions are possible being able to determine a hit even before filming (House of cards; Carr 2013). Other data sources available are from the actual digital players on the second screen providing users and times of pausing, forward, rewinding and replaying or stopping the TV show. Complimenting this viewer data with the comments on TV shows and films in social media provides an understanding of viewer behavior during and outside of the program schedule. Using the variety of data, I am expected to help provide reporting and dashboards to support: An increase in revenue through monetizing new views across devices The development of new programs or movies Web or mobile products around popular shows including recommender systems
  • Big Data and Insurance in Australia 2014 Since the launch of insurance box (Collett 2013; insurancebox.com.au) in Australia, a rethink of our existing approach to insuring drivers is driving changes to our existing business structure and processes. Previously, the actuaries provided the necessary quantitative skills. Today, the transforming business requires data science for insurance capabilities across our entire enterprise. Some of this capability includes our existing actuaries with additional training in machine learning. The necessity to move to a data driven organization is fostered by our data science graduates. The organization now uses data and analysis over the previous gut feel and intuition in all facets of business including the growth of our new revenues via the website and mobile from younger drivers adopting the “black box” recorder. Our data science team focuses on data collection from the black boxes, analysis and reporting as well as ongoing deployment of new sources of information enriching the business. Totally new capability is now available within the business not hitherto seen. This includes A/B testing of the online environment, trend analysis, segmentation, text mining as well as the important black box dashboard visualisations of driving scores (http://insurancebox.com.au/tour). The business is now able to acquire close to real time from our drivers, make predictions and start to look forward to creating new products entirely tailored to the needs of our individual customers with a “pay as you” go business model. True driving behavior as well as the actual overnight location of the car determines the insurance price not the previous statistical base of drivers and demographics.
  • Big Data and Education 2014 As a project manager for a local educational institution, I am driving our student dashboard project. I attended the MSCI course to understand the project management of data science intensive projects while mindful of privacy and the ability to innovate using new data sources. The idea behind the project is to predict and student challenges with regard to academic performance and any behavioral challenges. The dashboard notifies lecturers and tutors with a recommended course of action. Students receive similar recommendations directly. The key aim of the dashboard is ensuring the student is able to perform at the best while maximizing student retention through providing assistance to help with grades. The system generates a predictive model from past performance, library usage, wi-fi hot spots around campus and frequency of access of the learning management system. As project manager, I am continually on the look out to integrate new information sources including social media providing a representation of the student behavior as well as ensuring the system output provides students with directly actionable information. The system undergoes regular penetration security testing at each new release ensuring the privacy of student information
  • Big Data and Luxury Goods 2014 Forecasting the sale of our luxury goods has often been an art rather than a science. As a marketing data analyst, I support the sales and marketing team by honing forecasts to bring just enough stock from Italy to satisfy demand in Australasia. Furthermore, the provision of accurate forecasting is complicated with customers not just buying from bricks and mortar stores but online directly from our own Website and other online properties. Online shopping revenue sales 2013/14 according to IBISWorld is AUD12.4 billion. In 2016, I attended the MDSI course and enhanced my profile as a marketing data analyst/scientist. The course was instrumental in providing me with a mix of mathematical and statistical skills as well as coding. Innovation aspects of the course let me feel comfortable with introducing new approaches in the business. The course is timely, as the company had been thinking about offshoring our data analysis capabilities work to India and the company MuSigma. Instead, the expertise gained from MDSI allows me to develop and lead a balanced team of marketing specialists and general data talent. The communication skills learnt from the course form the basis of the shared language of data analysis amongst team members and executives. Amongst the innovation, I have put in place is the big data forecasting system. The MDSI course reinforced the notion of finding new innovative sources for making marketing discoveries. The idea is simple to use a variety of information sources not previously considered to help improve our luxury goods forecasting. With the team, we hit upon using indicators derived from Tasmanian oyster farms (Sense-T), Twitter luxury goods bloggers and finding correlations with historical sales. This helps us forecast 6 months in advance with far greater accuracy than previously available. The data is multi-structured and required using Hadoop to acquire the large volumes of data. The approach learnt from MDSI worked to a tee by starting simple with the forecasting problem, building the team to including mathematical/statistical specialists, training on not just sandboxes but the actual cluster of 4 nodes for processing the data and documenting the results.
  • Big Data and Retailing in Australia 2014 Retailing is not only one of the world’s top ten industries and in some countries (e.g. India) the largest but above all presents great opportunity for data science innovation as “the retail store experience is set to change more in the next five years than it has over the past century” (McKinsey 2013). In Australia Woolworths and Wesfarmers are in the top 20 worldwide retailers (Deloitte 2013). Until recently, the understanding of shopper behaviour in the Australian and global marketplaces is built atop volumes of scanner data under the control of the major retailers e.g. Wesfarmers and Woolworths working in conjunction with a handful of large research organisations (Nielsen and Ipsos). Hence, the research insights on shopper behaviour are often not transparent and not readily available to members of the global retail ecosystem including brand owners. As head of the retail analytics team with a major retailer, the importance of extracting value from the big data available from the supply chain, point of sale, online web site and shoppers is paramount. The data sources include store cameras to obtain shopper demographics, tracking behavior and heat mapping around key store positions. Today, as the head of retail analytics, we can utilise data mining techniques in conjunction with multi- structured data inclusive of shopper CCTV videos, traffic counters, online click stream data of retailers using social media to make pre-purchase buying decisions, the opportunity exists to generate predictions built on highly transparent big data sets combining rich data from a variety of information sources. However, these rich information sources are real time signals providing very big multi-structured data streams not traditionally manageable by retailers or research organisations. The retailer interest includes the availability of predictions to not only help with minimising stock-outs but even help ensure correct staffing levels. The Mckinsey Global Institute Big Data report (2011) supports this thinking by recognising “…the use of large datasets will continue to transform the face of retail…”.
  • Roadmap – Evolution from Existing Operations to Predictive Making Foresight Relevant to Daily Operations ! Rigid Flexible Connected What if conversations continue? (Adapted from Solis, 2012 and Davenport 2007) Themes Silo, rigid Hoarding info Vs. collaboration Freely share info and Knowledge on internal basis acting social with customers 2 –way communications Connected internal and External. Listening and Learning. Internal and external engagement Shared via hub and Spoke. Employees Connected directly to Customers. Adaptive Agile, integrate customer Experiences and feedback Loops. Listening and Learning now become analyse and insights Makes sense of data And transforms into Intelligence. Respond in Real time Predictive Shift from reactive to Proactive and predictive Business uses social media heavily and is flexible, connected, adaptive and predictive in terms of customer experiences, needs and new opportunities. Predict scenarios before they occur maximise opportunity and limit risk How can we lead conversations? (predictive recommendation) What conversations are next? Why are these conversations occurring? What actions are required? What are the sentiment of conversations? When and where are conversations taking place? What conversations are taking place? Business Intelligence 42
  • The future is impossible to predict. However one thing is certain : The company that can excite it’s customers dreams is out ahead in the race to business success Selling Dreams, Gian Luigi Longinotti 43