Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
by Patrick Hadley, Australian Bureau of Statistics at the Australian CIO Summit 2014
Big data characteristics, value chain and challengesMusfiqur Rahman
Abstract—Recently the world is experiencing an deluge of
data from different domains such as telecom, healthcare
and supply chain systems. This growth of data has led to
an explosion, coining the term Big Data. In addition to the
growth in volume, Big Data also exhibits other unique
characteristics, such as velocity and variety. This large
volume, rapidly increasing and verities of data is becoming
the key basis of completion, underpinning new waves of
productivity growth, innovation and customer surplus. Big
Data is about to offer tremendous insight to the
organizations, but the traditional data analysis
architecture is not capable to handle Big Data. Therefore,
it calls for a sophisticated value chain and proper analytics
to unearth the opportunity it holds. This research
identifies the characteristics of Big Data and presents a
sophisticated Big Data value chain as finding of this
research. It also describes the typical challenges of Big
Data, which are required to be solved. As a part of this
research twenty experts from different industries and
academies of Finland were interviewed.
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, ohn Lucker, Partner and Principal at Deloitte Consulting, discussed Five Trends in Analytics - How to Take Advantage Today. Lucker will discuss the latest advancements in the world of analytics and offer strategies for tapping into their potential. The topic areas include visualization and design, mobile analytics and strategy analytics.
What is big data ? | Big Data ApplicationsShilpaKrishna6
Big data is similar to ‘small data’ but bigger in size. It is a term that describes the large volume of data both structured and unstructured. Big data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques
Big data characteristics, value chain and challengesMusfiqur Rahman
Abstract—Recently the world is experiencing an deluge of
data from different domains such as telecom, healthcare
and supply chain systems. This growth of data has led to
an explosion, coining the term Big Data. In addition to the
growth in volume, Big Data also exhibits other unique
characteristics, such as velocity and variety. This large
volume, rapidly increasing and verities of data is becoming
the key basis of completion, underpinning new waves of
productivity growth, innovation and customer surplus. Big
Data is about to offer tremendous insight to the
organizations, but the traditional data analysis
architecture is not capable to handle Big Data. Therefore,
it calls for a sophisticated value chain and proper analytics
to unearth the opportunity it holds. This research
identifies the characteristics of Big Data and presents a
sophisticated Big Data value chain as finding of this
research. It also describes the typical challenges of Big
Data, which are required to be solved. As a part of this
research twenty experts from different industries and
academies of Finland were interviewed.
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, ohn Lucker, Partner and Principal at Deloitte Consulting, discussed Five Trends in Analytics - How to Take Advantage Today. Lucker will discuss the latest advancements in the world of analytics and offer strategies for tapping into their potential. The topic areas include visualization and design, mobile analytics and strategy analytics.
What is big data ? | Big Data ApplicationsShilpaKrishna6
Big data is similar to ‘small data’ but bigger in size. It is a term that describes the large volume of data both structured and unstructured. Big data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques
Abstract:
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
A 3-day interactive workshop for startups involve in Big Data & Analytics in Asia. Introduction to Big Data & Analytics concepts, and case studies in R Programming, Excel, Web APIs, and many more.
DOI: 10.13140/RG.2.2.10638.36162
This presentation, by big data guru Bernard Marr, outlines in simple terms what Big Data is and how it is used today. It covers the 5 V's of Big Data as well as a number of high value use cases.
Big data in transport an international transport forum overview oct 2013OpenSkyData
Comprehensive Guide on the use of Big Data in Transportation Services from the International Transport Forum. OpenSky loves making big data work for organisations large and small.
http://www.openskydata.com/our-sectors/transport.html
Big Data and The Future of Insight - Future FoundationForesight Factory
As Big Data sweeps through consumer-facing businesses, we ask:
- If Big Data is truly a revolution, then what (and whom) will it eliminate or elevate?
- What value will still be derived from conventional market research and brand-building techniques?
- If every brand is backed by Big Data, can every brand prosper?
For more information please contact info@futurefoundation.net or visit www.futurefoundation.net
Introduction
Big Data may well be the Next Big Thing in the IT world.
Big data burst upon the scene in the first decade of the 21st century.
The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Face book were built around big data from the beginning.
Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
A top-down look at current industry and technology trends for Big Data, Data Analytics and Machine Learning (cognitive technologies, AI etc.). New slides added for Ark Group presentation on 1st December 2016.
Abstract:
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
A 3-day interactive workshop for startups involve in Big Data & Analytics in Asia. Introduction to Big Data & Analytics concepts, and case studies in R Programming, Excel, Web APIs, and many more.
DOI: 10.13140/RG.2.2.10638.36162
This presentation, by big data guru Bernard Marr, outlines in simple terms what Big Data is and how it is used today. It covers the 5 V's of Big Data as well as a number of high value use cases.
Big data in transport an international transport forum overview oct 2013OpenSkyData
Comprehensive Guide on the use of Big Data in Transportation Services from the International Transport Forum. OpenSky loves making big data work for organisations large and small.
http://www.openskydata.com/our-sectors/transport.html
Big Data and The Future of Insight - Future FoundationForesight Factory
As Big Data sweeps through consumer-facing businesses, we ask:
- If Big Data is truly a revolution, then what (and whom) will it eliminate or elevate?
- What value will still be derived from conventional market research and brand-building techniques?
- If every brand is backed by Big Data, can every brand prosper?
For more information please contact info@futurefoundation.net or visit www.futurefoundation.net
Introduction
Big Data may well be the Next Big Thing in the IT world.
Big data burst upon the scene in the first decade of the 21st century.
The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Face book were built around big data from the beginning.
Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
A top-down look at current industry and technology trends for Big Data, Data Analytics and Machine Learning (cognitive technologies, AI etc.). New slides added for Ark Group presentation on 1st December 2016.
Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Cha...IT Network marcus evans
Australian CIO Summit 2012: Modernising New Zealand’s Border Clearance by Channa Jayasinha, Director, Border Change Programme & Joint Border Management System Ministry for Primary Industries
A New Approach to the CIO role by Redefining the IT Department’s Contribution...IT Network marcus evans
A New Approach to the CIO role by Redefining the IT Department’s Contribution to the Bottom Line
by Barry Lerner, Huawei Technologies at the Australian CIO Summit 2014
About
Evolution of Data, Data Science , Business Analytics, Applications, AI, ML, DL, Data science – Relationship, Tools for Data Science, Life cycle of data science with case study,
Algorithms for Data Science, Data Science Research Areas,
Future of Data Science.
Big data is a phenomenon brought about by rapid data growth, complex, new, and changing data types, and parallel technology advancements; it brings huge possibilities. By optimizing these enormous amounts of structured and unstructured data, CSPs are in a unique position to capture these opportunities and create new revenue streams.
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
Digitalization: A Challenge and An Opportunity for BanksJérôme Kehrli
Today’s banking industry era is strongly defined by a word - digital. The urgency to act is only getting severe each day. Banks using digital technologies to automate processes, improve regulatory compliance, and transform the customer experience may realize a profit upside of 40% or more, while laggards that resist digital innovation will be punished by customers, financial markets, regulators, and may see up to 35% of net profit eroded, according to a McKinsey analysis.
The vital question to answer is, do we get digitalization right? Why is it getting extremely urgent to digitize?
an introductory course for Librarians on using Big Data and Data Science applications on the field of Library Science. The course is a 2 hour course module for basic fundamentals of applying DS work.
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...Pieter De Leenheer
We live in the age of abundant data. Through technology, more data is available, and the processing of that data easier and cheaper than ever before. But to realize the true value of this wealth of data, data leaders must rethink our assumptions, processes, and approaches to managing, governing, and stewarding that data. And to succeed, they must deliver credible, coherent, and trustworthy data into the hands of everyone who can use it.
Transformando la vida cotidiana a través de Big DataUX Nights
UX Nights Vol XXVI Big Data y Experiencia de Usuario
Transformando la vida cotidiana a través de Big Data
Omar Aviles
Technical Evangelist Manager - Microsoft México
6 de octubre de 2016
Ciudad de México
A l'occasion de l'eGov Innovation Day 2014 - DONNÉES DE L’ADMINISTRATION, UNE MINE (qui) D’OR(t) - Philippe Cudré-Mauroux présente Big Data et eGovernment.
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupEdward Curry
Data management efforts such as Master Data Management and Data Curation are a popular approach for high quality enterprise data. However, Data Curation can be heavily centralised and labour intensive, where the cost and effort can become prohibitively high. The concentration of data management and stewardship onto a few highly skilled individuals, like developers and data experts, can be a significant bottleneck. This talk explores how to effectively involving a wider community of users within big data management activities. The bottom-up approach of involving crowds in the creation and management of data has been demonstrated by projects like Freebase, Wikipedia, and DBpedia. The talk discusses how crowdsourcing data management techniques can be applied within an enterprise context.
Topics covered include:
- Data Quality And Data Curation
- Crowdsourcing
- Case Studies on Crowdsourced Data Curation
- Setting up a Crowdsourced Data Curation Process
- Linked Open Data Example
- Future Research Challenges
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Similar to Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition (20)
How CIOs Can Bridge the Gap Between Executive Leadership and IT Teams - Greg ...IT Network marcus evans
Ahead of the marcus evans Australian CIO Summit 2023, Greg Cassis discusses how managing the impact of change can be more effective when the executive leadership, technical and operational teams are better aligned.
What CIOs Need to Know about the Future of Technology - Steve Sammartino, Fu...IT Network marcus evans
Ahead of the marcus evans Australian CIO Summit 2022, Steve Sammartino discusses disruptive technologies, the future of the internet, and what CIOs need to plan for.
Time Machines: The Evolution and Application of Predictive Analytics-Dr Steve...IT Network marcus evans
Dr Steven P. Pratt, PhD., Chief Technology Officer, CenterPoint Energy, Inc. delivered his presentation entitled Time Machines: The Evolution and Application of Predictive Analytics at the marcus evans CIO Summit 2016 held in Los Angeles, CA
Data Breaches and Security: Ditching Data Disasters-Michael McNeil, Philips H...IT Network marcus evans
Michael McNeil, Global Product Security & Services Officer, Philips Healthcare delivered his presentation entitled Data Breaches and Security: Ditching Data Disasters at the marcus evans CIO Summit 2016 in Los Angeles, CA
Where marcus evans fits in our business development mix
Andrew Flaherty, General Manager at BillView (Fastlane Software Pty Limited), shares his thoughts about the company’s business development activities.
Crafting the Right Mobile Device Management Framework to Mitigate Risks and M...IT Network marcus evans
Crafting the Right Mobile Device Management Framework to Mitigate Risks and Maximise Benefits of BYOD by Gary Pettigrove, ANAO at the Australian CIO Summit 2014
Active Defence: Safeguarding Crucial Capability while Boosting Functionality ...IT Network marcus evans
Active Defence: Safeguarding Crucial Capability while Boosting Functionality and Delivering on ROI
Presentation by Ricardo Alberto, CTO & Acting CIO , The Treasury, Australian Government at the Australian CIO Summit 2014
Outsourcing to Save IT Costs: Interview with: George Bower, President and Chi...IT Network marcus evans
Outsourcing to Save IT Costs: Interview with: George Bower, President and Chief Executive Officer, Axis Technologies, a solution provider at the marcus evans CIO Summit 2012, discusses the benefits to CIOs of outsourcing features of their enterprise software management.
Building IT Infrastructures to Interact with Big Data - Doug Roberts, Associ...IT Network marcus evans
Doug Roberts, a speaker at the marcus evans CIO Summit 2012, discusses how CIOs can handle and interact with big data sets.
Interview with: Doug Roberts, Associate Vice President for Digital Technologies and Chief Technology Officer, Adler Planetarium and Astronomy Museum
How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Di...IT Network marcus evans
How Infosec Can Become a Business Enabler: Interview with: Dr Tim Redhead, Director, DotSec, a sponsor company at the upcoming marcus evans Australian CIO Summit 2013, on how organisations can ensure information security becomes a business enabler.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Search and Society: Reimagining Information Access for Radical Futures
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
1. Australian CIO Summit 2014
28 – 30 July 2014
Bigger and Better: Employing a Holistic Strategy for
Big Data toward a Strong Value-Adding Proposition
Patrick Hadley
Chief Information Officer
Australian Bureau of Statistics
2. Not Another ‘Big Data’ Presentation
(‘V’ is not the only letter in the alphabet!)
4. The promise
Big data is at the foundation of all the megatrends that are happening today,
from social to mobile to the cloud to gaming. - Chris Lynch, ex Vertica CEO
“Big Data is a tidal wave, which in the next decade will create consumer –
and producer – value in almost every major sector of the economy” Philip
Evans
“….a tremendous wave of innovation, productivity and growth… all driven by
big data” McKinsey
“Big Data: A Revolution that Will Transform how We Live, Work, and Think”
Viktor Mayer-Schönberger and Kenneth Cukier. 2013.
5. Big data is like teenage sex: everyone talks about it, nobody really
knows how to do it, everyone thinks everyone else is doing it, so
everyone claims they are doing it...
Dan Ariely, 2013
In God we trust; all others must bring data.
W.E. Deming
Or, the reality…….
6. Agenda
• What is Big Data (3/4/5/6 v’s)
• Sources of Data
• Data as an asset
• Open Data
• Opportunities…..applications…..benefits
• Data Management
• Data Analytics; technologies
• Security
• Privacy
• Skills and capabilities
• …… and on
7. Agenda
• What is Big Data (3/4/5/6 v’s)
• Sources od Data
• Data as an asset
• Open Data
• Opportunities…..applications…..benefits
• Data Management
• Data Analytics; technologies
• Security
• Privacy
• Skills and capabilities
• …… and on
8. Today ………
• The use of Big Data in official statistics
• ABS initiatives, experiences and capabilities
• Learnings: Towards a strong value- adding proposition
9. Big Data in Official Statistics
The vision…..
A richer, more dynamic statistical picture of Australia;
Opportunity: reduce costs; improve quality
10. Sources of Data
• digital descriptions of the physical environment
• sensors and other devices
• communications networks
• individual behaviour and information
• digitisation of commerce and supply chains
11. High potential data sources
• Telecom
• Utilities
• Retailers
• Financial sector
• Satellite
• Other
12. Example: Telecom data applications
• small area population estimates
• service populations
• travel patterns
• seasonal population movements
• event populations
• internet use……
How do we ?
o identify characteristics of handset owners?
o turn handset counts into people
13. Initiate exploratory R&D
Targeted streams of investigation
Use of satellite imagery to determine land utilisation
Use of integrated demographic data for small area
modelling of unemployment
Use of mobile device messaging records for real time
estimation of service populations
Progress the methodological framework and trial new
technology approaches
Machine learning
Multidimensional data visualisation
Distributed computing
Open linked data
14. Big Data challenges
• Data quality
• Data volatility and stability
• Data representativeness
• Data dimensionality
• Statistical modelling and inference
15. Data quality
Big Data sets/streams are generally noisy and often
unstructured – they need to undergo non-trivial filtering and
cleaning process before they can be used
Balancing the complexity of the cleaning process with the
information value of the obtained results is significant issue
What methods can be used for noise reduction?
How do we deal with missing data?
16. Data volatility and stability
Streaming data may fluctuate over short time frames
Data sources themselves may change or disappear
What becomes of time series in a world where data streams
and sources are transient?
17. Data representativeness
How representative are the data from emerging Big Data
sources of the phenomena we are trying to measure?
How do we determine whether there are hidden biases?
What methods can be used to reduce the volume of data while
retaining the information value of the data and statistical
validity of the analysis?
18. Data dimensionality
Dimensionality is a significant and challenging aspect of
“bigness”
Dimension has an impact on
Storage of data
Processing and analysis of data
Existing storage and computational paradigms fail badly
19. Statistical modelling and inference
How can population characteristics be determined?
What is the population? In many cases this is not known (e.g.
Twitter)
Can we draw a sample and calculate descriptive statistics?
How do we avoid apophenia?
Seeing meaningful patterns and connections where none exist
The number of fake correlations grows with the number of
variables
“To understand is to perceive patterns.” – Isaiah Berlin
20. From ‘V’ (what) to ‘C’ (how)
‘What’ has changed about data?
Vs: Volume, Velocity, Variety, Veracity,
Volatility
‘How’ will we change?
Cs: Creating, Computing,
Comprehending, Competing,
Collaboration
21. Big Data ‘C’s and the ABS - CREATING
The world is CREATING data like never before and every
individual, household and business we interact with will change in
data creation:
• The Internet of Things (M2M) becomes the ‘Internet of
Everything’
• Sometimes called the 4 internets: people, things, information,
places are all network addressable, most have data
producing/collecting/transmitting capability
22. Big Data ‘C’s and the ABS - COMPUTING
COMPUTING data like never before. Some examples:
• emerged from Web-scale problems such as search engines with
new solutions such as key-value databases (Hadoop, NOSQL DBs
• advanced computation algorithms and approaches become
‘popularised’ e.g. machine learning approaches, automated
visualisation and explanations systems, data mining/discovery,
semantic (knowledge) representation and reasoning systems
requiring ‘search’
• statistical analysis-as-a-service e.g. auto-coding, confidentiality,
time series analysis, etc
• distributed/parallel computation for low-cost multi-core, multi-
socket, multi-computers, in-memory computation technologies
• embedded processors, sensors/RFIDs/GPS/SIM
• the ‘logical data warehouse’
23. Big Data ‘C’s and the ABS - COMPREHENDING
COMPREHENDING/CONSUMING data requiring new tools in the ABS kit bag:
• tables – static and data consumer dynamically defined (ABS.stat, REEM Table
Builder) in standard XML formats like SDMX
• visualisation – for internal ABS insight, for our ‘retail’ dissemination, ‘smart’ insight
where software suggests the best way to see data: ‘telling the story’
• narrative – table to text production (auto produce media release & part of main
features):
• voice – text to speech to read narrative & data for Accessibility speech to text for
NIRS analysis
• semantic data outputs in OWL/RDF
• hybrid of above – to add value to information, for ABS data consumers to enhance
comprehension
• data streams – data-as-a-service for M2M (the ABS public Web services library) ,
could be called ‘the embedded ABS’
and all this with adaptive/responsive design for multiple end-points devices types!!!
24. Big Data ‘C’s and the ABS - COMPETING
COMPETING with data, to obtain it and use it for competitive
advantage
• In some subject-matter areas there is more competition. Who
can make a statistical index ? Anyone with a spreadsheet;
• Who else wants to be influential in and/or monetarise statistics?
• Everyone else starts to understand INFONOMICS
• More ‘agent’ data sources for ABS as we may not have a the
capability to collect (full) unit record ‘big data’?
25. Big Data ‘C’s and the ABS : COLLABORATING
In ABS
In Government
In Academia
Across the international statistical community
26. ABS Capabilities, expertise
• collect and process large quantities of data
• data ‘cleansing’
• data standards and framework
• data integration
• methodological techniques
• strong analytical capability
• sophisticated web based dissemination system
• data quality framework
27. ABS Big Data Challenges
Business Benefit
Validity of Statistical Inference
Privacy and Public Trust
Data Integrity
Data Ownership and Access
Computational Efficacy
Technology Infrastructure
(Source: “Big data and the ABS – from ideas to action”, ABS MM paper, Oct 2013)