The document discusses big data, data-driven decision making, and data-informed policy making. It defines big data as large and complex data that requires new tools and techniques to analyze. It emphasizes that decision making should be based on data analysis rather than intuition alone. For policy making, data are crucial for monitoring progress, but statistics and data science are often underappreciated. Developing countries in particular lack reliable data for policy decisions.
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
INDIAN STATISTICAL INSTITUTE
Documentation Research & Training Centre
8th Mile, Mysore Road, RVCE Post
Bangalore-560 059
DRTC Seminar- 5
2014
Data Literacy
ABSTRACT
In our increasingly data-driven society, data literacy is an important civic skill which we should be developing in our society. Data is slowly but steadily forcing their way into the societies. Data literacy may seem less technical than either Computer Science or any other fields. Still we need to envisage a wide variety of tools for accessing, converting and manipulating data. These require to understand relational databases (like MS Access), data manipulation techniques, statistical software tools (like Minitab, SPSS, STATA and MS Excel) and data representation software tools (like MS PowerPoint and MS Excel). This seminar includes an introduction on data literacy, its inter-relationship with information literacy and statistical literacy. It also includes various steps for working with data followed by short demonstration of data analysis techniques by using the software STATA11.
Speaker: Jayanta Kr. Nayek
Date:29 .10.2014. Time: 2 p.m.
Venue: DRTC, ISI Bangalore.
All are cordially invited.
Seminar Coordinator
Biswanath Dutta
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
INDIAN STATISTICAL INSTITUTE
Documentation Research & Training Centre
8th Mile, Mysore Road, RVCE Post
Bangalore-560 059
DRTC Seminar- 5
2014
Data Literacy
ABSTRACT
In our increasingly data-driven society, data literacy is an important civic skill which we should be developing in our society. Data is slowly but steadily forcing their way into the societies. Data literacy may seem less technical than either Computer Science or any other fields. Still we need to envisage a wide variety of tools for accessing, converting and manipulating data. These require to understand relational databases (like MS Access), data manipulation techniques, statistical software tools (like Minitab, SPSS, STATA and MS Excel) and data representation software tools (like MS PowerPoint and MS Excel). This seminar includes an introduction on data literacy, its inter-relationship with information literacy and statistical literacy. It also includes various steps for working with data followed by short demonstration of data analysis techniques by using the software STATA11.
Speaker: Jayanta Kr. Nayek
Date:29 .10.2014. Time: 2 p.m.
Venue: DRTC, ISI Bangalore.
All are cordially invited.
Seminar Coordinator
Biswanath Dutta
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
Data science is having a growing effect on our lives, from the content we see on social media feeds to the decisions businesses are making. Along with successes, data science has inspired much hype about what it is and what it can do. So I plan to try and demystify data science and have a discussion about what it really is. What does a day-in-the-life look like? What tools and skills are needed? How is data science successfully applied in the real world? In this talk, I’ll be providing insight into these questions and also speculate the future of data science and its place in business and technology.
Presented at OpenWest 2018
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
Data science is an interdisciplinary field that uses algorithms, procedures, and processes to examine large amounts of data in order to uncover hidden patterns, generate insights, and direct decision making.
A Seminar Presentation on Big Data for Students.
Big data refers to a process that is used when traditional data mining and handling techniques cannot uncover the insights and meaning of the underlying data. Data that is unstructured or time sensitive or simply very large cannot be processed by relational database engines. This type of data requires a different processing approach called big data, which uses massive parallelism on readily-available hardware.
Introduction to various data science. From the very beginning of data science idea, to latest designs, changing trends, technologies what make then to the application that are already in real world use as we of now.
Key Considerations While Rolling Out Denodo PlatformDenodo
Watch full webinar here: https://bit.ly/3zaPGLO
Our approach for data virtualization advisory takes the following 3 dimensions/areas into consideration:
- Technology / Architecture
- Business User Groups (your clients)
- IT Organization
To deliver quick results, Q-PERIOR uses a multitude of accelerators in predefined topics within these three dimensions. In our presentation we will elaborate on client examples why such an exercise makes sense before rolling out Denodo and what kind of risks you can avoid doing so.
A Swiss Statistician's 'Big Tent' View on Big Data and Data Science (Version 10)Prof. Dr. Diego Kuonen
Keynote talk given by Dr. Diego Kuonen, CStat PStat CSci, on October 21, 2015, at the `Austrian Statistics Days 2015' in Vienna, Austria.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 9)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 1, 2015, at the `Joint SCITAS and Statistics Seminar' of the EPFL in Lausanne, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
Data science is having a growing effect on our lives, from the content we see on social media feeds to the decisions businesses are making. Along with successes, data science has inspired much hype about what it is and what it can do. So I plan to try and demystify data science and have a discussion about what it really is. What does a day-in-the-life look like? What tools and skills are needed? How is data science successfully applied in the real world? In this talk, I’ll be providing insight into these questions and also speculate the future of data science and its place in business and technology.
Presented at OpenWest 2018
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
Data science is an interdisciplinary field that uses algorithms, procedures, and processes to examine large amounts of data in order to uncover hidden patterns, generate insights, and direct decision making.
A Seminar Presentation on Big Data for Students.
Big data refers to a process that is used when traditional data mining and handling techniques cannot uncover the insights and meaning of the underlying data. Data that is unstructured or time sensitive or simply very large cannot be processed by relational database engines. This type of data requires a different processing approach called big data, which uses massive parallelism on readily-available hardware.
Introduction to various data science. From the very beginning of data science idea, to latest designs, changing trends, technologies what make then to the application that are already in real world use as we of now.
Key Considerations While Rolling Out Denodo PlatformDenodo
Watch full webinar here: https://bit.ly/3zaPGLO
Our approach for data virtualization advisory takes the following 3 dimensions/areas into consideration:
- Technology / Architecture
- Business User Groups (your clients)
- IT Organization
To deliver quick results, Q-PERIOR uses a multitude of accelerators in predefined topics within these three dimensions. In our presentation we will elaborate on client examples why such an exercise makes sense before rolling out Denodo and what kind of risks you can avoid doing so.
A Swiss Statistician's 'Big Tent' View on Big Data and Data Science (Version 10)Prof. Dr. Diego Kuonen
Keynote talk given by Dr. Diego Kuonen, CStat PStat CSci, on October 21, 2015, at the `Austrian Statistics Days 2015' in Vienna, Austria.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 9)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 1, 2015, at the `Joint SCITAS and Statistics Seminar' of the EPFL in Lausanne, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's `Big Tent' View on Big Data and Data Science in Health Scien...Prof. Dr. Diego Kuonen
Presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on April 18, 2016, at the `Nestlé Institute of Health Sciences' in Lausanne, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors, as well as health sciences. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' view on these terms in health sciences, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...Prof. Dr. Diego Kuonen
'President's Invited Speaker' keynote talk given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on August 22, 2016, at the '37th Annual Conference of the International Society for Clinical Biostatistics (ISCB)' in Birmingham, United Kingdom.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors, as well as pharmaceutical development. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' overview of these terms in pharmaceutical development, illustrates the connection between data science and statistics - the terms surrounding the 'sexiest job of the 21st century' - and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 8)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on July 2, 2015, at 'Swiss Re' in Adliswil, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on May 13, 2014, at the `SMi Big Data in Pharma' conference in London, United Kingdom.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms `big data' and `data science'. This presentation gives a professional statistician's view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's Introductory View on Big Data and Data Science (Version 7)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on May 12, 2015, at the 'SAS Forum Switzerland' in Zurich, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's introductory view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on August 26, 2014, at the `Zurich Machine Learning and Data Science' meetup in Zurich, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms `big data' and `data science'. This presentation gives a professional statistician's view on these terms and illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 13, 2014, at `F. Hoffmann-La Roche' in Basel, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors, as well as the pharmaceutical industry. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms `big data' and `data science'. This presentation gives a professional statistician's view on these terms in pharmaceutical development, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on November 21, 2014, at the 'Research Seminar in Statistics' of the University of Geneva in Geneva, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Demystifying Big Data, Data Science and Statistics, along with Machine Intell...Prof. Dr. Diego Kuonen
Presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on November 25, 2016, at the `Statistics at Nestlé in Switzerland' event of `Nestlé' in Vevey, Switzerland.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 6)Prof. Dr. Diego Kuonen
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on April 23, 2015, at the 'ZüKoSt: Seminar on Applied Statistics' of the ETH Zurich in Zurich, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
The Power of Data Insights - Big Data as the Fuel and Analytics as the Engine...Prof. Dr. Diego Kuonen
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on February 1, 2017, at the `Microsoft Vision Days - Intelligent Cloud' event of Microsoft Switzerland in Wallisellen, Switzerland.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on November 20, 2013, at the "IBM Developer Days 2013" in Zurich, Switzerland.
ABSTRACT
There is no question that big data has hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms big data and data science. This presentation gives a professional statistician's view on these terms and illustrates the connection between data science and statistics.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Measuring Success introduces nonprofit professionals to proven techniques on how to move from anecdotal to data-driven decision making and steer your organization to success. Gain insights on how to focus your limited organizational time and energies on the issues that are supported by data instead of anecdotes. Learn techniques for using data to track and measure progress over time, report impact to stakeholders, and manage toward success.
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...Prof. Dr. Diego Kuonen
Presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on November 29, 2016, at the `University of Applied Sciences of Western Switzerland' (`Haute Ecole d'Ingénierie et de Gestion du Canton de Vaud', HEIG-VD) in Yverdon-les-Bains, Switzerland.
Big Data, Data Science, Machine Intelligence and Learning: Demystification, T...Prof. Dr. Diego Kuonen
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on March 14, 2017 at Eurostat's international conference `New Techniques and Technologies for Statistics (NTTS) 2017' in Brussels, Belgium.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Data Driven Decision Making for Nonprofits4Good.org
“Someone told us we need to do a survey,” the process often begins. A survey is only one piece of a data-driven strategic process, which really begins with articulation of the core issue, and ends with an assessment of how the strategy worked. In this session we will learn the 12 stages of a data-driven process, and show a full illustration of a project. Participants will also learn how to put together a simple one-page project planning brief.
MARTINEZ - Enhancing Public Policy Decision Making using Large-scale Cell Pho...UN Global Pulse
VANESSA FRIAS MARTINEZ - a Scientific Researcher in the Data Mining and User Modeling Group at Telefonica Research in Madrid, Spain – focuses on technologies for emerging markets. She took participants through her work to determine specific human behaviors from cell phone data to evaluate the effectiveness of policy decisions. In order to measure the impact of the Mexican government’s H1N1 response in 2009, Vanessa analyzed call records to determine changes in people’s mobility patterns in Mexico City. The results indicated that the government’s policy to issue warnings to stay away from public spaces was in fact heeded by the citizens and thus effective in limiting exposure to the virus. Vanessa’s presentation also introduced cell phone data as cost-effective method to conduct demographic research in emerging economies.
Paper: "Measuring the Impact of Epidemic Alerts on Human Mobility using Cell-Phone Network Data"
Big Data as the Fuel and Analytics as the Engine of the Digital TransformationProf. Dr. Diego Kuonen
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on June 13, 2017, at the `Information Builders Think Tank Lunch' in Zurich, Switzerland.
(Big) Data as the Fuel and Analytics as the Engine of the Digital TransformationProf. Dr. Diego Kuonen
Webinar presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on March 15, 2018, within the TIBCO webinar entitled "Demystifying the Hype: [Big] Data as Fuel and Analytics as the Engine of Digital Transformation"; see
https://www.tibco.com/events/demystifying-hype-big-data-fuel-and-analytics-engine-digital-transformation
---------------
ABSTRACT
---------------
The digital revolution is truly underway: terms such as big data, cloud, internet of things, internet of everything, the fourth industrial revolution, smart cities and data economy are no longer just concepts - they are changing our lives in new and exciting ways.
Digital Transformation started with a first wave of digitalisation, which resulted in the (big) data revolution. But now a second wave of digitalisation is needed to enable learning from (big) data and to generate increased value for both business and society as a whole.
This presentation discusses how analytics, the science of "learning from data" or of "making sense out of data", becomes the engine of a new wave of Digital Transformation, and illustrates that the biggest challenge therein is the veracity of the "data pedigree", i.e. the trustworthiness of the data, including the reliability, capability, validity, and related quality of the data.
This presentation looks at demystifying concepts and terms surrounding Digital Transformation and big data. Along with machine intelligence and learning, the connection between data science and statistics is illustrated, and trends, challenges, opportunities, and the related digital skills and principles needed to succeed at Digital Transformation are highlighted.
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...Prof. Dr. Diego Kuonen
Public keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on July 7, 2017, in the context of the `CAS Data Visualization' of the `Bern University of the Arts' (HKB) in Berne, Switzerland.
See https://www.hkb.bfh.ch/de/weiterbildung/design/cas-data-visualization and http://bka.ch/worte/rubriken/worte/kein-datensalat
Data as the Fuel and Analytics as the Engine of the Digital Transformation: D...Prof. Dr. Diego Kuonen
Keynote speech given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on June 26, 2018 at TIBCO's "Data Innovation Event" in Zurich, Switzerland; see https://www.tibco.com/events/tibco-data-innovation-event
Glocalised Smart Statistics and Analytics of Things: Core Challenges and Key ...Prof. Dr. Diego Kuonen
Invited presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on July 18, 2017 within Eurostat's special topic session `STS021: From Big Data to Smart Statistics' at the `61st ISI World Statistics Congress' (ISI2017) in Marrakech, Morocco.
Production Processes of Official Statistics & Data Innovation Processes Augme...Prof. Dr. Diego Kuonen
Keynote speech given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on May 15, 2018 at the conference "Big Data for European Statistics (BDES)" in Sofia, Bulgaria; see
https://webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/BDES_2018
Data as the Fuel and Analytics as the Engine of the Digital Transformation: D...Prof. Dr. Diego Kuonen
Keynote speech given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on June 7, 2018 at the "5th Swiss Conference on Data Science (SDS|2018)" in Berne, Switzerland; see https://sds2018.ch/.
"Data as the Fuel and Analytics as the Engine of the Digital Transformation -...Prof. Dr. Diego Kuonen
Webinar presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on May 14, 2019, within the StatSoft webinar entitled "Data as the Fuel and Analytics as the Engine of the Digital Transformation - Demystication, Challenges, Opportunities and
Principles for Success"; see https://www.statsoft.de/en/dates/webinars/
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...Prof. Dr. Diego Kuonen
Keynote speech given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on February 28, 2019 at the "Swiss Cyber Security Days 2019" on February 27-28, 2019 in Fribourg, Switzerland; see https://swisscybersecuritydays.ch/.
Data as Fuel and Analytics as Engine of the Digital Transformation: Demystic...Prof. Dr. Diego Kuonen
Invited presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on September 19, 2017, at the "ITU-Academia Partnership Meeting: Developing Skills for the Digital Era" in Budapest, Hungary.
See https://www.itu.int/en/ITU-D/Capacity-Building/Pages/events/academia2017.aspx
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
by Patrick Hadley, Australian Bureau of Statistics at the Australian CIO Summit 2014
Einstein published his ideas and became a pivotal element in shifting the way we think about physics - from the Newtonian model to the Quantum - in turn this changed the way we think about the world and allowed us to develop new ways of engaging with the world.
We are at a similar juncture. The development of computational technologies allows us to think about astronomical volumes of data and to make meaning of that data.
The mindshift that occurs is that “the machine is our friend”. The computer, like all machines, extends our capabilities. As a consequence the types of thinking now required in industry are those that get away from thinking like a computer and shift towards creative engagement with possibilities. Logical thinking is still necessary but it starts to be driven by imagination.
Computational thinking and data science change the way we think about defining and solving problems.
The age of creativity - which increasingly extends its impact from arts applications to business, scientific, technological, entrepreneurship, political, and other contexts.
Similar to Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Policy Making (20)
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsRosie Wells
Insight: In a landscape where traditional narrative structures are giving way to fragmented and non-linear forms of storytelling, there lies immense potential for creativity and exploration.
'Collapsing Narratives: Exploring Non-Linearity' is a micro report from Rosie Wells.
Rosie Wells is an Arts & Cultural Strategist uniquely positioned at the intersection of grassroots and mainstream storytelling.
Their work is focused on developing meaningful and lasting connections that can drive social change.
Please download this presentation to enjoy the hyperlinks!
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Policy Making
1. Big Data, Data-Driven Decision
Making and Statistics
Towards Data-Informed Policy Making
Dr. Diego Kuonen, CStat PStat CSci
Statoo Consulting
Morgenstrasse 129, 3018 Berne, Switzerland
@DiegoKuonen + kuonen@statoo.com + www.statoo.info
‘World Statistics Day 2015’
#StatsDay15
Olten, Switzerland — October 20, 2015
2. About myself (about.me/DiegoKuonen)
• PhD in Statistics, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland.
• MSc in Mathematics, EPFL, Lausanne, Switzerland.
• CStat (‘Chartered Statistician’), Royal Statistical Society, United Kingdom.
• PStat (‘Accredited Professional Statistician’), American Statistical Association, United
States of America.
• CSci (‘Chartered Scientist’), Science Council, United Kingdom.
• Elected Member, International Statistical Institute, Netherlands.
• Senior Member, American Society for Quality, United States of America.
• CEO & CAO, Statoo Consulting, Switzerland.
• Senior Lecturer in Business Analytics and Statistics, Geneva School of Economics and
Management (GSEM), University of Geneva, Switzerland.
• President of the Swiss Statistical Society (2009-2015).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
2
4. About Statoo Consulting (www.statoo.info)
• Founded Statoo Consulting in 2001.
• Statoo Consulting is a software-vendor independent Swiss consulting firm spe-
cialised in statistical consulting and training, data analysis, data mining and big
data analytics services.
• Statoo Consulting offers consulting and training in statistical thinking, statistics,
data mining and big data analytics in English, French and German.
Are you drowning in uncertainty and starving for knowledge?
Have you ever been Statooed?
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
4
6. Contents
Contents 6
1. Demystifying the ‘big data’ hype 8
2. Data-driven decision making 17
3. Data-informed policy making 26
4. Conclusion and opportunities 34
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
6
7. ‘Data is arguably the most important natural resource
of this century. ... Big data is big news just about ev-
erywhere you go these days. Here in Texas, everything
is big, so we just call it data.’
Michael Dell, 2014
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
7
8. 1. Demystifying the ‘big data’ hype
• ‘Big data’ have hit the business, government and scientific sectors.
The term ‘big data’ — coined in 1997 by two researchers at the NASA — has
acquired the trappings of religion.
• But, what exactly are ‘big data’?
The term ‘big data’ applies to an accumulation of data that can not be
processed or handled using traditional data management processes or tools.
Big data are a data management infrastructure which should ensure that the
underlying hardware, software and architecture have the ability to enable ‘learning
from data’, i.e. ‘analytics’.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
8
9. • The following characteristics — ‘the four Vs’ — provide a definition:
– ‘Volume’ : ‘data at rest’, i.e. the amount of data ( ‘data explosion problem’),
with respect to the number of observations ( ‘size’ of the data), but also with
respect to the number of variables ( ‘dimensionality’ of the data);
– ‘Variety’ : ‘data in many forms’, ‘mixed data’ or ‘broad data’, i.e. different
types of data (e.g. structured, semi-structured and unstructured, e.g. log files,
text, web or multimedia data such as images, videos, audio), data sources (e.g.
internal, external, open, public), data resolutions (e.g. measurement scales and
aggregation levels) and data granularities;
– ‘Velocity’ : ‘data in motion’ or ‘fast data’, i.e. the speed by which data are
generated and need to be handled (e.g. streaming data from machines, sensors
and social data);
– ‘Veracity’ : ‘data in doubt’, i.e. the varying levels of noise and processing errors,
including the reliability (‘quality over time’), capability and validity of the data.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
9
10. • ‘Volume’ is often the least important issue: it is definitely not a requirement to
have a minimum of a petabyte of data, say.
Bigger challenges are ‘variety’ and ‘velocity’, and most important is ‘veracity’ and
the related quality of the data .
Indeed, big data come with the data quality challenges of ‘small’ data along with
new challenges of its own!
• The above definition of big data is vulnerable to the criticism of sceptics that these
four Vs have always been there.
Nevertheless, the definition provides a clear and concise framework to communi-
cate about how to solve different data processing challenges.
But, what is new?
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
10
11. ‘Scientists have long known that data could create new
knowledge but now the rest of the world, including gov-
ernment and management in particular, has realised
that data can create value.’
Sean Patrick Murphy, 2013
Source: interview with Sean Patrick Murphy, a former senior scientist at Johns Hopkins University
Applied Physics Laboratory, in the Big Data Innovation Magazine, September 2013.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
11
12. ‘The data revolution is giving the world powerful tools
that can help usher in a more sustainable future.’
Ban Ki-moon, United Nations Secretary-General, August 29, 2014
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
12
13. ‘To get the full business value from big data, compa-
nies need to focus less on the three Vs of big data
(volume, variety, velocity) and more on the four Ms of
big data: ‘Make Me More Money’!’
Bill Schmarzo, March 2, 2015
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
13
14. ‘Do not focus on the ‘bigness’ of the data, but on the
value creation from the data.’
Stephen Brobst, August 7, 2015
The 5th V of big data: ‘Value’.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
14
15. ‘Data are an infrastructural resource — a form of cap-
ital that cannot be depleted. ... Data can be used and
re-used to open up significant growth opportunities, or
to generate benefits across society in ways that could
not be foreseen when the data were created.’
OECD’s Directorate for Science, Technology and Innovation (STI),
OECD STI Policy Note, October 2015
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
15
16. ‘Data are not taken for museum purposes; they are
taken as a basis for doing something. If nothing is to
be done with the data, then there is no use in collecting
any. The ultimate purpose of taking data is to pro-
vide a basis for action or a recommendation for action.’
W. Edwards Deming, 1942
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
16
17. 2. Data-driven decision making
• Data-driven decision making refers to the practice of basing decisions on the anal-
ysis of data (i.e. ‘learning from data’), rather than purely on gut feeling and intuition:
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
17
18. ‘One by one, the various crises which the world faces
become more obvious and the need for hard facts [facts
by analyzing data] on which to take sensible action be-
comes inescapable.’
George E. P. Box, 1976
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
18
19. The two approaches of ‘learning from data’
Data science, statistics and their connection
• The demand for ‘data scientists’ — the ‘magicians of the big data era’ — is
unprecedented in sectors where value, competitiveness and efficiency are data-driven.
The term ‘data science’ was originally coined in 1998 by a statistician.
Data science — a rebranding of ‘data mining’ — is the non-trivial pro-
cess of identifying valid (that is, the patterns hold in general, i.e. being
valid on new data in the face of uncertainty), novel, potentially useful and
ultimately understandable patterns or structures or models or trends or re-
lationships in data to enable data-driven decision making.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
19
20. • Is data science ‘statistical d´ej`a vu’?
But, what is ‘statistics’?
Statistics is the science of ‘learning from data’ (or of making sense out
of data), and of measuring, controlling and communicating uncertainty.
It is a process that includes everything from planning for the collection of data and
subsequent data management to end-of-the-line activities such as drawing conclusions
of data and presentation of results.
Uncertainty is measured in units of probability, which is the currency of statistics.
Statistics is concerned with the study of data-driven decision making in the face
of uncertainty.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
20
21. What distinguishes data science from statistics?
• Statistics traditionally is concerned with analysing primary (e.g. experimental) data
that have been collected to explain and check the validity of specific existing ideas
(hypotheses).
Primary data analysis or top-down (explanatory and confirmatory) analysis.
‘Idea (hypothesis) evaluation or testing’ .
• Data science (or data mining), on the other hand, typically is concerned with
analysing secondary (e.g. observational or ‘found’) data that have been collected
for other reasons (and not ‘under control’ of the investigator) to create new ideas
(hypotheses).
Secondary data analysis or bottom-up (exploratory and predictive) analysis.
‘Idea (hypothesis) generation’ .
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
21
22. • The two approaches of ‘learning from data’ are complementary and should proceed
side by side — in order to enable proper data-driven decision making.
Example. When historical data are available the idea to be generated from a bottom-
up analysis (e.g. using a mixture of so-called ‘ensemble techniques’) could be
‘which are the most important (from a predictive point of view) factors
(among a ‘large’ list of candidate factors) that impact a given process out-
put (or a given indicator)?’.
Mixed with subject-matter knowledge this idea could result in a list of a ‘small’
number of factors (i.e. ‘the critical ones’).
The confirmatory tools of top-down analysis (statistical ‘Design Of Experiments’,
DOE, in most of the cases) could then be used to confirm and evaluate this idea.
By doing this, new data will be collected (about ‘all’ factors) and a bottom-up
analysis could be applied again — letting the data suggest new ideas to test.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
22
23. ‘Neither exploratory nor confirmatory is sufficient alone.
To try to replace either by the other is madness. We
need them both.’
John W. Tukey, 1980
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
23
24. Data-driven decision making and scientific investigation (Box, 1976)
Source: Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
24
25. ‘Experiments may be conducted sequentially so that
each set may be designed using the knowledge gained
from the previous sets.’
George E. P. Box and K. B. Wilson, 1951
Scientific investigation is a sequential learning process!
Statistical methods allow investigators to accumulate knowledge!
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
25
26. 3. Data-informed policy making
• In September 2015, at the United Nations General Assembly, heads of states and
governments came together to launch a new and ambitious agenda for world devel-
opment from 2016 to 2030.
The ‘Sustainable Development Goals’ (SDGs) set out 17 goals with 169 targets
and more than 300 indicators to monitor progress.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
26
27. The 17 SDGs:
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
27
28. Links among SDGs through their targets, based on scientific assessment:
Source: ‘Global Sustainable Development Report 2015’ (sustainabledevelopment.un.org/globalsdreport/2015).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
28
29. Source: United Nations General Assembly, The 2030 Agenda for Sustainable Development, September 18, 2015
(sustainabledevelopment.un.org/post2015/transformingourworld).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
29
30. ‘While figures, like the number of people living in pover-
ty, are regularly quoted, they are little more than guess-
work, a report from the ‘Overseas Development Insti-
tute’ (ODI) found earlier this year. According to the
report there could be 350 million more people living
in poverty than we realise. We also do not know how
many girls are married before the age of 18; the per-
centage of the world’s poor who are women; the num-
ber of street children worldwide; and how many people
in the world are hungry.’
Sarah Shearman, 2015
Source: Sarah Shearman’s article ‘Data ’crucial’ to eradicating poverty’ in the
Guardian, September 28, 2015 (goo.gl/DBTwza).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
30
31. ‘Without data we are flying blind, and we can not do
evidence-based policy decisions — or any decision at
all.’
Johannes J¨utting, 2015
Source: Johannes J¨utting, Manager of the Partnership in Statistics for Development in the 21st Century
(Paris21), quoted in Sarah Shearman’s article ‘Data ’crucial’ to eradicating poverty’ in the
Guardian, September 28, 2015 (goo.gl/DBTwza).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
31
32. • Policy makers want the top of the iceberg, but they need to remember the stuff
beneath sea (adapted from @HetanShah):
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
32
33. ‘Over time statistics has not been seen as one of the
sexiest topics. ... For policy makers with short-term
mandates, it is sometimes more sexy to achieve some-
thing in a field that is more visible.’
Pieter Everaers, 2015
Source: Pieter Everaers, Director Cooperation in the European Statistical System, International Cooperation,
Resources at Eurostat, quoted in Sarah Shearman’s article ‘Data ’crucial’ to eradicating poverty’ in the
Guardian, September 28, 2015 (goo.gl/DBTwza).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
33
34. 4. Conclusion and opportunities
• Decision making that was once based on hunches and intuition should be driven by
data ( data-driven decision making).
• The key elements for a successful (big) data analytics and data science future are
statistical principles and rigour of humans!
• Statistics, (big) data analytics and data science are aids to thinking and not re-
placements for it!
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
34
35. • Data are key for policy making and for accountability in all countries of the world
( data-informed policy making).
• Big data, e.g. using open data and new data sources, could contribute to the
monitoring of progress of the SDGs and the effectiveness of policies, programmes
and activities.
They should be envisaged to complement (official) statistics, not replacements for
it!
A long-term vision for the use of big data needs to be developed, e.g. by national
statistical offices.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
35
36. Video Big Data for Official Statistics from the United Nations Department of
Economic and Social Affairs at youtu.be/5G5hGu0lnqI , October 13, 2015.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
36
37. • Do not neglect the following four principles that ensure successful outcomes:
– use of sequential approaches to problem solving and improvement, as studies
are rarely completed with a single data set but typically require the sequential
analysis of several data sets over time;
– having a strategy for the project and for the conduct of the analysis of data (
‘strategic thinking’ );
– carefully considering data quality and how data will be analysed ( ‘data pedigree’ );
and
– applying sound subject matter knowledge (‘domain knowledge’), which should
be used to help define the problem, to assess the data pedigree, to guide analysis
and to interpret the results.
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
37
38. • Some challenges from a statistical perspective include
– the ethics of using and linking (big) data, particularly in relation to personal data,
i.e. ethical issues related to privacy ( ‘information rules’ need to be defined),
confidentiality (of shared private information), transparency (e.g. of data uses
and data users) and identity (i.e. data should not compromise identity);
– the provenance of the data, e.g. the quality of the data — including issues like
omissions, data linkage errors, measurement errors, censoring, missing observa-
tions, atypical observations, missing variables ( ‘omitted variable bias’), the
characteristics and heterogeneity of the sample — big data being ‘only’ a sample
(at a particular time) of a population of interest ( ‘sampling/selection bias’,
i.e. is the sample representative to the population it was designed for?);
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
38
39. ‘Data are the lifeblood of decision-making and the
raw material for accountability. Without high-qual-
ity data providing the right information on the right
things at the right time; designing, monitoring and
evaluating effective policies becomes almost impos-
sible.’
IEAG, 2014
Source: United Nations Secretary-General’s ‘Independent Expert Advisory Group on a Data Revolution for
Sustainable Development’ (IEAG), A Word That Counts: Mobilising The Data Revolution for
Sustainable Development, November 6, 2014 (www.undatarevolution.org/report/).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
39
40. – spurious (false) associations ( ‘coincidence’ increases, i.e. it becomes more
likely, as sample size increases, and as such ‘there are always patterns’) versus
valid causal relationships ( ‘confirmation bias’);
– the validity of generalisation ( avoid ‘overfitting’, i.e. interpreting an ex-
ploratory analysis as predictive);
– the replicability of findings, i.e. that an independent experiment targeting the
same question(s) will produce consistent results, and the reproducibility of find-
ings, i.e. the ability to recompute results given observed data and knowledge of
the data analysis pipeline;
– the nature of uncertainty (both random and systematic).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
40
41. ‘The SDGs will prove pivotal in the data for develop-
ment debate. But because improving data and statis-
tics requires a long-term investment, will the 15-year
time frame be long enough?’
Sarah Shearman, 2015
Source: Sarah Shearman’s article ‘Data ’crucial’ to eradicating poverty’ in the
Guardian, September 28, 2015 (goo.gl/DBTwza).
Copyright c 2001–2015, Statoo Consulting, Switzerland. All rights reserved.
41
43. Have you been Statooed?
Dr. Diego Kuonen, CStat PStat CSci
Statoo Consulting
Morgenstrasse 129
3018 Berne
Switzerland
email kuonen@statoo.com
@DiegoKuonen
web www.statoo.info
/Statoo.Consulting
44. Copyright c 2001–2015 by Statoo Consulting, Switzerland. All rights reserved.
No part of this presentation may be reprinted, reproduced, stored in, or introduced
into a retrieval system or transmitted, in any form or by any means (electronic, me-
chanical, photocopying, recording, scanning or otherwise), without the prior written
permission of Statoo Consulting, Switzerland.
Warranty: none.
Trademarks: Statoo is a registered trademark of Statoo Consulting, Switzerland.
Other product names, company names, marks, logos and symbols referenced herein
may be trademarks or registered trademarks of their respective owners.
Presentation code: ‘WSD.20.10.2015’.
Typesetting: LATEX, version 2 . PDF producer: pdfTEX, version 3.141592-1.40.3-2.2 (Web2C 7.5.6).
Compilation date: 19.10.2015.