This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Everybody has heard of Big Data, and its promise as the next great frontier for innovation. However, Big Data is neither new nor easily defined. What are the key drivers that make Big Data so critically important today? What is the single idea behind Big Data that promises such game changing outcomes for capable organizations? Who are the skilled talent that deliver Big Data results?
This presentation briefly reviews the opportunities, motivation and trends that are driving Big Data disruption. Data science is introduced as the enabling engine for Big Data transformation via the creation of new Data Products. The data scientist is defined and his tools, workflow and challenges are reviewed. Finally, practical tips are presented for approaching data product development.
Key takeaways include:
- Big Data disruption is driven by four megatrends
- Data is the essential raw material for creating valuable Data Products
- Data scientists are heterogeneous by role & skill set, but share common tools, workflows and challenges
- Data science talent is more important than raw data for Big Data success
These slides are modified from an invited presentation for the Gwinnett Chamber of Commerce on March 18, 2014. An excerpt was presented at the Georgia Pacific Social Media Working Session on March 19, 2014.
Una breve introduzione alla data science e al machine learning con un'enfasi sugli scenari applicativi, da quelli tradizionali a quelli più innovativi. La overview copre la definizione di base di data science, una overview del machine learning e esempi su scenari tradizionali, Recommender systems e Social Network Analysis, IoT e Deep Learning
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Everybody has heard of Big Data, and its promise as the next great frontier for innovation. However, Big Data is neither new nor easily defined. What are the key drivers that make Big Data so critically important today? What is the single idea behind Big Data that promises such game changing outcomes for capable organizations? Who are the skilled talent that deliver Big Data results?
This presentation briefly reviews the opportunities, motivation and trends that are driving Big Data disruption. Data science is introduced as the enabling engine for Big Data transformation via the creation of new Data Products. The data scientist is defined and his tools, workflow and challenges are reviewed. Finally, practical tips are presented for approaching data product development.
Key takeaways include:
- Big Data disruption is driven by four megatrends
- Data is the essential raw material for creating valuable Data Products
- Data scientists are heterogeneous by role & skill set, but share common tools, workflows and challenges
- Data science talent is more important than raw data for Big Data success
These slides are modified from an invited presentation for the Gwinnett Chamber of Commerce on March 18, 2014. An excerpt was presented at the Georgia Pacific Social Media Working Session on March 19, 2014.
Una breve introduzione alla data science e al machine learning con un'enfasi sugli scenari applicativi, da quelli tradizionali a quelli più innovativi. La overview copre la definizione di base di data science, una overview del machine learning e esempi su scenari tradizionali, Recommender systems e Social Network Analysis, IoT e Deep Learning
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Introduction to various data science. From the very beginning of data science idea, to latest designs, changing trends, technologies what make then to the application that are already in real world use as we of now.
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
Big Data and Data Science have become increasingly imperative areas in both industry and academia to the extent that every company wants to hire a Data Scientist and every university wants to start dedicated degree programs and centres of excellence in Data Science. Big Data and Data Science have led to technologies that have already shaped different aspects of our lives such as learning, working, travelling, purchasing, social relationships, entertainments, physical activities, medical treatments, etc. This talk will attempt to cover the landscape of some of the important topics in these exponentially growing areas of Data Science and Big Data including the state-of-the-art processes, commercial and open-source platforms, data processing and analytics algorithms (specially large scale Machine Learning), application areas in academia and industry, the best industry practices, business challenges and what it takes to become a Data Scientist.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
A look back at how the practice of data science has evolved over the years, modern trends, and where it might be headed in the future. Starting from before anyone had the title "data scientist" on their resume, to the dawn of the cloud and big data, and the new tools and companies trying to push the state of the art forward. Finally, some wild speculation on where data science might be headed.
Presentation given to Seattle Data Science Meetup on Friday July 24th 2015.
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...Pistoia Alliance
Pistoia Alliance launched its Centre of Excellence for Artificial Intelligence (AI) in Life Sciences where we hope to bring together best practice, adoption strategy and hackathons covering a range of challenges.
Over the coming months we will be hosting a series of topics and speakers giving their perspectives on the role of Artificial & Augmented Intelligence in Life Sciences and Healthcare.
The topics will cover some of the current challenges, user stories & value in using AI in life sciences. If you want to get involved in this series as a speaker or suggest topics please get in touch
Webinar 1 will focused on the following
A Brief History
Big Data/ML/DL/AI - fundamentals and concepts
Data Fidelity importance
Some best practices
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
My presentation on Data Mining, Lessons from Competitions, and Public Data looks at the Data Mining/Data Science/Big Data evolution, reviews lessons from KDD Cup 1997, Netflix Prize, and Kaggle, presents a big list of Public and Government data APIs, Marketplaces, Portals, and Platforms, and examines Big Data Hype. This talk was given at BPDM-2013, (Broadening Participation in Data Mining), Aug 10, 2013 held at KDD-2013, Chicago.
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Ilkay Altintas, Ph.D.
The new era of data science is here. Our lives and society are continuously transformed by our ability to collect data in a systematic fashion and turn that into value. The opportunities created by this change also comes with challenges that push for new and innovative data management and analytical methods as well as translating these new methods to applications in many areas that impact science, society, and education. Collaboration and ability of multi-disciplinary teams to work together and communicate to bring together the best of their knowledge in business, data and computing is vital for impactful solutions. This talk will discusses a reference ecosystem and question-driven methodology, called PPODS, to make impactful data science applications in many fields with specific examples in hazards, smart cities and biomedical research.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
Introduction to various data science. From the very beginning of data science idea, to latest designs, changing trends, technologies what make then to the application that are already in real world use as we of now.
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
Big Data and Data Science have become increasingly imperative areas in both industry and academia to the extent that every company wants to hire a Data Scientist and every university wants to start dedicated degree programs and centres of excellence in Data Science. Big Data and Data Science have led to technologies that have already shaped different aspects of our lives such as learning, working, travelling, purchasing, social relationships, entertainments, physical activities, medical treatments, etc. This talk will attempt to cover the landscape of some of the important topics in these exponentially growing areas of Data Science and Big Data including the state-of-the-art processes, commercial and open-source platforms, data processing and analytics algorithms (specially large scale Machine Learning), application areas in academia and industry, the best industry practices, business challenges and what it takes to become a Data Scientist.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
A look back at how the practice of data science has evolved over the years, modern trends, and where it might be headed in the future. Starting from before anyone had the title "data scientist" on their resume, to the dawn of the cloud and big data, and the new tools and companies trying to push the state of the art forward. Finally, some wild speculation on where data science might be headed.
Presentation given to Seattle Data Science Meetup on Friday July 24th 2015.
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...Pistoia Alliance
Pistoia Alliance launched its Centre of Excellence for Artificial Intelligence (AI) in Life Sciences where we hope to bring together best practice, adoption strategy and hackathons covering a range of challenges.
Over the coming months we will be hosting a series of topics and speakers giving their perspectives on the role of Artificial & Augmented Intelligence in Life Sciences and Healthcare.
The topics will cover some of the current challenges, user stories & value in using AI in life sciences. If you want to get involved in this series as a speaker or suggest topics please get in touch
Webinar 1 will focused on the following
A Brief History
Big Data/ML/DL/AI - fundamentals and concepts
Data Fidelity importance
Some best practices
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
My presentation on Data Mining, Lessons from Competitions, and Public Data looks at the Data Mining/Data Science/Big Data evolution, reviews lessons from KDD Cup 1997, Netflix Prize, and Kaggle, presents a big list of Public and Government data APIs, Marketplaces, Portals, and Platforms, and examines Big Data Hype. This talk was given at BPDM-2013, (Broadening Participation in Data Mining), Aug 10, 2013 held at KDD-2013, Chicago.
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Ilkay Altintas, Ph.D.
The new era of data science is here. Our lives and society are continuously transformed by our ability to collect data in a systematic fashion and turn that into value. The opportunities created by this change also comes with challenges that push for new and innovative data management and analytical methods as well as translating these new methods to applications in many areas that impact science, society, and education. Collaboration and ability of multi-disciplinary teams to work together and communicate to bring together the best of their knowledge in business, data and computing is vital for impactful solutions. This talk will discusses a reference ecosystem and question-driven methodology, called PPODS, to make impactful data science applications in many fields with specific examples in hazards, smart cities and biomedical research.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
Data Scientist has been regarded as the sexiest job of the twenty first century. As data in every industry keeps growing the need to organize, explore, analyze, predict and summarize is insatiable. Data Science is creating new paradigms in data driven business decisions. As the field is emerging out of its infancy a wide range of skill sets are becoming an integral part of being a Data Scientist. In this talk I will discuss the different driven roles and the expertise required to be successful in them. I will highlight some of the unique challenges and rewards of working in a young and dynamic field.
About
Evolution of Data, Data Science , Business Analytics, Applications, AI, ML, DL, Data science – Relationship, Tools for Data Science, Life cycle of data science with case study,
Algorithms for Data Science, Data Science Research Areas,
Future of Data Science.
Huge amount of data is being collected everywhere - when we browse the web, go to the doctor's clinic, visit the supermarket, tweet or watch a movie. This plethora of data is dealt under a new realm called Data Science. Data Science is now recognized as a highly-critical growing area with impact across many sectors including science, government, finance, health care, social networks, manufacturing, advertising, retail,
and others. This colloquium will try to provide an overview as well as clarify bits and bats about this emerging field.
Security in Clouds: Cloud security challenges – Software as a
Service Security, Common Standards: The Open Cloud Consortium – The Distributed management Task Force – Standards for application Developers – Standards for Messaging – Standards for Security, End user access to cloud computing, Mobile Internet devices and the cloud. Hadoop – MapReduce – Virtual Box — Google App Engine – Programming Environment for Google App Engine.
Need for Virtualization – Pros and cons of Virtualization – Types of Virtualization –System VM, Process VM, Virtual Machine monitor – Virtual machine properties - Interpretation and binary translation, HLL VM - supervisors – Xen, KVM, VMware, Virtual Box, Hyper-V.
This Presentation provides a detailed insight about Collaborating Using Cloud Services Email Communication over the Cloud - CRM Management – Project Management-Event
Management - Task Management – Calendar - Schedules - Word Processing –
Presentation – Spreadsheet - Databases – Desktop - Social Networks and Groupware.
This presentation provides a detailed coverage on Cloud services: Software as a Service, Platform as a Service, Infrastructure as a Service, Database as a Service, Monitoring as a Service, Communication as Services. Service providers- Google, Amazon, Microsoft Azure, IBM, Sales force.
This Presentation is an abstract of discussion I had during my Session with Participants of a Webinar at Regional Center of IGNOU, Patna on Future Skills & Career Opportunities in POST COVID-19
Delivered Key Note Address in National Seminar on
"Digital India: Use of Technology For Transforming Society" organized at Gaya College, Gaya on 28th & 29th January, 2017.
Gaya college-gaya-28-29.01.2017-presentation
Paradigm Shift in
Computing Technology, ICT & its Applications: Technical, Social, Economic and Environmental Perspective
Mobile Technology – Historical Evolution, Present Status & Future DirectionsDr. Sunil Kr. Pandey
I Made & presented this Presentation as a Resource Person in a Faculty Development Programme organized at central University of Himachal Pradesh, Dharmshala, HP, India during 13th & 14th June, 2016.
Mobile Technology – Historical Evolution, Present Status & Future DirectionsDr. Sunil Kr. Pandey
I made this Presentation as a Resource Person in a Faculty Development Programme organized at Central University of Himachal Pradesh, Dharmshala, HP during 13th & 14th June, 2016.
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...Dr. Sunil Kr. Pandey
I was invited as Key Note Speaker in a National Event organized at Gajadhar Bhagat College, Naugachia, (TM Bhagalpur University). I took session on "Paradigm Shift in Computing Technology, ICT & its Applications - Socioeconomic and Environmental Perspective". It was a wonderful learning experience to meet, interact and experience sharing with delegates, faculty and students there.
This presentation is an attempt to create awareness about Digital India Mission Program - its Projects preservative, Policies and various initiatives. Over all this presents a brief on the Digital India Mission Program by Govt. of India which was launched by Honorable Prime Minister of India, Sri. Narendra Modiji!
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
5. There's certainly a lot of it!
2015
1 Zettabyte
1 Exabyte
1 Petabyte
(brain) 14 PB: http://www.quora.com/Neuroscience-1/How-much-data-can-the-human-brain-store
(2002) 5 EB: http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm
1 Petabyte == 1000 TB 2002 2009
(2009) 800 EB: http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf
(2015) 8 ZB: http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf
2006 2011
(2006) 161 EB: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf
(2011) 1.8 ZB: http://www.emc.com/leadership/programs/digital-universe.htm (life in video) 60 PB: in 4320p resolution, extrapolated from 16MB for 1:21 of 640x480 video
(w/sound) – almost certainly a gross overestimate, as sleep can be compressed significantly!
5 EB
161 EB
800 EB
1.8 ZB 8.0 ZB
14 PB
60 PB
Data produced each year
100-years of HD video + audio
Human brain's capacity
Data, data everywhere…
References
1 TB = 1000 GB
120 PB
logarithmicscale
6. Data has become a Resource that needs to be carefully stored, processed,
analyzed, visualize and Present where it is required securely.
7. Growing Need for Analytics
DATA
HARNESSING
Companies store
each piece of
information
generated during
the business
operations and
customer
interactions.
DATA VOLUMESData is generated.
Learning from the data
is used in the decision
making and process
optimization.
Data is analyzed. 1.22010
2012
2015
2.4
7.9
Volumes in Trillion GB
DID
YOU
KNOW
?
Generation of Large Amount of Data from Business Transactions
4
Billion
Number of
transactions
every year
900
Number
of Stores
Number
of SKUs
10000
-1 lakh
10. Fourth Paradigm of Science
Turing award winner Jim
Gray imagined data science
as a "fourth paradigm" of
science -
• Thousands of years
• Empirical (अनुभवजन्य)
• Few hundreds of years
• Theoretical (सैद्धांतिक)
• Last fifty years
• Computational (गणनधत्मक)
• “Query the world”
• Last twenty years
• eScience (Data Science)
• “Download the world”
11.
12. What is Data Science
• Data Science is a multi-disciplinary field that uses scientific
methods, processes, algorithms and systems to
extract knowledge and insights from structured and
unstructured data.
• Data Science is a "concept to unify statistics, data analysis,
machine learning and their related methods" in order to
"understand and analyze actual phenomena" with data. It
employs techniques and theories drawn from many fields within
the context of mathematics, statistics, comp. science,
and information science.
• The availability of high-capacity networks, low-cost computers and
storage devices as well as the widespread adoption of hardware
virtualization, service-oriented
architecture and autonomic and utility computing has led to growth
in cloud computing.
14. Data Science : A Definition
Data Science is the science which uses computer science, statistics and
machine learning, visualization and human-computer interactions to:
1. Collect
2. Clean
3. Integrate
4. Analyze
5. Visualize
6. Interact
with data to create data products.
Objective of Data Science is to “Turn Data into Data Products”.
15. Traditionally, the data that we had was mostly structured and small in size,
which could be analyzed by using the simple BI tools. Unlike data in
the traditional systems which was mostly structured, today most of the
data is unstructured or semi-structured. Let’s have a look at the data
trends in the image given below which shows that by 2020, more than 80 % of
the data will be unstructured.
22. What is Analytics?
Data on its own is useless unless you can make sense of it!
WHAT IS ANALYTICS?
The scientific process of transforming data into insight for making
better decisions, offering new opportunities for a competitive
advantage
22
23.
24. Types of Analytics
1
32
Analytics
Prescriptive Analytics
Descriptive analyticsPredictive analytics
Enabling smart decisions
based on data
What should we do?
Mining data to provide
business insights
What has happened?
Predicting the future based
on historical patterns
What could happen?
25. Types of Analytics
Prescriptive
Analytics
advice on possible outcomes
Predictive
Analytics
understanding the future
Descriptive
Analytics
insight into the past
Why do airline prices
change every hour?
How do grocery cashiers
know to hand you coupons
you might actually use?
How does Netflix
frequently recommend
just the right movie?
26. Features Business Intelligence (BI) Data Science
Data Sources
Structured
(Usually SQL, often Data Warehouse)
Both Structured and
Unstructured
( logs, cloud data, SQL,
NoSQL, text)
Approach Statistics and Visualization
Statistics, Machine
Learning, Graph Analysis,
Neuro- linguistic
Programming (NLP)
Focus Past and Present Present and Future
Tools Pentaho, Microsoft BI, QlikView, R
RapidMiner, BigML, Weka,
R
Business Intelligence (BI) vs. Data Science
28. Interest for “Data Science” term since
December 2013
(source: Google Trends)
Hype bag-of-words. Let’s not focus on buzzwords, but on what the
beneath technologies can actually solve.
30. Contrast: Databases
Databases Data Science
Data Value “Precious” “Cheap”
Data Volume Modest Massive
Examples Bank records,
Personnel records,
Census, Medical records
Online clicks, GPS logs,
Tweets, Building sensor readings
Priorities Consistency,
Error recovery,
Auditability
Speed,
Availability,
Query richness
Structured Strongly (Schema) Weakly or none (Text)
Properties Transactions, ACID* CAP* theorem (2/3),
eventual consistency
Realizations SQL NoSQL: MongoDB, CouchDB,
Hbase, Cassandra, Riak, Memcached,
Apache River, …
ACID = Atomicity, Consistency, Isolation and Durability
CAP = Consistency, Availability, Partition Tolerance
31. Contrast: Machine Learning
Data Science
Explore many models, build and tune hybrids
Understand empirical properties of models
Develop/use tools that can handle massive
datasets
Take action!
Machine Learning
Develop new (individual) models
Prove mathematical properties of models
Improve/validate on a few, relatively clean,
small datasets
Publish a paper
33. The first war: Terminology
• Analyzing data has a long history!
• There have been many terms that have been used to describe such
endeavors:
• Statistics
• Artificial Intelligence
• Machine learning
• Data analytics
• Since I happen to work in a “Data Science” program perhaps I may be
allowed the indulgence of using that terminology…
34. The Case for Business Analytics
• The Business environment today is
more complex than ever before.
• Businesses are expected to be
diligently responsive to the
increasing demands of customers,
various stakeholders and even
regulators.
• Organizations have been turning to
the use of analytics.
• More than 83% of Global CIOs
surveyed by IBM in 2010 singled out
Business Intelligence and Analytics
as one of their visionary plans for
enhancing competitiveness.
In most cases the primary objective of
an organization that seeks to turn to
analytics is:
• Revenue/Profit growth
• Optimize expenditure
SOLUTION
BUSINESS NEED
GOAL
34
35. Data Analysis Has Been Around for a While…
R.A. Fisher
Howard
Dresner
Peter Luhn
W.E. Deming
36. Experiments, observations, and numerical simulations in many
areas of science and business are currently generating terabytes of
data, and in some cases are on the verge of generating petabytes
and beyond. Analyses of the information contained in these data
sets have already led to major breakthroughs in fields ranging from
genomics to astronomy and high-energy physics and to the
development of new information-based industries.
- Frontiers in Massive Data Analysis, National Research Council of the National Academies
Given a large mass of data, we can by judicious selection
construct perfectly plausible unassailable theories—all of
which, some of which, or none of which may be right.
- Paul Arnold Srere
37. The ability to take data—to be able to understand it, to process it, to
extract value from it, to visualize it, to communicate it—that’s going
to be a hugely important skill in the next decades, not only at the
professional level but even at the educational level for elementary
school kids, for high school kids, for college kids. Because now we
really do have essentially free and ubiquitous data. So the
complimentary scarce factor is the ability to understand that data
and extract value from it.
-Hal Varian, Google's Chief Economist, http://www.mckinsey.com/insights/innovation/hal_varian_on_how_the_web_challenges_managers
My personal goal: Getting students to be able to
think critically about data.
38. What is Big Data?
The are many examples of "data", but what makes some of it “big”? The classic
definition revolves around the three V’s - Volume, velocity, and variety.
Volume: There is a just a lot of it being generated all the time. Things get
interesting and “big”, when you can’t fit it all on one computer anymore.
Why? There are many ideas here such as MapReduce, Hadoop, etc. that all
revolve around being able to process data that goes from Terabytes, to
Petabytes, to Exabytes.
Velocity: Data is being generated very quickly. Can you even store it all? If
not, then what do you get rid of and what do you keep?
Variety: The data types you mention all take different shapes. What does it
mean to store them so that you can play with or compare them?
39. BIGDATAData that is TOO LARGE & TOO
COMPLEX for conventional data tools
to capture, store and analyze.
Shares traded on US
Stock Markets each
day:
7 Billion
Data generated in
one flight from NY
to London:
10 Terabytes
Number of tweets
per day on Twitter:
400 Million
Number of ‘Likes’
each day on
Facebook:
3 Billion
The 3V’s of Big Data
VOLUME VARIETY VELOCITY
90% OF THE WORLD’S
DATA WAS
GENERATED IN THE
LAST TWO YEARS
Big Data Everywhere!
www.imarticus.org 39
40.
41. Is Big Data the same as Data Science?
Are Big Data and Data Science the same thing?
I wouldn't say so...
Data Science can be done on small data sets.
And not everything done using Big Data would necessarily be called Data
Science.
Big Data
Data
Science
42. Is Big Data the same as Data Science?
Are Big Data and Data Science the same thing?
I wouldn't say so...
Data Science can be done on small data sets.
And not everything done using Big Data would necessarily be called Data
Science.
But there certainly is a substantial overlap!
Big Data
Data
Science
43. Perspective Of Big Data's Growth
• Worldwide Big Data market revenues for software and services are projected to
increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual
Growth Rate (CAGR) of 10.48% according to Wikibon.
•According to an Accenture study, 79% of enterprise executives agree that
companies that do not embrace Big Data will lose their competitive position and
could face extinction. Even more, 83%, have pursued Big Data projects to seize a
competitive edge.
•Forrester predicts the global Big Data software market will be worth $31B this
year, growing 14% from the previous year. The entire global software market is
forecast to be worth $628B in revenue, with $302B from applications.
•Worldwide Big Data market revenues for software and services are projected to
increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual
Growth Rate (CAGR) of 10.48% according to Wikibon.
• 59% of executives say Big Data at their company would be improved through the
use of AI according to PwC.
44.
45.
46.
47.
48.
49.
50.
51. Future Trends
Tech & Industries to watch out in near Future:
• Progressive Web Apps (PWAs) — A mixture of a mobile and web apps.
• Block Chain & Fintech – Meta-model building, reliable trading & credit scoring.
• Healthcare — Diagnosis by Medical Imaging (Computer vision & ML).
• AR/VR — Sport Analysis, Business Cards (Image Tracking), Real -Life Gaming
(Hado).
• AI Speech Assistants, smarter Chat-bot integrations.
• Smart Supply Chain — Digital twins (IoT Sensors).
• 5G — Big data, Mobile cloud computing, scalable IoT & Network function
virtualisation (NFV).
• 3D Printing — Prefabrication efficiency, Defect detection, Predictive ML
maintenance.
• Dark Data — Information that is yet to become available in digital format.
• Quantum Computing — Cutting data processing times into fractions.
52.
53. Thank You!
Dr. Sunil Kr Pandey
Professor & Director (IT & UG)
Institute of Technology & Science
Mohan Nagar, Ghaziabad
Email: sunilpandey@its.edu.in