SlideShare a Scribd company logo
1 of 25
Introduction to
Data Science
- Big Data & Data Analytics -
Yasas Senarath
Graduate Assistant Researcher at DataSEARCH
University of Moratuwa
Outline
● Introduction to Big Data and Data Science
● Data Driven Decision Making / D3M
● Importance of Big Data in Telehealth Services
● Data to Knowledge Process
● Techniques and Tools
Data is the new science.
Big data holds the answers.
-Pat Gelsinger, CEO, VMware
What is Big Data?
Big data is a term used to refer to data sets
that are too large or complex for traditional
data-processing application software to
adequately deal with.
--Wikipedia
“
● Attributes that define big data (the 4 V’s)
How to identify Big Data?
Volume Velocity
Variety Veracity
● Mobile Devices
● Internet of Things (IOT)
● Social Media
● Satellite Imagery
Where does Big Data come from?
<iframe width="640" height="480" src="https://ytcropper.com/embed/W_5c4d6496b5141/loop/noautoplay/"
frameborder="0" allowfullscreen></iframe><a href="/" target="_blank">via ytCropper</a>
● Emerging Discipline
● No exact definition (Different definitions exist from different
perspectives)
Data Science
†National Institute of Standards and Technology
Data science is the empirical synthesis of
actionable knowledge from raw data
through the data lifecycle process
-NIST†
“
Why Data Science?
● Exact new values, Insights and Hypothesis
● Derive new knowledge from existing data
● Understand customers’ behaviour
● Facilitate the demand market to suppliers
● Build Recommender systems
● Build predictive systems.
Data-driven decision making (DDDM)
involves making decisions that are
backed up by hard data rather than
making decisions that are intuitive or
based on observation alone.
MIT Sloan School of Management
professors Andrew McAfee and Erik
Brynjolfsson explain in a Wall Street
Journal article that companies that
were mostly data-driven had 4%
higher productivity and 6% higher
profits than the average.
Data-driven decision making (DDDM/D3M)
4 Stages of Data Analytics Maturity
Big Data in Telehealth Services
● Predict Admission Rates
○ Big data is helping to solve this problem, at least at a few hospitals in Paris
○ A Forbes article† details how four hospitals which are part of the Assistance
Publique-Hôpitaux de Paris have been using data from a variety of sources to
come up with daily and hourly predictions of how many patients are expected
to be at each hospital
● Electronic Health Records (EHRs)
○ Trigger warnings and reminders when a patient should get a new lab test or
track prescriptions to see if a patient has been following doctors’ orders
○ Hospitals adopting EHR?
†https://bit.ly/2FSzTZk
Big Data in Telehealth Services
● Real-Time Alerting
○ Wearables will collect data from patients and send this data to the cloud
○ React every time the results will be disturbing
Send data
periodically
Alert the
Doctor
Administer
measures
Analize
Better
Treatment
Plans
Big Data in Telehealth Services
● Patient Satisfaction Monitoring
○ Collect data on sentiment of the patient on Doctor / Hospital
○ For example,
■ Whether the doctor explained the treatment understandably
■ Whether the patient had confidence and trust in the treating physician
○ Analyze and use it to improve the quality of health services
● Minimizing Waiting Time
○ Predict the time patient should be available to the doctor
Big Data in Business
● Sentiment / Opinion Analysis
○ Analyze Social Media Posts and forums
○ Learn how customers feel about your products
○ Give attention where required
● Understanding, Targeting And Serving Customers
○ Analize usage patterns and understand the customer base (Eg: demographic)
○ Targeted Advertising
○ Improved service
Data to Knowledge Process
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Data
Acquisition
Data Storage
Data Cleaning
● Electronic Medical
Records (EMRs)
● User-generated data
(Fitbit, iWatch)
● Doctor Channelling
Records
● System Logs
● Patient Details
...
● Data acquisition and data
formats Privacy and
ethical issues
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Data
Acquisition
Data Storage
Data Cleaning
● Big Data
● CSV, TSV,XL
● Databases (MySQL,
NoSQL)
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Data
Acquisition
Data Storage
Data Cleaning
● Missing Values
● Outliers
● Human Error
● Machine Error
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Exploratory
Data Analysis
Dependency
and
Relationship
Machine
Learning
● Descriptive Statistics
● Clustering
● Looking for patterns
● Hypothesis testing
● Data tendency
● Groups, subgroups
● Looking for abnormality
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Exploratory
Data Analysis
Dependency
and
Relationship
Machine
Learning
● Association
- Do changes in X (seem to)
coincide with changes in Y?
● Correlation
- How to quantify the
association between X and Y?
● Agreement
- Do X and Y agre?
● Causation
- Do changes in X cause
changes in Y?
Data to Knowledge Process [contd...]
Data
Manipulation
Analytics
Communication
& Visualization
Exploratory
Data Analysis
Dependency
and
Relationship
Machine
Learning
Techniques and Tools
● Plotting (Scatter Plot, Bar Chart)
● Correlation
○ Pearson’s correlation
○ Spearman’s rank correlation
● Agreement
○ Cohen’s Kappa Coefficient
● Regression
○ Linear Regression
○ Logistic Regression
● Classification
○ SVM
○ Decision Trees
● Java/ Scala (Production)
○ Apache Hadoop (Distributed
Computing)
○ Apache Spark (Unified Analytics
Engine)
● Python (Research / Production)
○ Scikit-Learn (Machine Lear)
○ Keras (Deep Learning)
● Weka Package (Beginner)
Q & A
Hiding within those mounds of data is
knowledge that could change the life of a
patient, or change the world.
--Atul Butte, Stanford
“
ysenarath wayasas wayasas ypsenarath
Challenges
● Privacy and Security
● Data collection and management
○ Complex Data
○ Noisy Data
○ Distributed Data
○ Data Integration
● Performance
● Background Knowledge

More Related Content

What's hot

Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data MiningScottperrone
 
Application areas of data mining
Application areas of data miningApplication areas of data mining
Application areas of data miningpriya jain
 
Top Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their ApplicationsTop Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their ApplicationsPromptCloud
 
What is Datamining? Which algorithms can be used for Datamining?
What is Datamining? Which algorithms can be used for Datamining?What is Datamining? Which algorithms can be used for Datamining?
What is Datamining? Which algorithms can be used for Datamining?Seval Çapraz
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Seerat Malik
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsIJMER
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining IntroAsma CHERIF
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Miningtobiemuir
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining conceptsBasit Rafiq
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining TechniquesSanzid Kawsar
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)Krishan Pareek
 
Some Questions About Your Data
Some Questions About Your DataSome Questions About Your Data
Some Questions About Your DataDamian T. Gordon
 

What's hot (19)

Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
Application areas of data mining
Application areas of data miningApplication areas of data mining
Application areas of data mining
 
Top Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their ApplicationsTop Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their Applications
 
Data mining
Data miningData mining
Data mining
 
What is Datamining? Which algorithms can be used for Datamining?
What is Datamining? Which algorithms can be used for Datamining?What is Datamining? Which algorithms can be used for Datamining?
What is Datamining? Which algorithms can be used for Datamining?
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
Data Mining Intro
Data Mining IntroData Mining Intro
Data Mining Intro
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
Data mining
Data mining Data mining
Data mining
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Data analytics
Data analyticsData analytics
Data analytics
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data mining
Data miningData mining
Data mining
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Some Questions About Your Data
Some Questions About Your DataSome Questions About Your Data
Some Questions About Your Data
 
Data mining
Data miningData mining
Data mining
 

Similar to Data science / Big Data

Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applicationsPadma Metta
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...VMware Tanzu
 
'Big Data Little Disease' - OBH and Big Data Partnership
'Big Data Little Disease' - OBH and Big Data Partnership'Big Data Little Disease' - OBH and Big Data Partnership
'Big Data Little Disease' - OBH and Big Data PartnershipHealth Innovation Wessex
 
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Aridhia Informatics Ltd
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 
CTMS Data Migration by Krishnaveni Rapuru
CTMS Data Migration  by Krishnaveni RapuruCTMS Data Migration  by Krishnaveni Rapuru
CTMS Data Migration by Krishnaveni RapuruMuraliRaj M
 
Decentralized Clinical Trials
Decentralized Clinical TrialsDecentralized Clinical Trials
Decentralized Clinical TrialsPCE121
 
Intel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsIntel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsCarestream
 
Big Data Challenges and solutions.pptx
 Big Data Challenges and solutions.pptx Big Data Challenges and solutions.pptx
Big Data Challenges and solutions.pptxjawaria11
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxPrabhaJoshi4
 
Big Data Analytics: Challenge or Opportunity?
Big Data Analytics: Challenge or Opportunity?Big Data Analytics: Challenge or Opportunity?
Big Data Analytics: Challenge or Opportunity?NUS-ISS
 
dataminingppt-170616163835.pdf jejwwkwnwnn
dataminingppt-170616163835.pdf jejwwkwnwnndataminingppt-170616163835.pdf jejwwkwnwnn
dataminingppt-170616163835.pdf jejwwkwnwnnjainutkarsh078
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
 

Similar to Data science / Big Data (20)

Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
 
'Big Data Little Disease' - OBH and Big Data Partnership
'Big Data Little Disease' - OBH and Big Data Partnership'Big Data Little Disease' - OBH and Big Data Partnership
'Big Data Little Disease' - OBH and Big Data Partnership
 
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
Challenges in Clinical Research: Aridhia's Disruptive Technology Approach to ...
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
4D Geospatial Analytics in Digital Healthcare PDF
4D Geospatial Analytics in Digital Healthcare PDF4D Geospatial Analytics in Digital Healthcare PDF
4D Geospatial Analytics in Digital Healthcare PDF
 
CTMS Data Migration by Krishnaveni Rapuru
CTMS Data Migration  by Krishnaveni RapuruCTMS Data Migration  by Krishnaveni Rapuru
CTMS Data Migration by Krishnaveni Rapuru
 
Decentralized Clinical Trials
Decentralized Clinical TrialsDecentralized Clinical Trials
Decentralized Clinical Trials
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
 
Intel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsIntel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analytics
 
Big Data Challenges and solutions.pptx
 Big Data Challenges and solutions.pptx Big Data Challenges and solutions.pptx
Big Data Challenges and solutions.pptx
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
 
BIG DATA.ppt
BIG DATA.pptBIG DATA.ppt
BIG DATA.ppt
 
Big Data Analytics: Challenge or Opportunity?
Big Data Analytics: Challenge or Opportunity?Big Data Analytics: Challenge or Opportunity?
Big Data Analytics: Challenge or Opportunity?
 
Data mining
Data mining Data mining
Data mining
 
dataminingppt-170616163835.pdf jejwwkwnwnn
dataminingppt-170616163835.pdf jejwwkwnwnndataminingppt-170616163835.pdf jejwwkwnwnn
dataminingppt-170616163835.pdf jejwwkwnwnn
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 

More from Yasas Senarath

Aspect Based Sentiment Analysis
Aspect Based Sentiment AnalysisAspect Based Sentiment Analysis
Aspect Based Sentiment AnalysisYasas Senarath
 
Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data Yasas Senarath
 
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...Yasas Senarath
 
Affect Level Opinion Mining
Affect Level Opinion MiningAffect Level Opinion Mining
Affect Level Opinion MiningYasas Senarath
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep LearningYasas Senarath
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysisYasas Senarath
 

More from Yasas Senarath (7)

Aspect Based Sentiment Analysis
Aspect Based Sentiment AnalysisAspect Based Sentiment Analysis
Aspect Based Sentiment Analysis
 
Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data
 
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
 
Solr workshop
Solr workshopSolr workshop
Solr workshop
 
Affect Level Opinion Mining
Affect Level Opinion MiningAffect Level Opinion Mining
Affect Level Opinion Mining
 
Lecture on Deep Learning
Lecture on Deep LearningLecture on Deep Learning
Lecture on Deep Learning
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 

Recently uploaded

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 

Recently uploaded (20)

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 

Data science / Big Data

  • 1. Introduction to Data Science - Big Data & Data Analytics - Yasas Senarath Graduate Assistant Researcher at DataSEARCH University of Moratuwa
  • 2. Outline ● Introduction to Big Data and Data Science ● Data Driven Decision Making / D3M ● Importance of Big Data in Telehealth Services ● Data to Knowledge Process ● Techniques and Tools
  • 3. Data is the new science. Big data holds the answers. -Pat Gelsinger, CEO, VMware
  • 4. What is Big Data? Big data is a term used to refer to data sets that are too large or complex for traditional data-processing application software to adequately deal with. --Wikipedia “
  • 5. ● Attributes that define big data (the 4 V’s) How to identify Big Data? Volume Velocity Variety Veracity
  • 6. ● Mobile Devices ● Internet of Things (IOT) ● Social Media ● Satellite Imagery Where does Big Data come from?
  • 7. <iframe width="640" height="480" src="https://ytcropper.com/embed/W_5c4d6496b5141/loop/noautoplay/" frameborder="0" allowfullscreen></iframe><a href="/" target="_blank">via ytCropper</a>
  • 8. ● Emerging Discipline ● No exact definition (Different definitions exist from different perspectives) Data Science †National Institute of Standards and Technology Data science is the empirical synthesis of actionable knowledge from raw data through the data lifecycle process -NIST† “
  • 9. Why Data Science? ● Exact new values, Insights and Hypothesis ● Derive new knowledge from existing data ● Understand customers’ behaviour ● Facilitate the demand market to suppliers ● Build Recommender systems ● Build predictive systems.
  • 10. Data-driven decision making (DDDM) involves making decisions that are backed up by hard data rather than making decisions that are intuitive or based on observation alone. MIT Sloan School of Management professors Andrew McAfee and Erik Brynjolfsson explain in a Wall Street Journal article that companies that were mostly data-driven had 4% higher productivity and 6% higher profits than the average. Data-driven decision making (DDDM/D3M)
  • 11. 4 Stages of Data Analytics Maturity
  • 12. Big Data in Telehealth Services ● Predict Admission Rates ○ Big data is helping to solve this problem, at least at a few hospitals in Paris ○ A Forbes article† details how four hospitals which are part of the Assistance Publique-Hôpitaux de Paris have been using data from a variety of sources to come up with daily and hourly predictions of how many patients are expected to be at each hospital ● Electronic Health Records (EHRs) ○ Trigger warnings and reminders when a patient should get a new lab test or track prescriptions to see if a patient has been following doctors’ orders ○ Hospitals adopting EHR? †https://bit.ly/2FSzTZk
  • 13. Big Data in Telehealth Services ● Real-Time Alerting ○ Wearables will collect data from patients and send this data to the cloud ○ React every time the results will be disturbing Send data periodically Alert the Doctor Administer measures Analize Better Treatment Plans
  • 14. Big Data in Telehealth Services ● Patient Satisfaction Monitoring ○ Collect data on sentiment of the patient on Doctor / Hospital ○ For example, ■ Whether the doctor explained the treatment understandably ■ Whether the patient had confidence and trust in the treating physician ○ Analyze and use it to improve the quality of health services ● Minimizing Waiting Time ○ Predict the time patient should be available to the doctor
  • 15. Big Data in Business ● Sentiment / Opinion Analysis ○ Analyze Social Media Posts and forums ○ Learn how customers feel about your products ○ Give attention where required ● Understanding, Targeting And Serving Customers ○ Analize usage patterns and understand the customer base (Eg: demographic) ○ Targeted Advertising ○ Improved service
  • 16. Data to Knowledge Process
  • 17. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Data Acquisition Data Storage Data Cleaning ● Electronic Medical Records (EMRs) ● User-generated data (Fitbit, iWatch) ● Doctor Channelling Records ● System Logs ● Patient Details ... ● Data acquisition and data formats Privacy and ethical issues
  • 18. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Data Acquisition Data Storage Data Cleaning ● Big Data ● CSV, TSV,XL ● Databases (MySQL, NoSQL)
  • 19. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Data Acquisition Data Storage Data Cleaning ● Missing Values ● Outliers ● Human Error ● Machine Error
  • 20. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Exploratory Data Analysis Dependency and Relationship Machine Learning ● Descriptive Statistics ● Clustering ● Looking for patterns ● Hypothesis testing ● Data tendency ● Groups, subgroups ● Looking for abnormality
  • 21. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Exploratory Data Analysis Dependency and Relationship Machine Learning ● Association - Do changes in X (seem to) coincide with changes in Y? ● Correlation - How to quantify the association between X and Y? ● Agreement - Do X and Y agre? ● Causation - Do changes in X cause changes in Y?
  • 22. Data to Knowledge Process [contd...] Data Manipulation Analytics Communication & Visualization Exploratory Data Analysis Dependency and Relationship Machine Learning
  • 23. Techniques and Tools ● Plotting (Scatter Plot, Bar Chart) ● Correlation ○ Pearson’s correlation ○ Spearman’s rank correlation ● Agreement ○ Cohen’s Kappa Coefficient ● Regression ○ Linear Regression ○ Logistic Regression ● Classification ○ SVM ○ Decision Trees ● Java/ Scala (Production) ○ Apache Hadoop (Distributed Computing) ○ Apache Spark (Unified Analytics Engine) ● Python (Research / Production) ○ Scikit-Learn (Machine Lear) ○ Keras (Deep Learning) ● Weka Package (Beginner)
  • 24. Q & A Hiding within those mounds of data is knowledge that could change the life of a patient, or change the world. --Atul Butte, Stanford “ ysenarath wayasas wayasas ypsenarath
  • 25. Challenges ● Privacy and Security ● Data collection and management ○ Complex Data ○ Noisy Data ○ Distributed Data ○ Data Integration ● Performance ● Background Knowledge

Editor's Notes

  1. Data Veracity, uncertain or imprecise data. Data veracity is the degree to whichdata is accurate, precise and trusted. Data is often viewed as certain and reliable. The reality of problem spaces, data sets and operational environments is that data is often uncertain, imprecise and difficult to trust. The following are illustrative examples of data veracity.
  2. National Institute of Standards and Technology
  3. https://www.wsj.com/articles/SB10001424052748704547804576260781324726782#articleTabs%3Darticle
  4. https://www.gartner.com/doc/3818364/itscore-data-analytics Understanding of a situation or event only after it has happened or developed
  5. https://www.datapine.com/blog/big-data-examples-in-healthcare/
  6. https://www.datapine.com/blog/big-data-examples-in-healthcare/ identify asthma trends both on an individual level and looking at larger populations
  7. DS: Descriptive statistics aims to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. For example, in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related comorbidities, etc.
  8. scatter plots Pearson’s correlation coefficient for two MC data types (assumed normal), Spearman’s rank correlation coefficient for either or both variables is ordinal (not assume normal) Cohen’s kappa coefficient Linear regression, Structural equation modelling.
  9. scatter plots Pearson’s correlation coefficient for two MC data types (assumed normal), Spearman’s rank correlation coefficient for either or both variables is ordinal (not assume normal) Cohen’s kappa coefficient Linear regression, Structural equation modelling.
  10. Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.