SlideShare a Scribd company logo
1 of 11
Download to read offline
Data Science
Shankar Radhakrishnan
Cognizant
History…
• Questions first, data later	

• Data model first, data processing later	

• Size first, project second, react overtime	

• Focus on accuracy, assume little	

• Importance to completeness and comprehensiveness	

• Expose raw data to decision makers	

• Provide insights but those that are not actionable	

• Bound by constraints (Procurement, Process, Build Insights,
Interaction)
What’s Changed ?
• Medium to participate is vast	

• Mode to reach expanded	

• Data types are vast and voluminous	

• Noise is huge, yet accepted	

• Urgency precedes accuracy	

• Guidance is better than completeness	

• Cost to store and process has fallen (and still falling)	

• More ways and means to process data at scale
Speaking of Data
• Volume - Data at rest	

• Variety - Data in many forms	

• Velocity - Data in motion	

• Veracity - Data in doubt
Data Science
“ Data Science is the art of turning data into actions ”
This is accomplished through creation of data
products, that provide actionable information

without exposing underlying data or analytics
“ Scientific study of the creation, validation
and transformation of data to create meaning ”
http://www.datascienceassn.org/code-of-conduct.html
While we are on definitions…
Data Mining
“ Non-trivial process of identifying valid, novel, potentially
useful and understandable structures or patterns or models or
relationships in data to enable data driven decision making ”
Statistics
“ Science of learning from data or of 

making sense out of data ”
Science of Data Science
• Analyze and understand data that’s available	

• Find and acquire what more is needed	

• Discover what’s not known from data	

• Predict and build “actionable insights” from data	

• Build data products that has “immediate” business impact	

• Make it easy for business to “use”	

• Help decision making to drive “business value”
Data Science Toolkit
Python	

R	

Java	

Textwrangler	

SQL	

C, C++
Mahout	

NLTK	

OpenNLP	

GPText	

SciPy	

Pandas	

scikit-leam
Hadoop	

Hive	

HAWQ	

PL/Python	

PL/R	

PL/Java	

Proprietary
D3.js	

Gephi	

Graphviz	

R	

Tableau	

Proprietary	

Languages Libraries Database Visualization
Approach, Techniques
• Classification	

• Filtering	

• Structure	

• Clustering	

• Disambiguation	

• De-duplication	

• Normalization	

• Correlation	

• Prediction
• Discover	

• Reason	

• Model	

• Deploy	

• Visualize	

• Recommend	

• Predict	

• Explore
• Machine Learning	

• Decision Trees	

• Bayesian Networks	

• Logistic Regression	

• Monte Carlo Methods	

• Component Analysis	

• Fuzzy Modeling	

• Neural Networks	

• Genetic Algorithms
Step Process Technology
Data Science In Action
• Improving User Experience	

• Multi-device event stream analysis	

• Intrusion detection, avoidance	

• Collocation analysis from 

cell-phone towers	

• Text Mining, Bandwidth Throttling
• Network Performance &
Optimization	

• Mobile User Location Analytics	

• Customer Churn Prevention	

• Social Media and Sentiment
Analysis	

• Location Based Initiatives
Thanks !

More Related Content

What's hot

Bigdata
BigdataBigdata

What's hot (20)

Maximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformMaximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data Platform
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
 
Big data 101
Big data 101Big data 101
Big data 101
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Big data
Big dataBig data
Big data
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Big data 101
Big data 101Big data 101
Big data 101
 
Big Data Tutorial V4
Big Data Tutorial V4Big Data Tutorial V4
Big Data Tutorial V4
 
Unit i big data introduction
Unit  i big data introductionUnit  i big data introduction
Unit i big data introduction
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Bigdata
BigdataBigdata
Bigdata
 

Similar to Data science

Pivotal Data Warehouse in the Age of Digital Transformation
Pivotal Data Warehouse in the Age of Digital TransformationPivotal Data Warehouse in the Age of Digital Transformation
Pivotal Data Warehouse in the Age of Digital Transformation
VMware Tanzu
 

Similar to Data science (20)

TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
 
Intro to dh data management
Intro to dh data management Intro to dh data management
Intro to dh data management
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
DiscoverText Product Overview
DiscoverText Product OverviewDiscoverText Product Overview
DiscoverText Product Overview
 
Pivotal Data Warehouse in the Age of Digital Transformation
Pivotal Data Warehouse in the Age of Digital TransformationPivotal Data Warehouse in the Age of Digital Transformation
Pivotal Data Warehouse in the Age of Digital Transformation
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypse
 
How to find new ways to add value to your audits
How to find new ways to add value to your auditsHow to find new ways to add value to your audits
How to find new ways to add value to your audits
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data Science
 
Tour of Big Data
Tour of Big DataTour of Big Data
Tour of Big Data
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
 
Trendspotting: Helping you make sense of large information sources
Trendspotting: Helping you make sense of large information sourcesTrendspotting: Helping you make sense of large information sources
Trendspotting: Helping you make sense of large information sources
 
Getting to Grips with Research Data Management
Getting to Grips with Research Data Management Getting to Grips with Research Data Management
Getting to Grips with Research Data Management
 
Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data Management
 
2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Data science

  • 2. History… • Questions first, data later • Data model first, data processing later • Size first, project second, react overtime • Focus on accuracy, assume little • Importance to completeness and comprehensiveness • Expose raw data to decision makers • Provide insights but those that are not actionable • Bound by constraints (Procurement, Process, Build Insights, Interaction)
  • 3. What’s Changed ? • Medium to participate is vast • Mode to reach expanded • Data types are vast and voluminous • Noise is huge, yet accepted • Urgency precedes accuracy • Guidance is better than completeness • Cost to store and process has fallen (and still falling) • More ways and means to process data at scale
  • 4. Speaking of Data • Volume - Data at rest • Variety - Data in many forms • Velocity - Data in motion • Veracity - Data in doubt
  • 5. Data Science “ Data Science is the art of turning data into actions ” This is accomplished through creation of data products, that provide actionable information
 without exposing underlying data or analytics “ Scientific study of the creation, validation and transformation of data to create meaning ” http://www.datascienceassn.org/code-of-conduct.html
  • 6. While we are on definitions… Data Mining “ Non-trivial process of identifying valid, novel, potentially useful and understandable structures or patterns or models or relationships in data to enable data driven decision making ” Statistics “ Science of learning from data or of 
 making sense out of data ”
  • 7. Science of Data Science • Analyze and understand data that’s available • Find and acquire what more is needed • Discover what’s not known from data • Predict and build “actionable insights” from data • Build data products that has “immediate” business impact • Make it easy for business to “use” • Help decision making to drive “business value”
  • 8. Data Science Toolkit Python R Java Textwrangler SQL C, C++ Mahout NLTK OpenNLP GPText SciPy Pandas scikit-leam Hadoop Hive HAWQ PL/Python PL/R PL/Java Proprietary D3.js Gephi Graphviz R Tableau Proprietary Languages Libraries Database Visualization
  • 9. Approach, Techniques • Classification • Filtering • Structure • Clustering • Disambiguation • De-duplication • Normalization • Correlation • Prediction • Discover • Reason • Model • Deploy • Visualize • Recommend • Predict • Explore • Machine Learning • Decision Trees • Bayesian Networks • Logistic Regression • Monte Carlo Methods • Component Analysis • Fuzzy Modeling • Neural Networks • Genetic Algorithms Step Process Technology
  • 10. Data Science In Action • Improving User Experience • Multi-device event stream analysis • Intrusion detection, avoidance • Collocation analysis from 
 cell-phone towers • Text Mining, Bandwidth Throttling • Network Performance & Optimization • Mobile User Location Analytics • Customer Churn Prevention • Social Media and Sentiment Analysis • Location Based Initiatives