SlideShare a Scribd company logo
1 of 24
Download to read offline
Previously known as
Think Big. Move Fast.
Template designed by
brought to you by
SolidQ
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• www.solidq.com
Davide Mauri
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance
Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
Data Science
Reinassance 2.0
“Companies are collecting
mountains of information about
you, to predict how
likely you are to buy a product,
and using that knowledge to
craft a marketing message
precisely calibrated to get you to
do so”
Data Science
• Extraction of knowledge from data
• So, what’s new?
• Nothing. Except that it’s now economic and fast.
• It’s now applicable to everything. And we have a lot of data produced everyday
that can be used to extract knowledge
Data Science
DecisionsKnowledgeInformationData
Data Science
• A Sum Of
• Statistics
• Mathematics
• Machine Learning
• Data Mining
• Computer Programming
• Data Engineering
• Visualization
• Data Warehousing
• High Performance Computing
• To support (Informed) Decision Making
• Data-Driven Decisions
Data Scientist
• IBM
• A data scientist represents an evolution from the business or data analyst role.
• The formal training is similar, with a solid foundation typically in computer science and
applications, modeling, statistics, analytics and math.
• What sets the data scientist apart is strong business acumen, coupled with the ability to
communicate findings to both business and IT leaders in a way that can influence how
an organization approaches a business challenge.
• It's almost like a Renaissance individual who really wants to learn and bring change to
an organization.
• Algorithms are the new gatekeepers
• They decided
• What we find
• What we see
• What we buy
Modern Data Environment
Master
Data
EDW
Data Mart
Big Data
Unstructured
Data
BI Environment
Analytics Environment
Structured
Data
Big Data
The 3 V
No, the 4 V!!!
No, no, the 5 V!!!!!
http://www.ibmbigdatahub.com/infographic/four-vs-big-data
Big Data
• Volume, Velocity, Variety, Veracity….V<your-v-here>
• Data sets with sizes beyond the ability of commonly used software tools
to capture, curate, manage, and process the data within a tolerable elapsed
time
• Grid Computing, Parallel Computing needed
• keep processing time reasonable
• provide scalability
Big Data Data
• Paradigm: “Store Now, Figure Out Later”
• Data is the new resource. Never throw it away!
• Unstructured Data
• Text Files
• Images
• Sounds
• Structured/Semi Structured Data
• Sensors
• Transactions
• Logs
Data Storage
• RDBMS
• SQL Server
• Hadoop
• HDInsight
• Hortonworks Data Platform
• Distributed File (Eco)System
• CSV
• JSON
• *.*
Data Storage
• Hadoop Ecosystem
http://hortonworks.com/hadoop-modern-data-architecture/
Data Science & Big Data
• Data Science != Big Data
• Data Science Not Only on Big Data
• Data Science can be applied to Big Data
• Data Science starts from Small Data
• 1) find the algorithm that extract knowledge
• 2) measure algorithm results and in terms of probability
Machine Learning
• Machine learning, a branch of artificial intelligence, concerns the construction
and study of systems that can learn from data. (Wikipedia)
• For example, a machine learning system could be trained on email messages to learn to
distinguish between spam and non-spam messages. After learning, it can then be used
to classify new email messages into spam and non-spam folders.
• Flavors
• Supervised
• Unsupervised
Data Analysis
• Common Data Scientists Tools
• R
• Weka
• Octave
• Scikit-Learn
• Common Data Scientists Languages
• Python
• Scala
• F#
Resources
• https://www.coursera.org/
• Data Scientist Specialization
• https://www.khanacademy.org/
• Math
• http://www.osservatori.net/business_intelligence
• Italian Big Data Market Analysis Resources
• http://www.solidq.com/consulting/
• Data Science Services
• Big Data / Business Intelligence / Data Warehousing
Previously known as
Think Big. Move Fast.

More Related Content

What's hot

Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data Technologies
DATAVERSITY
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 

What's hot (20)

Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen LopezData Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
 
So you want to be a Data Scientist?
So you want to be a Data Scientist?So you want to be a Data Scientist?
So you want to be a Data Scientist?
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data Technologies
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureML
 
The importance of data
The importance of dataThe importance of data
The importance of data
 
Next Big Thing In IT Space
Next Big Thing In IT SpaceNext Big Thing In IT Space
Next Big Thing In IT Space
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
 
democratization of data sql-konferenz
democratization of data sql-konferenzdemocratization of data sql-konferenz
democratization of data sql-konferenz
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big data
Big dataBig data
Big data
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Big Data Presentation
Big Data PresentationBig Data Presentation
Big Data Presentation
 
Big Data & the importance of Data Science
Big Data & the importance of Data ScienceBig Data & the importance of Data Science
Big Data & the importance of Data Science
 
Leveraging open source for big data stack
Leveraging open source for big data stackLeveraging open source for big data stack
Leveraging open source for big data stack
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Big Data Overview 2013-2014
Big Data Overview 2013-2014Big Data Overview 2013-2014
Big Data Overview 2013-2014
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduce
 

Viewers also liked

Gam04 introduzione a-netduino_final
Gam04   introduzione a-netduino_finalGam04   introduzione a-netduino_final
Gam04 introduzione a-netduino_final
DotNetCampus
 
Javascript task automation
Javascript task automationJavascript task automation
Javascript task automation
DotNetCampus
 
AZURE WEBSITE DEEPDIVE
AZURE WEBSITE DEEPDIVEAZURE WEBSITE DEEPDIVE
AZURE WEBSITE DEEPDIVE
DotNetCampus
 
Signal r to the-max
Signal r to the-maxSignal r to the-max
Signal r to the-max
DotNetCampus
 
Win03 design windows store apps
Win03   design windows store appsWin03   design windows store apps
Win03 design windows store apps
DotNetCampus
 
Be04 introduction to ef 6.0
Be04   introduction to ef 6.0Be04   introduction to ef 6.0
Be04 introduction to ef 6.0
DotNetCampus
 
Ds03 data analysis
Ds03   data analysisDs03   data analysis
Ds03 data analysis
DotNetCampus
 
Be05 introduction to sql azure
Be05   introduction to sql azureBe05   introduction to sql azure
Be05 introduction to sql azure
DotNetCampus
 
Ag04 gestire gruppi di lavoro, team multipli e progetti con visual studio alm
Ag04   gestire gruppi di lavoro, team multipli e progetti con visual studio almAg04   gestire gruppi di lavoro, team multipli e progetti con visual studio alm
Ag04 gestire gruppi di lavoro, team multipli e progetti con visual studio alm
DotNetCampus
 
Windows iot barone
Windows iot baroneWindows iot barone
Windows iot barone
DotNetCampus
 
Dnc2015 azure-microservizi-vforusso
Dnc2015 azure-microservizi-vforussoDnc2015 azure-microservizi-vforusso
Dnc2015 azure-microservizi-vforusso
DotNetCampus
 

Viewers also liked (18)

Gam04 introduzione a-netduino_final
Gam04   introduzione a-netduino_finalGam04   introduzione a-netduino_final
Gam04 introduzione a-netduino_final
 
Javascript task automation
Javascript task automationJavascript task automation
Javascript task automation
 
DSTORIE DALLA TRINCEA: TEAM FOUNDATION SERVER IN CASI LIMITE E NON SOLO...
DSTORIE DALLA TRINCEA: TEAM FOUNDATION SERVER IN CASI LIMITE E NON SOLO...DSTORIE DALLA TRINCEA: TEAM FOUNDATION SERVER IN CASI LIMITE E NON SOLO...
DSTORIE DALLA TRINCEA: TEAM FOUNDATION SERVER IN CASI LIMITE E NON SOLO...
 
AZURE WEBSITE DEEPDIVE
AZURE WEBSITE DEEPDIVEAZURE WEBSITE DEEPDIVE
AZURE WEBSITE DEEPDIVE
 
Signal r to the-max
Signal r to the-maxSignal r to the-max
Signal r to the-max
 
Win03 design windows store apps
Win03   design windows store appsWin03   design windows store apps
Win03 design windows store apps
 
CONTINUOUS INTEGRATION CON SQL SERVER
CONTINUOUS INTEGRATION CON SQL SERVERCONTINUOUS INTEGRATION CON SQL SERVER
CONTINUOUS INTEGRATION CON SQL SERVER
 
Formulacion inorganica
Formulacion inorganicaFormulacion inorganica
Formulacion inorganica
 
Be04 introduction to ef 6.0
Be04   introduction to ef 6.0Be04   introduction to ef 6.0
Be04 introduction to ef 6.0
 
Ds03 data analysis
Ds03   data analysisDs03   data analysis
Ds03 data analysis
 
Be05 introduction to sql azure
Be05   introduction to sql azureBe05   introduction to sql azure
Be05 introduction to sql azure
 
Newsletter
NewsletterNewsletter
Newsletter
 
Ag04 gestire gruppi di lavoro, team multipli e progetti con visual studio alm
Ag04   gestire gruppi di lavoro, team multipli e progetti con visual studio almAg04   gestire gruppi di lavoro, team multipli e progetti con visual studio alm
Ag04 gestire gruppi di lavoro, team multipli e progetti con visual studio alm
 
Pigmento pb y cd
Pigmento pb y cdPigmento pb y cd
Pigmento pb y cd
 
Windows iot barone
Windows iot baroneWindows iot barone
Windows iot barone
 
70-483: PROGRAMMING IN C#
70-483: PROGRAMMING IN C#70-483: PROGRAMMING IN C#
70-483: PROGRAMMING IN C#
 
Dnc2015 azure-microservizi-vforusso
Dnc2015 azure-microservizi-vforussoDnc2015 azure-microservizi-vforusso
Dnc2015 azure-microservizi-vforusso
 
70-534: ARCHITECTING MICROSOFT AZURE SOLUTIONS
70-534: ARCHITECTING MICROSOFT AZURE SOLUTIONS70-534: ARCHITECTING MICROSOFT AZURE SOLUTIONS
70-534: ARCHITECTING MICROSOFT AZURE SOLUTIONS
 

Similar to Ds01 data science

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 

Similar to Ds01 data science (20)

DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
big-data-notes1.ppt
big-data-notes1.pptbig-data-notes1.ppt
big-data-notes1.ppt
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Big data
Big dataBig data
Big data
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 

More from DotNetCampus

More from DotNetCampus (20)

ARCHITETTURA DI UN'APPLICAZIONE SCALABILE
ARCHITETTURA DI UN'APPLICAZIONE SCALABILEARCHITETTURA DI UN'APPLICAZIONE SCALABILE
ARCHITETTURA DI UN'APPLICAZIONE SCALABILE
 
MICROSOFT E IL MONDO IOT
MICROSOFT E IL MONDO IOTMICROSOFT E IL MONDO IOT
MICROSOFT E IL MONDO IOT
 
70-485: ADVANCED OF DEVELOPING WINDOWS STORE APPS USING C#
70-485: ADVANCED OF DEVELOPING WINDOWS STORE APPS USING C#70-485: ADVANCED OF DEVELOPING WINDOWS STORE APPS USING C#
70-485: ADVANCED OF DEVELOPING WINDOWS STORE APPS USING C#
 
TUTTO SU VISUAL STUDIO ALM 2015
TUTTO SU VISUAL STUDIO ALM 2015TUTTO SU VISUAL STUDIO ALM 2015
TUTTO SU VISUAL STUDIO ALM 2015
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
DESKTOP AND CLIENT VIRTUALIZATION: NEW WORKSTYLES WITH MICROSOFT VDI
DESKTOP AND CLIENT VIRTUALIZATION: NEW WORKSTYLES WITH MICROSOFT VDIDESKTOP AND CLIENT VIRTUALIZATION: NEW WORKSTYLES WITH MICROSOFT VDI
DESKTOP AND CLIENT VIRTUALIZATION: NEW WORKSTYLES WITH MICROSOFT VDI
 
FROM ON-PREMISE TO THE HYBRID CLOUD WITH MICROSOFT AZURE
FROM ON-PREMISE TO THE HYBRID CLOUD WITH MICROSOFT AZUREFROM ON-PREMISE TO THE HYBRID CLOUD WITH MICROSOFT AZURE
FROM ON-PREMISE TO THE HYBRID CLOUD WITH MICROSOFT AZURE
 
SHAREPOINT 2016 - WHAT'S NEW
SHAREPOINT 2016 - WHAT'S NEWSHAREPOINT 2016 - WHAT'S NEW
SHAREPOINT 2016 - WHAT'S NEW
 
COSTRUISCI IL TUO DEVICE
COSTRUISCI IL TUO DEVICECOSTRUISCI IL TUO DEVICE
COSTRUISCI IL TUO DEVICE
 
SVILUPPARE PER MICROSOFT BAND
SVILUPPARE PER MICROSOFT BANDSVILUPPARE PER MICROSOFT BAND
SVILUPPARE PER MICROSOFT BAND
 
INTERFACCE GRAFICHE CON UNITY3D 4.6: IL GIOCO NON BASTA!
INTERFACCE GRAFICHE CON UNITY3D 4.6: IL GIOCO NON BASTA!INTERFACCE GRAFICHE CON UNITY3D 4.6: IL GIOCO NON BASTA!
INTERFACCE GRAFICHE CON UNITY3D 4.6: IL GIOCO NON BASTA!
 
WINDOWS PHONE APPS IN C++
WINDOWS PHONE APPS IN C++WINDOWS PHONE APPS IN C++
WINDOWS PHONE APPS IN C++
 
AZURE NOTIFICATION HUB
AZURE NOTIFICATION HUBAZURE NOTIFICATION HUB
AZURE NOTIFICATION HUB
 
SFRUTTARE I MICROSOFT AZURE MOBILE SERVICES CON XAMARIN.FORMS
SFRUTTARE I MICROSOFT AZURE MOBILE SERVICES CON XAMARIN.FORMSSFRUTTARE I MICROSOFT AZURE MOBILE SERVICES CON XAMARIN.FORMS
SFRUTTARE I MICROSOFT AZURE MOBILE SERVICES CON XAMARIN.FORMS
 
INTRO TO XAMARIN
INTRO TO XAMARININTRO TO XAMARIN
INTRO TO XAMARIN
 
UNIVERSAL APP IN TUTTE LE SALSE: PHONE, TABLET, PC, XBOX E IOT
UNIVERSAL APP IN TUTTE LE SALSE: PHONE, TABLET, PC, XBOX E IOTUNIVERSAL APP IN TUTTE LE SALSE: PHONE, TABLET, PC, XBOX E IOT
UNIVERSAL APP IN TUTTE LE SALSE: PHONE, TABLET, PC, XBOX E IOT
 
SFRUTTARE CORTANA E LE SPEECH API NELLE NOSTRE APP
SFRUTTARE CORTANA E LE SPEECH API NELLE NOSTRE APPSFRUTTARE CORTANA E LE SPEECH API NELLE NOSTRE APP
SFRUTTARE CORTANA E LE SPEECH API NELLE NOSTRE APP
 
APPSTUDIO: DA ZERO ALLO STORE IN 50 MINUTI!
APPSTUDIO: DA ZERO ALLO STORE IN 50 MINUTI!APPSTUDIO: DA ZERO ALLO STORE IN 50 MINUTI!
APPSTUDIO: DA ZERO ALLO STORE IN 50 MINUTI!
 
SIGNALR TO-THE-MAX: VERSO IL WEB ED OLTRE!
SIGNALR TO-THE-MAX: VERSO IL WEB ED OLTRE!SIGNALR TO-THE-MAX: VERSO IL WEB ED OLTRE!
SIGNALR TO-THE-MAX: VERSO IL WEB ED OLTRE!
 
SVILUPPARE E GESTIRE ARCHITETTURE A MICROSERVIZI SU AZURE
SVILUPPARE E GESTIRE ARCHITETTURE A MICROSERVIZI SU AZURESVILUPPARE E GESTIRE ARCHITETTURE A MICROSERVIZI SU AZURE
SVILUPPARE E GESTIRE ARCHITETTURE A MICROSERVIZI SU AZURE
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Ds01 data science

  • 1. Previously known as Think Big. Move Fast.
  • 3. SolidQ • Born in 2002 in USA and Spain • Established in 2007 in Italy • More than 1000 customers and more than 200 consultants worldwide • Dedicated to Data Management on the Microsoft Platform • Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors • www.solidq.com
  • 4. Davide Mauri • 18 Years of experience on the SQL Server Platform • Specialized in Data Solution Architecture, Database Design, Performance Tuning, Business Intelligence • Microsoft SQL Server MVP • President of UGISS (Italian SQL Server UG) • Mentor @ SolidQ • Video, Book & Article Author • Regular Speaker @ SQL Server events • Projects, Consulting, Mentoring & Training
  • 6. “Companies are collecting mountains of information about you, to predict how likely you are to buy a product, and using that knowledge to craft a marketing message precisely calibrated to get you to do so”
  • 7. Data Science • Extraction of knowledge from data • So, what’s new? • Nothing. Except that it’s now economic and fast. • It’s now applicable to everything. And we have a lot of data produced everyday that can be used to extract knowledge
  • 9. Data Science • A Sum Of • Statistics • Mathematics • Machine Learning • Data Mining • Computer Programming • Data Engineering • Visualization • Data Warehousing • High Performance Computing • To support (Informed) Decision Making • Data-Driven Decisions
  • 10. Data Scientist • IBM • A data scientist represents an evolution from the business or data analyst role. • The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math. • What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. • It's almost like a Renaissance individual who really wants to learn and bring change to an organization.
  • 11. • Algorithms are the new gatekeepers • They decided • What we find • What we see • What we buy
  • 12. Modern Data Environment Master Data EDW Data Mart Big Data Unstructured Data BI Environment Analytics Environment Structured Data
  • 13. Big Data The 3 V No, the 4 V!!! No, no, the 5 V!!!!!
  • 15. Big Data • Volume, Velocity, Variety, Veracity….V<your-v-here> • Data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time • Grid Computing, Parallel Computing needed • keep processing time reasonable • provide scalability
  • 16. Big Data Data • Paradigm: “Store Now, Figure Out Later” • Data is the new resource. Never throw it away! • Unstructured Data • Text Files • Images • Sounds • Structured/Semi Structured Data • Sensors • Transactions • Logs
  • 17. Data Storage • RDBMS • SQL Server • Hadoop • HDInsight • Hortonworks Data Platform • Distributed File (Eco)System • CSV • JSON • *.*
  • 18. Data Storage • Hadoop Ecosystem http://hortonworks.com/hadoop-modern-data-architecture/
  • 19. Data Science & Big Data • Data Science != Big Data • Data Science Not Only on Big Data • Data Science can be applied to Big Data • Data Science starts from Small Data • 1) find the algorithm that extract knowledge • 2) measure algorithm results and in terms of probability
  • 20. Machine Learning • Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. (Wikipedia) • For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders. • Flavors • Supervised • Unsupervised
  • 21. Data Analysis • Common Data Scientists Tools • R • Weka • Octave • Scikit-Learn • Common Data Scientists Languages • Python • Scala • F#
  • 22.
  • 23. Resources • https://www.coursera.org/ • Data Scientist Specialization • https://www.khanacademy.org/ • Math • http://www.osservatori.net/business_intelligence • Italian Big Data Market Analysis Resources • http://www.solidq.com/consulting/ • Data Science Services • Big Data / Business Intelligence / Data Warehousing
  • 24. Previously known as Think Big. Move Fast.