La BuzzWord dell’ultimo anno è “Data Science”. Ma cosa significa realmente? Cosa fa un “Data Scientist”? Che strumenti sono messi a disposizione da Microsoft? E che altri strumenti ci sono oltre a Microsoft?
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance
Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
“Companies are collecting
mountains of information about
you, to predict how
likely you are to buy a product,
and using that knowledge to
craft a marketing message
precisely calibrated to get you to
• Extraction of knowledge from data
• So, what’s new?
• Nothing. Except that it’s now economic and fast.
• It’s now applicable to everything. And we have a lot of data produced everyday
that can be used to extract knowledge
• A Sum Of
• Machine Learning
• Data Mining
• Computer Programming
• Data Engineering
• Data Warehousing
• High Performance Computing
• To support (Informed) Decision Making
• Data-Driven Decisions
• A data scientist represents an evolution from the business or data analyst role.
• The formal training is similar, with a solid foundation typically in computer science and
applications, modeling, statistics, analytics and math.
• What sets the data scientist apart is strong business acumen, coupled with the ability to
communicate findings to both business and IT leaders in a way that can influence how
an organization approaches a business challenge.
• It's almost like a Renaissance individual who really wants to learn and bring change to
• Algorithms are the new gatekeepers
• There is simply too much data for a human to analyze!
• They decide
• What we find
• What we see
• What we buy
• Data is the foundation upon which algorithm works
• Better Data lease Better Results
• Data-Driven Decisions will be a MUST in the next years!
• Data Scientists will help companies to leverage their most valuable asset: Data
Modern Data Environment
The 3 V
No, the 4 V!!!
No, no, the 5 V!!!!!
• Volume, Velocity, Variety, Veracity….V<your-v-here>
• Data sets with sizes beyond the ability of commonly used software tools
to capture, curate, manage, and process the data within a tolerable elapsed
• Grid Computing, Parallel Computing needed
• keep processing time reasonable
• provide scalability
Big Data Data
• Paradigm: “Store Now, Figure Out Later”
• Data is the new resource. Never throw it away!
• Unstructured Data
• Text Files
• Structured/Semi Structured Data
• SQL Server
• Hortonworks Data Platform
• Distributed File (Eco)System
• Hadoop Ecosystem
Data Science & Big Data
• Data Science != Big Data
• Data Science Not Only on Big Data
• Data Science can be applied to Big Data
• Data Science starts from Small Data
• 1) find the algorithm that extract knowledge
• 2) measure algorithm results and in terms of probability
• Machine learning, a branch of artificial intelligence, concerns the construction
and study of systems that can learn from data. (Wikipedia)
• For example, a machine learning system could be trained on email messages to learn to
distinguish between spam and non-spam messages. After learning, it can then be used
to classify new email messages into spam and non-spam folders.
• Common Data Scientists Tools
• Common Data Scientists Languages
• Data Scientist Specialization
• Italian Big Data Market Analysis Resources
• Data Science Services
• Big Data / Business Intelligence / Data Warehousing