THE BEST THING IN DATA SCIENCE?
COLLABORATION
Martina Pugliese
Data Science Lead
Mallzee, Edinburgh
FAKE IT TILL
YOU MAKE IT?
Part one: immaturity
“The key word in “Data Science” is not
Data, it is Science.
-Jeff Leek
• Data Science isn’t new as a field per se - in academia
• The availability of enormous sets of data is “new”
• The industry has then been gradually trying to catch up
• It is fundamentally a science/rigour - driven field
• The amount of hype is … still large
• The hype/buzz is demonstrated by the increase in use,
often erroneous, of terms related to the area:
ONE CHALLENGE WITH THIS FIELD
data science
data analytics
analytics
machine learning
artificial intelligence
data engineering
…?
SOME DATA FROM GOOGLE NGRAMS VIEWER
WHO’S GOT
DOMAIN
EXPERTISE?
Part two: nebulous goals
• “Data analysis” and “Machine Learning” are powerful
• But only if used knowing what one’s doing
TALK TO THOSE WHO KNOW
Oftentimes, speaking to people who know about the issue at hand is
of great help.
An approach detached from the applicability can’t work.
A focus on the algorithms first can’t work either.
COLLABORATION IS THE BEST THING IN DATA SCIENCE
Data Science is already an interdisciplinary field.
But when used in the industry, this needs to go further.
To work, it requires collaboration between:
• the data scientist (“I’ll do the maths and check all is correct”)
• the people who understand what business problem we’re solving (“I’ll teach you what
people actually want”)
the data scientist will learn applied thinking - with good communication and experience
… and repeated failures
THIS WILL ALSO HELP INCLUSIVITY
Data Science isn’t the champion of inclusivity, currently.
On top of the existing diversity issues which affect all of tech, the fact
that it is a field branched out of science (and academia, in many
cases), makes it a fundamentally competency-based one.
The quality bar needs to stay high.
But that doesn’t have to mean mutual alienation has to be regarded as
the normality.
START FROM
THE DATA …
Part three: really, it’s mostly
good ol’ methods
LOOK AT YOUR DATA
Great algorithms alone won’t solve a problem.
One of the problems the hype has created as its own effect is
the mistaken perception that just because one has some data and people who know “data
science” then “predictions” can be done.
If your data doesn’t contain information, no algorithm can do
miracles
Solid statistics work is always necessary. In fact, old-fashioned Statistics is the main
component of Data Science.
AND KEEP THE
BALANCE
Part four: do it quick
DEMAND FOR RIGOUR - NEED FOR SPEED
Solid work means results are rigorous.
But the business needs them quick.
?!
The balance between these two tendencies
is another critical skill to acquire
THANKS!

The best thing in Data Science? Collaboration

  • 1.
    THE BEST THINGIN DATA SCIENCE? COLLABORATION Martina Pugliese Data Science Lead Mallzee, Edinburgh
  • 2.
    FAKE IT TILL YOUMAKE IT? Part one: immaturity
  • 3.
    “The key wordin “Data Science” is not Data, it is Science. -Jeff Leek
  • 4.
    • Data Scienceisn’t new as a field per se - in academia • The availability of enormous sets of data is “new” • The industry has then been gradually trying to catch up • It is fundamentally a science/rigour - driven field • The amount of hype is … still large • The hype/buzz is demonstrated by the increase in use, often erroneous, of terms related to the area: ONE CHALLENGE WITH THIS FIELD data science data analytics analytics machine learning artificial intelligence data engineering …?
  • 5.
    SOME DATA FROMGOOGLE NGRAMS VIEWER
  • 6.
  • 7.
    • “Data analysis”and “Machine Learning” are powerful • But only if used knowing what one’s doing TALK TO THOSE WHO KNOW Oftentimes, speaking to people who know about the issue at hand is of great help. An approach detached from the applicability can’t work. A focus on the algorithms first can’t work either.
  • 8.
    COLLABORATION IS THEBEST THING IN DATA SCIENCE Data Science is already an interdisciplinary field. But when used in the industry, this needs to go further. To work, it requires collaboration between: • the data scientist (“I’ll do the maths and check all is correct”) • the people who understand what business problem we’re solving (“I’ll teach you what people actually want”) the data scientist will learn applied thinking - with good communication and experience … and repeated failures
  • 9.
    THIS WILL ALSOHELP INCLUSIVITY Data Science isn’t the champion of inclusivity, currently. On top of the existing diversity issues which affect all of tech, the fact that it is a field branched out of science (and academia, in many cases), makes it a fundamentally competency-based one. The quality bar needs to stay high. But that doesn’t have to mean mutual alienation has to be regarded as the normality.
  • 10.
    START FROM THE DATA… Part three: really, it’s mostly good ol’ methods
  • 11.
    LOOK AT YOURDATA Great algorithms alone won’t solve a problem. One of the problems the hype has created as its own effect is the mistaken perception that just because one has some data and people who know “data science” then “predictions” can be done. If your data doesn’t contain information, no algorithm can do miracles Solid statistics work is always necessary. In fact, old-fashioned Statistics is the main component of Data Science.
  • 12.
    AND KEEP THE BALANCE Partfour: do it quick
  • 13.
    DEMAND FOR RIGOUR- NEED FOR SPEED Solid work means results are rigorous. But the business needs them quick. ?! The balance between these two tendencies is another critical skill to acquire
  • 14.