Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Bridging the Gap Between Data Science
& Engineering:
Building High-Performing Teams
How do I hire a data scientist?
Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist
Continuum of Skills
Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist
Continuum of Skills
Math &
Stats
Computer
Science
Domain
Expertise
Machine
Learning
Software
Engineering Research
Unicorn
Data Science
Many companies try to find all of these skills in a
single person.
Which leads to job requirements like this…
• MSc/PhD in Computer Science, Electrical Engineering, Math or Statistics
• At ...
Data Science Unicorn
These people do exist, but they are often already
well-compensated, and only want to work on
interesting problems.
What can you do?
Build a team instead.
Broad-range generalist
Deepexpertise
Look for T-shaped people
Machine Learning,
Statistics, Domain Knowledge
Softw
are
Engineering
Business
Acum
en
Distributed
Com
puting
Com
m
unicati...
• Compose teams of individuals who
have overlapping skill-sets and
deep expertise in one area
(machine learning, statistic...
How do I structure my data science team within
my organization?
Data Science Team Structures
CentralizedEmbeddedHub & Spoke
Centralized
Data Scientists sit on a team that
acts as internal consultants, fielding
and answering questions from
multiple...
Embedded
• Data Scientists are almost wholly
embedded within one particular team
and focus on solving problems for that
te...
Hub & Spoke
• The data science team sits
together physically and works
collaboratively to solve problems.
• However, each ...
Data Science Team Structure
CentralizedEmbeddedHub & Spoke
> >
How do I get my data scientists to work with
engineering?
Data Science
Python R
modeling & prototyping production
Software Engineering
Java/C++ RoR/Javascript
Data Science Software Engineering
Python R Java/C++ RoR/Javascript
modeling & prototyping production
Data scientists learn
to write prototypes
in production
languages
Engineers learn the
basics of data
science so they can
u...
Data Science Data Engineering
Common Core
Data Science
Curriculum
Data Engineering
Curriculum
Data Science Data Engineerin...
Data Science Engineering
Initial Planning
Data Science Engineering
Data Science Engineering
Production
• Don’t look for unicorns, build collaborative
teams of T-shaped people
• Pay attention to how your data science team is
s...
We believe an opportunity belongs 

to anyone with aptitude and ambition.
29Galvanize 2015
NODES ON THE NETWORK
COLORADO (BOULDER, DENVER, FORT COLLINS)
SEATTLE, WA
SAN FRANCISCO, CA
AUSTIN, TX (O...
30Galvanize 2015
PLACEMENT STATS
FULL STACK IMMERSIVE DATA SCIENCE IMMERSIVE
$43K $77KPre-program Salary
Average Starting ...
31Galvanize 2015
5 PROGRAMS
• Full Stack Immersive
• Data Science Immersive
• Data Engineering Immersive
Project over 500 ...
32Galvanize 2015
FULL STACK IMMERSIVE
• 97% Placement Rate 

within 6 months
• $77K Average Starting Salary
• 6 Month Prog...
33Galvanize 2015
FULL STACK IMMERSIVE
34Galvanize 2015
DATA SCIENCE IMMERSIVE
• 94% Placement Rate 

within 6 months
• $114K Average Starting Salary
• 3 Month P...
35Galvanize 2015
DATA SCIENCE IMMERSIVE
Week 1 - Exploratory Data Analysis and Software Engineering Best Practices
Week 2 ...
36Galvanize 2015
DATA SCIENCE IMMERSIVE
37Galvanize 2015
DATA ENGINEERING IMMERSIVE
• Launched Oct. 2015
• Built in partnership with Nvent and
Concurrent
• 3 Mont...
THANK YOU
RYAN ORBAN | EVP OF PRODUCT & STRATEGY
ryan.orban@galvanize.com
@ryanorban
www.galvanize.com
Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams
Upcoming SlideShare
Loading in …5
×

Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams

66,953 views

Published on

Data scientists, data engineers, and data businesspeople are critical to leveraging data in any organization. A common complaint from data science managers is that data scientists invest time prototyping algorithms, and throw them over a proverbial fence to engineers to implement, only to find the algorithms must be rebuilt from scratch to scale. This is a symptom of a broader ailment -- that data teams are often designed as functional silos without proper communication and planning.

This talk outlines a framework to build and organize a data team that produces better results, minimizes wasted effort among team members, and ships great data products.

Published in: Data & Analytics, Software
  • Nice !! Download 100 % Free Ebooks, PPts, Study Notes, Novels, etc @ https://www.ThesisScientist.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi there! Essay Help For Students | Discount 10% for your first order! - Check our website! https://vk.cc/80SakO
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Nice !! Download 100 % Free Ebooks, PPts, Study Notes, Novels, etc @ https://www.ThesisScientist.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Very Nice, If you want more good Presenations on same topic visit www.ThesisScientist.com, Its a wonderful website for latest Presentations and Research
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Bridging the Gap Between Data Science & Engineer: Building High-Performance Teams

  1. Bridging the Gap Between Data Science & Engineering: Building High-Performing Teams
  2. How do I hire a data scientist?
  3. Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist Continuum of Skills
  4. Software Engineer Data Engineer Data Scientist Applied Scientist Research Scientist Continuum of Skills
  5. Math & Stats Computer Science Domain Expertise Machine Learning Software Engineering Research Unicorn Data Science
  6. Many companies try to find all of these skills in a single person.
  7. Which leads to job requirements like this… • MSc/PhD in Computer Science, Electrical Engineering, Math or Statistics • At least 5 years of experience in solving real-world practical problems using Machine Learning • At least 5 years of experience on mining and modeling large-scale data (hundreds of terabytes) • Extensive in-depth knowledge of Data Mining, Machine Learning, Algorithms • Knowledge of at least one high-level programming language (C++, Java) • Knowledge of at least one scripting language (Perl, Python, Ruby) • Knowledge of SQL and experience with large relational databases • Knowledge of at least one ML toolset (R, Weka, KNIME, Octave, Mahout, scikit-learn) • Strong ability to formalize and provide practical solutions to research problems • Strong communication skills and ability to work independently to get an idea from inception to implementation. • Knowledge of the state of the art in at least one of Bayesian Optimization, Recommendation Systems, Social Network Analysis, Information Retrieval • At least 5 years of experience with storing, sampling, querying large-scale data (hundreds of terabytes) and experimentation frameworks • At least 5 years of experience with Hadoop, Spark, Mahout or Giraph
  8. Data Science Unicorn
  9. These people do exist, but they are often already well-compensated, and only want to work on interesting problems.
  10. What can you do? Build a team instead.
  11. Broad-range generalist Deepexpertise Look for T-shaped people
  12. Machine Learning, Statistics, Domain Knowledge Softw are Engineering Business Acum en Distributed Com puting Com m unication Look for T-shaped people
  13. • Compose teams of individuals who have overlapping skill-sets and deep expertise in one area (machine learning, statistics, engineering, business, etc.) • The overlap allows them to speak the same language and work collaboratively on solving problems
  14. How do I structure my data science team within my organization?
  15. Data Science Team Structures CentralizedEmbeddedHub & Spoke
  16. Centralized Data Scientists sit on a team that acts as internal consultants, fielding and answering questions from multiple teams within the organization, defining tools for the organization, and acting as highly powered consultants.
  17. Embedded • Data Scientists are almost wholly embedded within one particular team and focus on solving problems for that team. • Teams are assigned to one particular product or function within the company and define and answer questions for that product or function.
  18. Hub & Spoke • The data science team sits together physically and works collaboratively to solve problems. • However, each data scientist (or a combination of them) gets deployed to work on problems within the organization. • Tends to apply to companies who have a lot of users.
  19. Data Science Team Structure CentralizedEmbeddedHub & Spoke > >
  20. How do I get my data scientists to work with engineering?
  21. Data Science Python R modeling & prototyping production Software Engineering Java/C++ RoR/Javascript
  22. Data Science Software Engineering Python R Java/C++ RoR/Javascript modeling & prototyping production
  23. Data scientists learn to write prototypes in production languages Engineers learn the basics of data science so they can understand how the models work Goal is to have both teams speak the same language and engender trust through communication
  24. Data Science Data Engineering Common Core Data Science Curriculum Data Engineering Curriculum Data Science Data Engineering Projects
  25. Data Science Engineering Initial Planning Data Science Engineering Data Science Engineering Production
  26. • Don’t look for unicorns, build collaborative teams of T-shaped people • Pay attention to how your data science team is structured within your organization • Get your data science and engineering teams to speak the same language, allowing them to build trust and work collaboratively Summary
  27. We believe an opportunity belongs 
 to anyone with aptitude and ambition.
  28. 29Galvanize 2015 NODES ON THE NETWORK COLORADO (BOULDER, DENVER, FORT COLLINS) SEATTLE, WA SAN FRANCISCO, CA AUSTIN, TX (OPENING Q1 2016) Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Data Engineering Immersive, Masters of Science in Data Science, Entrepreneurship Programs: Full Stack Immersive, Data Science Immersive, Entrepreneurship [Explanation Text]
  29. 30Galvanize 2015 PLACEMENT STATS FULL STACK IMMERSIVE DATA SCIENCE IMMERSIVE $43K $77KPre-program Salary Average Starting Salary 97% Placement Rate* *Galvanize is a founder member of NESTA (New Economy Skills Training Association), a trade organization founded to regulate the new “bootcamp” market. This place rate is more rigorous than that requested by state licensure agencies. The placement rate is calculated 6 months after graduation. $72K $114KPre-program Salary 94%Placement Rate* Average Starting Salary
  30. 31Galvanize 2015 5 PROGRAMS • Full Stack Immersive • Data Science Immersive • Data Engineering Immersive Project over 500 Student Member Graduates in 2015 Currently over 1500 Members • Master of Science in Data Science 
 (University of New Haven) • Startup Membership
  31. 32Galvanize 2015 FULL STACK IMMERSIVE • 97% Placement Rate 
 within 6 months • $77K Average Starting Salary • 6 Month Program
  32. 33Galvanize 2015 FULL STACK IMMERSIVE
  33. 34Galvanize 2015 DATA SCIENCE IMMERSIVE • 94% Placement Rate 
 within 6 months • $114K Average Starting Salary • 3 Month Program
  34. 35Galvanize 2015 DATA SCIENCE IMMERSIVE Week 1 - Exploratory Data Analysis and Software Engineering Best Practices Week 2 - Statistical Inference, Bayesian Methods, A/B Testing, Multi-Armed Bandit Week 3 - Regression, Regularization, Gradient Descent Week 4 - Supervised Machine Learning: Classification, Validation, Ensemble Methods Week 5 - Clustering, Topic Modeling (NMF, LDA), NLP Week 6 - Network Analysis, Matrix Factorization, and Time Series Week 7 - Hadoop, Hive, and MapReduce Week 8 - Data Visualization with D3.js, Data Products, and Fraud Detection Case Study Weeks 9-10 - Capstone Projects Week 12 - Onsite Interviews
  35. 36Galvanize 2015 DATA SCIENCE IMMERSIVE
  36. 37Galvanize 2015 DATA ENGINEERING IMMERSIVE • Launched Oct. 2015 • Built in partnership with Nvent and Concurrent • 3 Month Program
  37. THANK YOU RYAN ORBAN | EVP OF PRODUCT & STRATEGY ryan.orban@galvanize.com @ryanorban www.galvanize.com

×