1. | |July 2016
1CIOReview
CIOREVIEW.COM
T h e N a v i g a t o r f o r E n t e r p r i s e S o l u t i o n s
JULY 19, 2016
BIG DATA SPECIAL
Eric Gillespie,
Founder & CEO
MassimoRuffolo,
Founder&CEO,Altilia
LalVaghji,
CEO,Ana-DataConsulting
HarryCarr,
CEO&President,BayMicrosystems
CEO OF
THE MONTH
ENTREPRENEUR OF
THE MONTH
COMPANY OF
THE MONTH
govini:
The Big Data
Proponents
The Big Data
Proponents
CIOReview
| |July 2016
64CIOReview
N
owadays, Big Data Science and
Analytics are some of the hottest
areas in the fields of science,
technology, and business. Most
business leaders and executives
have realized by now that they
need some “Data Scientists” or “Big Data
Analysts” to do something special, new,
and beyond traditional techniques and tools.
Business leaders have started to understand
that they need to pay more attention to data
capital, which should be treated on par with
financial and human capital. Data is becoming
a valuable asset for those companies that can
take advantage of it. To do so, one
needs the right people and needs
them now.
Where is one to find all those
folks who know Hadoop and Spark,
SAS and R, and have expertise with
random forests and support vector
machines? Specifically, how does
one find them quickly when everyone
else is also looking for them? How
does one hire them without breaking
the bank? Also, how does one avoid
hiring the wrong people—those who
cleverly use big data jargon to mask
an absence of true depth?
Finding the right data scientist is
not easy to do because most of these
folks are either hired already or are
too expensive. Thus, hiring great data
scientists is becoming quite difficult
as the supply and demand balance is
heavily skewed towards demand.
With this in mind, I would
like to share some tips from my
personal experience in building such
organizations in the last 3-4 years.
Judging by our results at Seagate,
these rules have led to several
successful organizations. However,
this process might not be suitable
for everyone.
Rule number one-we only hire PhDs.
The justification for this is simple—
we want to have people who have
undergone the rigorous “training”
that a PhD program provides. Such
individuals have not only obtained
advanced knowledge and skills in
some relevant field but have also
learned to work independently,
analyze literature and all the known
facts, summarize their findings,
identify main current problems,
propose possible solutions and defend
them with data, collaborate well,
and execute under pressure of tight
deadlines. Also, they were trained to
write papers and reports and to give
clear and to-the-point presentations,
including public speeches (at
conferences, etc.). From what I’ve
seen, the difference in compensation
between new hires with PhD and MS
is no more than 10-20 percent, which
is a small price to pay for the extra 3-4
years of training.
Clearly, the above rule does
not apply in cases when we need a
software developer or someone else
with other specific skills.
Rule number two-all candidates
(with some rare exceptions) need
to have some math/numerical
simulation/modeling experience.
This will almost universally require
strong programming skills, although
not necessarily with the languages
that are the most relevant to Big Data
Analytics, such as R and Python.
They can learn those languages
later and quickly (in our experience,
in less than 3 months). Don’t be
concerned when you hear “Fortran”
or “C++”. I haven’t seen a single
On Building a Successful
Big Data Analytics Organization
By Andrei Khurshudov, Chief Technologist, Seagate [NASDAQ:STX]
CXO INSIGHTS
| |July 2016
65CIOReview
On Building a Successful
Big Data Analytics Organization
case when a candidate hasn’t learned the “new language of
analytics” quickly.
Rule number three (and this is the most important rule)-a
demonstrated ability to learn quickly is critical. Data science
is a rapidly evolving field, so experience with a particular
technique is not as important as a demonstrated ability to
learn. Consider outstanding individuals in various science
and engineering fields with strong analytical requirements in
addition to pure data science. Over time, candidates meeting
these requirements have proven to be the “superstars” we
hoped they would be. In other words, once a star, always a
Andrei Khurshudov
Finding the right data scientist
is not easy to do because most
of these folks are either hired
already or are too expensive
star. Remember: talented people can always learn new skills,
which is especially important in such a dynamic and fast-
changing field as modern data science.
Rule number four-you still need a few very experienced
folks in the group-experts in machine learning or data mining,
for example. However, assuming that rules 1-3 are followed
closely, the organization becomes so strong and capable that
you need no more than 10-20 percent of your folks to be
those experts early on–they can guide and mentor the rest of
the team. Finding a few data science experts and many more
superstars is a much simpler and realistic thing to accomplish
than finding all superstar experts.
Rule number five-this is a universal rule I personally use for
hiring-always ask a candidate to start his/her live interview
with an hour-long presentation (this time should include a
Q&A session). The presentation may be given on any relevant
subject chosen by the candidate. Tell the candidate upfront
that this is his/her chance to show off, to demonstrate one's
own abilities and to present knowledge and skills in the best
possible light. Believe it or not, most of our hiring decisions
are made by the end of this hour. We still follow it up with
5-6 thirty minute-long 1:1 interviews but, historically, they
simply solidify our initial conclusion made at the end of
the presentation.
Rule number six-be patient with your new hires. Expect them
to spend 2-4 months adopting and learning new things before
they start delivering at 100 percent (this would include all the
new programming languages and tools from rule number two).
Most attempts to hire a Perfect candidate ASAP–the one who
will “hit the ground running”–will likely result in major delays
and might force you to make your final decisions in a hurry
when you run out of time.
All in all, in my opinion, it is not terribly difficult to build
a world-class Big Data Analytics organization. Just follow the
six simple rules listed above.