August 6, 2018

How To Build a Data Science Team Now

Alex Woodie

Business execs who are leading their companies down the data science track may be dismayed by the difficulty and expense of hiring a data scientist, the so-called “unicorns” who command quarter-million-dollar salaries. But fear not: While companies can benefit from having a full-fledged data scientist on staff, it is by no means a requirement to actually doing data science.

The team-approach to data science started soon after Harvard Business Review named data scientist the “sexiest job of the 21st century” back in 2012, spurring a run on data scientists, applied mathematicians, and other quantitative types that still hasn’t let up yet.  Thanks to the continued rapid evolution of technology – not to mention workplace workarounds put into place due to the aforementioned unicorn shortage – the team approach has grown in popularity.

One business leader with real-world experience putting together data science teams (with and without actual data scientists) is Amy O’Connor, who built Nokia’s first data lake and is currently Cloudera‘s Chief Data and Information Officer.

O’Connor has a fascinating job that spans multiple roles. She takes responsibility not only for Cloudera’s internal IT systems, many of which are based on SaaS applications from vendors like Salesforce.com that are also Cloudera customers, but also overseeing the company’s own big data activities, which includes collecting and crunching a range of data to maximize customer satisfaction and revenue, and minimize risk – which, naturally, run on Cloudera’s own Hadoop-based data management and data science products.

While O’Connor has actual Cloudera staff data scientists available to her – not to mention the smart folks inside its Fast Forward Labs subsidiary – she doesn’t believe that data scientists are actually necessary in all circumstances. In fact, judging from the scarcity of data scientists, O’Connor says another path often preferable for many of the companies she deals with.

“One of the things I’ve found in my own experience at Nokia and Cloudera and with most of our customers is it helps to break down the role of a data scientist into smaller roles,” O’Connor tells Datanami in a recent interview. “I’ve found that that’s a way to take a set of individual people with skills set that complement each other and put them together to create what I call a hybrid data scientist.”

Instead of focusing on finding the magical unicorn data scientist who’s proficient across the triad of required skills – uber math and stats expertise, top-notch business acumen, and killer computer skills – businesses can actually get better results through the team approach. To pull it off, companies still need to find engineers who are good at programing distributed systems. They still need to find somebody with statistical skills, preferably at the Ph.D. level. And they still need somebody who’s a subject matter expert, preferably from the operations side of the house.

But provided that there’s enough overlap among them – that is, the statistician knows a bit of programming, and the engineer is passingly familiar with statistics – then the end result can be greater than the sum of its parts.

“That allows you almost to take that mythical unicorn of the data scientist, break it down into those three different roles, then strengthen the components of the roles that are needed to create a really good data science team,” O’Connor says. “We do this inside Cloudera. I’m finding that most of our customers are starting to do this as well.”

Read the rest of the story at Datanami.