Data science is a team sport.
At least, that’s the consensus of a survey conducted by Bob Hayes of Business Over Broadway (B.O.B.) and Analytics Week asking Data Scientists about their skills and team make-up. Seventy-six percent of the respondents said they’ve worked with one or more people on their projects involving analytics.
As the data companies capture and the number of sources they leverage increases, a number of data professional roles have been created to satisfy a demand for highly sophisticated technical skills. And they have similar backgrounds and skillsets, making it hard to grasp the difference between each role. The distinction, however, lies in their end objective and how they work with data to reach it.
To help clear up the confusion, we compared some of the top data professional roles available for your team:
A Data Architect is the go-to person for data management, especially when dealing with any number of disparate data sources. According to an article by Martijn Theuwissen, the co-founder at Data Camp, this important role “creates the blueprints” for data to be effectively captured, integrated, organized, centralized and maintained.
With an extensive knowledge of how databases work, as well as how the acquired data relates to the business’s operations, the Data Architect, ideally, is able to speculate how changes will affect the company’s data use, then manipulate the data architecture to compensate for them.
This role is closely related to the Data Architect. The Data Engineer also works on the management side of data, making some people think the titles are interchangeable. However, a Data Engineer, who usually has a strong background in software engineering, builds, tests and maintains the data architecture.
As Udacity states in 3 Data Careers Decoded and What It Means for You,
a data engineer builds a robust, fault-tolerant data pipeline that cleans, transforms and aggregates unorganized and messy data into databases or datasources.
This means that Data Engineers have the same knowledge about the inner workings of databases as Data Architects, but they use it to develop and maintain the data architecture that makes the data accessible and ready for analysis.
As the name implies, the Data Analyst works to interpret data to get actionable insights for the company.
With a strong background in statistics and the ability to convert data from a raw form to a different format (data munging), the Data Analyst collects, processes and applies statistical algorithms to structured data.
Possessing a “figure-it-out” attitude, as Theuwissen describes it, a Data Analyst runs queries guided by questions from the company’s decision-makers. The ideal person for this role should also have a keen sense of programming, machine learning and data visualization to effectively share the found insights in a clear manner.
A Data Scientist’s mission is similar to that of a Data Analyst’s: find actionable insights that are key to a company’s growth and decision-making.
However, a Data Scientist role is needed when a company’s data volume and velocity exceeds a certain level that requires more robust skills to sort through.
We talked to Filtered’s head of content and science, Dr. Chris Littlewood, and he said,
The way I’ve heard it described usefully is that if you need to look for a needle in a bucket of hay, you sort through the hay. If you need to find a needle in a field of haystacks, you’re best off using a very powerful magnet to pull out the needle.
So instead of finding key information by analyzing structured data, a Data Scientist wades through a rolling sea of unstructured data (big data) to identify questions and pull out critical information. The person then cleanses the data for proper analysis and creates new algorithms to run queries that relate data from disparate sources.
On top of these skills, a Data Scientist also needs strong storytelling and visualization skills to share insights with peers across the company.
Although each role has different objectives and processes for working with data, the skillsets required for these positions, many times, overlap. An example of this is the software languages each role should be familiar with.
When it comes down to it, a data science initiative is more than just using someone with highly technical skills to spot a pattern in a data set. There are many moving components to manage to effectively collect the data, make it accessible and gain key insights for the company.
Keep in mind that not every organization needs to have every type of data professional on a team. But it’s good to know the difference so you can choose the right person.