In my next life I want to be a data scientist… Appart from being labelled the sexiest job for this decade, it encompasses a very interesting mix of job functions. The typical data scientist performs a combination of technical IT tasks, various forms of data analysis, and high-end consulting to the business. However, while these scarce resources are being “bred”, organisations may have to resort to small teams of specialists to get the job done.
The term “data scientist” has emerged as the new term for a data analysis specialist “plus”. It has become the label for the whole package consisting of a data analyst, data specialist, business intelligence analyst and data engineer, all in one person. You can imagine what value such a person can contribute to the decision-making processes in the organisation.
This role has become very relevant, because the enormous growth of data, together with the technology now available to process it, has left more questions unanswered, and many more questions and facts to be discovered. It is an opportunity to bring sciences and scientific methods back into the BI space. According to a recent EMC survey, the typical data scientist is more likely to be more involved in the whole data life cycle, and perform experiments on data as well. Organisations are fast realizing that in order to tap the potential hidden in their data, they need more skilled data analysts and they need to give them the freedom to explore and experiment with the data. Churning out production reports by the hour just isn’t enough to gain a competitive edge any more.
So what would the “job spec” of a data scientist be?
You will perform data extraction and data preparation:
- Extract data from various data warehouses databases, systems, and
other sources such as Excel and social networks.
- Cleanse, deduplicate and make the data ready and useful for
- Restructure the data in efficient formats for analysis.
You will perform in-depth data analysis:
- Explore the data to detect and represent meaningful insights.
- Apply statistical methods to determine trends and patterns in the
data that are not immediately obvious.
- Postulate, test and do investigative proving of hypotheses about
the business, as represented by the data.
- Apply advanced and predictive analytics to segment and cluster the
data, predict trends and other advanced analytics applications such as basket analysis and cross-selling.
You will present the data and interpret the findings:
- Perform interactive visual analysis.
- Draw up visual narratives and infographics.
- Prepare and present presentations to small and larger groups.
- Document findings and write articles and position papers.
Above all though, you need to produce actionable insight. Your analyses, findings and outcomes must be applicable and relevant to the business – the business must be able to interact with it, and act on it.
But why would you need to perform all of these tasks, together? Surely there are already groups or teams that do these things, or part of it, anyway? In order to get to results and new insights in the timeframes now required to make business impact, and for the organisation to act competitively, we can’t wait for those separate groups to get their projects, budgets and dependencies sorted out before giving us some information to work with. We have to work way more faster across all these functions and be way more agile to get new and valuable insights within timeframes that can make a significant difference.
It is a known fact that real data scientists are very scarce – they are still being bred, apart from the few individuals out there which already have the right skills and character traits. In the meanwhile, many organisations have realized that for the time being they may need data science teams, consisting of data analysts, programmers, statisticians, data visualizers and a strong leader to get the same job done.
In my next post on data science I will elaborate on what it takes to make it work – the skills required as well as the necessary organizational cultural changes.