This month, I am wrapping up my three-part series on evolving data-related roles by focusing on one of the fastest changing and popular roles under discussion in the industry today – the data scientist.
Vincent Granville writes on Data Science Central in his article ‘The Rise of the Dual Data Scientist / Machine Learning Engineer’, that companies need professionals who are skilled in data science, machine learning, data analysis, research, statistics, and knowledge of a particular business industry. He defines these individuals as ‘unicorns’ who are really difficult to find especially considering that traditional education and HR functions have a very narrow mindset of funnelling people into specialised roles.
When it comes to education, this approach implies you need multiple qualifications, often at the Masters level, or that you need to further enhance your qualifications with several certifications to cover the full extent of the field. Yes, it is good to have siloed roles in a large data science team but not many companies know how to sustain such a big team. Often, they need these skills combined in a single multi-faceted ‘unicorn’, especially when it comes time to improve their maturity in the application of analytics, AI, and machine learning.
Expanding skillset
In her article on Data Science Central, ‘Data science as a lucrative career option for the youth’, Erika Balla writes that data science combines multiple disciplines. These include statistics, computer science, and mathematics. Balla notes that data science roles often include other functions such as the likes of a data analyst and data engineer.
According to Balla, it has become important for data scientists to venture into Machine Learning as well. “In the field of Data Science, Machine Learning is a pivotal skill. As a subset of AI, Machine Learning allows systems to learn and improve from experience automatically. The algorithms of Machine Learning have diverse applications, such as prognostic analytics, processing of natural language, and identification of images, among others.”
Beyond that, I do not think I need to highlight the importance of the software engineering or DevOps aspects of data science and Machine Learning. I have witnessed first-hand how incredible analytical models have been developed which would have resulted in significant business gains and cost savings. And yet, they failed because they were not properly productionised – in other words they were never integrated with the operational systems of the organisation and their results were never operationally made available to the decision-makers in a format or timeliness that empowered them to act on it.
Understanding the soft skills
Beyond the technical aspects of the data science role, one cannot neglect the soft skills related to business communication. Listening properly is crucial. The data scientist must tie in with business strategies and priorities. The data scientist must provide the company with the insights they need to act on.
Of course, there are times when this will be a difficult message to convey. Data storytelling, data visualisation, good writing, and presentation skills are therefore essential. Sadly, these are the skills that are often neglected in educational and training programmes.
Granville’s article is definitely a recommended read as he elaborates on how he trains software engineers to add data science and machine learning to their skill sets. He also touches on how to add software engineering skills to data science and analytics. He puts his own work in context: “But because I know both research and engineering, I am able to develop better solutions faster. The integration of engineering and research takes place in one brain (mine), not two. Intricacies in each are resolved jointly via fast human intelligence: all connections and neurons are in a single brain optimised for this dual interaction. Thus 1 is bigger than 1 + 1 here. And less expensive.”