Last month, I discussed the evolving roles of the Chief Data Officer (CDO), Chief Analytics Officer (CAO), and the Data Engineer. In my blog post this month, I will delve deeper into several of the other roles related to data engineering, as well as the evolution of the role of the Business Analyst.
Back in the day, we had ETL analysts and developers. Together with Database Administrators (DBAs), the ETL specialists did everything related to ‘moving’ data from source systems into the data warehouse. Of course, data warehouse architectures were a lot simpler back then. Often, the data warehouse was a single, large, on-premise database. It may have had a separate schema for the staging area with everything happening behind the firewall. Many companies could not really afford a development or test environment for their data warehouse, and therefore required a lot of the work to happen in production.
Expanding Data Engineer skill set
Today, there are hybrid architectures where some source systems are hosted on-premise, some are hosted in the cloud, and others are hosted by vendors on their own cloud infrastructures. Most organisations now also have development, test, and production instances of their data warehouse and Business Intelligence (BI) environments. This means a Data Engineer must have a much wider skill set. In the Technative article that I referenced in my previous post, Andy Palmer noted that “some businesses are encouraging software engineers to branch out into data engineering, while others are trying to retrain their database administrators.” The reason for this is that the modern data engineering tasks include work traditionally done by software engineers and DBAs.
In the ‘Roles and Responsibilities’ section of Microsoft’s cloud adoption framework, these roles are listed as essential for a cloud-scale analytics platform:
- Platform Ops: Also called a Cloud Engineer, Infrastructure Engineer, or Systems Engineer, this person plays a key role in cloud technology design, implementing and enabling cloud services and capabilities, and managing cloud resources.
- DataOps Engineer: This role, sometimes also referred to as Data Platform Ops, is responsible for orchestrating the data and analytic pipelines, promoting features to production, and automating quality management processes.
In a recent Forbes article, ‘How To Build An Effective Data And Analytics Team For Business Success’, Lokesh Anand refers to these roles as the DataOps team, noting that they take care of data stores, databases and transformation pipelines, ensuring that the data quality is intact.
A security focus
Next up, we have the Security Architect and Security Engineer. These roles are responsible for designing and ensuring that security policies, standards, and related tools are properly implemented, and assist with security assessments and audits. With the increase in the amount of sensitive data collected and stored and the ever-increasing threats of data and privacy breaches, these roles are crucial.
Database administration
What is interesting to me is that neither the Microsoft framework, nor the Forbes article explicitly mention the Database Administrator role or the tasks they perform. I think this role is as relevant, even if not more, than when it started in the 1990s. However, this role is just as necessary and crucial in the cloud analytics space.
As Kevin Kelly mentions in an AWS blog ‘The evolution of the database administrator role’, that AWS offers 15 different purpose-build databases, designed to support diverse data models. Microsoft Azure has a similar interesting and diverse set of data storage platforms. Granted, today’s DBAs are finding their jobs are more software-based and less about provisioning and managing hardware. But for all these databases, we still require capacity planning, backup and recovery, performance optimisation and workload management and balancing.
In fact, choosing the optimal data storage solution and integrating data sets across the various technologies is a new challenge in itself. Robert Half, in his post ‘The Evolving Role of Database Administrators’, phrases it aptly: “Data storage has dramatically changed over time, from mainframes to databases to the cloud. But as long as there’s data, people will want information from that data, and it will need to be managed. The DBA role is not going to lose steam any time soon.”
Analysing the business
Up next, we have the Business Analyst. In the past, Business Analysts in the BI space used to ask the businesspeople what KPIs they wanted on their dashboards. They then ensured the technical teams deliver on spec while doing a bit of testing and data mapping. Of course, this is a complete over-simplification, but the point is that there are also new roles related to business analysis in the modern BI and analytics segments.
As Michael Amori writes in a Forbes article ‘How AI Is Revolutionizing The Role Of The Business Analyst’, Business Analysts have to look at data and insights more holistically. It is interesting he mentions staffing analytics in his examples. I am still under the impression that most business are still placing people and support related insights too low on their priority lists, when their staff are in fact their biggest asset. Amori also highlights that with an increase in the maturity of analytics and also insights provided through AI, analysts should be freed up to present their findings at a higher level and include institutional knowledge, industry expertise, and more strategic line-of-business needs.
In the Microsoft framework, the term Data Steward (also called Data Trustee) is used to refer to a role inside the team that is responsible for data meaning, data quality, data compliance, fitness of data assets, knowledge of data products and their use. Palmer gives the rationale for this role in these terms: “Business Analysts are often frustrated by the lack of organisation and quality of a company’s data, not to mention the inability to find the information they need. To ensure that it’s consolidated and organised as appropriate for their needs, some take matters into their own hands and assume responsibility for the collection of data across the organisation.”
An interesting term used in the Microsoft framework is that of Data Owner. As domain subject matter experts and being the ones responsible for business relationships, Data Owners need to manage access approvals, data quality rules, business term definitions, usage rule definitions, and other aspects related to the specification and utilisation of data. In layman’s terms, the role is responsible for implementing the operational aspects of data governance from within the business intelligence and analytics team.
In my current team, we have replaced the traditional Business Analyst role with that of an Insights Analyst. This reflects the increased focus on delivering more mature and higher-level insights to the business. We have also started adopting the term Data Owner as an SME-related role.
Many hats
In a large team, these roles may be filled by different people. However, in smaller teams, an individual may be required to fulfil multiples of these roles concurrently or at different stages of the data-related lifecycles. Even though I wrote about Data Scientists a while ago, I would like to cover the evolution of the Data Scientist role in a future post, as it is vast enough to warrant a write-up on its own.