Understanding the analytical trends to watch for in 2021


This is usually the time of year we reflect on what was and think about what is to come. I think we can all agree that 2020 has been quite the experience with things like social distancing, lockdown, and the pandemic becoming part of popular discourse. As organisations embrace remote working and the associated digital technologies as part of the ‘new normal’, data will become even more important than ever.

Read the rest of this entry »

Data science is a team sport


The concept of teamwork is fundamental to the success of any organisation. Sport fanatics understand all too well how this can help shape the success (or failure) of their favourite sporting teams. But when it comes to data science, many still think it is a solitary pursuit with people working in isolation from the rest of the organisation. The reality could not be further from the truth.

I recently came across an insightful article that explores what constitutes a strong data science team structure, providing a number of very relevant insights from various industry experts. While the piece does not bring in the sporting analogy, any team requires individuals who fulfil certain roles and responsibilities – much like the positions of a rugby, football, or cricket team. With this in mind, the piece does point out that companies should not just appoint several data scientists for the purposes of forming a data team. Rather, the business must understand the data science roles and consider the need for the data science team to understand the business challenges that needs to be overcome, in order for success.

Different roles

Much like sport, there are many different roles in any data science team. How effectively these individuals work together will greatly impact on the team’s ability to extract value and meaningful strategic insights from the data at hand. Examining what constitutes a data team, the article says that while data scientists will comprise the bulk of it, there are different types that must be considered. These can range from machine learning experts, statisticians, and developers to name a few – and all with varying knowledge and expertise.

In fact, the field is so vast that you can have an individual who has a PhD in data science but does not necessarily have much knowledge or experience in the applications needed for the business. Similarly, the individual might not understand the insights the company requires to grow in a competitive environment.

Building on from appointing the right mix of data scientists, any data team must consider data engineers who are responsible for setting up the data pipeline and managing it. So, while the scientists build analytical models, the engineers do the nuts and bolts of setting up the processes.

Strategic integration

Tying all this together is a data strategist that provides an invaluable link between the business and the data science team. It is a case of combining the productionising of analytical models with the data architecture and guiding it in such a way to meet the core business objectives. This function is often neglected as much of the focus falls on model development. Unfortunately, many organisations do not have the skills on board to put those models in use as part of the ecosystem of its production systems.

But it remains critical to combine the different skills of data scientists, data analysts, data engineers, and a data strategist. Of course, not all companies will be able to afford specialists in each of these roles. Instead, they should have the team share responsibilities to cover each of these touch points. The key is to have the roles fulfilled in a unified manner and deliver the best data value the organisation requires. Given how data will only become more important as businesses embrace digital transformation initiatives, the data science team becomes one of the most vital to help ensure the success of the organisation into the future.

Understanding the megatrends impacting on AI


Given the rapid push towards the digitalisation of business in 2020, Artificial Intelligence (AI) has gained attention and as a critical enabler in this regard. In April this year, while delivering on the company’s quarterly earnings report, Microsoft CEO Satya Nadella said that “we’ve seen two years’ worth of digital transformation in two months.” It is this sentiment of fast-paced innovation that is echoed in a recent Gartner article I came across, focused on AI megatrends.

Read the rest of this entry »

Vital capabilities for advanced analytics success


Last month I looked at what organisations can do to overcome some of the barriers to tailoring data management for advanced analytics. This built on from the first blog on the advanced analytics topic, where I discussed how important data management is for successful advanced analytics. In the final piece of this series based on a recent TDWI report, I will explore the critical data management capabilities required to ensure advanced analytics success despite the challenging market conditions.

According to the results of the TDWI survey, the three most important data management capabilities are data integration and data warehouse (tied for first), and data quality. Of course, this should not come as a surprise considering that these form the core of what many consider to be modern data management from both reporting and analytics perspectives. Fundamentally, all three of these work in unison where it becomes difficult to distinguish between where the integration stops, and the warehouse begins.

A modern approach

Driving advanced analytics in this environment requires the modernisation of existing data warehouses. These need to be able to support the likes of data lakes that can be deployed across any environment whether on-premise, in the cloud, or a hybrid of both. But if the quality of the data is poor, it does not matter how integrated the data warehouse is or whether it supports modern innovations. Organisations will still not be able to effectively analyse the information they have at their disposal.

For its part, data integration extends beyond the warehouse to encompass aspects such as ETL, replication, synchronisation, virtualisation, orchestration, and workflow management. Furthermore, the report highlights the importance of data semantics, a broad term that incorporates all forms of metadata management. Without this semantics in place, self-service analytics cannot be successful. And if this does not take place, true advanced analytical programmes are also significantly limited in their potential for the organisation.

After all, self-service analytics include data tools and the preparation of data for more simplified integration.

Remember the interface

According to TDWI, data management for advanced analytics should integrate data from numerous sources at multiple latencies. And as the type of resources and targets are expanding via new interfaces, companies need to keep in mind the importance of effectively interfacing with these environments.

To this end, interface and API management are becoming increasingly important aspects of data integration and advanced analytics. As with other technological processes, the human element should not be overlooked.

Human factor

People-driven data practices are also important for successful analytics initiatives. By supporting data sharing functions, stewardship, and curation features, business users can more effectively manage and control analytics in the organisation and extract insights from the data. Data is changing how businesses operate and engage with external and internal stakeholders. By embracing the variety of data management capabilities available, decision-makers can help ensure the success of advanced analytics irrespective of what is happening in the market.

Navigating data management obstacles for advanced analytics growth


In my previous post, I discussed how important data management is for successful advanced analytics based on a recent TDWI report I found of interest. As a follow on, this month, I examine what organisations can do to overcome some of the main barriers to tailoring data management for advanced analytics, if you really want to reap the benefits.

Constant governance pressure

According to a TDWI report, the leading concern remains data governance. More specifically, how best to modernise governance practices to cover the inevitable additional data platforms, and use cases, which come from advanced analytics programmes.

Data legislation is something that needs to be taken seriously. The General Data Protection Regulation (GDPR) in the EU and the Protection of Personal Information Act (PoPI) in South Africa are prime examples of this. The financial and reputational damage that can result from non-compliance can be significant and so it’s no surprise that data governance remains a top priority to get right before any benefits can be reaped.

Irrespective of the legislation, ensuring data privacy and using it within compliance parameters will underpin any successful analytics initiative. Of course, given the right focus and resources, an organisation can maintain its compliance no matter how many data platforms and advanced analytics programmes it runs.

Managing data complexities

Another barrier which TDWI highlights is the complexity of hybrid data architectures. It states that most data management teams deploy multiple data platforms, each optimised for a particular [analytics] use case or structure. Inevitably, this creates a distributed environment that can result in significant data sprawl.

This complexity is vital to provide the best outcome for specific use cases. However, it creates difficulties in architecting environments that can continuously repurpose data for a variety of uses or purposes. Maintaining a single version of the truth can be problematic given how data is spread across environments within any business. If data is not successfully integrated across these architectures, a business can lose sight of the most relevant aspects of it. The saying of not seeing the forest for the trees becomes all too real in this regard.

Leveraging expertise

Furthermore, TDWI has found that companies that are new to advanced analytics are often held back, at least initially, because of a lack of internal skills. It goes so far to say that even those companies with established analytics programmes struggle to keep pace with the skills required to unlock the potential of the data.

And the skills gap that exists will certainly add to organisational pressure. More must be done to upskill and reskill data management teams in this data driven world. Moreover, the training provided to tertiary students must better reflect the needs of the organisation especially when it comes to advanced analytics.

Fortunately, none of these obstacles are insurmountable.

However, a concerted effort must be made if companies are to address these concerns and positively impact change at the organisation, to be able to reap the benefits data management for advanced analytics offers. Join me next month for the final article on this topic, where I explore the critical data management capabilities required to ensure advanced analytics success despite the challenging market conditions.

Data management essentials for better analytics


Thanks to the availability of innovative technologies such as artificial intelligence, machine learning, and robotic process automation, modern companies have access to a wealth of tools to expand their analytical functions and extract additional value from the data at their disposal. Key to getting this right, however, is managing data effectively and understanding the very important relationship between advanced analytics and data management.

In a recently published report I came across, TDWI examines the best practices required for using data management in an advanced analytics environment, to really reap the benefits. The points outlined in the report are so valuable, that I will be summarising and sharing views over a series of articles linked to this theme.

Fundamentally, and as many in the data space will know, it comes down to the premise that successful forms of advanced analytics require adequate data management. The reports sums this up perfectly: if a company puts ‘garbage’ in, it stands to reason that it will only get ‘garbage’ out irrespective of the technologies used.

Data management therefore requires the right data in the right format on the right platform to deliver value through advanced analytics.

Matching needs

This means matching a combination of data management platforms, and tools, to each individual use case for advanced analytics. Much like any business challenge, the organisation must adopt a unique focus for each analytical function and manage the data processes or requirements round this. There is simply no one size fits all approach in this regard.

For example, as the report highlights, the analytics required when cross-examining massive data volumes (think of statistics and data mining) typically see users deploying Hadoop or a cloud-based management system. However, analytical solutions can actually be on premise or in the cloud. As such, it is wholly reliant on the requirements, and which tools are best suited for the job. In saying that, as more organisations continue their digital transformation journeys, the expectation is that there will be a natural increase in cloud-base solutions due to the high-performance capabilities that these can leverage.

Concepts unpacked

To understand this practice or approach, we must decipher the concepts to see their relationship more clearly. Advanced analytics is a collection of multiple user practices and tool types supporting techniques for data mining, statistics, predictive analytics, data visualisation, and others. Analytics should therefore be seen as a grouping of several practices with each having its own focus, abilities, value proposition and performance characteristics.

For its part, data management focuses on a variety of product types, technologies, and user practices that contribute to the successful handling of data. According to TDWI, these can be divided into data integration and data platforms. The former is about capturing and repurposing data for applications while the latter is the location where data is stored and managed to be provisioned for applications.

Both these elements need to be adapted to meet the extensive requirements of advanced analytics. Each method needed for advanced analytics can require a combination of data integration and data platforms in order to deliver value effectively.

Seeing this relationship between a myriad of components emphasises the importance of data management for successful advanced analytics. Dubbed an emerging practice, it aims to raise the accuracy of analytics outcomes for the organisation by adapting data management practices to consider the unique needs of each analytics technique and solution. This entails the development of a targeted solution that is capable of truly analysing all aspects of the data without resulting in unaddressed needs.

So, join me next month as I explore how to overcome the barriers of data management for advanced analytics to harness the benefits on offer.

Spot the difference – data scientist vs data analyst


Revenues in the global big data market are expected to more than double from 2018 figures and reach $103 billion by 2027. Given figures like this, it certainly emphasises the need for organisations to ensure they have the right skills in place to capitalise on this obvious growth opportunity. And a key element is to leverage the expertise of data scientists and data analysts. Yet, in my experience, there is still some confusion on what roles and responsibilities these positions fulfil.

I came across this interesting article that examines the main differences (and similarities) between the two roles. And this discussion has become even more relevant today given the uncertainty of the market due to the COVID-19 pandemic. In fact, it is a question I get asked by many students in my Master of Business Analytics courses – and one that inevitably arises from clients not certain about the nuances between these positions.

In this industry piece, the author’s description of a data analyst is spot on. He writes that ‘the focus of data analytics is to describe and visualise the current landscape of data – to report and explain it to non-technical users.’ He goes on to cite skills in SQL, Excel, and Tableau (or other visualisation tools) as being especially important.

When it comes to data science, he describes it as ‘a field of automated statistics in the forms of models that aide in classifying and predicting outcomes.’ He writes that the top skills required by a data scientist centre around Python, SQL, Jupyter Notebook and algorithms.

While I do expect analytical modelling skills in a data scientist, as he mentioned, I also think a more rounded skill set is required from a ‘full-blown’ data scientist, including an understanding of business processes, business strategy as well as all the data analyst skills mentioned above.

Having said that, these fully rounded data scientists are a scarce commodity. Realistically, very few people have all these capabilities and even less have the inclination to get skilled in so many fields, especially when it comes to the business-side of things. However, for a small team within an organisation, or even a start-up, these are the skills needed in the ‘toolbox’ of the data scientist.

When it comes to larger teams, there are more specialised roles available. However, the team lead must be multi-skilled across all the disciplines even if they do not practice it from a hands-on perspective. In these teams, the data scientist role can be one that is more in the form of an analytical/machine learning/automation ‘guru’ who leaves the data wrangling, reporting, and business interaction to their teammates, which can include data analysts.

In many respects this is what used to be called an analytical modeller. Of course, given the slightly wider application of not just analytical models (such as machine learning and automation), the term data scientist is probably more apt within the current landscape. So, while data drives both these positions in the organisation, there are subtle differences, especially when it comes to the business expectations, of each. Irrespective, the competitive organisation of the future will require an element of both to remain relevant in an increasingly dynamic market.

Let your data tell a meaningful story


The old adage of “a picture is worth a thousand words” could not be more apt, or more evident, than when it comes to data visualisation. And with research proving that the human brain can process images that the eye sees for as little as 13 milliseconds, data storytellers are playing an increasingly critical role in bringing across key insights, visually, as effectively as possible.

Data storytelling is a methodology for communicating information, tailored to a specific audience, with a compelling narrative. It merges data science, visualisations, and the concept of a narrative to provide one of the most effective ways of sharing business information and driving outcomes.

When it comes to data, the reality is that data preparation on its own holds little value beyond those working closely with it. To unlock all its potential, a company requires a data storyteller that understands not only the data itself (science) but can pull it together visually (visualisation) to tell an important story (narrative) and help guide organisational strategy and decisions.

Data evolution

Anyone who follows my blog will know that the data visualisation topic is one I am rather passionate about. And so, when I came across this insightful article on the evolution of a data analyst to a data storyteller in three steps, I immediately wanted to share the insights and my take.

The author of this industry piece goes on to explain how to improve data visualisations using a simple procedure in Matplotlib. And while you can read the mechanics of this to do this for yourself, there are a few fascinating insights to come from the process involved.

The critical point is that visualisations are the key mechanisms to translate complex data outcomes into understandable business stories. For its part, visualisation becomes the fundamental tool required to enable data-related storytelling. According to the article, if a data analyst or scientist cannot visualise the results, then they do not know the results. The author writes that it takes a passionate and skilled data scientist to transform basic visualisations (which just about anyone can create) into a story that managers and customers will understand and get excited about.

This all comes down to three things noted in the article – adding information; reducing information; and emphasising information. Much of this revolves around improving the signal to noise ratio. This describes the amount of valuable information compared to the amount of unnecessary information. The article goes on to explain how to highlight all the important information and remove everything that does not add any real value.

At its core, this is what storytelling does. Nobody wants to read a novel that is poorly written or has weak characters and plot holes. Similarly, data storytellers must create compelling narratives where the ‘readers’ become passionate about what they are seeing.

Insights beyond

In his 1983 book, ‘The Visual Display of Quantitative Data’, Edward Tufte introduced the concept of Data-Ink ratio. Tufte refers to data-ink as the non-erasable ink used for the presentation of data. If data-ink would be removed from the image, the graphic would lose the content. Non-data-ink is accordingly the ink that does not transport the information, but it is used for scales, labels, and edges. The Data-Ink ratio is the proportion of Ink that is used to present actual data compared to the total amount of ink (or pixels) used in the entire display. In the book, he explains how to get more data (story) onto the graph and less graphic distractions.

Furthermore, Stephen Few has also written extensively on visual business intelligence, or rather, data visualisation as we know and love it today. I have in the past attended one of his excellent courses and have worked my way through some of his material – which is fantastic. In his most recent blog, he addresses the data storytelling attempt by some to compare the effects of the COVID-19 pandemic around the world. While it provides a fascinating read on how to make the data tell a more insightful story, the examples he creates highlights that it is not always about making things more complicated but adopting a simpler approach to provide more valid and useful ways to represent data.

Data visualisation and storytelling are fast becoming critical components for any business who wants to get a better understanding of the data at its disposal. In fact, I believe that this will be a vital resource for any organisation to create differentiation in this digitally driven world.

Data this and data that – tips to avoid data chaos


At the core of almost every business conversation centered around growth and competitive gain today lies the topic of data. It is the single most valuable asset to many businesses – not matter industry or size. Data has taken the world by storm and continues to shape how businesses operate and transform to ensure relevancy today and, in the years to come.

Fuelled by technological development, data is coming into organisations from every angle. Naturally, business users within the company want to be able to leverage this data to help them fulfill their role and support overall business growth and transformation. While the business shouldn’t necessarily complain about this, if not managed accurately, this ‘data this data that’ focus can result in organisational data chaos.

In an environment where data is uncontrolled and dispersed across the organisation, business users are troubled with having to identify between distributed data assets and try to make sense of them – like needing to determine which is the real master copy of the data or the real version of the truth – for the data to be effective. The bottom line is that this is not the role of the business user and having to spend time determining whether the data is up to date, validated or complete, is a waste of a scarce resource.

Instead, an organisation that is steering a tight data ship forward knows that the correct version of the data must be available to the business user for them to be able to carry out their actual job function/role – which is interpreting the data and using it in business-oriented decision-making.

While achieving this is no easy feat, it is most definitely a manageable one, when the right processes and technical solutions that support a data driven approach are in place. 

A dedicated data team

For a business to reap the rewards of a data focused approach, the organisation needs a team of data stewards. The sole responsibility of this team is to serve the business users by providing them with the accurate data they need, when they need it and in the format and level of quality they require, to be able to carry out their business functions and responsibilities. Setting the business users up with the wrong data or data of poor quality is only setting them up to fail.

A strategy for data management with the right tools

To be able to produce quality data and at the level the business users need, the data team needs to be working off data that is running through a dedicated and managed data system. Technical based data management solutions play a critical role in guiding an effective data driven business forward. Aspects like data inventory, data cataloguing or data dictionaries are central to this. Without this focus, the job of the data team is rather impossible, with a knock-on effect that impacts the entire organisation.  

Data governance the golden thread

Of course, none of this should take place if data governance does not fall at the centre of anything data related an organisation carries out. Policies, procedures and work approaches linked to data must be defined around data governance and compliance. And this must be driven by the C-Suite level, from the top down, in order for any data strategy to succeed.

While every business wants to be data driven to benefit from this asset that is only growing in relevance, it is critically important to not let data cause chaos in an organisational structure. To avoid this, I believe that a strong focus is needed on the above identified points – looking at the various technical solutions required in order to provide high quality, correct and suitable data to the business user, while being enterprise lead and with governance top of mind. It is only when such points are visibly actioned within an organisation can it really reap the benefits data has to offer.

Older posts «

hope howell has twice the fun. Learn More Here anybunny videos