Data Quality is essential for Real-Time BI efficiency
In my previous two blogs I focussed on the actions an organisation should take to implement and improve real-time BI. In this post I continue with the theme, focussing on the importance of data quality in the real-time BI environment and in the real-time BI processes. Data quality directly influences the efficiency of the processes and the value of the information to the organisation.
In the white paper that I used as departure point for this mini-series on real-time BI, the Aberdeen Group1 states that organisations striving to implement real-time BI should investigate technologies that can be used to improve data quality. I want to elaborate on this statement because there is no doubt that the quality of the data that is churned through a real-time BI solution directly determines how useful that data can be for an organisation in its decision-making processes.
Therefore, we need to look at processes and technologies that should be put into place to clean, enhance and fine tune the data, to ensure that it adds timeous value when and where the corresponding information is required. There are a variety of ways in which information is utilised from a BI solution and these also affect the usability of the data, including its timeliness, relevance and ease of access, among others. However, it is even more important that the underlying data must be clean, free from error, corruption, duplication and it shouldn’t have any missing elements.
One of the most logical uses of data quality technology is for source data quality assurance. The so-called ‘best-in-class’ businesses utilise tools like data profiling, data cleansing, hygiene technology and data enrichment to improve data quality before the data is even extracted from the transactional or other source systems.
In a real-time BI environment, clean and correct source data is even more important. In a high volume real-time pipeline from transactional system(s) to the storage area used for BI, you don’t want the processes stalled through extensive exception processing or clogged by bad quality data. In a batch ETL process error handling is less disruptive than in a real time process. Sure we can implement the necessary checks and filters in the real-time process, but the simpler and faster it is, the more efficient it will be. Simpler and faster can be achieved by ensuring only correct and high quality data gets pushed through the pipe.
In its simplest form, data forms the core of any BI solution. As a result, having such technologies in place as described above will ensure that throughout the process of achieving the real-time value of BI, time and effort is not wasted. Good quality data will therefore allow a business to reduce the time it takes to process the data and it will increase trust in the data, allowing for a more efficient analysis to be performed as an end result. In turn, this will ensure that the real-time value of BI is achieved.
Keep a look out for my last post in this real-time mini-series next month.
- Business answers at your fingertips: The Real-Time value of BI, Aberdeen Group, February 2011