Tag Archive: ETL

Coming to grips with data wrangling

Data wrangling is often used to describe what data engineers do. However, a universally accepted definition of the concept has proven difficult to find. It was therefore fortuitous that I came across this piece on Simplilearn titled “What Is Data Wrangling? Overview, Importance, Benefits, and Future.” It shares some interesting and insightful points and so …

Continue reading »

Data Lake vs Data Warehouse

In a previous post I gave a high level description and overview of the data lake. Informally speaking, from a BI point of view, a data lake is a large scaled-out all-encompassing free-for-all unstructured data staging area. In this post I take the discussion further to investigate whether it replaces or interacts with the data …

Continue reading »

BI tools

As this is my last blog post for the year, I thought it would be fitting to explore how the Business Intelligence (BI) landscape has changed over the years, especially regarding toolsets. Having worked in the BI sector for numerous years – in the last five years the BI toolset landscape has not only expanded, …

Continue reading »

Data Scientist – Job Spec

In my next life I want to be a data scientist… Appart from being labelled the sexiest job for this decade, it encompasses a very interesting mix of job functions. The typical data scientist performs a combination of technical IT tasks, various forms of data analysis, and high-end consulting to the business. However, while these …

Continue reading »

Data Quality vital for sound BI decisions

The success of every decision is closely related to the quality of the information that was used to make that decision. For this reason, Data Quality is very closely related to Business Intelligence. Data quality checks and active data quality controls should be embedded into the loading and reporting processes. This can ensure the quality …

Continue reading »

Near-Time Data Warehouse Synchronisation

An on-going challenge of data warehousing is the long turnaround time between when a business event occurs and when the fact representing it is ready for consumption in the data warehouse. Delayed delivery makes it difficult to make timely decisions. Long data latencies also impede the organisation’s ability to quickly assess the implementation of decisions. …

Continue reading »

Combining Agile Prototyping and Data Warehousing

Agile BI is all about interactively prototyping the information requirements and information usage with the business users. You can deliver data as quickly as technically possible, but if you don’t workshop the information exploitation aspects interactively with the users, then you are not being agile. 

Continue reading »

Agile Data Warehousing

“Agile data warehousing” is a contradiction in terms. Very few activities are less agile than populating a 10 terabyte data warehouse…  However if you want to deliver information frequently and rapidly to the business, you have to speed up the ETL implementation.

Continue reading »

Report Analytics

Report Analytics scrapes data from existing reports into integrated reports for decision-makers. It is positioned as self-service business intelligence that eliminates the delays and costs of traditional data warehousing. However, I have a concern with the trustworthiness of the existing report data.

Continue reading »

hope howell has twice the fun. Learn More Here anybunny videos