Combining Agile Prototyping and Data Warehousing


Combining agile prototyping and data warehousingAgile BI is all about interactively prototyping the information requirements and information usage with the business users. You can deliver data as quickly as technically possible, but if you don’t workshop the information exploitation aspects interactively with the users, then you are not being agile. 

Linking back to the analogy in my post on Dimensional Modeling for Agile BI, I found the most practical approach to  building Sodor railways with my children is to construct small parts of the track and keep them involved throughout. I get them to steer their engines and carriages through the segments under construction. They don’t have the patience to wait for a completed track – they chop and change it so often that we hardly ever reach the utopia of completion. And while the tracks are being built and modified, they compete to assemble the longest trains to drive through the

In my previous post I discussed how to speed up loading the data warehouse using smaller real-time loads through a  well-architected ETL framework. Now I focus on those very important frequent interactive sessions with the business users, called scrums, which form the crux of an agile approach. You have to run live prototyping sessions with them on what information they need to run the business, what can be provided and which visual formats would be most suitable.

The easiest prototyping is obviously when the data is already available in a data warehouse or data mart, in a query- and analysis efficient model. Then, along with good communication and facilitation skills, all you need is a good understanding of the business, a tool with rapid development capabilities, excellent visual displays and a skilled “tool jockey” who can churn out visual reports and graphs while in discussion with the users.

The scrums become a lot more challenging when the data is not so readily available. Then the only way you can do agile BI prototyping is by using a very interactive tool, which can directly connect to and integrate data from multiple transactional systems. Fortunately there are some good tools like that readily available, such as Qlikview, Tableau and YellowFinBI. The key requirement is that the tool must be able to join data from two (or more) sources “on the fly”. Whereas the business may at first be satisfied with information coming from a single system, they will soon require cross-system analyses. KPIs such as Revenue per Full-Time Employee and Revenue per Base station/Store/Agency are pretty simple KPIs that are often required, but usually require data from multiple systems (e.g. Financials and HR for the former). After a few sessions with the business, more and more of these will appear.

So, the business has seen what they want, and you have provided it to them “on the fly”. Now, how do you bridge the gap between prototyping and production reporting? I am still strong of the opinion that production reporting should be done from the data warehouse – which should be the single true source of properly integrated, validated and quality-assured data, in a reporting and analysis efficient format, with detailed history. So while the business is happily using the prototype reports and dashboards, your agile ETL team gets the “now” data into the data warehouse as quickly as possible, while your hard-core back-end team starts working away at getting the historical bulk loads in place. As soon as sufficient data is available in the data warehouse, the reports and dashboards are pointed to it as their true source. And so in this fashion the prototyping based sprints continue…

In the following posts in this series on agile BI, I will look at the approach to prototyping, some of the BI toolsets that support prototyping, as well as paradigms to visually present data.

Leave a Reply