«

»

Big O and Big Data

Share

Big O Oracle and Big DataLast week I attended the Oracle Big Data and Extreme Analytics event in Melbourne. The packed audience was a good indication how relevant this topic is. Oracle illustrated and demonstrated the flow from unstructured data on Hadoop, through the Oracle NoSQL platform, integrated with structured data in their BI stack running on Exadata, to end user tools like Essbase and Endeca running on an Exalytics platform. Endeca impressed as an advanced analysis and exploration tool, for structured and unstructured data combined.

The event was very well attended. Bankers, insurers, retailers, airlines, government agencies, utilities, healthcare, consultants, service providers, and even Oracle’s competitors – they were all there. That is a sure indication just how important this field has become – those who are not embarking on the journey yet, are watching it very, very closely. It is very simple – if you are not going to analyse your customers’ sentiments and behaviors, then your opposition is going to do it, and change their product offerings and their campaigns to exploit the customers’ positive behaviours and to pro-actively manage their negative  sentiments.

There is no doubt that everyone knows by now that Big Data with its 4 Vs (Volume, Velocity, Variety and Variability) have become important due to the incorporation of additional data sources such as measurement data, social media and other unstructured data. This additional data provides us with that crucial information about our clients’ sentiments and behaviours. However, what I really liked is that Oracle added another V, as in Value. You really need to distinguish the valuable subsets of this data very, very quickly. One attendee asked the pertinent question: do you necessarily need to analyse 2 weeks of twitter feeds when a random sample may give you a very similar outcome? That may be a valid approach to determine a general market sentiment, but of course, when it comes to proactive or corrective action, you will want to know each and every one of the individuals concerned.

Enter elephant on stage left… I have always been skeptical about the Hadoop / Mapreduce hype, but from the live demos Oracle ran, I am now convinced of the power of Mapreduce. (My word, it has a super-primitive interface, I thought I was back at Uni hacking lex, yacc, and a few other awkward 3-letter acronyms to try and write a compiler. Surely elephants can be trained using human language?) One of my biggest concerns has always been the volume of data to be communicated between the Big Data residing on the HDFS in the cloud and the traditional BI systems in the organisation where the business users do analysis and reporting. But that is where the “map” and “reduce” comes in. A crucial step in the data pipeline is to code/decode the data, and filter and condense it to a communicatable, analysable subset.

What really impressed me was the power and ease of use of Endeca. For me Endeca falls into the category of data exploration and data analysis tools, up there with Tableau and Qlikview, but instead of focusing on visualization, it focusses on search and association – not saying that it doesn’t have powerful visualisations too. (In fact, it even supports that most dreadful of all dreaded visualisations – 3D graphs. When will vendors learn that the only use for a 3D graph is on a book or magazine cover, or maybe on a PowerPoint title slide?) Endeca has a very powerful google-like search capability and it can be used to analyse a combination of structured and unstructured data with equal ease and power, so together with its powerful indexing facilities, it’s very, very useful in the Big Data environment.

Through this session Oracle demonstrated three things: BigData has become an important business resource, it is very practically implementable, and they (Oracle) are a serious player in the field. The demonstrations and real life case studies were very business focussed, and you could clearly see the business value that can be obtained by acquiring and properly analyzing Big Data. Equally important though is that you need to get your infrastructures spot-on, as well as the tools used on each, and very crucially, the data flows between them.

Big Data implementations are maturing very fast. Although it may not be relevant for each and every organisation right now, it is becoming more and more relevant as a strategic information-related resource. In my opinion, organisations will have to regularly evaluate whether it can give them any business benefit over the costs, efforts and implications involved in its implementation. With such a fast developing field, these strategic evaluations have to be made quite frequently.

Leave a Reply