Earlier this week my eye was caught be an email I received entitled “Eight New Tech Job Titles”. One of the most unusual job titles I have come across in my career is “Chief Monster” which was used for a while by Jeff Tayler the founder of monster.com. Although I believe that role title was unique I do tend to keep an eye on what role titles are emerging in the corporate world and so of course I clicked on the link. Nothing particularly startling in the eight titles listed but the “Chief Analytics / Data / Science Officer” entry did catch my eye, not least because I had recently for the first time encountered someone who carried the title “Chief Data Scientist”. The chap was presenting on a webcast and worked for a large US retailer and the context was a session on “big data” on which he had some very interesting views. I would share the link to the webcast but it is behind a subscription service paywall.
However, first it is perhaps worth defining the term “big data”. In general it seems accepted to refer to the exponential growth and availability of structured and unstructured data, a key dynamic of the digital age. People typically refine that broad definition by referencing some concepts I believe were first articulated by Doug Laney in 2001. He defined “big data” in terms of the characteristics of volume, velocity and variety. I recently came across an excellent IBM infographic entitled “The Four Vs of Big Data in which they had added veracity. The infographic is an excellent summary and it would be foolish of me to try and restate it here. I did register that within it is an arresting statement that by 2015 IBM believe that there will be 4.4m new IT jobs created in the big data field; note to self, can this old dog learn some new tricks to reinvent himself?
So you can imagine that on coming across my first Chief Data Scientist I had a number of questions to pose to him. I’m sure you will all have seen the various statistics about the exponential generation of data both in terms of the social media type context such as 30billion pieces of content shared on Facebook every month or facts like there are 6 billion mobile phones in use or that by 2016 it is forecast that there will be 18.9 billion network connections or that today each day there is estimated to be 2.3 trillion gigabytes of data created. Sadly I didn’t get to pose any of my (in my view!) insightful questions, however others admirably stepped into the breach.
The first question set to him was by a CIO musing on the number of “challenging” data warehouse projects in her past (snap!) and was focused on his approach to handling the complexity implicit in the big data arena. The data is arriving at speed from multiple sources both structured and unstructured and to be of value it is necessary to process the data sets (link, match and cleanse at a minimum) before you can start to meaningfully connect/correlate relationships to turn data into information into insight. I liked his initial answer; “Frankly if it was at all easy no one would be interested in paying me to hold a role with such a fancy job title!”. I liked his second point even more which was “It is easy to get over excited about the neat new analytic tools, how much processing power you need or whether you can leverage cloud based analytic engines. What is absolutely critical is domain knowledge, you have to understand the business context in which the data is created and in which it is being interpreted to create business insight and ultimately competitive advantage.” However, driven by the questions being posed he did then actually proceed to talk at length about technology tools at which point I will confess to losing interest quite quickly.
Of course what was extremely familiar was the message around needing to be able to use the power of the technology to create a context for the data whilst taking due note of the implications of the “four Vs” so well-articulated by the IBM infographic. This is the core message that CIOs and their teams hear all the time. It is those that internalise and act upon it that typically become the success stories and the technology capacity seen as an innovation engine for the business enabling competitive advantages. To get a sense of the size of the prize around big data and some case studies on success stories I recommend a read of the report Big Data In Big Companies by Thomas Davenport and Jill Dyche. To reflect on the gestation time for trends in technology to become deployed innovations in the business world I suggest reading the 2011 McKinsey report “Big Data The Next Frontier For Innovation, Competition & Productivity”. I remember reading this report in late 2011 when preparing a presentation on the Internet of Things and pondering whether it would be more than hype by 2015; I think we can declare yes at this point in time.