Big Data vs Monte Carlo: enabling better decision-making

What's best for enabling better project decisions: big data predictive analytics or Monte-Carlo simulation? 

Monte-Carlo simulation is an algorithmic approach to estimate the probable behaviours of a system. They rely on random sampling, and give numerical results.

I attended an APM Event to this effect, presented by Dr Nira Chamberlain, Principal Consultant for Data Science and Mathematical Modelling, Babcock Analytic Solutions. The presenter concluded that both of these tools are useful, providing that best practice is adhered to when they are used. I left feeling unclear, given that the majority of the audience felt the concept of ‘Big Data’ was new.

I was particularly interested as I happen to be about half-way through a book called ‘Big Data’ by Viktor Mayer-Schonberger and Kenneth Cukier. I have so far found this to be a really interesting read, and felt that the APM presentation could have benefitted from some of the examples in the book, regarding what ‘Big Data’ actually is and how it can be used.  For context, some of these examples include:

  • When the H1N1 pandemic struck in 2009, the U.S. Centre for Disease Control (CDC) was reliant upon doctors providing feedback on new flu cases as a means to understand where the disease had spread to. The picture they received from this data was always a week or two behind. Around the same time, Google had developed a model which enabled them to look at c.50 million Google search terms. They identified correlations between the frequency of certain search queries and the spread of flu over time and space (using CDC data). They eventually found a combination of 45 search terms that when used together in a mathematical model could predict in near-real-time where the flu had spread.
  • Credit card companies process all transaction data and look for anomalies in the data to help identify fraud.
  •  Amazon use customer purchase data to identify associations between specific products as a means to make informed recommendations. Netflix do something similar. In both instances, they can tease out valuable correlations without necessarily understanding the underlying causes, with the view being that knowing WHAT, not WHY is good enough.
  • In 2004 Walmart looked back at old transactions and tried to find any correlations between specific products, the time of day purchases took place, the weather at the time etc. Through this they identified that, prior to a hurricane, the sales of both torches and "Pop Tarts" increased. They had these products placed at the front of the shops and sales were significantly boosted. This is obviously only one example which hopefully highlights the power of the data.

Whilst all of this is quite interesting (at least I think so); the question that I come back to is…

How does this fit in within the context of Project Management/Project Controls and specifically in relation to Monte-Carlo simulation which the event was all about?

I conclude that Big Data could potentially be used as another source of information to help inform estimates as an INPUT to Monte-Carlo. One of the questions from the audience was also quite thought-provoking. It suggested that perhaps big data could be used to look at the wider environment, political landscape etc. and identify whether there are any interesting correlations which could have a bearing upon project success/failure.

I'm always keen to hear the views of others on the application of big data in the environments we operate within. Let me know your thoughts.

Credit: Content contribution - Tom Olden, Senior Consultant, BMT