Here's another deep thought provoking post from Andrew Walker. I recommend to take a few minutes to read this article because he speaks about how the Social Graph Data can help businesses make better decisions.


Predicting the future is hard!  Why?  Because we’re living on a ball of rocks, liquids and gasses, spinning at (depending on your latitude) between 700 – 1000 miles per hour and moving around the sun at around 67,000 million miles an hour in a vast cosmos of other large things spinning and travelling very fast.

It’s a world of overlapping ecosystems where instinct, politics, economics and technology brings added predictive complexity into the organic systems all around us. All this is occurring at a macro level whilst, sub-atomically, everything fluxes and decays in (almost) impossible to measure increments, where time flows erratically and concepts like up and down simply don’t apply.  Predictions are therefore, as journalists say “a mug’s game”.

That’s all perfectly natural.  So is weighing slightly less at the equator than at the North Pole.   How these systems affect our future is a data capture problem because there are so many variables applying to everything that it’s very difficult to get enough information to predict events reliably.

Consider the economic planning for a company based in Milan, supplying designer plastic furniture to exclusive outlets in London, New York and Paris.  Their key variables include low-price competition from unlicensed Chinese competition, oil prices, exchange rates, EU/US import legislation, the US election, the Greek debt crisis, Hurricane Sandy, the building of trendy new loft apartments and the disposable incomes of the target market.

That’s hard to predict, but economic data helps fix it.   Our trendy Italian furniture company also needs to grasp the complexities of local fashion trends.  That’s even harder to predict.  Could social data fix that?

The answer, ironically, is probably, but I need more data to be certain.  But here’s one useful insight.  Back in 2009 researchers at the University of Tokyo studied how frequently parliamentary electoral candidates were mentioned on Facebook and found the more mentions they received, the higher the likelihood they would be elected.

In 2010 at Tweetminster we conducted a similar study using Twitter mentions as the data source.  We predicted the (highly unusual) hung parliament outcome weeks before anyone else and with a marginally higher level of accuracy than the polling company YouGov and a marginally lower level of accuracy than ComRes and Ipos/Mori.

The difference was, we predicted it without asking anyone a question, we merely examined the ‘social graph’ – a phrase coined by Mark Zuckerberg to describe the analysis of socially generated data on sites like Facebook, i.e. ‘Likes’ and ‘I’m having soup for lunch’ user-generated posts.

What our study showed was two things.  Firstly we noticed that small samples, i.e. seats where rival candidates only got a few hundred mentions, were hard to predict and achieved about a 64% accuracy rating, which isn’t much better than making a guess.  However when we aggregated all the scores together and compared the parties on a data sample of over 120,000 mentions we were 92% accurate, which is getting into crystal ball territory.  This places a value on large sample data analysis, or to put it another way, the god of maths doesn’t care about details, he/she is a big picture person.

Get a big enough sample and data analysis renders the old school of theoretical modelling obsolete, i.e. you don’t need to understand how theoretical people will behave if you have enough data to show you what actual people do. This doesn’t just apply to people.  Large data analysis predicted the extent of the flooding from Hurricane Sandy, it has also predicted the presence of undiscovered creatures in ocean food chains before a live specimen has been physically discovered.  It works.

The second thing we learned was data tells us something about the world that we can only learn from data.  Before the US election a few weeks back, a civil service friend asked me who was going to win.  He was uncertain because the press coverage showed close-run contest that was “impossible to call”.  I told him the social graph mentions predicted Obama.  He asked me if the mentions were positive or negative and I told him it didn’t matter, in an election the most popular candidate is normally the most unpopular (with his or her opponent’s supporters, obviously) so they get lots of positive and negative mentions.

The emotion of the mention is irrelevant, a higher sum of mentions predicts a win.  It’s that simple.  That kind of meta-perception, understanding that good or bad comments are irrelevant, it’s the total number of comments that counts, can only come from large sample data analysis.

Right now I’m in Brazil for Fund Forum Latin America and keenly aware that the Financial Services industry is expert at data analysis and predictions.  But here’s the problem – there’s no social graph data.  The data that describes economic forecasts, stock trades, currency movement, custody operations and the impact of regulatory activity is inherently mechanical and transactional, but the people who affect it are organic and social.

There are social data gaps in the big financial data picture.  If the current data sets work okay, imagine how much better they would work with millions of relevant people’s opinions attached to them as a predictive data set. Where and how to invest based on more insight… that’s a no brainer right?

Here’s an analogy that makes sense to me.  Yesterday I was at the Brazilian F1 Grand Prix.  Jenson Button won the race, Sebastian Vettel became world champion. The mechanical data predicted they should get the biggest cheers from the fans based on the mechanics of points scored and positions won (the way financial services use data).  Who got the biggest cheers at the end?  Massa.  Why?  The social graph data explains it… he, like the majority of the fans, is Brazilian.



About the author

Andrew Walker is an entrepreneur, inbound marketing consultant and tech wizard with a long history of creating websites, apps, data & strategy. He is also a regular speaker on the social media, finance and technology conference circuit. Over the last 14 years he has worked at director level, specializing in tackling complex briefs, designing campaigns and deploying new technologies for commercial brands, NGOs and government agencies.

In 2008 he co-founded Tweetminster, a successful social media technology start-up that creates outstanding news content for websites and apps through data mining. 

Current role(s): Co-founder & CIO Tweetminster