"Tell me, what would you do if you had 1,000 times more data?"
I still remember reading this profound question on December 24, 2007 in BusinessWeek's cover article on Google and cloud computing like it was yesterday. It was an interview question, posed by then senior Google engineer Christophe Bisciglia (later founder of red-hot cloud software company, Cloudera and more recently WibiData) to job applicants.
For the first time, I began to think through the practical implications of what the advent of the information explosion (now commonly referred to as “Big Data”) really meant to both individuals and corporations. How would this onslaught of information effect decision-making across a range of industries, what were the practical implications to businesses and executives, and what were the resulting investment opportunities?
That spark nearly five years ago led to an investment thesis here at Flybridge, where we have looked to invest behind Big Data applications and uses across a number of vertical industries. At the start of my career as an enterprise software product manager, I was trained to think about the impact of disruptive, horizontal technologies and innovation on vertical industries. Hence, when Christina Cacioppo of Union Square Ventures wrote her excellent wrap-up of last year’s Techstars class ("What Comes Next"), my first thought was to wonder why more entrepreneurs aren’t going vertical – that is, why they aren’t more focused on solving business problems for large vertical industries?
Fortunately, we have been able to find some extraordinarily talented entrepreneurs who have thought deeply about this issue of Big Data’s impact on vertical industries. For example, in the world of consumer lending, Douglas Merrill and Shawn Budde at ZestCash are using the vast array of signals across online and off-line data sources to determine whether an individual consumer should receive a loan. The young company was recently in the news for the release of their latest underwriting algorithm. Since when does a company get big time TechCrunch coverage for releasing a new algorithm?
Another entrepreneurial team pursuing applications of Big Data is Mike Baker and Bill Simmons of DataXu. The company uses a big data approach to determine what the particular advertisement should be for the particular user at that particular moment across all digital channels – display, mobile, video and social. The company processes 20,000 third party data segments and evaluates billions of impressions each month.
Most recently, we announced a new investment in a big data company out of Israel called tracx. The company has been inhaling all of the social media data exhaust out of Twitter, Facebook and other sources to determine insight for brands about consumer sentiment and provide a platform for targeted engagement and campaign management.
Finally, we have made an unannounced seed in the healthcare market that applies big data techniques to help payors reduce health care costs and more accurately account for revenue.
One of the common threads and underlying skills requried across these Big Data investments is Machine Learning. This is a cool artificial intelligence-based technique for developing computer systems that learn and evolve based on experience. Each of these companies has based their intellectual property on sophisticated machine learning techniques developed by dozens of PhDs. It's as if the machines have been in training all their lives to adapt and make use of the Big Data now being thrown at them – a combination of Moore's Law and the cloud mixed in with Machine Learning finally makes it all possible. It probably has been a growth of 1000x the data available to each of us since Bisciglia's question – and another 1000x is coming for us all in the next few years.
So while McKinsey raises the alarm (in a very nice report) that the implications for Big Data is going to be a massive shortage of trained data scientists (they estimate the US alone will need 1.5 million more by 2018), I would argue that if you young graduates want to build a future in the Big Data Era, I have one word for you (OK, two): "Machine Learning"