The Data Driven Web (or Why What People Say and What People Do Are Different Things)

A few weeks ago, I had the opportunity to attend the always awesome SF Music Tech Conference.

More about the music industry (and why its really in trouble) in a future post, but for today I wanted to riff on something Steve Jang of Soundtracking (previously Imeem) said during his panel on Location Check-ins.

For background, Imeem was one of the early players in the music discovery space.  Users would sign-up and both passively track the music they were listening to as well as create playlists to share with other users with the end goal of discovering new music to listen to.

As part of the panel, Steve explained the real issue with explicitly selected interests:

“Users would sign-up and create these amazing aspirational playlists – lots of independent music and off the beaten track bands.  But then we would watch what these users were playing in the background and it was usually just Madonna or the new Snow Patrol album on repeat.”

The problem:

  • Interests change over time (both in degree and type)
  • Consumers are actually really bad at selecting what they like
  • There’s no context for that interest (with degree or specific type)
  • Users have a view of themselves that may not be entirely true

Let’s take my friend Noelle as an example:

These are Noelle’s “Likes on Facebook.”

As a business, simply leveraging her “Likes” as her interest graph would be almost 100% useless.   There’s simply not enough full data (I know she likes a few things, but is that everything?), the data has no context (What does she like about Hoodie Allen – his music or is he just a friend?), and some of it is way out of date (she actually now hates Judy Bloom.)

Combine this with other explicit “likes” such as check-ins and links shared on Twitter and businesses have enough “interest” data about a user to ensure that they can market the exact wrong product or message to them.

Taking a step back, in the real world – a person is characterized by how they dress and behave, who they spend their time with, and their actions.  They make explicitly “like” an activity such as surfing by telling people – but it will be kept in check by their actual activities (I may “like” surfing, but if I don’t surf and don’t want to hang out with people that surf – it will slowly become a downplayed part of my personality and others will understand the context of my interest in surfing.)

In the online world, there is no anchoring explicit “likes” to reality – taking Noelle as the example above – her “liking” of Taylor Swift is more of an inside joke than anything – something her real world friends understand and have context for – yet brands and companies online take it as a signal for where her interests fall.

The next generation of the web will be focused on bridging the gap between the real world and reality – and in the process – creating the platform for the first generation of truly social experiences online.

Why now?

  • Large scale compute power is exceptionally cheap
  • The rise of graph databases and next-generation big data analytics tools (Next Gen SAS, SPSS)
  • Growth of large scale sensor networks (driven by mass adoption of smartphones globally)
  • Consumer desire (and trust) with passive, data tracking if it adds value
  • Really personal computer in a mobile phone with other “dumb machines” that are connected via the cloud

The Quantified Self movement has signaled the start of what is possible as users want to track their information online.  Product and services such as Fitbit, Audioscrobblers & Zeo enable users to easily and passively track their activities and store that data online for later consumption.

This trend will only continue as mobile phones become more prevalent and more open and other records, such as receipts and call records become open and digital.  (Just think about the value in understanding how often and the pattern of when I communicate with my friends and family – bringing context to a social relationship)

On top of that – technology has finally reached a point where this type of large scale data processing is not only possible, but reaching mainstream acceptance.  Check out Google’s Pregel and think about the possibility for understanding both relationships and interests at scale with billions of pieces of passively collected data.

Are we there today? No – we still have a long way to go before passive data tracking is mainstream.

However, think about the world where developer’s have this collection of data to tap into for recommendations and for a true social graph – a world with true social applications – built on top of a trusted personal data platform.

That’s the world I’m really excited for.

Related Articles:

Published by

adam

I work for True Ventures, an early-stage venture capital fund with offices in San Francisco and Palo Alto. We partner with promising entrepreneurs at the earliest stages in the technology market providing hands-on management support to guide our portfolio companies through the challenges of early growth.