How sites find out all about you
When you browse the web, how do networks find your personal information and bring you custom adverts in milliseconds? Last month, I attended the San Jose conference NoSQL Now! to find out.
The 1980s saw explosive growth in relational databases that use tables and SQL (Structured Query Language) to store and retrieve data. For example, you could list names and addresses in a table and list people in Mountain View's 94043 ZIP code using a SQL query like "SELECT NAME WHERE ZIP=94043."
Today, websites like Facebook, with nearly a billion active users, store all kinds of data - videos, schedules, messages, favorites and friends. Relational databases are slow to handle this data. To serve millions of users quickly, three main types of NoSQL (Not SQL or Not-only SQL) databases are rapidly gaining acceptance. The first is a key-value database that gives each piece of data an index. It is like having a filing cabinet where folders can contain anything and you find items by looking at the folder tag. The second type is a document-oriented database. This stores, manipulates and retrieves information like web pages, word-processor files and publications. The third type is a graph database that stores information like networks, relationships and road maps.
In February 2011, a key-value database company, Membase, merged with a document database company, CouchOne, to form Couchbase, a Mountain View company. "Over 50 percent of our customers run at least some of their applications using Couchbase on Amazon's Web Services," says Bob Wiederhold, CEO of Couchbase. Others install a free version of Couchbase on their own servers, paying for larger systems and support.
Over 50 social gaming companies, including Zynga, Electronic Arts and Disney, use Couchbase. Cloud-based enterprise software companies and e-commerce companies also use Couchbase. Advertising networks typically store user profiles in Couchbase. User profiles used to be stored in a small cookie file on your computer, but now cookies link to larger files stored in the cloud. When you browse the web, an ad network may look up your profile containing age, gender, likes, education level, spending patterns, occupation and more. The network matches your profile to an advertiser and shows you a custom advertisement in less than 40 milliseconds. According to Wikipedia, the average time to blink your eye is 100-400 milliseconds. Ads are displayed quicker than you can blink.
Oracle, a $35.6 billion company in 2011, grew by adding applications and infrastructure to its relational database foundation. At this stage, Couchbase has enough to do without developing its own applications. Still, there's a huge growth opportunity. Couchbase has raised about $30 million from Accel, Ignition, Mayfield, North Bridge and Redpoint. Couchbase is racing with other open source NoSQL databases, like MongoDB, to build market share by courting developers. The CouchConf developer conference took place on Sept. 21 in San Francisco.
According to database pioneer Michael Stonebraker, who presented at a NewSQL Meetup held recently at Intuit, Oracle has 4 million lines of database code and an expensive direct sales model. So open source databases, where a customer can try out a free version and users support each other, have much lower sales costs. A pioneer in database research, Stonebraker explained how he is developing a NewSQL database, VoltDB. This is more suited to transaction processing applications than NoSQL databases. On one test, he claimed VoltDB ran 70 times faster than Oracle.
Relational databases led to a world that collected information on paper and online forms. Now NoSQL and NewSQL databases enable organizations to engage us online, whether we are playing games, sharing with friends or buying goods.
Angela Hey advises technology companies on marketing and business development. She can be reached at email@example.com.