by Angela Guess
David Gewirtz recently asked in ZDnet, “What is big data? The answer, like most in tech, depends on your perspective. Here’s a good way to think of it. Big data is data that’s too big for traditional data management to handle. Big, of course, is also subjective. That’s why we’ll describe it according to three vectors: volume, velocity, and variety — the three Vs. Volume is the V most associated with big data because, well, volume can be big. What we’re talking about here is quantities of data that reach almost incomprehensible proportions. Facebook, for example, stores photographs. That statement doesn’t begin to boggle the mind until you start to realize that Facebook has more users than China has people. Each of those users has stored a whole lot of photographs. Facebook is storing roughly 250 billion images.”
He continues, “250 billion images may seem like a lot. But if you want your mind blown, consider this: Facebook users upload more than 900 million photos a day. A day. So that 250 billion number from last year will seem like a drop in the bucket in a few months. Velocity is the measure of how fast the data is coming in. Facebook has to handle a tsunami of photographs every day. It has to ingest it all, process it, file it, and somehow, later, be able to retrieve it. Here’s another example. Let’s say you’re running a presidential campaign and you want to know how the folks ‘out there’ are feeling about your candidate right now. How would you do it? One way would be to license some Twitter data from Gnip (recently acquired by Twitter) to grab a constant stream of tweets, and subject them to sentiment analysis.”
Photo credit: Flickr/ Leo Reynolds