Wednesday, May 29, 2013

Big Data: the Big Picture

In May 2013, I was invited to be part of a panel at DAM NY 2013 talking about "Big Data". I shared some of what we're doing with Big Data at the Associated Press.

In particular, I've worked on an effort to create a digital archive - all of the text, photo, video, graphics and audio content that AP has ever published - which adds up to hundreds of millions of items. We combine that with our rich taxonomy of people, places, companies, organizations and subjects, available to you as the AP Metadata Services. I explained a bit about how we use that archive of content to provide insight and drive further enrichment, all in support of better products and services.

I feel that the term "Big Data" is a bit of a buzzword that is getting attached to a lot of different efforts and there's some healthy skepticism about the true value of some of what is touted under that particular heading. However, we have really found that there's a lot to be learned from bringing together significant data sets. And, also, that working with large data sets really does require some different techniques.

The World of Big Data in a Single Infographic

The other day, this Wikibon Infographic on Big Data caught my eye. It seems to define the term pretty broadly, but manages to convey some of the key technologies that underpin the field (not just Hadoop) and why there seems to be such an upsurge (a combination of low scaling costs, maturing tools and larger enterprises with a thirst for data driven strategies).

Worth a glance.