新闻中心
 

Top 10 big data stories of 2013

By Danny Palmer / Published: 16 Dec 2013

Big Data

Big data. Anyone sick of it yet? Not the technology: the pulling together and cross-matching of disparate data on a massive scale is arguably producing more social, commercial and political change than any other innovation over the last couple of decades. No, we're talking about the term: "big data".

Once a handy collective noun for a bunch of technologies such as Hadoop, NoSQL and distributed computing that evolved in tandem to handle the extreme end of the storage and analytics spectrum, big data has been adopted by an enormous volume and variety of marketing departments seeking to rebadge their products for a big data age, to such an extent that as a descriptive term "big data" is now virtually meaningless.

It has been noticable that, in contrast to previous years, many of the key industry players we've interviewed during 2013 have displayed a marked reluctance to use the term 'big data' when explaining what they do.

The result of attempts by every man and his elephant to crash the big data party has, inevitably, been a growing dispondency and cynicism around the term, which is unfortunate as there are  many interesting and varied use cases emerging around these nascent technologies, some of which are featured below, with doubtless many more to come as projects that are currently in pilot phase (as, let's face it, most are) come to fruition.

So, perhaps this time next year we won't be featuring the top 10 big data stories at all, instead higlighting advances in real-time or streaming analytics. But for now here they are: the top ten Computing big data stories of 2013.

10. Big data hype manufactured by analysts and media, says SAS CEO Jim Goodnight

One person who was keen to differentiate his organisation from what he views as "big data hype" is SAS CEO Jim Goodnight.

Goodnight argued in October that "big data" is just another buzzword following on from other recent trends in the IT industry, even suggesting that analysts promote them in order to generate business.

"The term big data is being used today because computer analysts and journalists got tired of writing about cloud computing. Before cloud computing it was data warehousing or 'software as a service'. There's a new buzzword every two years and the computer analysts come out with these things so that they will have something to consult about," he said.

9. 'Lump in the learning curve' for big data adoption - Met Office

James Tomkins, Met Office portfolio technical lead has no doubts about the potential of big data technologies to reveal new patterns by analysing large datasets. However, he cautioned in August that moving away from the traditional relational data model to the schemaless NoSQL architecture can be quite a challenge.

"In terms of non-relational data structure, I think there's a definitely a lump in the learning curve for people to take their first steps away from the more traditional, more well-known relational data-model handling," Tomkins told Computing.

However, he went on, despite the difficulties it is a nettle that organisations like his definitely need to grasp.

"I think there will always be a place for relational data, but I think that as we look more and more around the organisation now, we're identifying requirements where, perhaps, we're not modelling and storing and using our data as best we can, because we've had to always stick with the one representation of our data," he said. 8. Research: first steps with big data

Learning to use NoSQL and other technologies is not the only challenge for those interested in big data. In May, research byComputing found that the term 'big data', while overfamiliar to many in IT, was unknown to non-technical board members in 41 percent of organisations polled.

This lack of understanding presents a serious challenge to those seeking to raise funds for pilot projects, especially since many finance directors will have long memories of previous "game changing" IT interventions that turned out to be anything but. In the minds of these funders, big data is at best a big-ticket endeavour, suitable for only the largest enterprises, and at worst the latest flash in the pan - best ignored or deferred, and certainly not worthy of significant support until it has proved itself over a broader range of scenarios.


7. Big data raises questions over privacy, warn experts


US retail chain Target asked its in-house statistician Andrew Pole to identify life changing events that it could use to better target customers with offers. In particular, Pole was asked to find out whether customers that had become pregnant could be identified from their shopping patterns.

They could.

Later, an angry father went into a Target store to demand an explanation from the manager about the coupons for baby clothes that his daughter had started to receive from the store. Why, he wanted to know, was the company trying to encourage his daughter to become pregnant?

Promising an explanation within the week, when the manager called to find the father much more subdued. "I had a talk with my daughter," he said. "It turns out there's been some activities in my house I haven't been completely aware of. She's due in August.  I owe you an apology," he reportedly told the manager.

6. IBM IOD 2013: 20 per cent more big data skills will be required within next five years says IBM

Data is the new crude oil, said general manager of IBM analytics Les Rechan in November, but there are not enough people who know what to do with it.

"In the next five years, you and I will need 20 per cent more skills in this area. In the US alone, that's 150,000. Only one out of every ten organisations has the skills they need to succeed here," he claimed...


So, problems remain in sorting the truth from the snaike-oil, hiring and retaining skilled practitioners, ensuring privacy is not trampled and setting up meaningful pilot studies. Nevertheless, some firms have been working with big data technologies for years and have reaped the benefits. Computing chatted to a few of them in the last 12 months.

5. Real-time big data analytics is the next big thing, says Walmart security director

Analysing retail transactions in real-time can produce many benefits for stores and payment card companies alike, Juan Luis Carselle told Computing in June.
"If you recognised something that means that customers could walk away without merchandise, then it makes a difference [to sales] if you know of the problem straight away," Carselle said.

Another way that transactions can be used to the benefit of supermarkets such as Walmart is to identify how many checkouts are being used in a store at any given time.

"So if a store manager thinks he or she has 25 checkouts working, maybe they only have five - this could be related to staff issues or technical issues, but the main point is that you are using existing data," he said, emphasising that the biggest shift in paradigm is that the organisation does not need to store information in the same, structured way that it has in the past. 

4. Conservation International uses HP big data analytics to measure animal population decline

Non-profit environmental organisation Conservation International has teamed with HP for a new scheme that will see big data solutions deployed in an effort to protect nature.

"Tropical forests are a vital part of the planet's life-support system - we need them for the air we breathe and to support a diverse and healthy ecosystem for agriculture, medicine and recreation," said CEO Peter Seligmann in December.

"We know that we can't protect what we don't measure, which is why CI is extremely focused on accelerating our research and having the most accurate and current data to ensure that we are doing the very best to safeguard our natural resources," he added.

The project, dubbed "Earth Insights", has been established in order to help Conservation International crunch the large amounts of data collected from environmental analysis of animal populations in the tropical rainforests.

Ultimately the aim is to exploit data collection in order to provide an early warning system about conservation efforts. Data is gathered in real-time from camera traps and climate sensors in 16 countries.

3. A big data case study: Ooyala's real-time video analytics

Ooyala, a video analitics firm has been doing big data since 2007. An early adopter of the NoSQL database Apache Cassandra and data crunching platform Hadoop, the firm's set-up now includes hot new technologies such as Storm, Spark, Shark and Splunk.

Ooyala provides media firms such as Bloomberg, ESPN, Telegraph Media Group and Yahoo! Japan with "actionable analytics", enabling them to analyse in great detail the way that their video content is being consumed and to optimise its delivery to maximise revenues. The firm's analytics engine processes over two billion analytics events each day, derived from nearly 200 million viewers worldwide who watch video on an Ooyala-powered player.

"Data is core to everything we do," Ooyala's director of platform engineering and operations Peter Bakas told Computing in September, insisting that the company stores all the data it collects, throwing nothing away.

"We store six years' worth of data in our Cassandra cluster, more than any company in the world."

2. A positive reaction: big data technology at the Royal Society of Chemistry

In 1841, when the RSC was founded, its clients and contributors consisted of a small group of gentleman scientists, mostly based in London. Today its materials are used by millions all around the world - researchers, universities, institutions, R&D departments - just last week the organisation added its millionth global user of the educational platform for secondary school students and teachers Learn Chemistry.

David Leeming, strategic innovation group solutions manager toldComputing in November how, like all publishers, the RSC has had to widen its horizons to a huge extent as the digital revolution has taken hold, cutting and dicing its information and making everything searchable, creating products for a new and varied global audience, and speeding up its production processes.

"We're changing the way a publisher works," Leeming said.

1. Database shakedown: Five reasons why there's a revolution going on

Times they are a-changin' in database land, and in June Max Schireson, CEO of NoSQL firm MongoDB (formerly 10gen) popped around to explain why the likes of Oracle and Microsoft may not find that everything goes their way in the coming decade in the way it did over the past two or three. We like a tech story that puts new developments into their historical context, which is why it's number 1. 

"The relational database was invented in 1970. You have a technology that was invented over 40 years ago to support accounting-type applications with very regular tabular data, with schema that got updated every couple of years, that were deployed to dozens of users, that ran on a single computer. Now people are trying to use this technology to manage everything from social media to customer contacts and contracts, often being deployed to millions of end users and clusters of machines - and they want to update them every day. If you started off with that goal you never would have designed that [relational] technology, but that's what people are using," Schireson said.

The five reasons why the database scene is changing, as explained by Schireson, were as follows:

Reason 1: The advent of big data
The document-oriented, distributed model is better able to cope with large volumes of disparate data that is changing very rapidly.
Reason 2: Priorities have changed
All technology design involves trade-offs. When storage was expensive data models were designed to minimise the footprint on disk. However, development time is now a much more valuable commodity than storage.
Reason 3: Developers have more power
Shireson believes that the last 10 years have seen a sea change in the way that decisions are made within the IT department.
Reason 4: New applications are better suited to the document model
For new projects the application development that we see is more interactive. It pulls in a lot of product information, customer information, social media information... This sort of application often tends to fit better with the document-oriented model.
Reason 5: People want alternatives
"The database sector is a boring market because it has been so dominated by the biggest players for so long, and they are some of the biggest tech companies out there," said Schireson.