The emergence of “big data” could transform your global business—provided you know how to exploit big data practically and provided you have the right people to find the patterns of opportunity in all that information. That was the message that John Fritz, director of the Strategic Software Alliances, Advanced Micro Devices, gave to the EuroFinance conference in Miami recently.
What is big data? Fritz relies on the definition in Wikipedia: “A collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing.” It is has resulted from the falling cost of electronic storage, he said. “In the past, large companies might have kept data for only six months. Every six months it downloaded its data onto tapes and warehoused them. Now, with new technology, data can be kept much longer.” He said it’s possible to analyze all the relevant “hot” data they’ve ever gathered.
The problem is that finding useful insights from all that data seems impossible. “Big data is a universe of a lot of things going on,” Fritz said. The phenomenon was first tackled in 2004, when Google created the MapReduce programming model for processing a lot of data using a parallel, distributed algorithm. That led to Hadoop—open-source software supporting data-intensive distributed applications. Armed with these new tools, data scientists could begin to manipulate mountains of data, testing different data sets against one another to find new insights. (“A data scientist,” Fritz said, “is a statistician who makes a lot more money.”)
Suddenly, private and public organizations could tackle big issues by overlaying, filtering and combining massive data sets. Here are three examples:
- Visa Inc. put big data to work to better understand credit card fraud, which costs the organization billions of dollars a year. Visa employed Hadoop to build analytical tools to get a better understanding of fraudsters and their fast-evolving crime. Six cents of ever $100 is lost to fraud, so it’s an important pursuit. In the past, an organization like Visa could analyze only 2% of its data. It used to base security assumptions on average fraud rates for individual business sectors. Today, it can analyze the market as a whole with detail right down to individual merchants’ terminals. By looking at more attributes more effectively patterns have emerged from transactions and their location, average authorization volumes and frequency of fraudulent purchases. Today, Visa’s data engine can look at as many as 500 aspects of a transaction at once. This analytical muscle has saved Visa US$2 billion a year in potential incremental fraud losses by addressing vulnerabilities before they’re attacked.
- Stanford University wanted big data to go after emerging diseases faster. So its researchers consulted the “Gene Expression Omnibus”, a public genomics data repository. They looked at disease genomes then examined 167 drugs in the Drug Connectivity Map, a directory that links patterns associated with diseases to corresponding patterns produced by drugs and genetic manipulations. They were looking for pattern associations not normally associated with individual diseases. What emerged was an anti-seizure drug that might be of use against inflammatory bowel disease.
- And Big Oil wanted to improve its risk trade-offs, so companies looked at scores of data sets in their portfolios comprised of risk models, geological and engineering data to determine optimal portfolios. The number of options available in so much data is 1050, and yet the analysis can be done in minutes.
How can you put big data to work in your business? The miracle of big data, Fritz said, is that there is no barrier of entry for any organization that wants to slice and recombine its own data, or data pertaining to it and its industry that can be found in the public domain on the Web. Much of your big data opportunities are more material you have in-house or you can get off the Web for free.
Fritz suggested hiring data analysts who can mine your information for new “data products.” LinkedIn is producing new services based on in-house information from its online members. Walmart is trying to improve its customer reach by looking at data. What is it looking at? The nature of its millions of transactions. “Transactions in the real world are all about behaviour,” he said. “Look for patterns in the transactions.”
Looking at big data in different ways will help you to better understand big problems and how to tackle them faster. Indeed, Fritz said that the cutting edge of information analysis is no long big data, but “fast data”—getting answers from analytics that happen in real time. The next stage will be about predictive behaviour and preparing for it. U.S. President Barack Obama’s second presidential campaign used big data to better understand the candidate’s followers and to predict how they were going to vote.
READ Building on Big Data