Genome is a service designed to let companies deliver highly targeted online advertising and marketing campaigns. It will let advertisers quickly sift through and analyse terabytes of real-time web data collected from Yahoo’s own networks and from those of partners such as Yahoo and AOL.
The service, scheduled to become available in July, will let advertisers mash up their own data with Yahoo’s data and run analytics on the combined data set.
Such instant analysis of real-time big data sets is an emerging trend and something that many companies are moving towards. “It is illustrative of the desire by companies of all sizes to capture, synthesise, analyse and share timely information about user behavior,” said Jeffrey Kaplan, managing director of ThinkStrategies. The goal: To drive better decision-making and new business opportunities, he said.
Genome is based on technology from interclick, a company that Yahoo acquired last December. At its core is a 20-terabyte in-memory database that pulls in and analyses real-time behavioral and advertising-related data from Yahoo’s multi-petabyte scale Hadoop clusters.
The company is using a blend of proprietary technology and best-of-breed commercial products from vendors such as Netezza and Microstrategy to do the data analytics on the real-time data, said Michael Katz, CEO of interclick.
“Looking at it through the lens of the business, big data is not just about storing the data,” Katz said. “It’s about capturing data, putting it into the platform, updating it and propagating it out to the server to be able to do targeting against it in real-time. It’s not a trivial task.”
Genome is one of a growing number of services that offer companies a way to do sophisticated analytics with their big data without having to invest in a data analytics infrastructure of their own, or without having to worry about finding scarce data scientists to support the infrastructure
In Yahoo’s case, the service is targeted specifically at online ad targeting. Others are broader in nature.
One example is Google’s BigQuery, launched a few weeks ago, which aims to let enterprises upload their data to Google’s infrastructure and run sophisticated analytics against it. Another is ClearStory, a start-up that came out of stealth mode earlier this year. It offers a service that lets companies mash up heterogeneous data from corporate databases, Hadoop environments and public web sources, and then run it through an analytics application.
Other companies with similar offerings include Metamarkets, a venture funded startup that offers a software-as-a-service (SaaS) solution for big data analytics. The company helps firms analyse clickstream data and other online data and provides visualisation and predictive modeling capabilities for customers.
Another example is WibiData, a startup that offers a service that lets businesses store and analyse vast amounts of customer data from websites, mobile devices and other sources.
“The problem of getting interactive access to ever more challenging data sets has been top-of-mind for several years now,” said Curt Monash, principal at Monash Research. Companies such as Metamarkets and WibiData are addressing that need with their hosted data analytics models, he said.