September 9, 2014 by Bill Johnson
There are curious kids, and then there are curiously curious kids.
At an early age, Gil Elbaz developed a seemingly inexplicable fascination with data, and this being the early 1980s, he couldn’t feed his immense curiosity without some good old fashioned hard work. He would hone in on a question like “What’s the coldest place in the world?” and his parents would pull out a massive reference book where he would dig for the answer. “One of the annoying things about a book is you can’t sort,” Elbaz says. “So you want to find the coldest place, but the only way to do that is flip through all the temperatures in a 100 page book.”
As a result, he acquired a healthy appreciation for spreadsheets, reveling in their neatly organized and easily sorted rows and columns. He still remembers sitting at his early Apple computer, collecting and curating information on any and all subjects. “I just thought: ‘Wow, data is a way to answer really interesting questions,’” Elbaz says.
He cultivated this skill for decades and eventually parlayed a childhood obsession into Applied Semantics, a business he sold to Google for $102 million. The company’s main product was AdSense, a contextual advertising tool that now accounts for some $13.6 billion in revenue for the tech giant.
Now, after serving as a Google engineering director for four years, Elbaz is working on technology he hopes will have similarly far-reaching effects on the way businesses utilize data. Founded in 2008, his new company, Factual, aims to build the largest set of location-related data in the world. That hasn’t exactly made the business a household name, but while keeping a low profile among consumers, Factual has simultaneously accrued a long list of big-name customers, including the likes of Yelp, Bing, and Samsung, who use Factual’s location data to make their own products more robust and intelligent for users.
In creating this hub of data, Elbaz hopes to breed a new generation of apps that can adapt and react to where a user is in the world. “With location, you can understand people’s patterns, figure out what they like, where they go, what they do, what they’re going to do,” he says. “Every industry is going to have to play this new game, or ultimately disappoint users.”
Factual’s mission is both risky and technologically complex, pitting the company against so many other businesses, from Foursquare to that great purveyor of data itself, Google. Both Google and Foursquare offer APIs that give developers access to location data. And yet, developers who use those APIs also run the risk of competing with Google and Foursquare for users and ad dollars. Services like Bing and Yelp, for instance, aren’t likely to get data from Google, because, as Elbaz points out, “Google sees them a competitor.”
Factual avoids that tension, because it has no consumer product. Instead, Elbaz says, “we want to become the neutral data network that everyone can work with and trust.”
In a way, Factual represents the next stage in the evolution of how tech companies use data. Over the years, companies like Google have developed sophisticated ways of mining data, and smaller developers have been all too happy to utilize what information they were given. And yet, Factual argues that developers shouldn’t need Google to filter and spoon feed data to them. They should have direct access to it on their own. And so, Factual wants to turn the data itself into the product.
“Everyone’s been talking about coming up with tools to mine the big data,” says Danny Rimer, a partner at Index Ventures, which contributed to Factual’s $25 million funding round, “but we believe another big opportunity will be in datasets themselves.”
A Familiar Opponent
This is not the first time Elbaz has competed head on with Google. Before launching AdSense, the team at Applied Semantics was working on a meaning-based search engine that searched not only for certain words on a webpage, but also for related words, something Google hadn’t yet mastered in its early days.
The Applied Semantics system would understand, for instance, that if a user searched for vegetarian restaurants, he might also be interested in vegan restaurants, because they have a tight relationship. The technology worked well, but when it came to beating Google, Elbaz admits, “we lost that badly”
And yet, this vast trove of related words became a fundamental part of building AdSense, which matches ads to the context of a webpage. Now, Elbaz sees Factual as the analog version of that tool. Thanks to the proliferation of mobile technology, it doesn’t just matter what people are looking at online, it matters where they are in the world when they’re doing it, too.
“If you want to personalize an app, and we think all apps should be personalized, you have to know your users, and location is the best way to know your users,” Elbaz says.
Today, Factual has data on 75 million locations, which include businesses, public parks, and other points of interest, in 50 countries. And while the information Factual collects on these locations is rather simple—things like phone numbers, addresses, and hours of operation—the process of amassing all that information is anything but. Elbaz and his team spent two years prior to launch building the database—and building the technology that builds the database.
Factual analyzes billions of data points everyday, working with hundreds of businesses around the world, who willingly share their data with Factual. Yext, for instance, is a company that helps small business marketers manage their company listings online. It shares accurate data on hundreds of thousands of its small business clients.
The system also crawls the web to find publicly available data, but according to Elbaz, gleaning accurate information from all that is one of the toughest parts of the job. Phone numbers can be incorrect, addresses incomplete, and in many foreign countries, the listings, themselves, are non-existent. “The single most difficult thing of all is, how do you build an algorithm that can predict truth?” Elbaz says.
In many countries, where data is unreliable, Factual works with people on the ground to build what he calls “gold standard databases.” These people will manually build a database of, say, 100 restaurants in Japan. Factual then tests its algorithms against those databases. “If our algorithm can automatically come up with the same answers that humans can, that means they’re working,” he says. For Factual, this type of hardcore vetting is an absolute must, says Rimer. “In order to do this effectively, you have to provide a comprehensive service,” he says. “It’s not good enough to serve up information and have it not be right.”
Maintaining this level of quality assurance may prove to be a challenge as the company scales—and scale it most certainly will. According to Elbaz, Factual presents a bigger opportunity than even AdSense, a product, mind you, that now rakes in nearly a quarter of the revenue for one of the biggest businesses in the world.
“The fact that we can examine the kinds of people who enter businesses and make determinations about what’s going on there, and do that for any location on earth, means there’s unlimited information to be stitched together,” Elbaz says.
For now, companies like Yelp and Bing are using Factual’s dataset to expand internationally. Meanwhile, startups are using it to make their apps more intelligent. Shopular, for one, serves users coupons, based on which stores are nearby, a service that co-founder Tommy Tsai says might not have been possible without Factual. “Early on, we were thinking about building our own location database, and it became clear to us after we looked at Factual’s data that we wouldn’t be able to achieve nearly the same quality as Factual,” he says.
Factual also offers technology that not only pushes location data to companies, but takes data from those companies and turns it into demographic information on their users. So if, for instance, a user is frequently at a location Factual recognizes as a driving range, Factual might label that user a golfer.