Technology

Search engines: Image conscious

Will Leila Boujnane's search engine be the next Google?

If you had met Larry Page and Sergey Brin in 1997, when they were still PhD candidates at Stanford University, would you have thought their BackRub search engine was going to change the Internet? Most people didn’t. Many of the dominant online businesses, including Yahoo! Inc., were offered the chance to buy the technology, famously renamed Google, and took a pass. Of course, just 10 years later, Page and Brin’s algorithms now underpin a business that last year pulled in US$16.6 billion in revenue, US$4.2 billion of it profit. Google (Nasdaq: goog) now looms over the present and future of all online media.

For many web entrepreneurs, Google’s story corroborates a belief that all one needs is an extraordinary algorithm or two and an ingenious way to make them pay. But only in their wildest fantasies do most would-be online oligarchs dare whisper their company’s name in the same breath as the Silicon Valley eminence. Leila Boujnane, however, is especially bold. The 40-year-old Toronto entrepreneur brazenly drops the G-bomb to describe her company’s business: “Idée does for images what Google does for text.”

It is as much a statement of the company’s aspirations as it is fact. Idée Inc. makes a search engine for digital images. But rather than typing in descriptive words to find a picture associated, or “tagged” with those terms, as one does with Google’s Image Search service or a host of other search engines, Idée’s visual search is conducted using another image, comparing its unique “fingerprint” of visual data to a database of other images’ fingerprints. The technical scope and complexity of this problem — particularly, designing rigorous algorithms that can efficiently process and index hundreds of millions, or even billions, of images — has stared down academic researchers for decades. “People have looked at image-matching speed and complexity quite a bit, but I haven’t seen anything quite like what I see people at Idée doing,” says Paul Fieguth, a professor at the University of Waterloo’s department of Systems Design Engineering who has consulted with the company since 2002. “They’ve come up with algorithms that a lot of professors would be proud to present at conferences and publish. A lot of work at universities seems to me to be a bit naïve compared to what they’re doing, because they actually have to sell their product.”

Indeed, this is no vapourware. Since founding Idée in 1999, CEO Boujnane has flown the 22-person firm under the radar, quietly licensing software to Adobe Systems Inc. for use in its Photoshop Elements product, and building a small stable of A-list clients like the Associated Press, Agence France-Presse and stock photo companies Getty Images Inc. and Masterfile Corp., which use Idée to track the use of their images in print and online. In December, online content-sharing site Digg.com began using Idée’s software to screen for duplicate image submissions.

Customer wins like that have garnered Boujnane and founding chief technology officer Paul Bloore admiration within Toronto’s close-knit community of web entrepreneurs, in no small part because Idée has bootstrapped its organic growth with no venture capital.

Now, after seven years of refining its technology, Boujnane and Bloore think Idée is ready for the big time. “The next level for us is having a true, Google-type image search,” says Boujnane with breathless energy. “So you can go online, upload an image, and our search engine will tell you all the places in the web where that image has appeared. And it will do that on the fly, in real time.”

TinEye, the search engine’s working name, is now available in invitation-only beta and makes its first public demonstrations the week of March 3 at the high-profile ETech emerging-technology conference in San Diego. Boujnane and Bloore expect TinEye — or whatever it is renamed — to be ready for a full public web launch by fall. “The goal here is to be working with billions of [visual] assets, images as well as video,” says Boujnane. “It’s something the world has been talking about for quite some time as the next generation of search.”

Boujnane and Bloore, an unmarried couple for 18 years, believe their visual search technology is a major breakthrough. “It’s a hundred-million to half-billion-dollar revenue potential,” says Boujnane. “This is the type of firm that we would like to grow.” But succeeding will require a careful plunge into the world of venture capital, waters they have begun testing for the first time. And they’ll have to find a troublesome little thing known as a revenue model.

Born in Morocco, where her French parents worked as bureaucrats, Boujnane announced at age five she was going to be a doctor, and held that dream after the family moved during her teen years to Bordeaux, France. But midway through studies at that city’s medical university, she realized that a career in medicine was a bad fit. So in 1989, at age 21 — and to the dismay of her parents — Boujnane decided to leave Europe for a year to figure out her future.

She landed in Toronto on a student visa, eager to improve her high-school English (she already spoke French, Spanish, Arabic and basic German). She soon hooked up with Ron Dembo, a University of Toronto professor who had just founded Algorithmics Inc. to develop financial risk-management software. Boujnane joined as employee No. 7, despite having no experience in software or finance. “There are two things I do really well,” she recalls telling Dembo. “I never take no as an answer, and I work like a dog. If I don’t work out, toss me out.” For months, she threw herself into whatever tasks needed completing, and soon discovered a love for project and product management.

She also met Paul Bloore, Algorithmics’ employee No. 6, a quietly intense U of T computer science dropout whom she started dating. In 1992, Bloore and a colleague founded a competing financial risk-management software firm, which they sold seven years later to industry heavyweight SunGard Data Systems Inc. By that time, Boujnane had returned from a one-year stint in Silicon Valley at yet another financial risk-management software company, where she, too, caught the entrepreneurial bug.

Idée began in 1999 developing new media software on spec for clients. Boujnane quickly hired Bloore for his technical expertise. In 2001, the two decided to phase out their thriving services business and build a software firm instead.

Specifically, Bloore wanted to develop software to tackle burgeoning online copy infringement. “I realized that trying to use an invisible [digital] watermark in an image to try and track its use is just never going to happen,” says Bloore. “Crop it 40%, rotate it five degrees, and the watermark is gone. The watermark is fragile no matter what, and it will be removed just from typical use like putting it on a website.” He also felt there was something fundamentally wrong about identifying an image only by adding something to it. The way to track images, he thought, is the way humans do: by comparing the patterns in one image to another, and seeing if they’re the same.

Sounds obvious, but getting computers to see like humans is an unsolved problem. University studies have tended to dwell on getting software to analyze objects in images and assign semantic labels — so it recognizes a ball as a ball, for instance, or the difference between a dog and a cat. While that is a potentially powerful technology, and some slow progress has been made, it’s a very hard nut to crack.

So Boujnane and Bloore remained pragmatic, and focused on finding potential applications for image recognition technology. From the start, they figured tracking copyrighted images would be a natural fit. But when they showed potential customers like stock photo firms their prototype, they were told otherwise: in 2001, the real problem was simply finding images inside their continuously expanding digital archives of images. Boujnane and Bloore refocused on visual search technology able to find similar images, based on colours, textures, shapes, foreground and background. Adobe licensed the result in 2003 for one of its program’s Find by Visual Similarity function, and it is that same technology behind Idée’s Piximilar product in use, for example, on Masterfile’s website to refine searches of its image database.

That early market research forced Idée to think about building software that would be future-proofed for massive and continuously expanding image databases. “Human beings are better processors of visual information than software will ever be,” says Bloore. “The only thing that we can do with software is process way, way more data than a human can.”

Says Sven Dickinson, a U of T computer vision expert who sits on Idée’s board: “The key for Idée was not trying to bite off that open-ended [academic] problem, but rather finding applications and markets where people need to find specific images.”

By 2005, when Idée zeroed back in on the market for tracking images, customer prospects were ready to listen. “The technology really impressed me,” says Michel Scotto, the Paris-based director for photo business development at Agence France-Presse. But Scotto still wasn’t interested in Idée’s plan of tracking unlicensed online use of AFP images. He wanted a billing solution. Today, Idée offers AFP an automated, web-based reporting service that compares all the images appearing in about 70 French magazines with the roughly three million images in AFP’s ever-expanding database. Idée’s 400-node server farm processes five to six trillion comparisons per month for 21 other clients.

It’s a nice, profitable business, but Boujnane and Bloore think they’re just warming up. “The entire world is moving into visual digital formats,” says Boujnane. “As more images are produced, more video is produced, it is time to start tracking that.” Indeed, Idée started a pilot project with NBC Universal last year to test PixID with online video. The results are encouraging, suggesting that copyrighted material posted to video sharing sites could be identified automatically.

But if Idée wants to be the Google of visual search, it will need to find a way to make money. Says Boujnane: “Every single time we show what we are doing to someone new, what they say is, ‘I wouldn’t use it that way, this is what I would do.’ We are putting something in the hands of people, and they will really decide what it is that can be built upon it, how they will actually use it, and what it’s going to look like.”

Idée needs to bulk up, both in servers to crawl and index the Web’s unknown billions of images, and in employees to beat bushes for new business partners — and that means getting venture capital. Boujnane needs a VC willing to invest the $10 million or so she thinks the company has to raise, and to provide the kind of specialized expertise that will help Idée build the best business.

The coming year will test Boujnane, Bloore and Idée on a whole new level. But Boujnane displays exceptional entrepreneurial optimism: “Do what you love and the revenues will follow.” Hey, it worked for Google.