Udi Manber speaks to us: about engines of course
Search is a very competitive market. But it is also a very important service, that is setting the very meaning of the Internet. The service provided by engines seems very simple. But it is not.
To understand something more, I asked Udi Manber his help: not only as the person that is leading A9, but also as a professor with an independent view of the phenomenon.
Udi Manber - I believe that search is a critical and yet unsolved problem. I have been working in the search field for 15 years and I'm amazed at how far we have come, and equally amazed at how far away we are from truly solving the challenge. As more and more information is accessible via the Internet, high-quality search becomes even more critical. Search is a competitive market and this competition is spurring terrific innovation.
Me - As I see it, in engines there are core technologies for searching the Web in general, and some specialized tools that are peculiar to different engines. I would like to understand in what different engines are different. And then I would like to understand the consequences of those differences.
Udi Manber - I would prefer to not discuss the technology of other companies, however, I can say that at A9.com, we are interested in giving users tools to manage their searches more effectively, and to create a search engine with a memory. Remembering where you have been on the Web is a very powerful experience. Being able to make notes (through A9's diary feature) about a site or page makes searching more efficient. Seeing a number of data sources side-by-side (as on A9) makes searching more comprehensive. All of these features and more that are found on A9.com help achieve a truly great user experience.
Me - Can we first take a look at core technologies? Algorithms are rules. And rules should be clear to understand. Which are the main kinds of rules that differentiate the various engines that are on the market in terms of core rules? Google is known to assess the relevance of pages by using the number of links from other sites: is that a correct idea? And in what other engines are different, or even better?
Udi Manber - Search companies, even before the web, have always guarded their core ranking algorithms as highly proprietary secret sauce. These algorithms are highly tuned over a long period of time, usually with a large number of special features.
Me - What do those rules mean from the cultural point of view? Is it true that those rules help big sites to become bigger and minority sites to be more and more minoritarian? How different languages affect the situation?
Udi Manber - Like everywhere else, experience counts. But innovation counts more. We are only at the beginning, and I expect search engines to be much better in the years to come. Most of the important rules are language independent, although there is non trivial work involved adapting them to different languages.
Me - What happens when an engine is made by someone that also has his own content? There is a new rule: relevance is assessed in this way but when it comes to such and such matter, the rule is different and the content of the owner of the engine becomes more relevant. Is this correct? Or should it be published in a way that makes clear the different set of rules?
Udi Manber - A large part of the search experience is contingent upon a level of trust between the engine and the user, in that the user believes that the results delivered are as accurate and informative as possible. Our number one rule is that users come first. Relevancy is measured by how satisfied users are with the results. At A9.com, we provide users data from four different sources - Google (Web results and images), Amazon.com (Search Inside the Book), IMDB.com, and Gurunet.com. All four sources provide results simultaneously, and the user has the power to control how they are displayed on the screen. Users are the final arbiter of what they deem relevant, not the search engines.
Me - And what happens with payed search results? What policy should engines follow to be credible in their results?
Udi Manber - The same rules apply: be good to your users. In this case, it is important to tell users what's paid and what's not, in addition to being relevant and useful to users no matter if it's paid or not.
Me - Personalization is supposed to be the next step in engines. Does this mean that engines are going to be made in a way that people will find only what they know they would be interested in without having anymore surprises? This could be more efficient and less interesting at the same time. And are engines going to become tools that other people develop in their own way? Is this going to fragment the cultural impact of the Internet?
Udi Manber - This is a very good question. Although personalization will no doubt improve search, it can cause harm too, in exactly the way you mentioned - it might limit what people can find and how they can find it. The key will be to empower people rather than giving them black boxes that they don't understand. The reason you have not seen many personalized search engines yet is that it is very hard to do well. At A9.com we started with personalized data - we allow you to search your history, your diary, and your bookmarks. The content is clear - it is your data - and the whole process is clear. At the same time you can also search the web and several other sources of data. We will tie those results with your history - for example, we will tell you that you clicked on any particular results and when - but we do not change the order of web results based on who did the search.
Me - There are at least three metaphors for content on the Internet: the biggest library in the world, a new medium for news and information on current matters, a tool for a world wide conversation. Should engines be different for the three metaphors? One engine for finding documents that have a lasting impact, one engine for news, and a third engine for blogs? Or should the engine be able to put together all this in an intelligent way? Which one?
Udi Manber - I can't predict the future and I won't try. But whether there will be three engines or one, there will be a need for specialized algorithms for the different metaphors. And there will be more than 3 metaphors.
Me - Problems are different in different cultures and with different languages? And if yes, how are they different?
Udi Manber - Dealing with different cultures and languages is just one of the factors that makes solving the challenges around search difficult. Fortunately, many of the breakthroughs and the new search features are culture independent, so everyone benefits.
Me - Engines can be manipulative. How can we help users not to be manipulated by engines?
Udi Manber - The key is to empower users, to give them the right tools, and to make the results as clear as possible, so they can make up their own minds. At A9.com, we empower users by providing many data sources and give them control over how the data is presented to them. The tools we provide give users an extension of their memory to give them even more control over not just the search they are conducting now, but the search after that, and the search after that.
Me - Algorithms are secret. Rules should also be secret? Is it interesting to have a sort of independent users authority for maintaining the credibility of engines?
Udi Manber - While the algorithms are secret, the results are being judged by everyone every day. The credibility of the engines is therefore open to more scrutiny than almost anything else.
Me - What next? Engines for video and music, engines that go out the Net and see television and other media? Engines that go into the desktop and out again?
Udi Manber - I can't speculate on the future, but I'm sure that over time we will see many innovative new approaches to search. This is just the beginning.
Me - Is the semantic Web happening? And is it interesting?
Udi Manber - It is too early to say. There are many interesting ideas and many interesting projects.