Web Searching Magnified
You might be surprised to learn that when you conduct a search on the Web, you're missing out on about 95 percent of what's really out there. Weiyi Meng has found a way to reach further into the Web to retrieve more information when conducting a search.
Google has been so pervasive among Internet surfers for so long that the Oxford English Dictionary added "google" as a verb years ago. And while few doubt Google's supremacy as a search engine, Binghamton University Computer Science Professor Weiyi Meng says there is a better way to search the full content of the Web.
A basic problem facing the big search engines like Google, Yahoo and MSN is that the World Wide Web is split into two general categories. The most visible is the “Surface Web,” which is the five percent of the Web that is public and indexable by the big search engines. The remaining category — the vast majority of the Internet — is the “Deep Web,” and even though it’s 20 times larger than the Surface Web, it goes mainly unnoticed by the big engines. But there are many smaller, independent engines that individually search their own segments of the Deep Web. Harness the power of these Deep Web engines, and you harness the total power of the Deep Web’s nearly trillion pages. Meng is doing just that. In 2002, he and three colleagues, one a former Binghamton PhD candidate, started the company Webscalers to demonstrate that “large-scale metasearch technology” was feasible. They are building customized “metasearch engines,” which use the Deep Web engines and databases for information retrieval. In fact, they built the largest metasearch engine in the world when they connected to 1,800 news engines in 200 countries to create http://www.AllInOneNews.com.
Metasearch technology can also be used to link existing autonomous systems without having to modify each one. The State University of New York (SUNY) structure provides a good example as it is made up of 65 independent systems, each with its own small search engine. A metasearch engine can be placed on top of these smaller engines, thereby creating a single, larger search system for the entire SUNY system.
Meng’s research on Web-based information retrieval is a natural evolution from the area of research that brought him, in 1985, from China to the University of Illinois at Chicago. He earned his master’s and PhD there, studying multidatabase technology and trying to find ways to access many autonomous systems in a uniform way. “Now what I’m doing is accessing multiple Web sites in a uniform way,” he said. “So there was some connection with the general way of thinking, but of course the problems are totally different now.” Over the past two decades, Meng has published more than 100 papers and a book, served on scores of international conference program committees and, he is currently writing a book on advanced metasearch engine technology.