The Google Goal Of Indexing 100 Billion Web Pages
Google?s Goal of Quality Search
In their paper 'The Anatomy of a Large-Scale Hypertextual Web Search Engine' it is very evident that Google?s goal has always been to be one of the best search engines there is in terms of the quality of the results it gives. Sergey Brin and Lawrence Page, however knew that in order to do this, Google needed to be able to store information efficiently and cost effectively and to have excellent crawling, indexing, and sorting methods or techniques. Google not only aimed to give quality results but to produce the results as fast as possible. Google started as a high quality search engine and continues to be the best search engine today. It has managed to stay true to its original intent to be a search engine that not only crawls and indexes the web efficiently but also to produce more satisfying results in comparison to other existing search engines.
To stay true to their goal of providing the best search results Google knew right from the start that it had to be designed so that the search engine could catch up with the web?s growth. According to Brin and Page ?In designing Google we have considered both the rate of growth of the Web and technological changes. Google is designed to scale well to extremely large data sets. It makes efficient use of storage space to store the index?. They knew that they needed much space to store and ever growing index.
Google?s index size, which that started out as 24 million web pages was large for its time and has grown to around 25 billion web pages, still keeping Google ahead of its competitors. However, Google is a company that doesn?t settle for just beating the competitors. They truly aim to give their users the best service there is and that means as a search engine they want to give users access to all or at least most of the quality information that is available on the web.
Google?s New System for Indexing More Pages
As mentioned earlier, Google aims to give access to even more information and has been devoting time and much effort to realize this goal. It seems that the new patent entitled 'Multiple Index Based Information Retrieval System' filed by Google employee Anna Patterson might be the answer to the problem. The patent published just this May of 2006 and filed way back in January of 2005 shows that Google might actually be aiming to expand their index size to as much as a 100 billion web pages or even more.
According to the patent, conventional information retrieval systems, more commonly known as search engines, are able to index only a small part of the documents available on the Internet. According to estimates the existing number of web pages in the Internet as of last year was around 200 billion; however, Patterson claimed that even the best search engine (that is Google) was able to index only up to 6 to 8 billion web pages. The disparity between the number of indexed pages and existing pages clearly signaled a need for a new breed of information retrieval system. Conventional information retrieval systems just weren?t capable of doing the job and just wouldn?t be able to index enough web pages to give users access to a large enough percentage of the present existing information available on the web.
The Multiple Index Based Information Retrieval System, however, is up to the challenge and is Google?s answer to the problem. Two characteristics of the new system makes it stand out compared to the conventional systems. One is that it has the ?capability to index an extremely large number of documents, on the order of a hundred billion or more?. And the other is its capability to ?index multiple versions or instances of documents for archiving?enabling a user to search for documents within a specific range of dates, and allowing date or version related relevance information to be used in evaluating documents in response to a search query and in organizing search results.? With the new system developed by Patterson, Google now has the ability to expand its index size to unbelievable proportions as well as improve document analysis and processing, document annotation, and even the process of ranking according to contained and anchor phrases.
History of Google?s Index Size
Google started out with an index size of around 24 million web pages in 1996. By August of 200, Google had managed to quadruple their index size to approximately one billion web pages. On September of 2003 Google?s front-page boasted and an index of 3.3 billion web pages. Microdoc, however, revealed that the actual number of web pages Google had indexed during that time was more than five billion web pages already. In their article 'Google Understates the Size of Its Database', they emphasized that Google not only specialized in simplicity but also in understating their power and complexity. Google was still managing to stay ahead of its competitors and continued to surprise everyone with what they had under their sleeves.
As Google?s index continued to grow the number in their front page grew impressively large as well before it plateaud at eight billion web pages. This was around the time that Patterson filed the new patent. Then in 2005, with controversies in index size growing, Google decided to stop counting in front of the public and simply claimed that their index size was three times larger than the nearest competitor?s index size. Google also maintained that it was not just the size of indexed pages that was important but how relevant the results they returned were. Then in September of 2005, as part of Google?s 7th anniversary, Anna Patterson, the same software engineer who filed the patent on the Multiple Based Index Information Retrieval System posted an entry on Google?s official blog claiming that the index size was now 1,000 times larger than the original index. This pegged their index size to around 24 billion web pages, about a fourth of Google?s goal of indexing a100 billion web pages. It seems then that Google must have started using the new system in mid 2005. With the new system in place we can only wait and see how fast Google will reach the goal of a 100 billion web pages in its index. It's most likely though that when Google has reached that goal it would set an even higher goal to provide continuous quality service.
http://www.theinternetone.net
|
|
 |
 |
nCode Launches Web Based Analysis for Engineering Data at JSAE Automotive Exposition
nCode International, a global provider of durability based software & instrumentation, is pleased to present at the JSAE Automotive Engineering Exposition a new product for web-based analysis and management of test data. ICE-flow Automation is set to revolutionize collaborative engineering test and monitoring applications, integrating advanced web-based data management technology with nCode's GlyphWorks data analysis software.
How to drive high traffic to your site?
Planning and organizing are the most crucial steps in designing a website. Well, driving high traffic to your site is important, but what's even more important, is designing a website that makes them stay. Focus on good content and designing your website for your visitor, not the search engine. Search engine optimization is the skill of designing or re-designing a website in order to improve the search engine ranking of that website for certain relevant keywords go to www.javascript-magic.com When designing your site, put into mind a website that is homey.
The Siekie Press, LLC Company of Ohio Recently Launched Website Ignites Interest In Cost Effective Promotional Marketing Products
The Siekie Press, LLC Company announces the launch of their website www.siekiepress.com as its vehicle for providing outstanding promotional products, of the highest superiority available, to clients everywhere.
SEOs Relationship With Website Architecture
SEO's Relationship With Website ArchitectureSearch engine optimization for today's search engine robots requires that sites be well-designed and easy-to-navigate. To a great degree, organic search engine optimization is simply an extension of best practices in web page design.
Make Money With Online Surveys
One of the biggest questions searched these days is about making money with online surveys and whether they are legitimate or not.
Why Choose Online Savings?
As banks and building societies scramble and compete with each other to attract customers? savings, The Guardian newspaper (1707
Web Design For The Home Based Business
New technology released in 2007 has made it easy for people to create their own websites and FTP upload entire pages from the tools in MS Office Suite 2007-Business Edition.
3 Simple Steps To Organize Your Critical Online Home Based Business Directory
If You are running an Online Work at home Business it is very critical that you create a well organized directory structure. The Obvious advantage of an organized Directory structure is it makes it easy to find what you need.
Introducing Engagement Commerce: Fididel Launches First Site with Real-Time Negotiation to Give Buyers More Control of the Online Shopping Experience
Fididel, the first online marketplace based on real-time negotiation of sales, today introduced millions of buyers and sellers worldwide to a fun and interactive approach to ecommerce: engagement commerce. Fididel (www.fididel.com) fills a gap in the ecommerce market by moving the online buying experience from an "auction" or "fixed price" transaction to real-time negotiation, a universally accepted way to buy and sell products in any real-world marketplace.
Do you know how to provide quality service in Internet Marketing?
To be coachable is a great gift and a tremendous asset. Everyday I laugh a little, reflect a little and learn a little. I choose to learn from those who have already done what I desire to create in my life.~ Ellie Drake
Amtrak Wins 10 million Online Bookings Through NetDespatch
Amtrak's express delivery business has been revolutionised by its Online Collections, Despatch and International services supplied by 'Software as a Service' specialist NetDespatch.
How to Rank Your Website in Multiple Countries? Article on Geo Targeting by Website Magazine
Website Magazine in it's Feb 2008 issue has an article "Geo targeting countries for SEO". The article is written by eBrandz CEO, Milind Mody. The article familiarizes readers to the basic logic used by search engines in associating a domain name with a country. The article also has some useful tips on how to rank websites in multiple geographies.
Using Online Business Technologies
Using Internet-based business technologies is something most business owners are completely unfamiliar with. To many, the concept of making use of the Internet for business purposes is still completely alien, and beyond setting up a basic brochureware website, some business owners have still yet to embrace Internet technologies for the benefit of their business. From payment processing and ecommerce websites through to establishing VOIP telecomms systems, business owners can benefit from cost savings and increased efficiencies simply by switching over to new Internet based technologies, and it is worthwhile for any business owner to look into the possibilities of heading down that route.
|
 |
|