Anglicans Online News Resources Basics Worldwide Anglicanism Anglican Dioceses and Parishes
Noted this Week News Centre A to Z Start Here The Anglican Communion Africa Australia Canada England
Letters to AO News Archives Events Anglicans Believe... In Full Communion Europe Hong Kong Ireland Japan
Search, Archives Newspapers Online Vacancies The Prayer Book Not in the Communion New Zealand Scotland USA Wales
Visit the AO Shop Official Publications B The Bible B World B B B
Help support AO w w w w w w w w
This page last updated 15 April 2007
Anglicans Online last updated 29 August 2010
About AO Search

There are two kinds of directories to the Internet: indexes and search engines. The distinction is blurred because most of the big popular index systems, like Yahoo, call themselves search engines. And topical search systems, like Anglicans Online's AO Search, further complicate the distinction.

An index is something compiled by hand. Often judgment is involved. Anglicans Online itself is an index. We, its staff, put every listing into it by hand. There are 10 000 links in Anglicans Online, in 600 separate web pages, and we put them all there by hand, after looking at them to make sure they warranted a listing. We rely on you, our readers, to tell us about pages that need to be listed, but we make the final decision.

What is AO Search?

Anglicans Online, begun in December 1994, grew to be large enough that it wasn't easy to find out whether something was listed in it. In September 1999 we installed search software for the purpose of searching our site, and after doing this we realized that we could extend it to search not just our site, but also the sites that are listed.

The Simple AO search looks through the entire Anglicans Online web site to find pages that have all of the words that you enter. The AO Global Search lets you choose whether you want to search just Anglicans Online, or search also the pages that are linked in it.

AO Global Search is quite remarkable. Anglicans Online lists links to 10000 web sites that have some Anglican content or relevance. Those web sites have, collectively among them, 93000 pages. Our Full Search looks in all 93000 of those pages for you. But no farther. We can illustrate its value by searching for the words "durham cathedral". A global search engine like Alta Vista finds 1630 web pages. AO Global Search finds 118, and Simple AO Search finds 8, the first of which is the "England" page, on which the Durham Cathedral is listed.

How does AO Search work?

All search engines work in two phases. The first phase builds an index in one place, and the second phase answers search requests by looking in that index. When you request a search, it is not actually hunting around the Internet at that time. It has already done the hunting, and stored the results. Your search requests are being looked up in an index that resulted from a systematic scan of web pages.

AO search has one index for each kind of search; each index is built separately and stored separately. The index of Anglicans Online itself is rebuilt each week just after Anglicans Online is published, so it is always completely current. It takes our computers about 5 minutes to scan the files and build this index, and it takes about 10 megabytes of disk space to hold the complete index.

Each week when we rebuild the Anglicans Online index, we produce a list of all of the web sites referenced by AO. We remove from this list all sites that are referenced only in the News Centre, and we remove from this list a small collection of sites that we know not to be Anglican sites, such as amazon.com. The resulting list of sites to be indexed has about 5000 URLs in it.

From time to time we run our Global Index program. This connects, if it can, to each of the 5000 URLs, and reads in each one of them to form a list of all of the other pages that they reference. That list is typically about 200,000 web pages. We remove from the 200,000 pages anything that is not contained on one of the Anglican servers identified by the original seed list of 5000, to produce a work list that is typically 70,000 web pages. Our search software then spends about 2 full 24-hour days indexing each of those 70,000 web pages.

The secret to AO Global Search is the means by which we restrict the scope of our search to just Anglican web pages. That is an inexact process, and we are still working on refining and improving it. The problem is that so much of the valuable Anglican web content is stored in free public web services, and we don't want to index all of their contents; only the Anglican portion. For example, we don't want to index all of GeoCities or all of America Online; we just want to index the Anglican portions of them.

I think a page is missing from AO Global Search. What should I do?

Well, if you can't find what you are looking for, you should use one of the big commercial search engines like Google. The essence of AO Global Search is that it is bigger than AO and smaller than the entire Internet. To create its search database we have to draw a boundary between the portion of the Internet that is Anglican and the portion of the Internet that is not Anglican. Give or take technical problems, if your site is not listed it is because our search robot has decided that it is not a primarily Anglican site. Since nobody knows what "fundamentally Anglican" means, it goes without saying that there is no rigorous definition of "fundamentally Anglican site," and it's really hard to write a decision procedure that will decide on the basis of the name of a site whether or not it should be included. We expect to get better at this as time goes by, and if there is a site missing that you think really ought to be included, please let us know. In fact, if you have a comment about AO Search we'd like to hear it.

A meditation on the Search indexer

To enable AO Global Search, an indexing computer must spend several days probing around the Internet, retrieving pages and adding them to its database. As each page is retrieved, the indexer looks over that page to find links to other pages, and then it follows those links to new pages, retrieving them. There is a very rhythmic cycle: fetch a page, study it, learn about new pages. Big commercial search engines use a similar process, albeit bigger and faster, but they keep retrieving pages until they stop finding new ones. In order to keep AO Global Search small enough to be useful, we have to stop it at the edge of the Anglican communion. We can't index pages about coffee makers or dry cleaners or telephone repair, even though our churches use all of those. Our indexing computer has a collection of about 2 million web pages to which it has found links; only about 300 thousand of them will be used.

The computer is an ordinary PC in a "minitower" box, but it has been augmented to have a great deal of RAM and a great deal of hard disk space. Anglicans Online is published every Sunday evening, but the indexing for the Global Search takes so long to run that we cannot run it every week. Sometimes, as now, it is the middle of a Sunday evening and we are working towards the publication deadline, closing out pages and updating things, and we turn to ponder the search computer and think about what it is doing.

This indexing process has a great Taizé feel to it. The computer sings the same stanza thousands or millions of times, each time the same yet each time different. We listen to the rattle of the disk as a page comes in, and wonder, "is this a parish in Wales? A book shop in Sydney? A youth group in California? A press release in London?" The whole of the Anglican communion, in some modern symbolic form, is gathered here into this computer and treated as one. That page is for a church I've knelt in, and the page after it is for a monastery in Asia that I know I will never see. Unlike St Peter, there is no judgment of pages here: if a page is not admitted to our search index, it is not condemned to anything. It just won't be searchable here. This global gathering of the web pages for the Anglican communion is, for us, a symbol of unity, but it is almost completely invisible. There is a great joy in the togetherness of all of these web pages, but there is a great sadness that we will probably never meet or see or touch the people who made them. The Internet is wonderful, but for some things, bicycles are better.

Tonight it looks as though the index gathering is not going to finish before the publication and its editors are put to bed. Maybe it will finish tomorrow, but then we'll just start it again. Ubi caritas et amor; Ubi caritas, Deus ibi est.


This web site is independent. It is not official in any way. Our editorial staff is private and unaffiliated. Please contact editor@anglicansonline.org about information on this page. ©2010 Society of Archbishop Justus