Current Cites (Digital Library SunSITE)

Volume 10, no. 7, July 1999

Edited by Teri Andrews Rinne

The Library, University of California, Berkeley, 94720
ISSN: 1060-2356 -

Contributors: Terry Huwe, Margaret Phillips, Jim Ronningen, Roy Tennant, Lisa Yesson

Basch, Reva. "High AJeevers: Valet-Added Searching from Ask Jeeves" Database 22(3) (June/July 1999): 28-34. - As libraries continue to struggle with the most effective web interface to Internet and library resources, the single, simple search box as online reference desk is a tempting model. But who envisioned it would be staffed by a butler? P.G. Wodehouse's caricature of a proper British butler is the host of Ask Jeeves, a second generation search engine where users are encouraged to submit a natural language query in a simple search box. In this behind the scenes look at Ask Jeeves, Basch describes how Jeeves accepts a natural language query and attempts to match it against a list of known questions - about seven million as of early 1999 - in its knowledge base. Jeeves uses a proprietary parsing technology called QPE (Question Processing Engine) that is based on both semantic processing (understanding the meaning of words) and syntactic processing (understanding parts of speech and how words are used in context). Of course, the key element in this process continues to be the humans on the six newspaper-style content desks who build the knowledge base. Yet even with seven million "answers" Jeeves may still send you down a few dark hallways. But it will be interesting to see if one day this knowledgeable butler will give new meaning to silver platter. - LY

Sheehan, Mark. "Faster, Faster! Broadband Access to the Internet" Online 23(4) (July/August, 1999):18-26. Tilley, Scott. "The Need for Speed" Communications of the ACM 42(7) (July 1999):23-26 ( - Ah, working from home. Whether that means more flexibility or just more work, if you're doing it you'll probably want the fastest affordable Internet connection. These two articles neatly summarize the currently feasible options for increasing your flow: 56Kbps modems, cable modems, Integrated Services Digital Network (ISDN), and Digital Subscriber Lines (DSL). Tilley's article in the CACM describes his experiences with fast modems, DSL and cable. It's helpful to learn what he had to do himself to get things to work, such as running a test on his phone line to see if it could handle higher speeds. Sheehan's article in Online is a more systematic overview, and includes tables and sidebars which list costs, availability, predicted vs. observed downstream and upstream speeds, etc. The sidebar titled "Promises, promises" was certainly a cold shower — it details mundane problems which can drastically cut speed, e.g. phone wiring too close to a dimmer switch, or the distance of your home from the telco's central office, or who happens to be using your particular branch of the cable system. (Here in the San Francisco Bay Area, there've been stories in the local media about disgruntled cable Internet subscribers who discovered they were sharing pipelines with bandwidth-hog digital video developers). Sheehan also touches upon broadband wireless and satellite possibilities. - JR

Coffman, Steve. "The Response to 'Building Earth's Largest Library'" Searcher 7(7) (July/August 1999): 28-32 ( - In this interesting follow-up to his explosive article in the March 1999 issue of Searcher (cited in the April 1999 issue of Current Cites), Coffman addresses some of the 250-odd responses he received. The response has been so dramatic to the idea put forward in his original article that Information Today is devoting a day-long track to the idea in the November 1999 Internet Librarian Conference they sponsor. For this follow-up piece to make any sense to you, you should first be sure you've read the original article. Whether you agree with him or not (and I do), all librarians need to sit up and take notice. - RT

Gorman, Michael. "Metadata or Cataloging? A False Choice" Journal of Internet Cataloging 2(1) 1999: 5-22. - In this thoughtful piece Gorman considers the appropriate roles of MARC, AACR2, the Dublin Core, and web search engines in making electronic resources more easily discoverable. He ends with the assertion that we are not faced with a dichotomy, but with an opportunity, and he proposes using the four-pronged approach to resource discovery: 1) full MARC cataloging, 2) enriched Dublin Core records (what is also called the "structuralist" approach), 3) minimal Dublin Core records (the "minimalist" approach), and 4) full-text keyword searching via web search engines. Those resources deemed the most valuable would get the full-MARC/AACR2 treatment, while others would get progressively less attention until reaching the mass of unselected resources available through web search engines. - RT

Green, Ann, JoAnn Dionne, and Martin Dennis. Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata, Washington: Digital Library Federation, June 1999. ( - This second publication from the Digital Library Federation focuses on how to rescue statistical data from outdated formats and/or systems. This process, called migration by those knowledgeable about digital preservation matters, is complicated and not-often attempted (yet). Thus this early report from the front lines of preservation is all the more important. The two-track approach is necessary since not only the data must be rescued, but also the metadata or descriptions of the data, and each requires a different process. - RT

Phillips, Margaret E. "Ensuring Long-Term Access to Online Publications" JEP: The Journal of Electronic Publishing 4(4) (June 1999) ( - The problem of retaining access to digital material that may exist in only one location -- and in the hands of a commercial enterprise that may go bankrupt any day -- is enough to keep just about any librarian awake at night. But if the National Library of Australia has their way, librarians in Australia may soon be sleeping a bit sounder. From the NLA viewpoint, there are two distinct processes: archiving (collecting the material to be preserved), and preservation (keeping the material accessible as technology changes). Since it is still anyone's guess how best to handle the latter problem, this article mainly describes how the NLA is dealing with the former issue. Phillips discusses the collecting process (including identification of material and comprehensive vs. selective collecting), metadata management, quality control, access, permanent naming, and costs. - RT

"The ROADS Project Exit Strategy - Ensuring the Future of ROADS for its Users" ROADS Development Newsletter Issue 9 (July 1999) ( - Bringing a project to a close is never an easy task, but in this case at least, it appears that it doesn't have to mean the end of the ROADS. The British Electronic Libraries (eLib) Programme is an ambitious collection of projects that have sought to advance library technology and technique into new areas of digital collections and services. One of the more successful projects is the Resource Organization and Discovery in Subject-based Services (ROADS) effort to create a set of tools for building interoperable subject-based indexes to Internet resources. Their software now serves a number of subject indexes well, and provides a method by which to query these indexes simultaneously. Therefore, the ROADS team is committing to some level of continuing support despite the end of eLib funding. To do this, they are using the Open Source model that has served so many other software development projects well (can you say Linux?). - RT

"Web Search Engines: Precision, Power, and Performance" Online 23(3) (May/June 1999): 20- ( - This special section on web search engines covers many different aspects of these tools, and provides some handy charts detailing their various features. Included are programs you can install on your own server as well as the huge indexes that attempt to comprehensively index the web. Specific topics include results ranking, natural language processing, meta search engines, features and commands, and the future of search engine technology. Some of the articles are available online at the Online web site. - RT

Current Cites 10(7) (July 1999) ISSN: 1060-2356 Copyright © 1999 by the Library, University of California, Berkeley. All rights reserved.
http: //

Copying is permitted for noncommercial use by computerized bulletin board/conference systems, individual scholars, and libraries. Libraries are authorized to add the journal to their collections at no cost. This message must appear on copied material. All commercial use requires permission from the editor

All product names are trademarks or registered trademarks of their respective holders. Mention of a product in this publication does not necessarily imply endorsement of the product.

To subscribe to the Current Cites distribution list, send the me ssage "sub cites [your name]" to, replacing "[your name]" with your name. To unsubscribe, send the message "unsub cites" to the same address.

Editor: Teri Andrews Rinne,, (510) 642-8173

Copyright © 1999 UC Regents. All rights reserved.
Document maintained at by the SunSITE Manager.
Last update August 9, 1999. SunSITE Manager: