Current Cites

Volume 14, no. 12, December 2003

Edited by Roy Tennant

The Library, University of California, Berkeley, 94720
ISSN: 1060-2356 -

Contributors: Charles W. Bailey, Jr., Shirl Kennedy, Leo Robert Klein, Roy Tennant

Digital Preservation Management: Implementing Short-Term Strategies for Long-Term Problems   Ithaca, NY: Cornell University, September 2003. ( - From the same folks who brought us Moving Theory Into Practice: Digital Imaging Tutorial" comes yet another informative, engaging, and slick presentation of essential information on an important topic. Built to support a workshop of the same name, this online tutorial is well worth the time of anyone interested in digital preservation. Anne Kenney and company clearly know their stuff, and they have applied their award-winning style in presenting a complex mixture of organizational and technical information to great effect. Be sure to check out their "Chamber of Horrors: Obsolete and Endangered Media" and "Timeline: Digital Technology and Preservation", both very useful in their own right. - RT

Digital Library Federation Fall Forum 2003   Washington, DC: Digital Library Federation, November 2003. ( - A tremendous amount of innovation is going on in libraries these days, the world over. For those of us in the United States, however, one of the best sources for finding out about cutting-edge developments is at the twice-yearly DLF Forums. Although only members and invited guests can attend, the rest of us can virtually attend by reviewing the many interesting presentations that are available online shortly after the end of the meeting. I won't attempt to list the topic areas of the presentations, which vary widely, but will leave you with the assertion that if you are interested in digital library issues of any stripe, there is likely something of interest here for you. - RT

It's About Time: Research Challenges in Digital Archiving and Long-Term Preservation   Washington, DC: The National Science Foundation and the Library of Congress, August 2003. ( - I'm old enough to remember that for a while the preservation of print materials was all the rage. The issue of books crumbling into dust was at the forefront of everyone's awareness within the profession, and at least to some degree, without. Therefore government money to fund print preservation activities was relatively easy to obtain -- particularly for large research libraries. Now, although the print preservation problem has not suddenly disappeared, it is the preservation of digital materials that is all the rage. So it certainly isn't surprising to see this report, which comes out of a workshop co-sponsored by the National Science Foundation and the Library of Congress. If you're involved with digital library research or -- god help you -- in digital preservation itself, this report is essential reading. The rest of us can probably skip it. - RT

"Keeping Found Things Found: Web Tools Don't Always Mesh With How People WorkAscribe: The Public Interest Newswire   (17 December 2003) ( - "People have devised many tricks - such as sending e-mails to themselves or jotting on sticky notes - for keeping track of Web pages, but William Jones and Harry Bruce at the University of Washington's Information School and Susan Dumais of Microsoft Research have found that often people don't use any of them when it comes time to revisit a Web page. Instead, they rely on their ability to find the Web page all over again." Keeping Found Things Found is a National Science Foundation-funded research project ongoing at the University of Washington's Information School that seeks to learn how people actually work with the information they find on the Web. Eventually -- according to this press release which describes the project -- the researchers hope to develop information seeking and management tools that are actually useful to end users. A collection of Keeping Found Things Found presentations and papers is available online. - SK

"XML and E-Journals"  OCLC Systems & Services   19(4) (November 2003) - This special issue focuses on the use of XML in electronic journals. Included are articles that review the history of article metadata standards, the history of XML, using XML for journal archiving, and using XML for scientific publishing. I'm not yet convinced that it is feasible to markup journal articles in XML, at least without the ability of common authoring tools such as Microsoft Word to output an article in a useful XML encoding. From this set of articles, it appears that I'm not the only doubting Thomas, as the editor (Judith Wusteman) of this collection remarks in the introduction that "The granularity with which e-journals should be marked up is debateable and there is more than one approach presented in this special issue". But as Wusteman herself puts it, "The papers in this special issue cover a breadth of opinion but there is a common theme, namely that XML and its related technologies can help to fulfill the promise of e-journals." - RT

Ayati, M. B, and Susan Carol  Curzon.  "How to Spot a CIO in TroubleEDUCAUSE Quarterly   26(4) (2003):  18-23. ( - Catalog of "warning signs" that the head of IT will get the axe if left unresolved. Many of the points will be familiar to anyone who has felt themselves under the tyrannical yoke of an unresponsive Systems operation. Warning signs include "everything is always a crisis with them" and "we can count on them to fail", or my personal favorite, "I have students who are more up-to-date." - LRK

Barton, Mary R., and Julie Harford  Walker.  "Building a Business Plan for DSpace, MIT Libraries' Digital Institutional RepositoryJournal of Digital Information   4(2) (2003) ( - Currently, there is a great deal of interest in institutional repositories, but little is known about their costs. This article outlines MIT's business plan for its well-known DSpace repository. Not considering software development and system implementation costs, the authors conservatively estimate a budget of $285,000 for FY 2003. The bulk of the costs are for staff ($225,000), with smaller allocations for operating expenses ($25,000) and system hardware expansion ($35,000). MIT's DSpace service offerings have two components: core services (basic repository functions) and premium services (e.g., digitization and e-format conversion, metadata support, expanded user storage space, and user alerts and reports). While core services are free, MIT reserves the right to potentially charge for premium services. For further information see: MIT Libraries' DSpace Business Plan Project--Final Report to the Andrew W. Mellon Foundation (,which indicates that system development costs "included $1.8 million for development as well as 3 FTE HP staff and approximately $400,000 in system equipment." - CB

Brown, Cecelia, Teri J.  Murphy, and Mark  Nanny.  "Turning Techno-Savvy into Info-Savvy: Authentically Integrating Information Literacy into the College CurriculumJournal of Academic Librarianship   29(6) (November 2003):  386-398. ( - Information literacy is most successful when it directly relates to the individual information needs of each student. That's the conclusion of a case study presented here looking at information seeking behavior of college students majoring in education. Among a number of great points made throughout the article is this gem: "It is no longer effective to provide a laundry list of information resources that librarians believe to be 'good for' students, but rather, instruction must focus on the learning styles and preferences of the target population. Others have also suggested that to successfully foster and promote information literacy librarians must first understand how people learn. " - LRK

Kugel, Robert D..  "Unstructured Information ManagementIntelligentBPM   (December 2003) ( - This white paper, from Ventana Research, offers a lucid explanation of what "unstructured information" actually means, and why it will consume a significant amount of IT resources in the coming years. Structured data is the easily classified stuff -- names, addresses, zip codes, SKU numbers, etc. Unstructured data "does not readily fit into structured databases except as binary large objects (BLOBs)." Examples given include e-mails, multimedia files, document files.... Although these objects may have some structure -- e.g., an e-mail address -- they are not easily classified for storage in a structured format that makes a typical database happy. As the amount of this unstructured data increases exponentially, solutions are being sought; XMLis a big help because of its flexible tagging system. If this data cannot be efficiently stored and retrieved, it has little or no utility. The white paper identifies six potential components of a viable storage system: document management, Web content management, records management, digital rights management, collaboration, and image capture. All of these elements are emerging as critical, especially in light of today's more stringent regulatory environment (i.e., Sarbanes-Oxley) which dictates compliance standards for information retention. - SK

LeFurgy, William G..  "PDF/A: Developing a File Format for Long-Term PreservationRLG DigiNews   7(6) (15 December 2003) ( - The number of files in Adobe Acrobat format (also known as PDF for Portable Document Format) is astounding. This file format has been embraced by the U.S. Government, journal and book publishers, and indeed just about anyone who wishes to have more control over how something displays on screen than can be attained by HTML. And although PDF is a somewhat open format (with the specification openly published), it nonetheless remains in the control of a commercial company, and therein lies the preservation rub. "Adobe controls its development and is under no obligation to continue publishing the specification for future versions. The format includes some features that are incompatible with preservation purposes," states the author. Therefore, there is a move afoot, which this piece outlines, to specify a stable subset of the PDF format upon which librarians, archivists, and others can rely as a method to preserve digital information over the long haul. Given the number of PDFs that were created while you were reading this, such a development can only be good news. - RT

Margulius, David L..  "Trouble on the NetInfoWorld   (24 November 2003) ( - "The founders of the Internet sought to minimize intelligence at its core and insure end-to-end connectivity. Today, a host of challengers, including commercial interests and security concerns threatens that vision. What can be done?" Some interesting tidbits from this article: 1) The number of "average daily queries" to the Net's DNS services is "up fivefold since 2000." The number doubles every 18 months; 2) "Internet traffic is growing at a faster rate than Moore's Law predicts...."; 3) IPv6, the so-called "next generation Internet," has gotten off to a slow start in the U.S. Says Symantec CTO Rob Clyde, "That whole product upgrade cycle is likely to be very complex. Everything has to be changed. It will probably take the government driving IPv6."; 4) VeriSign has invested more than $100 million in the DNS system and provided "100% availability for six years." Note: Large PDF file -- 5.63MB - SK

Orlowski, Andrew.  "A Quantum Theory of Internet ValueThe Register   (18 December 2003) ( - Google "sucks," according to this IT columnist. This in spite of its impending (as of this writing) rollout of Google Print, which is more or less like's Search Inside the Book tool. It's not Google's fault that it sucks, the writer says, because Google's "aggressive, but essentially dumb robots" simply cannot "see" most of the Web. The intial promise of the Internet -- that everyman would be easily connected to the entire world of information -- has not been fulfilled. Why? "Information costs money." What a concept! "Taxonomies also have been proved to have value...." Another concept! And, says this columnist, librarians and archivists know this better than anyone. He wonders why no one has seriously looked into "how come our 'Internet' went AWOL, while we weren't looking?" Has it been totally overpowered by garbage and hucksterism? And why haven't such "fads" as portals and blogging been enough to save it? Or maybe the Internet as we perceived it back in 1994 never actually existed. What is important, the author says, are the "information archives" we have now. And if you doubt this, he suggests, "ask a librarian, while you can still find one." - SK

Current Cites - ISSN: 1060-2356
Copyright (c) 2003 by the Regents of the University of California All rights reserved.

Copying is permitted for noncommercial use by computerized bulletin board/conference systems, individual scholars, and libraries. Libraries are authorized to add the journal to their collections at no cost. This message must appear on copied material. All commercial use requires permission from the editor. All product names are trademarks or registered trade marks of their respective holders. Mention of a product in this publication does not necessarily imply endorsement of the product. To subscribe to the Current Cites distribution list, send the message "sub cites [your name]" to, replacing "[your name]" with your name. To unsubscribe, send the message "unsub cites" to the same address.

Document maintained at by Roy Tennant.
Last update December 19, 2003. SunSITE Manager: