Current Cites (Digital Library SunSITE)

Volume 10, no. 11, November 1999

Edited by Teri Andrews Rinne

The Library, University of California, Berkeley, 94720
ISSN: 1060-2356 - http://sunsite.berkeley.edu/CurrentCites/1999/cc99.10.10.html

Contributors: Terry Huwe, Michael Levy, Leslie Myrick, Margaret Phillips, Jim Ronningen, Lisa Rowlison, Roy Tennant, Lisa Yesson

Carnevale, Dan. "Web Services Help Professors Detect Plagiarism" The Chronicle of Higher Education (http://www.chronicle.com/free/v46/i12/12a04901.htm) - The Web has brought a double-edged sword into conventional and distance-education classrooms alike: easy access to digital information can mean increased access to plagiarizable information, whether in the form of online encyclopedia articles or from the growing online term-paper market. Moreover, "copying" bits of somebody else's work is now as arduous as cutting and pasting text. Ironically, the same nexus of search engines that students use to find articles online can be tapped by instructors to sniff out those "hauntingly familiar" or "overly ornate" passages. But while entering the offending phrases into a text-rich search engine is infinitely easier than a trip to a bookstore or library to pore through Cliff's Notes or the Encyclopedia Britannica, most instructors don't have the time to surf for purloined bits. Enter web entrepreneurship in the shape of companies such as Plagiarism.org, or IntegriGuard.com, which maintain databases of papers culled from various sources; the former also offers to send papers through a multiple search-engine gamut. Plagiarism.org's resulting originality report highlights suspect passages of eight words or more and provides a link to the web text it matches. In the manner of a badly concealed speed-trap, prevention may lie at least partially in the fact that professors openly register their students and in some cases students upload their own papers for scrutiny. Astonishingly, however, despite fair warning, in one early case study in a class held at UC Berkeley, some 45 papers out of a total of 320 were found to contain "suspicious passages". - LM

Coombs, Norman. "Enabling Technologies: New Patrons: New Challenges" Library Hi Tech 17(2) (1999): 207-210. - In his regular column on enabling technologies for the "print disabled"- those who are dyslexic and those who cannot hold and manipulate books - Coombs aims to highlight the hardware and software tools that libraries need utilize in order to make electronic resources accessible to the widest possible range of users. His aim is to "persuade librarians that taking on this new task will be a challenge and opportunity rather than another burden." As a blind professor, Coombs discusses his initial work with a speech synthesizer to access an online catalog through to the capability to read web documents. In particular he discusses IBMs most recent web browser for special needs patrons called Home Page Reader Version 2.0. Using the numeric keypad and a combination of individual other keys the user can send commands to the program. What makes HPR more useful than simple screen readers is that it allows for comprehensive HTML handling and navigation, so that it will deal with frames, tables, forms list and menus. Unlike regular screen readers it actually examines the HTML code itself but unfortunately does not handle Java. In Coombs informative review he has effectively highlighted some issues that should be of concern to all librarians. - ML

Ganesan, Ravi. "The Messyware Advantage" Communications of the ACM 42(11) (November 1999) - Librarians and other information organizers, take heart - we're messyware and we're indispensable. Playing devil's advocate, the author starts by describing the Internet commerce scenario which so many digital pundits espoused not long ago: a direct link between producer and consumer, with the hated middleman eliminated. In questioning why the opposite seems to be happening when we place a high value on a new kind of dot-com middleman such as Amazon or Yahoo, he introduces his concept of messyware, which he describes as "the sum of the institutional subject area knowledge, experienced human capital, core business practices, service, quality focus and IT assets required to run any business." Why the term "messyware"? While a software solution may be all you need when systems are running perfectly, real life tends to get messy. (The photographs accompanying the text get this point across admirably. They depict people on a rainy streetcorner buying cheap umbrellas from a roving umbrella salesman. Thanks to this middleman, they are getting exactly what they need, when and where they need it, and would certainly not benefit by cutting out the middleman and going directly to the source.) Ganesan, bless him, uses libraries as an example of the value of expert intermediation which can deal with the infomess. His primary focus is on business, but there is plenty to ponder here for all information professionals, including strategic pointers for leveraging the messyware advantage. This article is just one of many fascinating pieces on information discovery, the issue's special theme. - JR

Jones, Michael L. W., Geri K. Gay, and Robert H. Rieger. "Project Soup: Comparing Evaluations of Digital Collection Efforts" D-Lib Magazine (November 1999) (http://www.dlib.org/dlib/november99/11jones.html). - The Human Computer Interaction Group at Cornell University has been evaluating particular digital library and museum projects since 1995. In this article they discuss their findings related to five projects (three museum and two library). Their conclusions include: Effective digital collections are complex sociotechnical systems; Involve stakeholders early; Backstage, content and usability issues are highly interdependent; Background issues should be "translucent" vs. transparent; Determine collection organization, copyright, and quantity goals around social, not technical or political, criteria; Design around moderate but increasing levels of hardware and user expertise; "Market" the collection to intended and potential user groups; and, Look elsewhere for new directions. - RT

Lewis, Peter H. "Picking the Right Data Superhighway" New York Times (http://www.nytimes.com/library/tech/99/11/circuits/articles/11band.html) - For surfers seeking that tubular high-bandwidth download, there is now more than one wave to catch (depending, of course, on availability), each with its own advantages and pitfalls. This article examines three modes of high-bandwidth Internet service: cable modem, DSL and satellite data services. Lewis was in the lucky position (Austin, TX; expense account) to test all three, using as his criteria speed, performance, price, security, and choice of ISP services. His assessment(your results may vary): while any of the three is preferable to an analog modem insofar as the connection is always on, satellite data services can be easily factored out for all but the most remotely situated users due to huge financial outlays, from hardware to installation to monthly fees and possible phone charges to distant ISP providers. Speed is also an issue, at a "measly" 400 kbps. Cable modems, while they offer theoretically the speediest of connections: (30 mbps possible), suffer from "Jekyl-and-Hyde"-like yawls in performance, since cable is a shared resource. The more neighbors to whom you gloat over your wealth of bandwidth, the worse it will become. A more likely figure is 1 mbps. You may also find you have security concerns. DSL, on the other hand, has a dedicated line, so there are no security problems. But it is hands down the costlier alternative. Moreover, outside of a radius of 17,500 feet from the phone company's central office (or about 3 miles), performance suffers significantly, unless you are willing to pay extravagant sums. Data is loaded at somewhat slower speeds than cable's best numbers: download can run from 384 kbps to 1.5 mbps, with upload consistently logy at 128kbps. All these considerations aside, Lewis goes with DSL. The deciding factor is often in the details: having to deal with the telephone company vs the cable company, the choice of ISPs (in the case of cable modems, practically nonexistent), and so on. - LM

Malik, Om. "How Google is That?" Forbes Magazine (http://www.forbes.com/tool/html/99/oct/1004/feat.htm) Walker, Leslie. ".COM-LIVE" (The Washington Post Interview with Sergey Brin, founder and CEO of Google) (http://www.washingtonpost.com/wp-srv/liveonline/business/walker/ walker110499.htm) - For those users of the recently-launched search engine Google (http://www.google.com/) who have consistently found its searching and ranking facilities spot on, and wondered, "How do they DO that?", two recent articles offer some answers; but the algorithm remains a mystery. With the backing of the two biggest venture capital firms in the Silicon Valley, and a PC farm of 2000 computers, another boy-wonder team out of Stanford has revolutionized indexing and searching the Web. The results have been so satisfying that Google processes some 4 million queries a day. Google, whose name is based on a whimsical variant of googol, i.e. a 1 followed by 100 zeroes, claims to be one of the few search engines poised to handle the googolous volume of the Web, estimated to be increasing by 1.5 million new pages daily. It uses a patented search algorithm (PageRank technology) based not on keywords, but on hypertext and link analysis. Critics describe the ranking system as "a popularity contest"; the Google help page prefers to characterize it in terms of democratic "vote-casting" by one page for another (well, some votes "count more" than others ...). Basically, sites are ranked according to the number and importance of the pages that link to it. In a typical crawl, according to Brin, Google reads 200 million webpages and factors in 3 billion links. Decidedly NOT a portal, when Google came out of beta in late September the only substantive change made to the fast-loading white page inscribed with the company name and a single query textbox was a polished new logo. A helpful newish feature is Googlescout, which offers links to information related to any given search result. There are also specialized databases of US government and Linux resources. It appears that the refreshing lack of advertising on its search page will not last forever: in the works is a text-based (rather than banner-based) "context-sensitive" advertising scheme, generated dynamically from any given query. - LM

Miller, Robert. "Cite-Seeing in New Jersey" American Libraries 30(10) (November 1999): 54-57. - Tracking down fragmentary citations or hard-to-locate material is a classic library service. But in this piece Miller highlights how the tools for performing this service have changed. Classic citation-tracking resources are still used, but now the Web can be used as well. A few interesting anecdotes illustrate how a little imagination, experience, and perseverance can make the Internet cough up the answer when the usual resources fail. Miller illustrates how the best librarians are those who can absorb new tools into their workflow as they become available, and therefore become more effective at their job. - RT

netConnect. Supplement to Library Journal October 15, 1999. This very slim but incredibly pithy supplement to LJ is modestly subtitled "The Librarian's Link to the Internet". I doubt anyone needs this publication to get online, but the point is taken. It is aimed at bringing focused information regarding the Internet to LJ's audience. And if this first issue is any indication, they will be successful doing it. Contributions to this issue include Clifford Lynch on e-books (an absolute must-read for anyone interested in this technology), a couple pieces by Sara Weissman, co-moderator of the PubLib discussion, an article on net laws from an attorney at the Missouri Attorney General's Office, a practical article on creating low-bandwidth web images without sacrificing quantity and quality, and an article on Web-based multimedia from Pat Ensor, among others. This is a solid publication that I cannot wait to see again. Disclosure statement: I am a Library Journal columnist. - RT

Pitti, Daniel. "Encoded Archival Description: An Introduction and Overview" D-Lib Magazine (November 1999) (http://www.dlib.org/dlib/november99/11pitti.html). - Encoded Archival Description (EAD) is a draft standard SGML/XML Document Type Definition (DTD) for online archival finding aids. In this overview article, the father of EAD explains what it is, why it exists, and what future developments may lie in store. - RT

Planning Digital Projects for Historical Collections in New York State New York: New York State Library, New York Public Library, 1999 (http://digital.nypl.org/brochure/). - This brochure serves as a useful high-level introduction to digitizing historical collections. Following a brief history of New York Public Library's digitization projects, it dives into the heart of the matter -- planning a digitization project. Main sections include: What does a digital project involve?; Why undertake a digital project?; How to plan for digital projets; How to select collections and materials for a digital project; How to organize information; and, How to deliver materials effectively. A brief list of resources is also included. Before getting started in such a project you will need to do much more reading than this, but it nonetheless is a useful place to start -- in either it's print or web format. - RT

Seadle, Michael. "Copyright in the Networked World: Email Attachments" Library Hi Tech 17(2) (1999): 217-221. - Seadle takes two commonplace uses of copying and evaluates whether they are legally acceptable in a digital environment. He gives a brief overview of the four keys test for determining "fair use" before discussing the specific cases. The first case is that of a faculty member distributing via email an article from the online interactive edition of the Wall Street Journal to his entire class. He had previously done similar things with the print version of the Journal and felt that this new use was still fair use. Unfortunately it would appear that the ability to make a full and perfect reproduction of a digital document destroys any barriers to further copying by students and would invalidate a fair use justification of this practice. In the second scenario a reference librarian sends via email a list of citations and full-text articles to a patron from the FirstSearch database. The librarian decided that if she deleted her copy of the downloaded documents that the end user would be complying with specific language in the database allowing for the downloading and storing documents for no more than 90 days. The differences are the librarian is sending the information to one person and not to a class, and the patron could have found the articles himself. So in essence the library was making an allowable copy for the user. Seadle admits that his arguments are not conclusive or exhaustive but in a clear way he outlines two interesting, yet normal copyright situations facing librarians and faculty. - ML

"Tomorrow's Internet" The Economist 353 (8145) (November 13, 1999): 23 (http://www.economist.com/editorial/freeforall/19991113/index_sa0324.html). - The cover story of this issue of The Economist focuses on the aftermath of the now-notorious "findings of fact" in the Microsoft antitrust case. This related article describes in detail the emerging, network-intensive style of computing that may reduce or eliminate the need for costly operating systems like Windows. Look no further for a balanced treatment of the forces behind "open system" computing, "thin clients", netcomputers and the like. As with all their technology reporting, the editors rely on plain English and disdain technobabble. - TH


Current Cites 10(11) (November 1999) ISSN: 1060-2356
Copyright © 1999 by the Library, University of California, Berkeley. All rights reserved.

Copying is permitted for noncommercial use by computerized bulletin board/conference systems, individual scholars, and libraries. Libraries are authorized to add the journal to their collections at no cost. This message must appear on copied material. All commercial use requires permission from the editor. All product names are trademarks or registered trade marks of their respective holders. Mention of a product in this publication does not necessarily imply endorsement of the product. To subscribe to the Current Cites distribution list, send the message "sub cites [your name]" to listserv@library.berkeley.edu, replacing "[your name]" with your name. To unsubscribe, send the message "unsub cites" to the same address. Editor: Teri Andrews Rinne, trinne@library. berkeley.edu.

Copyright © 1999 UC Regents. All rights reserved.
Document maintained at http://sunsite.berkeley.edu/CurrentCites/1999/cc99.10.11.html by the SunSITE Manager.
Last update December 1, 1999. SunSITE Manager: manager@sunsite.berkeley.edu