TECH NEWS--Building User-Friendly Web Sites--Part 1: Site Structure
NEW TECH NEWS: Spam and the E-Mail Transport Process
In early July, federal librarians and information center staff members attended FLICCs first Acquisitions Institute. The four-day course addressed the rules of government contracting, evaluation of vendor services and products, an overview of ordering sources, and ways federal librarians can develop collections to support the missions of their agencies.
Each day focused on one aspect of the acquisition process, including serials, electronic information products, and hard-to-find materials such as gray literature and foreign publications.
Exercises and quizzes helped participants learn the finer details of ordering and contracting. Breakout and Q&A sessions covered acquisitions questions with contracting specialists, vendor representatives, and acquisitions librarians from the public, academic, and federal sectors. A manual, Managing Acquisitions and Vendor Relations, by Heather S. Miller, was included in the course resource materials.
Audrey Eaglen, Acquisitions/Order Department Manager, Cuyahoga county Public Library, Ohio, and author of Buying Books, opened the institute by describing her long books acquisition career acquiring books for libraries and the methodology of monographic procurement. She pointed out that there really is a partnership among librarians, books vendors, and acquisitions staff.
"If you are going to be any kind of a library at all, your library is only as good as your acquisitions person," Eaglen said. "Without a collection, you can't have much of a library."
Meg Williams, FEDLINK Network Program Specialist, then addressed acquisitions processes in the context of government contracts. She outlined the time and budget constraints that shape the federal procurement process, as well as its various payment processes. Williams also sketched out the roles and responsibilities of the parties involved in acquiring library materials: the librarian, management, counsel, and financial and contracting officers. Together, she suggested, they are responsible for ensuring that the agency has the materials it needs for its mission, taxpayer funds are used responsibly, and companies have the chance to compete for government business.
The first days panel was titled "Getting What You Pay For: Monitoring Your Book Prices And Vendor Service." Panelists included David Pachter, FEDLINK Network Program Specialist; Janet Scheitle, Director, US Army TRALINET; and Cicely Marks, Director of Government Sales, American Overseas Book Company.
Scheitle stressed the relationship between the acquisitions librarian and the books vendor. "The librarian needs to keep excellent records, verify materials ordered, and be involved with all transactions," said Scheitle, "while the vendor must respond quickly to requests and problems from customers and maintain steady pricing and delivery."
Marks discussed factors that librarians use to evaluate book vendors. "Participants should use bothquantitative and qualitative factors when selecting a jobber--location, familiarity, reputation, fulfillment rate, delivery times, discounts, staff experience, customer service, online accessibility, return policies, special services, and special stock all play a part," said Marks.
Marcia Tuttle, Head of the Serials Department of the University of North Carolina at Chapel Hill, described the serials industry and strategies for serials control. Williams then outlined the serials competition process.
"The only constant in working with serials is change. Titles, scope, formats, sizes, and costs are all variable, and librarians must constantly double-check serials specs," Tuttle said. "Implications also arise from the fact that serials are continuous. Libraries must commit budget, space, preservation and binding services, and must find a reliable subscription agent to prevent lapses in service," said Tuttle. She advised attendees not to be surprised by anything that happens. "Working with serials is rarely boring," she said.
Lynn McDonald, FEDLINK OCLC Program Coordinator, moderated the afternoon panel, "Issues with Issues--Working With Your Serials Agent," which featured Susan Dyer, Federal Government General Manager, EBSCO, and Carolyn L. Helmetsie, Serials Librarian, NASA-Langley Research Center (LaRC).
Dyer offered tips for preparing a good solicitation from the vendors point of view. Librarians should be careful to specify their technical requirements, which pricing option they prefer, when subscriptions should begin, and which format they prefer. She advised attendees to allow 30 days for the vendor to properly prepare a response, and warned them that asking vendors to submit their bid on a particular form slows down the bidding process. Dyer also reminded them to factor vendor response time into their ordering schedule; ideally, an award for subscriptions needs to be made at least six months in advance.
Helmetsie described her experience with serials acquisitions for LaRC. Langley has 800 current subscriptions, and offers full-text electronic journals. She outlined her "lessons learned":
Helmetsie also recommended that librarians carefully define the services of vendors, learn how to read and understand serials contracts, research vendor business practices, and perform self-evaluations in conjunction with vendor evaluations.
The next day, Tuttle outlined the universe of electronic publications (e-pubs), and discussed ways that librarians might evaluate and select them. Williams discussed the management of e-pubs in the context of a librarys collection.
Williams underscored that storage and maintenance are central questions in the development of an e-pub collection. "Contracts are also a major concern, as different electronic publishers want to negotiate different terms for use and reproduction of their products," said Williams. She also suggested agency employees must be notified of new e-pub acquisitions. "Librarians must also decide how to train end-users to use these products, and to evaluate whether electronic resources are appropriate for their research," said Williams.
A brainstorming session revealed that various libraries had developed different methods for managing e-pubs. Solo libraries depended on support from their Information Technology (IT) office to post and maintain e-pubs, while medium-sized libraries had an IT contractor on staff or on call. Although there is an emphasis on making e-pubs available from agency employees desktops, patrons still seem to come to the library to use the print materials as a break from the office.
Promotional strategies included introducing new employees to electronic collections, hosting special orientations and informal briefings, including materials in libraries Online Public Access Caalog (OPAC), setting up "agency information resources" classes, advertising publications in the agencys Web site and publications, creating a library newsletter, and sending out printed or e-mail alerts. Participants also suggested assigning "outreach librarians" to work with research teams and other groups of users.
Suggested training strategies included arranging private training for senior management, offering face-to-face training with walk-in patrons, enlisting vendors to train patrons or librarians, deputizing knowledgeable end-users, contracting with a training company, developing online instructional materials, or attending FEDLINK classes.
Williams discussed licensing issues at further length. She recommended that librarians anticipate the vendors requirements and possible situations that might arise before agreeing to contract terms. Enlisting help from the agencys IT unit, general counsel, and contracting office in negotiating contracts will help in warding off licencing disputes. Many e-pub usage contracts are based on software licenses; this may not be appropriate for the federal library context.
Librarians should ask themselves:
They should also ask the IT unit:
Additional guidelines for evaluating licenses are available at the Web sites of the Office of General Counsel, University of Texas System (http://www.utsystem.edu/ogc/intellectualproperty/justsign.htm).
The institutes final day addressed sources for hard-to-find material.
Ned Kraft, Librarian, Acquisitions Services, Smithsonian Institution Libraries, offered this advice for obtaining "gray" literature:
Beth Root, Librarian, US International Trade Commission, suggested vendors and strategies for obtaining foreign materials. She reminded attendees that shipments from abroad must go through customs, which may mean making trips to the airport. She urged librarians to ask staff members who are going abroad to purchase hard-to-find items. Root also advised that some pleading might be necessary to convince overseas publishers to ship materials.
Caroline Early, Head, Acquisitions and Serials Branch, National Agricultural Library (NAL), spoke about exchange and gift procedures. "To make the best use of exchange and gift opportunities," Early said, "librarians must know their collections strengths and weaknesses." Once the library has identified areas which need more titles, they may contact special interest groups to request gift copies of publications. Early warned, however, not to depend on gift publications from only one source to build particular collections.
Sheila McGarr, Chief, Depository Services, Library Program Services, Government Printing Office (GPO), provided an overview of GPOs offerings. She also covered ordering procedures, and explained how librarians might use the GPO Access Web site (http://www.access.gpo.gov/su_docs/index.html). Carol Ramkey, Branch Chief, Reference Services, Defense Technical Information Center (DTIC), also described the centers information products and explained how to order them. For more information, see the DTIC Web site (http://www.dtic.mil/).
Pachter then covered Wb-based tools for acquisitions research. (For a selection of his bookmarks see pages 12 and 13. A complete list will be available at the FLICC Web site this fall: http://www.loc.gov/flicc.)
Final sessions addressed working with users, and managing cooperative projects.
Ellen Swain, Manager of Research Services, US General Accounting Office (GAO) library, recommended using online tools to identify and evaluate new information resources in a variety of formats, to elicit feedback from users, and to generate lending and access statistics which help to refine the collection. She advised that librarians be both "high tech" and "high touch." For example, the GAO Intranet includes a form which allows staff members to recommend library purchases, but library staff members also act as liaisons to different GAO divisions in order to develop one-on-one relationships with staff members.
Suzanne Grefsheim, Chief, Library Branch NCRR, National Institute of Health, spoke about a survey distributed by the NIH library to measure levels of interest in particular journals. The survey listed each journal title, and then asked users to indicate
Because medical and health journals are expensive and highly specialized, Grefsheim explained, it is important to weed out little-used titles.
Sarah Mikel, Library Director, National Defense University and Doria Grimes, Library Contract Manager, NOAA both spoke briefly about cooperative projects. Mikel described the Military Education Coordinating Committee (MECC) electronic library, which allows for resource sharing among intermediate and senior level service schools. Grimes encouraged attendees to use the librarys physical facilities to host meetings, briefings, and public awareness events about topics of concern to agency staff and members of the public.
Videotapes of the Acquisitions Institute will be available through the National Library of Education in late autumn.
A moderated listserv for acquisitions librarians and others interested in acquisitions work.
Against The Grain ELECTRONIC
Acquisitions information linking publishers, book jobbers, subscription agents, and librarians.
American Library Association (ALA)
Statement on collection development from ALAs Association for Collections and Technical Services (ALCTS).
Book information on the WWW--their motto is "The first place to look for book information on the WEB."
Directories of Electronic Journals
Directory of electronic journals, journal collections, e-text collections, and other electronic sources of full text documents.
Federal Depository Library Program Administration
News, information, and communication for and about the Federal Depository Library Program.
Mega-information site for U.S. Federal Government information.
Search engine for .gov materials (discontinued).
Government Information Locator Service (GILS)
Identifies and describes US government information resources.
GreyNet Grey Literature Network Service
For authors, researchers, and intermediaries looking for grey literature.
International Resources--Government Publications
International governmental information from the Web.
Internet Library for Librarians
A comprehensive Web database for librarians.
U.S. Corporate, Industrial, and Economic Information
U.S. corporate, industrial, and economic information from the University of North Carolina at Charlotte.
For copies of Public Domain Extext editions of materials from the public domain.
THOMAS--U.S. Congress on the Internet
Database of current Congressional information, keyword searchable.
U.S. Government Printing Office
Complete information center for all programs of the Government Printing Office.
WWW Resources For Law Librarians
Web acquisitions and collection development resources for law librarians.
TABLE OF CONTENTS
By Jessica Clark
The Web's initial growth spurt has ended, and with it, the heyday of the simple "Web presence." Users have begun to expect sophisticated, easy-to-use Web sites which quickly meet their research needs.
Like libraries, Web sites which attract, hold, and serve users well require planning, careful design, usability testing, and regular maintenance. Librarians are trained to organize information--so be wary of surrendering the construction of your Web site to your departments HTML guru. Instead, determine how a Web site can best enhance the library's collections and services and then marshal the technical knowledge or staff to build it.
Start by reviewing what you already know about your library or information center--your customers and materials. Think about the kinds of electronic resources, such as CD-ROMs and databases, that your library or information center could make available via the Web. Consider developing a list of Web sites that address your agency's areas of interest.
Write a statement of purpose for your Web site and outline the information you will include. Avoid structuring your Web site around your organizational chart. Instead, structure it around users' needs. It may help to sketch out user scenarios--hypothetical searches that you, agency staff, or members of the public might perform. Poll your staff and regular users to determine the questions that frequently crop up.
Once you have identified the information and services you would like to present online, arrange them into a hierarchy of importance and generality. Then, separate the list into discrete categories. For example, a Web site might include information about the library's location and hours of operation, a link to your online catalog, a list of electronic resources, and a description of your special collections.
Sketch a flow chart of the information you have decided to include. Each major section of your site should be accessible from your home page. The chart will help you visualize how different categories might be interrelated.
You might begin by linking particular online resources to related special collections, the list of special collections to your online catalog, and so forth. Over time, real-life scenarios and user responses will help you fully develop your site's internal links.
The most common Web site structure is the sequence. Many sites list and link items in a logical series--for example, the NASA Center Libraries home page ( http://library.gsfc.nasa.gov/GSFCHome.htm) lists different research centers alphabetically. The planning process outlined above will produce a sequence of topics arranged in order of importance to library patrons.
Sequential structures are simple to use and understand, and work well for training purposes. They may not, however, provide the most direct route to crucial resources. If your site is in sequential or menu format, consider shortcuts to key sections, or encourage users to bookmark them.
A table scheme organizes information about similar items. For example, the Library of Congress American Memory site (http://www.loc.gov/ammem/) uses tables to present lists of materials related by exhibition, keyword, subject, or medium. This structure is best for users who want to compare or locate particular pieces of information.
Webs, or interconnected pages, work best for audiences seeking knowledge rather than instruction or specific information. For example, a Web site about preservation might contain many interlinked pages covering different topics--water damage, fire damage, mildew, bindings, archival paper, etc. A user could start by reading about one topic, and then move to the next according to his or her interests.
All home pages should contain information about the library's mission and users, contact and location information, and links to all of the site's different areas. A home page should be like a resume--neat, professional, well-organized, and easy to scan for pertinent information.
Design for both novice and experienced users by including more than one navigational path--for example, provide a map or diagram of the site, or a search engine, as well as menus. The National Library of Medicine's site (http://www.nlm.nih.gov/) provides all three. Compaq's site (http://www.compaq.com/siteguide/index.html) offers an interesting way to display a site map at a glance.
When it comes to the number of pages, sites should neither be too deep nor too shallow. Studies show that users prefer menus which contain at least five to seven links. Minimize the number of steps through a hierarchy, but beware of dead-end pages. All pages should link back to the home page.
Make menu and other selections consistent throughout your site, so users can navigate through different sections without becoming disoriented. If your material is sequential, provide users with "Back one page" and "Forward one page" links, so that they will be able to follow along if they start in the middle. Alternately, provide links to a higher menu which will give them more context.
Avoid using the "FRAME" tags to create scrolling menus or content windows. Frames are difficult to navigate (users can't use the Back button to return to a previous frame, which is frustrating), impossible to bookmark, confusing to print, and may not work on some browsers. Questions have also been raised about the copyright implications of "framing" other sites. Simulate the appearance of frames by creating a colored side or top panel for each page with the "TABLE" tags.
Alert users to new material by inserting a "New" graphic next to updated items
The first four inches of any page are crucial. Jakob Nielsen, an engineer at Sun Microsystems and author of several books on interface usability (http://www.sun.com/columns/jakob), cautions that only 10 percent of users scroll beyond the information visible at the top of a screen. Do not make pages too wide, either--a 640 by 480 pixel screen area is standard.
Users find scrolling disorienting. Nielsen recommends that site creators write no more than 50 percent of the text that would appear in a hard copy publication. In general, keep Web pages under three printed pages in depth. Longer pages download slowly and are difficult to read online.
Long documents may be broken into multi-page presentations, with a linked table of contents and interlinked pages. Many readers, however, print documents out and read them offline. It is a good idea to provide "printing" versions of such documents. Let users know how many pages the document will run so they can decide whether to print it out.
Web pages should be free-standing in case a user happens upon them independently. Each page should contain the organization name, the topic of the page, the site's main URL and the date the page was last updated. Create a standard header and footer which contain this information; this will help users to recognize pages from your site.
Also assign each page a clear title, as this will become the text that appears in a user's bookmark listing. It pays to be specific--the title "catalog" wouldn't be very helpful, but "Agency X Library's Online Catalog" would.
Long lines of text are difficult to read. You may want to bracket text in "BLOCKQUOTE" tags to create margins on your page, or use "TABLE" tags to set up columns. Repeating the same column structure from page to page will also help you to establish continuity between different sections of your site and create a consistent "look and feel."
Include an "ALT" description in the "IMG" tag for each graphic which contains text or needs to be described. This helps patrons who use text-based browsers, who have turned their graphics off, or who are visually impaired and using screen reading software. If you use an imagemap, include text links as well. You can use brackets to set off each link and create a button effect.
Do not include "ALT" tags for bullets, lines, or other design elements which don't communicate information. To test your "ALT" tags, look at the site with the graphics turned off in your browser.
Many of the same principles used in publications design are appropriate for Web design. Editorial standards should be applied--if your site contains grammatical errors or erratic formatting, users are less likely to take it seriously.
Ask for user feedback during the initial design phase. Nielsen describes the process of designing and testing the Sun Microsystems site at http://www.sun.com/sun-on-net/performance/book2ref.html.
Most usability testing is performed in lab, where designers record how users navigate through a site. Members of the Sun design team, however, also performed less elaborate tests. They solicited feedback on printouts of their prototypes, asking users to predict the results of clicking on different page elements. They also wrote different aspects of the site's contents on index cards and asked users to do concept mapping by sorting the cards into piles by similarity and then naming the categories they had created. Such testing is inexpensive, and allows you to double-check the logic of your site structure.
Before uploading your pages to the server, try selecting different browser font sizes and printing them again. Text and images may realign if you have used formatting tricks to place them on the page. If possible, try opening your pages in several browsers, or from several different platforms. You may be surprised by the very different results that are produced by the same HTML code.
Once the site is up and running, provide a "comments" section to encourage continued user feedback. Learn how to run site statistics, and compare usage reports for different sections. If certain parts of your site are not visited, you may need to rethink or eliminate them.
For additional testing tips, see Neilsen's columns on usability issues at http://www.useit.com/alertbox. An in-depth description of the University of Maryland Human-Computer Interaction Laboratory's design and testing of the LC National Digital Library is available at http://www.cs.umd.edu/hcil/ndl/.
Garlock, Kristen L., and Sherry Piontek. 1996. Building The Service-Based Library Web Site: A Step-by-Step Guide to Design and Options. Chicago: American Library Association.
Karp, Tony. "Art and the Zen of Web Sites." April 19, 1997. http://www.tlc-systems.com/webtips.shtml (11 Aug. 1997).
Lynch, Patrick, and Sarah Horton. "Yale C/AIM Web Style Guide." January, 1997. http://info.med.yale.edu/caim/manual/index.html (6 Aug. 1997).
TABLE OF CONTENTS
Three Book Suppliers Join OCLC PromptCat Service
Baker & Taylor Books, a leading supplier of books and related services, DA Information Services of Victoria, Australia, and Rittenhouse Book Distributors of King of Prussia, Pennsylvania, are the newest active participants in the OCLC PromptCat service, which provides cataloging to libraries for materials supplied by participating vendors.
The OCLC PromptCat service delivers cataloging records for any title supplied by participating vendors that has a monographic record in WorldCat (the OCLC Online Union Catalog). Records and library materials arrive at the library simultaneously and the libraries holding symbols are set in WorldCat.
Baker & Taylor distributes a wide range of products -- books, video, audio, software, and related services -- to school, public, university, and special libraries worldwide, as well as government agencies and more than 100,000 bookstores. One of Forbes Magazines Top 500 Private Companies for the past three years, Baker & Taylor distributes more than one million titles annually and maintains a database of more than three million current American, British, Australian, and European English-language titles.
Founded in 1951, DA Information Services supplies science, technology, medicine, and business information. In 1995, DA received ISO 9002 international quality standard accreditation and became the first Australian library and information supplier to achieve accreditation as a Quality Endorsed Company by Standards Australia.
Rittenhouse Book Distributors began as a retail medical bookstore 50 years ago. The company continues to specialize in health sciences information. Visit their Web site at http://www.rittenhouse.com.
These three new vendors bring the total number of participating book vendors to eight: Academic Book Center, Ambassador Book Service, Baker & Taylor Books, Blackwell North America, DA Information Services, Majors Scientific Books, Rittenhouse Book Distributors, and Yankee Book Peddler. Other vendors who have agreed to join the OCLC PromptCat service are Book Clearing House, Casalini Libri, Eastern Book Company, Iberbook International, and Puvill Libros.
OCLC wil be introducing free label software in September 1997. The new label software will import labels from text files and then display, edit, and print them. Other features include the options to create new labels from a blank label workform, print labels in both immediate and batch modes, use pinfeed stock or sheets of label stock for laser printers, print multiple copies of the label, and specify print constants, ranges, and copy numbers. The label software will support the standard OCLC label formats of SL4, SL6, SLB, and SP1.
Users just import labels from OCLC micro products such as OCLC Passport for Windows (or DOS) software, OCLC CJK software, CAT ME Plus software, CatME for Windows software (when available), and OCLC CatCD for Windows software. The label software is also compatible with label files created by the PromptCat service from a new processing option available later this year.
The software is a 32-bit Windows-based product requiring either Microsoft Windows 95 or Windows NT (version 3.51 with Service Pack 5, or higher). OCLC will distribute it electronically via the OCLC Web site or anonymous FTP. Check the OCLC home page for availability announcements and for more information about downloading the software.
OCLC has released its prices for Cataloging MicroEnhancer (CatME) for Windows. CatME for Windows will only work with X.25 dial access and TCP/IP telecommunications protocols (dial, dedicated and Internet). Although OCLC will continue to sell the DOS product CAT ME Plus (product code CCD9354) until September 15, 1997, the DOS product should only be used if a library needs to use the CatME on dedicated lines. Package and upgrade prices for the Windows version are:
|CatME Single License||SOF9327||$99.00|
|CatME Site License||SOF9328||$349.00|
|CatME Single License||SOF9325||$395.00|
|CatME Site License||SOF9326||$1396.00|
OCLC has installed online software that validates new codes in several OCLC-MARC code lists. At the same time, OCLC has also made the MARC Field 350 (price) invalid. When entering the price, users should key price information in fields 020 subfield $c, 024 subfield $c, or 037 subfield $c. OCLC will release revision pages to the printed OCLC-MARC Code Lists in the fourth quarter of 1997.
For more information about the changes, check OCLCs cataloging or ILL system and type "news cat<F11>". A summary of the changes also appears on OCLCs Web site. They have also updated the complete OCLC-MARC Code Lists on the Web site to reflect these changes. For the complete list on the OCLC Web site, see http://www.oclc.org/oclc/man/code/codetoc.htm.
Please remember to send the full ILL record with loan requests made via OCLC. With Passport for Windows, simply press <F12> to print the entire record. If you are still using Passport for DOS, the sequence of <Ctrl><F8>, <F4>, <Ctrl><F8> should also produce the desired copy.
Ask*IEEE, the official document delivery service of the Institute of Electrical and Electronics Engineers (IEEE), has joined the OCLC ILL Document Supplier Program. Libraries can access the Ask*IEEE document delivery service using the OCLC symbol, A8K. Ask*IEEE delivers documents via fax, aerial fax, overnight express or mail. Contact information is available under the symbol A8K in OCLCs online Name Address Directory. While logged onto either cataloging or ILL, type ":a8k<F11>" and view the Document Delivery record.
OCLC has added two databases to FirstSearch. The OCLC Union Lists of Periodicals database is now available and shows holdings for more than seven million journals and other covered items among OCLC member libraries. It can help users determine if a library holds a specific journal issue. OCLC updates the database semiannually.
Another addition, H.W. Wilson Select Full Text, contains indexed and abstracted articls from more than 430 periodicals, all with ASCII full text online. The database consists of records from the following H.W. Wilson databases: Readers Guide Abstracts, Social Sciences Abstracts, Humanities Abstracts, General Science Abstracts and Wilson Business Abstracts. The database covers January 1994 and continues to the present (please note that each periodical can begin on a different month and year).
OCLC is now using functions with passwords on their home page to deliver records and reports via the World Wide Web. Called "OCLC Product Services," it allows libraries to download some reports previously only available on paper or via the Product Services Menu in PRISM. Records received in batches (e.g., PromptCat records) can be downloaded, although records cannot be uploaded via the Web. Technical Bulletin 225, "OCLC Product Services via the Web," describes the service. OCLC distributed e-mail versions describing these changes in early August. Technical Bulletins are available on OCLCs home page at: http://www.oclc.org/oclc/menu/tb.htm.
OCLC has also added two new subfields to the COMMUN2 Field in the Name Address Directory workform. The new subfields accommodate the URLs for the library local system and institution homepage. Subfield l will be added for recording the local system URL, and subfield h will be added for the institution homepage URL.
TABLE OF CONTENTS
This is another in a series of articles written for Wisconsin InterLibrary Services.
These articles are being shared with other networks under a contract executed by the Network Alliance.
By Tom Zillner
Mail "spamming" is increasingly becoming a problem for Internet users. Spamming means the sending of huge quantities of junk e-mail indiscriminately, hoping that a few of the recipients will respond to solicitations for chain letters, pyramid schemes, get-rich-quick instruction methods, ads for female penpals or invitations to buy pornography. I have seen an acceleration of the pace at which this material arrives in my incoming e-mail, and an increase in the number of complaints I see online and in print articles, suggesting that other people are experiencing this annoying phenomenon as well.
Two interesting questions about the spam mail phenomenon are "Why me?" and "Where did this thing come from?" Both are easy to answer. The "Why me?" question expands to "How did they get my e-mail address?" There are several techniques for doing so, most having to do with collecting e-mail membership lists or postings to Usenet News. The easiest way to get a list of e-mail addresses is to send the commands to list software that will return the member list. Until recently, it was possible to get the membership of most listservs whether or not you were a member of the list. To combat spamming and provide some measure of privacy for list members, lists now often require that someone be a list member to get a list of members. So, the spammer simply signs up, gets the member list, and signs off again. What makes this easier is that most e-mail gatherers use automated processes--spam software. Thus, it is possible to obtain huge collections of e-mail addresses with ease.
Usenet News also offers opportunities to collect e-mail addresses. Usually, a Usenet posting carries the e-mail address of the person who originated it. There are a number of sites and search engines that collect News, and spam software can access these repositories. Similarly, spam software can search listserv collective repositories-- if the spammer has difficulty collecting e-mail addresses through the process described above--and extract e-mail addresses from actual listserv messages.
After the collection phase of the spam process, the e-mail addresses are checked against a master database and duplicate addresses removed before the new addresses are added to the master database. At this point, spamming can commence!
You should note that the simple spamming process I describe here can be considerably enhanced, although the spams I receive seldom reflect such sophisticated procssing. What I am suggesting is that instead of indiscriminately grabbing e-mail addresses and circulating spam mail to everyone in a database it is possible to categorize a persons interests according to which listservs they belong or to which newsgroups they post. This would allow the kinds of targeted mailings found in traditional postal junk mail.
The second part of the spam process, actually generating e-mail messages, leads to some answers to the question, "Where did this thing come from?" Most often the spam mail does not have a valid "From" e-mail address, making it impossible to respond to the e-mail with a frothy or pithy reply. Some people who receive junk e-mail are also tempted to respond with so-called "mail bombs," sets of messages that make the e-mail server software at the originating site "crash," by sending a huge number of very large messages within a very short period of time. Faking the origination address makes it harder to trace the true source of the messages, although experienced investigators can often track the perpetrators. If you wish to respond to the solicitation, a separate e-mail address often appears in the body of the message, or a telephone number or postal address is provided.
Just as spammers fake their return addresses, the "To" address is also often not that of the recipient, making it harder to figure out how you became a target of the spam. For example, I might subscribe to one listserv as "Tom Zillner" and another as "Thomas Zillner." If the "To" field carries "Thomas Zillner" as part of its address, that could give me a clue about where the spammer culled my name.
An interesting question is how spammers can place fake e-mail addresses in their messages in the first place, given that conventional e-mail programs automatically attach genuine e-mail addresses in everyday communications on the Internet. Additionally, how do return messages get where they are going if the "To" address is incorrect? The key to understanding how this occurs is to look at what happens to an e-mail message after it is composed and launched into the ether of the Internet. My naive interpretation of the e-mail process had been the idea of a passive transport from the senders computer to the recipients. Several years ago my naiveté was shattered by learning the truth about the dynamic process of e-mail delivery. Instead of packets desultorily wandering the Net in a fashion similar to postal mail, the sending computer attempts to reach the recipient computer repeatedly until it succeeds.
Let us back up a bit first. Many people generate their e-mail messages using a mail package like Pegasus or Eudora, which are free or low-cost options. We do our editing on our PC or Mac, and then select Send. Depending on how the e-mail program is configured, the message is either sent onward immediately or queued for sending as a group. In any case, the message is routed to a mail sending and receiving program, most often on a computer different from our own. It joins other messages there that are being sent by us, our colleagues, or sometimes an entire organization or campus. At this point, spammers distort their originating addresses. Essentially the same thing happens at the receiving end of this process, with an intermediate computer often accepting the messages and relaying them to a users local computer.
What follows is a technical explanation of the process of e-mail transport. I will try to make this explanation as straightforward as possible, but for the faint of heart, it is possible to skip the explanation and move to the gloss on how header information can be manipulated.
How do the messages actually move across the Internet? The mechanism employed is the simple mail transport protocol (SMTP). There are many programs to carry out this protocol, depending on the computer and its operating system. For example, most UNIX systems that act as an SMTP server use a program called sendmail to handle this and other e-mail processes. One reason that the mail handling process is handled by a separate, sometimes dedicated, computer is that the ail processor must be active twenty-four hours a day, seven days a week, because the arrival of e-mail is unpredictable, happening at all hours of the day and night.
The sendmail or other software receives outgoing messages from client computers, and deposits them into its temporary storage slot for messages. Messages that arrive first go out first. Even if a message queue grows to several hundred, or even thousands, the program dispatches them rapidly to their destination, provided the destination answers promptly. Suppose I send a message to Bill Gates praising him for the quality job his programmers have done on Windows 95. (Remember, this is hypothetical!) The message that the sendmail program "sees" would look something like this:
From: Tom Zillner <email@example.com> To: Bill Gates <firstname.lastname@example.org> Subject: Wonderful Win 95 Dear Bill,
Your programmers and designers did a great job with Windows 95. I particularly like the idea of putting everything about the system and its programs into the Registry. That way, if I damage it, I wont be able to run ANY of my software. Thanks again, Your pal, Tom
The e-mail message is divided into two parts: headers and the message body. The headers include everything that comes before the message itself, consisting of information about the message, its source, routing, and destination. Even at this stage, there might be more header fields already present than the ones above, and some more will be added as the process continues. For now, the message is as complicated as it needs to be for the purposes of this example.
The sendmail or other SMTP software that encounters this message in its queue will try to initiate a connection with the computer identified by microsoft.com. If the recipient computer does not answer the request for a connection within a specified number of seconds, the e-mail message is returned to the queue. The software will try many times to send the message, but if it fails after a predetermined number of attempts, the message is "bounced" back to the sender.
If the software successfully connects with the recipient e-mail computer, it interacts to transfer the message. Such an interaction in the case of my hypothetical message might go something like this (S is sending software/computer, R is recipient software/computer):
S: [connects at the e-mail port of the recipient computer] R: 220 microsoft.com Sendmail 8.1; Thu, 8 May 97 13:17:16-07:00 S: helo milkyway.wils.wisc.edu R: 250 microsoft.com Hello milkyway.wils.wisc.edu, pleased to meet you S: MAIL From: Tom Zillner <email@example.com> R: 250 Tom Zillner <firstname.lastname@example.org>...Sender ok S: RCPT To: Bill Gates <email@example.com> R: 250 Bill Gates <firstname.lastname@example.org>...Recipient ok S: data R: 354 Enter mail, end with "." on a line by itself S: Date: 8 May 1997 From: Tom Zillner <email@example.com> To: Bill Gates <firstname.lastname@example.org> Subject: Wonderful Win 95 Dear Bill, [The rest of the lines of the message are sent, one after the other.] R: 250 ok S: quit R: 221 microsoft.com closing connection R: [closes the e-mail port connection]
Seem complicated? Naaah! It is fairly straightforward. (By the way, the "helo" above is not a typo.) The exchanges are like a human conversation, although a stilted one with a very limited vocabulary. Notice that each time the sending computer sends information the receiving computer acknowledges it with a response that includes a 3-digit number at the beginning. This is to simplify processing, allowing the software to examine this message code to determine the success or failure of its previous transmission. The rest of the message line is essentially for the benefit of humans who may be checking the e-mail transmission process in case of repeated prblems with the software or other subtle problems with e-mail.
For all of you who skimmed through the preceding technical stuff about e-mail transport, you may want to resume a closer read at this point (unless you are just skimming the entire article, in which case carry on!).
The problem with the way in which the e-mail transport protocol behaves is that the recipient software seldom checks the validity of any of the information it is receiving. For example, the sendmail program only checks whether the computer identified in the helo command is really the one sending it. It does not check the validity of anything else transmitted by the sender. This means that a spammer can make up whatever "To" and "From" information he wants. Note also that the "To" and "From" information contained in the body of the message is not used in any way to transmit the message, so an unscrupulous sender could change these to read anything.
What compounds the problem of spurious header information is the fact that many e-mail reading programs remove much of the header for the comfort and convenience of the human recipient. Most people are not interested in most of the header information. To trace spam or other problem messages, however, having the complete header is vital. Many e-mail programs provide capsule header information (e.g., "From", "Subject", "Date") but offer an option to display full header information upon request.
Some people mischievously or feloniously modify the SMTP header to engage in e-mail forgery or to send inflammatory, threatening, or joke messages so that it appears that someone else originated the message. Usually, it is possible to track down felonious incidents, depending on how much time and expertise law enforcement officials devote to the process. Of course, spamming seldom gets the same effort. The most ignorant spammers, those who do not cover their tracks and have accounts on a host machine rather than running their own computer sites, can be detected and e-mail to the postmaster of the site results in their rapid ejection or the radical limiting of their e-mail privileges.
One obvious question that arises is why the software that carries out the mail transport protocol is so trusting, and why the protocol itself does not provide for more security and authentication. The answer is that most of this software is descended from programs that were put together at the beginning of the Internet, when spammers and "mail spoofing" were unknown. Similarly, SMTP was adopted in the same, more innocent times.
Lately, there have been some successes against the more aggressive e-mail spammers. For example, a Newsweek article (May 12, p. 90) about Sanford Wallace, self-styled "Spam King" notes that after America Online (AOL) and Wallace exchanged lawsuits over his inundation of AOL users with spam mail, they came to an agreement under which AOL users can block spam from Wallace and other spammers. The San Jose Mercury News reported Compuserve negotiated a similar settlement. Earthlink Network, an Internet service provider (ISP), is suing Wallaces company, Cyber Promotions, for $3 million, claiming that if Earthlink allowed complete delivery of spams to its users, the volume would bring its system "to its knees." In fact, that is just what happened to Netcom, also an ISP, after another spammer sent 50,000 e-mails to its users, forcing it to shut down part of its network to recover.
Ultimately, this issue may land with Congress. Whether or not they will want to outlaw spam is an open question, particularly since it seems to me that any such legislation would have serious First Amendment implications. Overall, I think it is better to suffer a little spam for the sake of preserving free expression on the Internet. Of course, I will not be signing up for any pyramid schemes anytime soon. There is a limit to my support of the marketplace.
TABLE OF CONTENTS
FEDLINK Technical Notes is published by the Federal Library and Information Center Committee. Send suggestions of areas for FLICC attention or for inclusion in FEDLINK Technical Notes to:
FLICC/FEDLINK: Phone (202) 707-4800 Fax (202) 707-4818
FEDLINK Fiscal Operations: Phone (202) 707-4900 Fax (202) 707-4999
Executive Director: Susan Tarr Editor-In-Chief: Robin Hatziyannis Editor/Writer: Jessica Clark Editorial Assistant: Mitchell Harrison
FLICC was established in 1965 (as the Federal Library Committee) by the Library of Congressand the Bureau of the Budget for the purpose of concentrating the intellectual resources of the federal library and related information community. FLICC's goals are: To achieve better utilization of library and information center resources and facilities; to provide more effective planning, development, and operation of federal libraries and information centers; to promote an optimum exchange of experience, skill, and resources; to promote more effective service to the nation at large and to foster relevant educational opportunities.
TABLE OF CONTENTS
Go to: Library of Congress Home Page