“Digitizing and organizing information in Libraries, an overview.”

By: Samer Alshawwa

IRLS 501 Final Project

I give permission for my work to be published in the SIRLS LIS Learning Showcase.
Introduction:

 

     In this project I intent to present an overview of information organization in libraries by digitizing catalogs and metadata. The information I include in this project is influenced by the reading I have been doing. This project will be more of an overview of the need of digital libraries and information organization than a critique. I will attempt to  cover the different elements of the process of organizing information from an information professional view due to my humble and short exposure to library sciences. Furthermore, I will include other suggested methods of preserving catalogs and information in libraries. I hope this paper will be of benefit to others.

_________________________________________________________________

In many respects the invention of the MARC record and related   standards has been the most important event in librarianship and   bibliography since the Library of Congress began its catalog card   distribution service early in this century.  It has enabled the   creation of immense multinational bibliographic databases for scholars   and researchers; it has allowed libraries to make use of automated   support for most basic library functions, such as cataloging,   acquisitions, and online public access catalogs.  It proved the value   of standard protocols and content guidelines in promoting the sharing   and processing of information.  And it put libraries, archives, and     others at the forefront of the electronic information revolution.  

  But, in most respects, libraries are no longer on the forefront of   that revolution.  The electronic information environment has exploded   outside of libraries in ways that we're all too familiar with, and yet incapable of really understanding the dimensions of.  New information   technology is now happening, for the most part, in the burgeoning   private information industry, in computer science departments, and in   scientific research centers.  

  That this has happened is of course overwhelmingly positive.   Looked at one way, librarians will no longer need to continue to   invent all our own standards and protocols and database systems from   scratch.  Better-capitalized and far more innovative groups are now   taking care of that for us.  How true this is can be illustrated by    considering the database search and retrieval systems that libraries   and the undercapitalized library automation industry have created for   us and our patrons.  We have been working on our library public access   catalogs for over twenty years now, and what search and retrieval   techniques have we implemented?   Implemented search and retrieval techniques Simple author searching Simple title searching Combined partial author-title searching Simple subject searching Keyword-Boolean searching  

  Yes, after twenty years, the major new retrieval technique we've   made available to users of library catalogs, at least, is   keyword-Boolean searching.  One might say that since much library cataloging data   has the great advantage of controlled name and subject vocabularies, created at great expense, that little more in the way of retrieval technology was needed.  

  Unfortunately, there has been ample evidence in the literature and in   practice that shows that it has not made it nearly easy enough in our online   systems to use our own subject thesauri and classification schemes; nor   have created the functionality that would truly allow data to   be used interactively to excavate all the "intelligence" we have built   into our databases.  

 

 

PRINCIPLES OF DIGITAL LIBRARIES

The purpose of a digital library is to provide coherent organization and convenient access to typically large amounts of digital information. The following principles provide working definitions of a digital library from both a conceptual and a practical standpoint:

A digital library is an integrated set of services for capturing, cataloging, storing, searching, protecting, and retrieving information.

Digital library services bring order where data floods and information mismanagement have caused much critical information to be incoherent, unavailable, or lost. Digital library architecture emphasizes organization, acquisition, preservation, and utilization of information. Digital library systems are realizations of an architecture in a specific hardware, networking, and software situation.

 

Core Capabilities of Digital Library Systems

Digital library systems compose a family of automated systems that together provide a comprehensive capability to manage the digital content of an enterprise. It is useful to divide the capabilities of digital library systems into the following areas:

capture or creation of content, indexing and cataloging (metadata), storage, search and query, asset and property rights protection, and retrieval and distribution.

Content exists in multiple sizes, formats, and media, each with accompanying technical challenges. Content may be structured or unstructured. It may have exact, precise meaning; or it may be fundamentally ambiguous. Content may directly or indirectly support a business process or function.

A digital library architecture shows how capabilities are realized and related, and does this at several levels. Digital library architectures show how business processes or functions are enhanced; they show how technology components fit together and how, in detail, components interoperate with each other.

Such functions and relationships, when reduced to a particular software and hardware implementation, lead to operational digital library systems.

 

Digital Libraries and Traditional Libraries

Digital library functions, insofar as they purport to organize information, may be compared with traditional library functions. Consider digitization, which technically is the conversion of analog to digital formats. A common human artifact, such as a bound book, loses value when simply scanned into bits. In a library context, where organization, access, protection, and preservation are important business functions, digitization technologies are starting points for a complicated set of computational processes that in the first instance reconstruct the cultural, conventional, and intuitive significance, structure, and external relationships that defined the original artifact. Additionally, digitization and other processes may be able to add value and support certain fiduciary responsibilities that resemble functions of traditional libraries.

In a similar way, other core capabilities of traditional libraries can be transposed to the digital domain. Cataloging is transposed to the generation of metadata, and is an area where much work needs to be done to develop automated, multidimensional indexing and cataloging procedures. Just as the public card catalog is a gateway to the holdings of a conventional library, search of content and metadata is the gateway to a digital library. Circulation in a conventional library transposes to network access, retrieval and delivery.

The fiduciary responsibilities of traditional libraries are related to issues of copyright protection and intellectual property rights. The table below relates digital library capabilities to well-known capabilities of traditional libraries. The point is that traditional libraries have established uniform business processes and highly interoperable data formats which support especially bibliographic catalogs, item ordering, and interlibrary loan. Although many of these procedures pre-date "digital" libraries, digital library design can benefit from the comparisons.

 

Comparison of Digital and Traditional Library Capabilities

Digital Library Capability

Traditional Library Capability

Capture

Acquisitions and collection development

Catalog and Index

Cataloging rules and bibliographic control

Store

Stacks, inventory management and shelf lists

Search

Public card catalog

Protect

Patron privileges and circulation rules consistent with public law and policy

Retrieve

Loan management and interlibrary loans

 

Having made these comparisons, it must be emphasized that in the United States the digital library is not regarded as a technology related to library automation or the provision of integrated library systems for operating traditional libraries.

 

EVOLUTION OF THE ONLINE CATALOG

Connectivity

Library functions in the online catalog are now integrated. This linking of library software for different processes, such as circulation, cataloging, the catalog, and other databases, is in turn dependent on the connectivity of hardware hard drives, CD-ROM drives, local area networks, and the Internet.

 

Integration of Technical Services.

In earlier systems, the integration of the circulation function within the catalog was considered an innovation. School library patrons in the '90s were quite familiar with the automated catalogs they used in both school and public libraries where they may handle their own check-ins, check-outs, and reserves. Cataloging used to be handled off-line, but it has become more common to create a bibliographic record within the catalog system, or to search for it on a CD-ROM database or a Web site and download it to the local catalog. The Library of Congress, state networks (Texas Library Connection, Florida SUNLINK, etc.) and bibliographic networks such as OCLC are some of the MARC (Machine Readable Cataloging) databases that are available on the Internet. The online state networks will also offer interlibrary loan (ILL) services. Acquisitions (orders) and serials may also be integrated into the school's management system. The streamlining of these technical services translates into improved access to information for the library's clients.

 

Integration of Public Services.

More visible to the catalog user than the integration of enhanced technical services is the integration of reference databases and the OPAC. The OPAC serves as an index, a gateway to full-text information. The citation in the catalog leads to a book-in-hand available in the local library or from another library via ILL. More and more high school libraries are adding reference databases to the catalog menu, accessed from a CD-ROM database or from the Internet. The references may be periodical indexes or full-text databases, encyclopedias, or special references. In the most recent developments, school libraries are putting their catalogs on the Web so that users can access them from any location. School media specialists also cataloged Web sites and creating direct links from the bibliographic record (MARC tag 856) to the Web site address or URL.

 

Networked Hardware and Software.

Early microcomputer systems were limited to stand-alone circulation modules, and while it is true that some school libraries have not progressed beyond that stage, or are not automated at all, the majority has several kinds of systems. A biennial survey of School Library Journal subscribers reports that figures for 1995-96: online catalog, 60 percent; circulation, 77 percent; LAN, 66 percent; telecommunications, Internet, and e-mail, 62 percent; and CD-ROM, 84 percent. These technologies support the integrated functions just described.

Windows and UNIX programs are becoming more evident in the school library environment. Mac systems are still favored by a minority, but these different platforms can co-exist in a network. As more schools have realized that the future is in telecommunications, libraries have been included (or sometimes led the race) in making plans for network development and Internet access.

 

User Interface.

Client-server architecture and interface standards like Z39.50 make it possible for microcomputer systems to access remote databases in the ways previously described. Workstations in the local library will now have a GUI (Graphical User Interface) that replaces most text commands. The elements of this interface will include colorful screens with scroll bars, pop-up windows, point-and-click menus, hot buttons for special reading lists or resources, and maps of the library. Different command languages are offered; searching options include browsing as well as keyword and Boolean. These bells and whistles are being put to the test of facilitating the search process.

 

Usability of Online Catalogs;

Searching the Catalog.

The first generation of OPAC development was characterized by a card catalog model with some information system features. The second generation, which describes current technology, improved the user interface in the ways just described. Nonetheless, users are still running into some of the same searching problems encountered in previous systems. These problems include difficulties in spelling or keying in search terms, understanding commands, finding or modifying search terms, using suitable search strategies, getting feedback, and interpreting displays.

 

The Research.

Christine Borgman, one of the major researchers in children's use of online catalogs, states that catalogs are still hard to use because they operate as if the user has a fixed information goal represented by an appropriate query. What really happens, Borgman says, is that users formulate questions in stages and only gradually come to the point where they can articulate a query. It was found that children break down in searching the catalog according to their previous experiences, and that developing search strategies is more difficult for them than learning keying and commands. Borgman and others discovered that browsing is easier than keyword searching.

There seems to be general consensus among researchers that there needs to be more study of user-information interaction. While most observers expect systems to become more intuitive as the technology advances, it will be some time before such systems are widely available. This suggests an important role for all librarians to train users to search the catalog. It is not enough to teach technical skills such as keying or semantic skills such as understanding commands. The emphasis must be on teaching the information-seeking process.

 

 

Why are Libraries Digitizing? Some reasons are due to Space. It may take less space to store collections electronically, but the costs are high. Unlike off-site storage, you can't walk away and come back in thirty years and expect to be able to read your converted books. The infrastructure to migrate electronic documents reliably is not in place and the costs and risks are high. Another reason is because everyone else is. In an attempt to be able to say they are creating digital collections some libraries are undertaking conversion projects without understanding the resources it takes and without careful analysis in their choice of collections. Developing internal expertise by carrying out exploratory conversion projects can bring definite benefits to a library, but if this is done without fairly broad-based institutional consideration and buy-in on the decision of what collections to digitize, the drain of money and professional time in such projects could easily derail other important programs. Some other better and more logical reasons for going digital in libraries are; Electronic access is a big part of our future. The Internet is remaking higher education, as well as scholarly culture and communication. Libraries are uniquely placed to participate in shaping that future so that it serves in the best interests of research and instruction. Another reason is access. Electronic access is in many ways an improvement. Virtual collections can pull together disparate and large collections that couldn't be physically viewed at one time and place. The ability to tap image databases and to integrate text and images will enrich scholarship. Electronic journals with links to citations offer efficiency. Conventional scholarly research will be enhanced by electronic access to media collections. These materials, which have always been difficult to access, can now be incorporated in research publications and easily exchanged between scholars. Yet another reason is information organization. Digital surrogates minimize handling of fragile materials, but the imaging process is demanding and must be done with oversight by library staff and with a high enough level of quality to ensure the reusability of the archival electronic file for as long as possible. Another good reason for going digital is new scholarly tools. While full text databases are not new, image databases are an exciting application of electronic access. They draw together images of different formats: objects, models, plans, in addition to conventional images such as photographs and drawings, allowing scholars to reference a broad spectrum of visual materials. Furthermore, the ability to combine multimedia sources with print creates a different aesthetic and intellectual experience. We are still in the infancy of electronic delivery, but as the quantity and quality of electronic resources grow, we can expect to see innovative applications and new ways of utilizing research materials.

 

There are currently many library departments with an interest in managing digital conversion projects: systems departments, academic computing units, and special collections. Each brings a different and relevant form of expertise. We often hear that librarians shouldn't be doing imaging because electronic files are not sufficiently archival to warrant inclusion in the arsenal of preservation techniques. And this is currently the case. In limited instances, however, it may be legitimate to think of digital conversion as preservation. One such instance might be a black and white photo collection of a non-unique nature, which is rapidly deteriorating and for which sufficient funds for traditional film duplication do not exist. In this case, the choice is between some loss of information, plus the risk of uncertain future maintenance, vs. certain loss. Far more frequently offered as a reason for information organization staff's involvement is the belief that creating a digital surrogate will relieve use on the original. Yet there is a reason to believe that the increased awareness of the items from their presence on the web will lead to increased serious scholarly interest and a need to handle the original.

 

Preserving Information!!

A more promising basis for information organization's immediate involvement is "Preservation," the intertwining of traditional microfilming with digitizing. There is still active debate in the professional community concerning whether to scan or film first but the technique allows for the best of both worlds. We can continue to use microfilm as a long-lasting, low-maintenance archival format that can be converted to digital format as needed, by either the institution or a scholar onsite.

 

Digitizing is systematically related to microfilming, involving similar skills and workflow structures. Preservation professionals have done an excellent job of developing the field of microfilm to a high standard. They have imported and developed standards and guidelines to produce a well documented process, and they are beginning to do the same with digitizing.

Digitizing obviously involves many legitimate digital information organization and Preservation issues: decision-making for repair and the actual repair prior to scanning, handling and transport to and through the scanning operation, the environmental concerns of the digital capture location and process, and the specifications and handling of the electronic surrogates to minimize the need for future scanning.

Digitizing creates both managerial and ethical choices. If it is not balanced with needs introduced by use, brittle collections, and exhibition, it may consume resources intended for conservation treatment. How much treatment should you give something before imaging? How much effort should you expend in editing an image as opposed to treating the original?

So far, the focus is primarily on conversion of paper-based collections to electronic forms. Soon the archiving of electronic documents and collections will become a n information organization concern. Running digital conversion programs is an excellent way to become familiar with the technology and issues.

A final, and not the least significant, reason to involve information organization in digital conversion is the changing nature of research libraries and their priorities. If, as a field, we are not actively involved in the central issues of libraries, we risk becoming irrelevant. We are trained to evaluate the ways in which scholars use materials and to ensure that the necessities of collecting, arranging, and describing those materials do not damage or destroy the qualities that scholars find critical to their work. Increasingly those research materials will be electronic.

 

Realities of Libraries Today

One is Downsizing. Changes in government funding of universities dictates that in many institutions all new initiatives (for everyone, not just libraries) must be funded from current budgets. It's not that many libraries are doing things which are unrelated to their mission, it's that we have to say which of these perfectly legitimate things we are not going to do anymore, or will do less of, so that we can add new services and programs to meet the needs of researchers.

Another reality is Outsourcing. This is not completely new to libraries, but all traditional services are being systematically considered for contracting out. Another library reality today is Operational efficiency. Library processes are increasingly being reevaluated and traditional work flows are being altered to streamline activity and reduce the number of people who need to "touch" an item. As a result, staff must become familiar with related areas of technical processing outside their own department or specialty. Enterprise is another reality. As universities look to replace the funding no longer available from governmental sources and since endowments and tuition revenues are not adequate to close the gap, university officials are looking at programs to produce income. Library directors will be subject to this pressure as well. Money secured by a unit will not necessarily benefit the unit or even the libraries, but may simply fill a gap or deficit in a lowered operating budget.

Changes in priorities; In the last thirty years each decade has seen a different area of librarianship capture interest and available funding: cataloging, then information organization, then digitization. It is very likely that information organization digitization will gain the support needed. Change as a given; The rapid pace of change means the most important professional skill to acquire is learning how to learn.

Lack of resources is a reality in today’s libraries. Along with decreased funding, libraries must cope with increased serial costs, digital conversion costs, and acquisition of both traditional and new electronic collections. All library functions will be continuously evaluated for cost savings and relevance to service. Shedding stereotypes in Libraries that are working very hard to shed their traditional image of conservatism. While it is important to maintain our reputation of reliability, there is pressure to be seen as innovative by the university community. Increasingly, willingness to meet the needs of the community and a "can-do" attitude are seen as more important than the traditional concerns of the library profession. Traditional cataloging is one area that has been in conflict with the need to increase efficiencies in cataloging, giving rise to simplified catalog records.


Bibliography:

1.      Knowledge organization in research: A conceptual model for organizing data. by Given, Lisa M. 
Olson, Hope A. Library & Information Science Research v. 25 no2 (2003) p. 157-76 ISSN: 0740-8188

2.     Putting XML to work in the library: tools for improving access and management. By Miller, Dick R.

3.         A history of information storage and retrieval. By Stockwell, Foster.

4.     TI: Cataloging and metadata education: asserting a central role in information
 organization.
AU: Hsieh-Yee-I
SO: Cataloging-and-Classification-Quarterly. 34 (1/2) 2002, p.203-22. il. refs.
WEBLH: Check for Holdings in the UA LIBRARY
 http
://sabio.library.arizona.edu/search/i?SEARCH=0163-9374

5.      TI: Managing cataloging and the organization of information: philosophies,
 practices and challenges at the onset of the 21st century.
AU: Carter-R-C
SO: Catalogue-and-Index. (144) Spring 2002, p.15-16.
WEBLH: Check for Holdings in the UA LIBRARY
 http
://sabio.library.arizona.edu/search/i?SEARCH=0008-7629

6.   TI: 'Knowledge Organization', 1988-1999: a bibliometric analysis.
AU: Rekha-G; Parameswaran-M
SO: SRELS-Journal-of-Information-Management. 39 (4) Dec 2002, p.355-62. il. tbls.
WEBLH: Check for Holdings in the UA LIBRARY


7.      TI: Managing cataloging and the organization of information: philosophies,
 practices and challenges at the onset of the 21st century.
AU: Carter-R-C
SO: Journal-of-Internet-Cataloging. 5 (2) 2002, p.63-6.


8.     TI: Managing cataloging and the organization of information: philosophies,
 practices and challenges at the onset of the 21st century.
AU: Carter-R-C
SO: Technicalities-. 22 (1) Jan/Feb 2002, p.12-13.
WEBLH: Check for Holdings in the UA LIBRARY
 http
://sabio.library.arizona.edu/search/i?SEARCH=0272-0884

9.     TI: The implementation of information technology in the corporate engineering
 library.
AU: Schwarzwalder-R
SO: Science-and-Technology-Libraries. 19 (3/4) 2001, p.189-205. il. refs.
WEBLH: Check for Holdings in the UA LIBRARY
 http
://sabio.library.arizona.edu/search/i?SEARCH=0194-262X

10.TI: The design for authority-control systems in digital libraries. [In Chinese]
AU: Chen-K-h
SO: Bulletin-of-Library-and-Information-Science. (34) Aug 2000, p.51-71. il.
  Refs.
WEBLH: Check for Holdings in the UA LIBRARY
 http://sabio.library.arizona.edu/search/i?SEARCH=1023

11.                     Merrill. A Code for Classifiers -Its scope and problems.

 

  1. Osborn, A. The Crisis in Cataloging.

 

13. Pitti, D. Settling the Digital Frontier: The future of scholarly communication in the Humanities. URL: <http://sunsite.berkeley.edu/FindingAids/EAD.dpitti.html>

 

14. Buckland, M. Bibliographic Access Reconsidered. In Redesigning Library Services: A Manifesto. URL:<http://sunsite.berkeley.edu/Literature/Library/Redesigning/bi>baccess.html