mswangera2
Maggie Swanger
501 Assignment 2
Dec. 10, 2004

Metadata Quality


Table of Contents

1. Evaluation of Records
2. Literature Review & Basis for Checklist
3. Revised Quality Checklist

4. Bibliography

  


1. Evaluation of Records
Back to Table of Contents

Bibliographic Record #1
Windows to the Universe: Geology
http://www.windows.ucar.edu/tour/link=/earth/geology/geology.html
Consistency
Good
Accuracy
Good overall, although description could be more extensive. Title was perhaps a judgment call. Windows to the Universe is the name of the larger site and is listed at the top of the geology section. Geology is listed under it, no colon.
Granularity
High (poor). The entire Windows site is huge, also including physics, astronomy, and space travel. Just the Geology section, still includes over 20 pages, some of which are linked into the physics section. Messy.
Subject Indexing
Exhaustive

Bibliographic Record #2
Deep Time
http://www.pbs.org/wgbh/evolution/change/deeptime/index.html
Consistency
Good
Accuracy
Most elements okay. Description is far too brief.
Granularity
Low (good). Although the page has a small subwindow to display a wealth of highlighted information, the overall page and URL remains constant and the content is cohesive. Breaking it down further is impossible.
Subject Indexing
Exhaustive

Bibliographic Record #3
DinoDictionary.com
http://www.dinodictionary.com/index.asp
Consistency
Good
Accuracy
Good. Description is brief, but appropriate for this rather straight-forward site. Dataset for type is perhaps inaccurate--Interactive Resource? Subject should be Dinosaurs--Dictionaries rather than just Dinosaurs.
Granularity
Low (good). Multiple pages, with each page a letter of the alphabet. Breaking the metadata records up by each page would be somewhat arbitrary and nonsensical.
Subject Indexing
Summative

Bibliographic Record #4
Understanding Geologic Time
http://www.ucmp.berkeley.edu/education/explorations/tours/geotime/index.html
Consistency
Good
Accuracy
One major flaw influenced the overall accuracy, but could be easily remedied. Although there is also an extensive Teacher's Guide linked from the URL given, only the student's lesson was cataloged. Either the URL should be changed to the first page of the student's lesson, or the entire thing should be recataloged. The record will be highly accurate if the URL is changed to: http://www.ucmp.berkeley.edu/education/explorations/tours/geotime/gtpage1.html
Granularity
Again, if the URL is changed, granularity will be relatively low. There are multiple pages, but they are intended to work together as a cohesive, step-by-step lesson.
Subject Indexing
Exhaustive

Bibliographic Record #5
On the Move, Continental Drift and Plate Tectonics
http://kids.earth.nasa.gov/archive/pangaea/index.html
Consistency
Good
Accuracy
Description should be longer.
Granularity
High (poor)
Subject Indexing
Exhaustive

Bibliographic Record #6
The Amazing Mundo
http://ology.amnh.org/earth/mundo/index.html
Consistency
Good
Accuracy
Mostly good, however, I should not have mentioned the bit about the Ology site in the description. This information is beyond the scope of the individual page being cataloged and does not belong.
Granularity
Low
Subject Indexing
Basically summative

Bibliographic Record #7
Mineral Matters
http://www.sdnhm.org/kids/minerals/index.html
Consistency
Mostly good, but in the relation element, in order to be consistent with my other records, I should had bolded Is Part Of.
Accuracy
Again, description should be longer.
Granularity
High
Subject Indexing
Exhaustive

Bibliographic Record #8
When Crocodiles Ruled
http://www.crocsrule.org/
Consistency
Mostly good, but in the relation element, in order to be consistent with my other records, I should had bolded Is Part Of.
Accuracy
Good
Granularity
Somewhat high, although the pages all work together under the main topic, more like chapters.
Subject Indexing
Exhaustive

Bibliographic Record #9
Astro-Venture Geology Training
http://astroventure.arc.nasa.gov/geology/training/index.html
Consistency
Good
Accuracy
Good
Granularity
Low
Subject Indexing
Exhaustive

Bibliographic Record #10
Geo Mysteries
http://www.childrensmuseum.org/geomysteries/index2.html
Consistency
Good
Accuracy
Mostly good, but the description only discusses chapter one. It should be expanded in include descriptions of the other activities.
Granularity
Somewhat high. Several different pages and interactive games are included in the Geo Mysteries site.
Subject Indexing
Exhaustive


2. Literature Review & Basis for Revised Checklist
Back to Table of Contents

The assessment of cataloging quality has been discussed for years. Even nearly 100 years ago, Merrill (1912) recognized the main challenges of effective classification and characterized the process as "both an art and a science." The cataloging community still works toward the general adoption of overriding, specifically defined quality assessment standards, but thus far, even loose definitions of metadata quality are under new scrutiny for two main reasons: a) in the climate of ever-tightening budgets, cost-effectiveness must taken into consideration, and b) the challenge of selecting and cataloging e-resources requires new standards and presents new difficulties. Now, with the information explosion that has stemmed from the Internet, the educational potential is enormous, but only if high quality educational resources can be easily located and retrieved. Sounds like a job for librarians!


User Efficiency

Whenever performing cataloging and evaluating its effectiveness, the needs of the user must be the guiding factor. As Paiste (2003) points out, librarianship is a service, a customer-oriented profession. She further maintains that it is not only reference librarians who shoulder and address the needs of users, but so too the catalogers. In fact, since online catalogs are the primary library-user interface, the catalog itself handles the majority of user requests, not to mention, supports and makes possible the actions of all other library employees. Even in 1958, Jackson emphasized the importance of card catalog usability, with accurate, cross referenced access points, appropriate signage, and on-site librarians to aid in location and retrieval. Then and now, users are frequently hesitant to ask librarians for help, so the catalog and its bibliographic records must serve as intuitive, stand-alone tools. This dictates that catalogers remain in touch with the needs of users, both the library employees and the public.

This usability is not an item that can be added to an evaluative checklist for bibliographic records, rather it is the overarching theme that all other methods of assessment must ultimately fall under.
For instance, from a user standpoint, keywords and the item's description are the most important components of the record. Most online public access catalogs (OPACs) reflect this user preference. The University of Arizona's OPAC, for instance, highlights the keyword search with larger font. Readily usable catalogs are reliant upon both clear, high quality metadata, and a well designed user-interface.


Cost Effectiveness

Time and again, the issue of cost effectiveness has been raised in the literature (Graham, 1990; McCain, 2002; Paiste, 2003). Clearly, to a certain degree, the more time that can be allotted to cataloging a particular item, the better that bibliographic record will be. Of course, there are no easily-generalizable formulas or ratios of time spent to metadata quality. The details can vary according to the cataloger's experience and knowledge of a subject, the complexity of the item, the type of metadata schema being used, the type of cataloging (original vs. copy), and a host of other variables. Considering all this, each library should weigh the cost-benefits and establish guidelines for catalogers.

Graham and Paiste lie near differing ends of this cost-benefit comparison. Graham (1990) has questioned catalogers devoting an extensive amount of time on additional access points that are useless to the average user. He favors producing a higher quantity of "lean" records and getting materials on the shelves faster. Contrarily, Paiste (2003) points out that because the catalog is the heart of the library, cataloging must be done with care in order to avoid future problems and confusion, and ultimately, this will save both users' and other staff members' time.

As with anything, there is perhaps a balance for every particular scenario. By copy cataloging or outsourcing whenever possible, using automation or non-professional staff for certain tasks, and diminishing superfluous checks and revisions, the process might be streamlined. However, the records must still exhibit high accuracy and include as much information as possible to assist users in locating materials.


Compatibility


In this time of rapid technological upheaval, librarians are developing new schemata, like Dublin Core, to handle the unique task of cataloging websites. Specialized digital libraries, such as the Digital Library for Earth System Education (DLESE) are springing up, but ever more are needed to keep pace with the unending stream of new websites. Terribly important to the future of digital collection and preservation is technological compatibility. Long-term access must be insured and planned for (Farb, 2004). Systems must be flexible enough both to translate easily among various schemata (DC to MARC, for instance), allowing sharing and copy cataloging, and must keep up with technologies, updating when necessary. With technologies changing on sometimes a yearly basis, digital librarians are learning to keep pace.


Accuracy & Completeness

Perhaps the most central concept of organization is accuracy. If metadata records do not accurately represent the items, then all else is lost, information may as well be thrown away if it cannot be logically retrieved. Similarly, misspellings and typos have a deathly efficiency at losing records (Barton, 2003).

Metadata should be as complete as time and resources allow (Bruce, 2004). All elements that apply to an item should be completed. Furthermore, the information in those elements should be as full and complete. Again, the level of completeness depends upon the mission of the library, in other words, the budget.


Related to completeness is the importance of specificity. Subject indexing, for instance, should be as specific as possible. Consider an item about Paleocene paleontology. The LC subject, Paleontology--Paleocene is preferable to simply, Paleontology. Through attention to specificity, records will in turn be more accurate and easier to retrieve quickly and precisely.



Consistency & Repetition

Additionally, descriptive names and terms should be used with consistency. It is this consistency that is at the root of classification.
A specific method of improving consistency is the use of controlled vocabularies. Controlled vocabularies limit the infinite range of words and names to a manageable number, establish cross referencing, and in short, make effective organization and retrieval possible. From simple synonym rings to thesauri and complex ontologies, anytime controlled vocabularies can be used, they should. Although highly useful, even indispensable, controlled vocabularies are, as Coleman addresses (2004), highly expensive as well. Fluency in complicated vocabularies requires extensive training and experience.

A concept related to the controlled use of terms, is repetition. When considering flexible elements like keywords and description, excessive repetition of words only wastes space. For a user, who is looking for a source on tabby cats, a book or webpage entitled "Tabby Cats" will be retrieved with a keyword search, so do not waste valuable metadata content by including "tabby" or "cats" in the keywords. Clearly there will be some cases, in the description perhaps, where overlap is unavoidable, but in general, repetition should be avoided.


Subject Indexing

An important issue of subject indexing is its exhaustivity. Indexing can be either summative or exhaustive. Those in the summative camp maintain that only the one main subject should be recognized, while exhaustive catalogers see the benefits of assigning subtopics to create a more thorough depiction of the item. Taylor (2004) points to the rationale behind these two ideologies as related to users -- document retrieval vs. information retrieval. While one is not necessarily better than the other, these overriding missions must be established by the library in order to guide catalogers. If we take another look at the issue of cost effective cataloging, we can see that a catalog designed with information retrieval in mind will be much more extensive and require much more work. On the other hand, a library that has chosen to limit its catalog's purpose to document retrieval will subscribe to Graham's (1990) "lean records" approach and likely only supply users with summative subject indexing.


Granularity

With online resources, attention must be paid to granularity. This can become an ambiguous concept when working with websites. Whereas the breadth of a book is quite clear -- it is a tangible item with a concrete, unchanging number of pages -- the dimensions of a web resources are generally unclear. Does a website constitute one "item?" What if the site is huge and links to non-related content? Is a single webpage one "item?" What if the content of one essay is displayed between several consecutive webpages? Generally, the lower the granularity, the better the metadata, but this is clearly a classification concept that falls more in Merrill's (1912) art than science.


Currency

Finally, the issue of currency is closely allied with accuracy. Unlike static books or other physically tangible materials, websites are ever-changing. Generally, the best sites are updated regularly, so even for resources that have already been cataloged, the records must be updated regularly to remain accurate.
Ideally, the metadata in digital libraries should be checked and revised at designated, regular intervals and the date of the latest inspection recorded in the record. This presents one of the largest challenges to the cataloging of digital resources and clearly requires extensive man-hours.


3. Revised Quality Checklist
Back to Table of Contents

Component
Specifics
General Assessment of the 10 Records
Accuracy

Any typos, misspellings? No
Is metadata is an accurate representation of the item? Usually (see individual problems in record assessments above)
Is there general adherence to standards of the schema (date formated correctly, words spelled out or abbreviated, etc)? Yes
Completeness
Are all appropriate elements included? Almost always
Are they specific enough (especially in the case of subject indexing)? Some subject headings could be more specific.
Consistency
Use of controlled vocabulary whenever possible?   Yes
Names/terms consistent with one another?
Almost always
Repetition Do keywords replicate information already included in other elements? No
Subject Indexing Summative or exhaustive? Mostly exhaustive
Granularity High or low? Variable
Currency


Access -- Is the URL still good? Yes
Has any change of content occurred since cataloging? No
Compare update date on website (if available) to creation/update date on metadata.
Since resources were cataloged under two months ago, all but one remain the same. (The content of the DinoDictionary has not changed, but the site now features a Christmas advertisement on the homepage.)
Language
Appropriate for the intended user group? Yes, the audience for ENC Online is math and science K-12 teachers.


Added notes:  Although it must be measured internally, either at the time of metadata creation, or retrospectively through extrapolation based on total number of cataloger hours per total number of metadata records created, cost effectiveness must always be taken into consideration. Furthermore, effectiveness can only ultimately be measured by users, so on a larger scale, effectiveness of the user-interface should be considered. Language has been included with user effectiveness in mind, however, it could be a controversial addition. There are obviously only certain elements in which this is practical. User-appropriate language can only be considered for elements such as keywords and description and in cases involving multiple user groups, it could become ambiguous. Nonetheless, in the case of a digital library designed for use by children, appropriate language would clearly be necessary.


4. Bibliography
Back to Table of Contents

Barton, J., Currier, S., Hey, J. M. N. (2003). Building quality assurance into metadata creation: an analysis based on the learning objects and e-prints communities of practice. <http://www.siderean.com/dc2003/201_paper60.pdf>
Accessed Nov. 27, 2004.

Bruce, T. (2004). The continuum of quality: defining, expressing, and exploiting metadata. <http://metadata-wg.mannlib.cornell.edu/forum/index.php?date=2004-05-21>
Accessed Dec. 2, 2004.

Coleman, A., et al. (2004). A framework for the future of educational digital libraries: metadata and vocabularies for learning.
<http://swiki.dlese.org/quality/uploads/1/QualityWG3final.pdf> Accessed Nov. 2004.
 
Digital Library for Earth System Education (DLESE). <http://www.dlese.org/dds/index.jsp> Accessed: Dec. 2004.

Farb, S. E. & Riggio, A. (2004). Medium or message? A new look at standards, structures, and schemata for managing electronic resources. Library Hi Tech, 22(2), 144-152.

Graham, P. S. (1990). Quality in cataloging: making distinctions. The Journal of Academic Librarianship, 16(4), 213-218.

Jackson, S. L. (1958). Catalog use study: director's report. Chicago: American Library Association. pp. 1-3.

McCain, C. & Shorten, J. (2002). Cataloging efficiency and effectiveness. Library Resources and Technical Services, 46(1), 23-31.

Merrill, W. S. (1912). A code for classifiers--its scope and its problems: the one-topic book. The Library Journal, 37(5), 245-251.

Merrill, W. S. (1912). A code for classifiers--its scope and its problems: the two-topic book. The Library Journal, 37(6), 304-310.

Milstead, J. & Feldman, S. (1999). Metadata: cataloging by any other name. <http://www.onlineinc.com/onlinemag/OL1999/milstead1.html (1 of 22) [07/06/2000 13:26:13]>
Accessed Nov. 2004.

Paiste, M. S. (2003). Defining and achieving quality in cataloging in academic libraries: a literature review. Library Collections, Acquisitions, & Technical Services, 27, 327-338.

Taylor, A. G. (2004). The Organization of Information, 2nd ed. Westport, Connecticut: Libraries Unlimited.