I give permission for my work to be part of the SIRLS Learning Showcase, both assignments 1 and 2.

Trevor Smith
IRLS 501, Dr. Coleman
December 10, 2004

Assignment 2
Metadata Evaluation

On Evaluating Metadata Quality for Educational Tasks

My favorite scene in the novel, “Twenty Thousand Leagues Under the Sea,” involves two of the main characters discussing classification. Conseil, the personal servant of the illustrious French academic, Professor Aronnax, is anxious to impress the Canadian harpoonist, Ned Land, with his knowledge of fish taxonomy.  Ned has a much simpler classification technique; he divides the aquatic world into fish that can be eaten and fish that cannot be eaten—a system that fits his needs perfectly. (Verne, 2000)  So it is with metadata, its quality must be evaluated in relation to its purpose.

Taylor (Taylor, 2004) defines the general purpose of metadata,  “…to provide a level of data at which choices can be made as to which information packages one wishes to view or search…” without having to examine the actual resource.  It can also be described as “…structured, descriptive information about a resource...”(Coleman, deCharon, Frost, Ginger, & Raskin, 2004) for the purpose of finding, identifying, selecting, and obtaining it.  (Attig, 1998)

On that basis it would seem reasonable to judge the quality of metadata on the following characteristics (provided in assignment 2):
These are excellent general measures of metadata quality, but can additional or more specific criteria be found when the metadata will be used to describe resources in a specific domain?  A brief literature search was done to find articles related to metadata quality in education to develop a checklist for evaluating metadata quality for educational tasks.

An interesting place to start is with Peter Graham’s admission that,  “Quality is more difficult to define, and, though it is often assumed and praised in the literature of bibliographic control, it doesn't seem to be well delineated.”(Graham, 1990) In other words, no matter what objective criteria we may have in mind, cataloging and metadata creation is still something of art that must be evaluated with a level of subjectivity.

Standardization is an important criterion, and the one most frequently mentioned in the reviewed literature.  “It is this larger concept of standardization that information professionals understand and use.  They are concerned both with how to write down the descriptive information and what to write down.” (Milstead & Feldman, 1999) The type of external standardization that makes it possible to locate and exchange information (usually in the form of a controlled vocabulary) is called interoperability. (Barton, Currier, & Hey, 2003) Further metadata quality problems discussed in their paper include duplication of metadata over dissimilar learning objects, internally inconsistent terminology, description of characteristics of objects without describing the content, and over-use of software default values.  Clearly there are two aspects of consistency, internal and external.

Interoperability and the issue of external consistency is identified by Laurel Clyde, “The challenge will be to bring all the paths together in a set of standards that are widely accepted internationally across a range of Internet and other applications.” (Clyde, 2002 )  But, the application of controlled vocabularies and other standardization methodologies is a time consuming task that requires skill and training; not to mention general agreement (currently lacking) of which standard is to be followed. (Anido et al., 2002)

Roy Tennant of Library Journal has recent experience with the problems inconsistent metadata can cause.  He mentions lack of a controlled vocabulary, the wrong type of data or the mis-mapping of data in a metadata element, and encoding problems (especially the date). (Tennant, 2004)

Tennant also discusses granularity in two articles.  His first observation is that, “Excellent metadata is constructed in layers.” (Tennant, 2002 )  By this he means high granularity where it is appropriate, and good (standardized) summative elements.  Tennant also points out that “Highly granular metadata doesn’t come cheap…(it) is hard-won and easily lost.  Identifying and appropriately encoding metadata elements usually requires a person—and one with training.”(Tennant, 2002)  Harvey examines the specifics of training and education that may be required to consistently produce quality metadata. (Harvey, 2003)

Students and educators have special needs when it comes to locating educational resources.  So, metadata should be evaluated for its comprehensiveness when it comes to educational level and resource type.  Students also tend to be sensitive to the amount of time required to find and use a learning object.  (Augustine & Greene, 2002)  Insight about the possible metadata requirements for learning objects can also be gained on the IEEE Learning Technology Standards Committee website on Learning Object Metadata.  (Hodgins, 2004)

Finally, with regard to the importance of quality metadata creation, especially for online resources, Ron Chepesiuk has a warning.  “For the past 25 years, OPACs have been at the center of the library world…That era is over.  Ask any patron how many times a week he uses an OPAC and how many times he uses a web search engine.  The answer to that question should scare us.” (Chepesiuk, 1999 )

Using the information provided in these articles and understanding gained in this class, the following is my proposed checklist for evaluating metadata quality for educational tasks.  The original four criteria have been included:
  1. Internal Standardization – Is the metadata consistent in the use of terms to describe the same things across all the elements?
  2. External Standardization – Do all the elements that should have a controlled vocabulary use one?  (Especially subject using LCSH, and the elements that use a list of controlled values.)
  3. Accuracy – Does the metadata reflect with precision the resource it is describing?  Do elements like title exactly match the resource?
  4. Completeness – Does the metadata capture the scope of the resource or does it leave out important aspects?
  5. Audience – Does the metadata include audience information essential to the educational mission of the resource (language, education level, learning time)?
  6. Uniqueness – Are there sufficient unique elements of the metadata to differentiate the resource from other, similar resources.
  7. Granularity – Does the description provided in the metadata go to a sufficient level of detail to accomplish its purpose?
  8. Non-text Objects – Are all non-text object of value described by the metadata.  This is important because without appropriate metadata inclusion, search engines or other indexing tools will not identify these items.
  9. Level of Subject Indexing -- Is it summative or exhaustive?
  10. Objectivity – Is the metadata free of cultural, racial, and gender bias?  Are all the subjective evaluations indicated in the metadata made solely on the basis of evidence found in the resource?

An evaluation of the metadata quality from Assignment 1

The following two tables summarize the evaluation of the metadata submitted as part of Assignment 1.  Table 1 looks at the criteria provided as part of the assignment.  Table 2 uses the checklist developed above.

Resource
Consistency
Accuracy
Granularity
Subject Indexing
A Brief History of the Internet yes
yes deep
exhaustive
UBC Roadmap to Computing yes
yes
shallow
summative
CompuTREK:  Internet Error Messages yes
yes
deep
exhaustive
UofO Libraries:  Web Publishing Curriculum Resources yes
yes
shallow
summative
TechEncyclopedia no
yes
shallow
summative
Internet Archive Wayback Machine no
no very shallow
summative
W3 Schools yes
yes
shallow
summative
Computer Almanac yes
yes
shallow
exhaustive
Free On-Line Dictionary Of Computing yes
yes
shallow summative
Living Internet yes
yes
medium summative

Table 1
Quality of Metadata Submitted in Assignment 1
Criteria Provided in Assignment 2



Resource
Internal
Standard
External
Standard
Accuracy
Completeness
Audience
Uniqueness
Granularity
Non-text
Objects
Subject
Indexing
Objectivity
A Brief History of the Internet yes yes yes yes yes yes deep yes exhaustive yes
UBC Roadmap to Computing yes yes yes no
yes yes shallow
no
summative
yes
CompuTREK:  Internet Error Messages yes yes yes yes yes yes deep
yes exhaustive yes
UofO Libraries:  Web Publishing Curriculum Resources yes no
yes no
yes yes shallow
no
summative yes
TechEncyclopedia no yes yes no yes yes shallow no summative yes
Internet Archive Wayback Machine no no no no no yes very shallow
no summative
yes
W3 Schools yes
yes
yes
no yes
yes
shallow no summative
yes
Computer Almanac yes
yes
yes
yes
yes
yes
shallow yes
exhaustive yes
Free On-Line Dictionary Of Computing yes
yes
yes
no yes
yes
shallow yes
summative
yes
Living Internet yes
yes
yes
no yes
yes
medium
no summative yes

Table 2
Quality of Metadata Submitted in Assignment 1
Proposed Checklist: Metadata Quality for Educational Tasks

Several factors are important to keep in mind relating to this evaluation of metadata quality.  The metadata creator is was inexperienced, so there are likely to be mistakes or omissions that would not happen again were the assignment to be repeated.  The creator is also the evaluator, so the assessment is likely to be somewhat subjective; it is easy to believe that because you knew what you were thinking, an impartial reviewer would also know what you were thinking.  Also, the metadata was created from the beginning with the idea of serving an educational purpose at an academic institution.  If the metadata were selected from more general sources, some of the specific items on the checklist would be much less likely to be at an acceptable level.

In general, the level of consistency was relatively high.  Table 1 lumps all the aspects of consistency together where Table 2 breaks the criteria in to Internal and External Standardization.  Where negative marks were issued, it usually had to do with the improper use of the LCSH, either a heading that was blatantly wrong or just not specific enough.   There appeared to be no violations of controlled values.  Internal inconsistencies occurred in one set of metadata, and one resource was so broad that its scope made consistency difficult.

Accuracy tended to be very high.  This is not surprising if you consider that any element that could be was cut-and-pasted from the actual source.  Again the tricky broad resource (Internet Archive Wayback Machine) did not get a yes in accuracy because the metadata referred to its function but not its content.

Completeness and Subject Indexing were closely correlated with the scope of the resource.  If it was a small single topic resource, it was easy to cover all the subject areas and do so in an exhaustive manner.  The big and broad resources generally were summarized and had some gaps in the completeness of their metadata descriptions.

Four values could be supplied in the Granularity columns, very shallow, shallow, medium, or deep.  Shallow was the most common value, indicating that finer granularity might be something to look at.  The one “very shallow” rating is that big and broad resource—now a consistent theme.

Audience and Uniqueness were both quite positive.  The worksheets used in gathering the metadata were very helpful in this regard.  They insured that the specific needs of educators and students were addressed.  Every resource had Audience, Interactivity Type, and Typical Learning Time elements.  By following the process outlined in “Guide to Selecting and Cataloging Quality WWW Resources for the Small Library,” (Coleman, 2004) appropriate metadata was gathered to distinctly identify each resource.

Non-text Objects were something of a problem.  For reasons mentioned above, it is a good idea to describe them with metadata at a somewhat granular level.  This is definitely an area for improvement.

Finally, Objectivity was positive across the board.  Part of this could be because the general subject area, online technology and the Internet, has a bit less potential for personal bias than topics in the humanities or social sciences.  Most likely it has to do with the biases of the evaluator exactly matching the biases of the metadata creator.


References

Anido, L. E., Fernandez, M. J., Caeiro, M., Santos, J. M., Rodriguez, J. S., & Llamas, M. (2002). Educational metadata and brokerage for learning resources. Computers and Education, 38(4), 351-374.

Attig, J. (1998). Committee on Cataloging:  Description and Access, Task Force on Metadata and the Cataloging Rules, Final Report, Metadata and Cataloging:  Supporting Common User Tasks. American Library Association. Retrieved November 28, 2004, from http://archive.ala.org/alcts/organization/ccs/ccda/tf-tei3.html

Augustine, S., & Greene, C. (2002). Discovering how students search a library Web site: a usability case study. College and Research Libraries, 63(4), 354-365.

Barton, J., Currier, S., & Hey, J. M. N. (2003). Building quality assurance into metadata creation: an analysis based on the learning objects and e-prints communities of practice. Dublin Core Conference. Seattle, WA. Retrieved December 1, 2004, from http://www.siderean.com/dc2003/201_paper60.pdf

Chepesiuk, R. (1999 ). Organizing the Internet: the "core" of the challenge. American Libraries 30 (1 ), 60-63.

Clyde, L. A. (2002 ). Metadata. Teacher Librarian 30(2 ), 45-47.

Coleman, A., deCharon, A., Frost, C. O., Ginger, K., & Raskin, R. (2004). A Framework for the Future of Educational Digital Libraries:  Metadata and Vocabularies for Learning. DLESE Quality WG 3:Metadata Structures. Retrieved November 25, 2004, from http://swiki.dlese.org/quality/4

Coleman, A. S. (2004). Guide to Selecting and Cataloging Quality WWW Resources for the Small Library. Fairfield, CA: Learning Resources Association of the California Community Colleges.

Graham, P. S. (1990). Quality in cataloging: making distinctions. The Journal of Academic Librarianship, 16, 213-218.

Harvey, R. (2003). Promoting Quality Metadata in Libraries: The Role of Education. Malaysian Journal of Library & Information Science, 8(2), 79-93.

Hodgins, W. (2004). IEEE Learning Technology Standards Committee, WG12: Learning Object Metadata. IEEE. Retrieved December 4, 2004, from http://ltsc.ieee.org/wg12

Milstead, J. L., & Feldman, S. E. (1999). Metadata: cataloging by any other name. Online (Weston, Conn.), 23(1), 24-26.

Taylor, A. G. (2004). The Organization of Information. Westport, CT: Libraries Unlimited.

Tennant, R. (2002). The importance of being granular. Library Journal, 127(9), 32-34.

Tennant, R. (2002 ). Metadata as if libraries depended on it. Library Journal 127 (7 ), 32-34.

Tennant, R. (2004). Metadata's Bitter Harvest. Library Journal, 129(12), 32.

Verne, J. (2000). Twenty Thousand Leagues Under the Sea. World Wide School. Seattle, WA. Retrieved December 1, 2004, from http://www.worldwideschool.org/library/books/lit/adventure/TwentyThousandLeaguesUndertheSea/chap14.html


Appendix 1
Resources Used for Metadata Quality Analysis


  1. A Brief History of the Internet
    http://www.isoc.org/internet/history/brief.shtml

  2. UBC Roadmap to Computing
    http://www.roadmap.ubc.ca/index.html

  3. CompuTREK:  Internet Error Messages
    http://www.daviestrek.com/computrek/error.htm

  4. University of Oregon Libraries:  Web Publishing Curriculum Resources
    http://libweb.uoregon.edu/it/webpub/

  5. TechEncyclopedia
    http://www.techweb.com/encyclopedia/

  6. Internet Archive Wayback Machine
    http://www.archive.org/web/web.php

  7. W3 Schools
    http://www.w3schools.com/default.asp

  8. Computer Almanac:  Interesting and Useful Numbers About Computers
    http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/bam/www/numbers.html

  9. Free On-Line Dictionary Of Computing
    http://foldoc.doc.ic.ac.uk/foldoc/index.html

  10. Living Internet
    http://livinginternet.com/