Jessica Browning
Assignment #2: Final
December 10, 2004
(Essay)
Evaluating
and Eliminating Inconsistencies in Metadata Creation
Part 1: Assignment 1 Criteria
Overview
In evaluating my metadata based upon the criteria
outlined in our assignment, I found many things to consider when
creating metadata. While some of the ideas considering subject
indexing had been addressed in my initial report (assignment 1), other
ideas regarding consistency should have been addressed at the time I
was creating my initial metadata. Even though my inexperience in
metadata creation contributes to my lack of consistency, some errors
could--and should-- have been avoided by closer examination of my
sources and a closer proofreading of my metadata. This was
alluded to in the comments given to us regarding Assignment 1, but I
didn't fully realize exactly where my failures occurred until I
re-evaluated my sources and metadata. As a student, I had the
time/incentive to check my work; however, if I were creating my data as
an actual job for an actual library, I probably never would have
discovered some of the errors that I had simply because I would have
moved on to other projects.
1. Consistency and Subject Indexing:
Standardizing
Vocabulary in Metadata and Classification Schemes
One problem with
consistency in
metadata creation was inspired by the DLESE Quality EG 3 Metadata
Structures
white paper made available as a lecture note. In the white paper,
the
work group discusses the problems dealing with "standardizing"
metadata vocabularies to facilitate information retrieval. While
the white paper only discusses information retrieval in terms of assigning subject terms to metadata, I
believe that a review of classification systems--in particular, the
Library of
Congress cataloging system, which I used-- can also be beneficial at
this time,
due to the research being done on metadata vocabulary
standardization.
When I researched my selected topic ("Japanese kanji online
resources") using Google, I encountered a variety of sites, not all of
them immediately relevant. Initially, I blamed my search skills;
however,
a quick visit to the Library of Congress's classification scheme
revealed that
every part of the Japanese language--from kanji to classical Japanese
to local
dialect--is lumped under PL501-699: Japanese Language. To me,
this seems
to be a little dismissive of the fact that Japanese is a complex
language made
up of many elements. One can argue that a broad classification is
more
than adequate to the casual researcher--after all, perhaps not everyone
realizes that Japanese has many elements--, but to the more advanced
user
(a
person who knows "exactly what they need") a broad
classification could slow the retrieval process because the user has to
pick
through several resources to find what he/she needs.
In terms of metadata creation, inconsistency in subject
classification can also lead to problems with database mergers.
If
multiple libraries are sharing resources--implying that multiple
metadata creators have, and will continue to, share virtual resources
as
well--, then
one metadata creator using the broad classification of "Japanese
language"
to describe both Japanese kanji dictionaries and JLPT word lists and
another
cataloger using the terms "Japanese kanji" and "JLPT--Japanese
Language Placement Tests" to describe the same respective resources
leads
to discrepancies in information retrieval as well as creates redundant
records
for the same resource. Taylor
addresses this issue in chapter 11 of the text when she states that "if
some of the metadata has been created using close classification, and
some has
been created using broad classification, the combined effect will be
confusing
and less helpful" (308). However, I believe that the accuracy of
a metadata record in terms of subject terms chosen also depends on the
knowledge of the individual creator on a particular subject. I
state that only using "Japanese language" to subjectively classify
every resource is too broad, but another creator may disagree because
he/she is unaware, for instance, that Japanese can be expressed using 4
different alphabet systems, with at least 2 being used in a single
written sentence. But, to suggest that a creator take the time to
research every subject he/she is creating metadata for in order to
ensure accuracy would be a bit outrageous--individual creators are too
busy to research everything.
Conversely, though, mega search engine creators are also somewhat at
fault for
not "us(ing) classification theory" to create search databases (Taylor
319). Currently, there are no specific schemes written for metadata
creation--metadata creators use a pre-existing scheme
such
as
LOC. Or, in the case of some mega search engines such as Yahoo!,
mega
search engine technicians create categories that don't "take advantage
of
an already-existing classification" (Taylor
319). From my own metadata creating experience, I started by
consulting
the LOC classification scheme to determine the pre-established
classifications
for my subject. Not satisfied with lumping every resource into
the
"Japanese language" category only, I was forced to arbitrarily create
my own subject terms based upon my own study experiences.
However, chances
are another creator would either stick with the "Japanese language
only" idea in order to adhere to time-honored classification schemes or
else choose to creatively "re-invent the wheel" based upon his/her
experiences as a virtual user. If another creator and I
simultaneously created metadata for the same resource and then compared
our
work, we could discover that we have different subject terms (Coleman
21).
Therefore,
a patron using my library would find a kanji resource by typing
"Japanese
kanji", but a patron using my colleague's library would have to type
"Japanese language", or “Sino-Japanese characters” to find the same
resource. This, to me, is time-consuming, frustrating, and
ridiculously
unnecessary.
I seem to present two different views on classification consistency in
this
section, and I have done so purposely in order to illustrate the
problems that
a metadata creator faces when making cataloging decisions.
On one hand, I suggest that sticking to pre-established
cataloging schemes alone to catalog metadata is a bad idea because of
user
search pattern changes as well as the broadness of some categories; on
the
other hand, I then suggest that creating too many arbitrary categories
could
lead to trouble as well. Naturally, the
optimal solution would be to write a new classification scheme based
upon a
combination of pre-established specific classification schemes and the
"browsing" searching technique that the casual Internet user has
adopted as a means of obtaining information. Dealing with this
idea in
its entirety would produce a rather large report that is inappropriate
for the
scope/topic of this assignment, so I will merely present my ideas and
move
on. The first method would be to form a mixed committee
consisting of
classification theorists and Internet mega search site technicians to
study the
browsing habits of users, thereby creating a scheme that combines
already-existing ideas with new ideas. The second method would be
for
classification theorists to study the Internet and create a new
classification
scheme that is largely based on "Internet browsing principles."
The first method officially acknowledges the Internet as a valid
information
resource and places responsibility on several parties to make changes
to better
information organization as a whole; the second method somewhat
abandons
already-existing ideas and also, in a way, undermines the entire field
of
classification theory by suggesting that retrieval by browsing is
better than
retrieval by specific searching. Either way would be a
time-consuming and
expensive process. However, I believe that a standard which
breaks down large chunks of subject information and
reflects today's Internet "browsing" technique would benefit both
patrons and catalogers.
In my initial report, I rather boldly stated that
"If I were actually
cataloging the resources for a library, I would perhaps try to further
subclassify the "Japanese language" into smaller categories such as
"Japanese kanji" and "Japanese grammar". However,
after taking a bit of time away from my assignment, I now realize that
this may
not be the brightest idea I've ever had. After all, if a user is
expecting to find resources under "Japanese language", and I've
divided resources into my own arbitrary categories, then a patron will
have a
hard time finding anything. To counter this problem, I listed
"Japanese language" as a subject term, and then created my own terms
based upon what I "thought" should work. Then, within my
metadata, I tried to use the same arbitrarily created terms in order to
make my
metadata consistent. I was mostly able to stay within the same
terms for
all 10 resources. However, I still ended up with a couple of
unique subject terms such as "Japanese alphabet system" for Resource
#9: CMJ Grammar Online and "Japanese kanji furigana" for Resource #10:
Kanji Furigana For Japanese Learners. Even though the subject
terms accurately reflect the information available in the web site, I'm
not sure that a patron would think of those specific search terms to
find those resources. To completely cover all possible subject
terms, I would give Resource #9 two more subject terms: "Japanese
hiragana" and "Japanese katakana". Then, keeping in mind that a
beginning Japanese student may not yet know/remember the word
"furigana", I would give Resource #10 another subject term called
"Japanese kanji readings".
For a controlled vocabulary
point, I prefaced
every subject listing with “Japanese”.
This was a good move in terms of making distinction between Japanese kanji and Chinese kanji, but it
also makes sense in terms of being a
cataloger. According to the metadata
creation text, my use of "Japanese" also leads to improved subject
information retrieval Using the same terms to
describe
a group of resources was relatively easy—I just had to scroll up to see
what I
had entered before. However, what would
happen if I wouldn’t have cataloged everything at once?
I would have to first have a good knowledge
of what was available in my library, and then be able to find the
appropriate
record or records. By prefacing
everything with “Japanese”, it doesn’t matter so much if I can remember
whether
I said “Japanese kanji” or “Japanese writing system”—I only have to
search for
“Japanese language” or “Japanese”, and my resources will hit. In addition, I was also careful to use
abbreviations as well as full text—JLPT and Japanese Language Placement
Test. While the actual name of the test
is “nihongo nouryoku shiken”, most English speakers taking the test
probably
realize that materials are classified under the English translation, or
else
using roman characters to search for Japanese names doesn’t always work
so
well.
2.
Accuracy
In creating my metadata resources, I followed the
instructions provided in the cataloging text. According to our
instructions, we should "Take title from the actual information
resource" (24). I do, however, question a few of the titles given
to the web pages. For instance, as a researcher, I've been taught
that in doing research, one should only use resources that "sound"
credible. Therefore, a resource titled NASA Guide to Locating
Venus In The Night Sky is a credible source since NASA is an official
space agency, but "Bubba's Guide to Finding Venus Among All Them
Sparkly Dots" probably isn't so credible. Even when I was
locating resources to create metadata for, I unconsciously filtered out
sites that didn't sound so good based upon the titles, the URLs, and
the descriptions. But, now that I look back at my metadata, I
question whether, as a patron, I'd find the sources I chose to be very
valid. Most of my titles, such as "Kanji alive" and "Kanji a
Day", sound okay, but I'm not so sure about "Jeffrey's Japanese-English
Dictionary--Gateway" or "Charles Kelly's Online Japanese Language Study
Materials". In my first report, I had anticipated that the
credibility of these sites might be questioned, so I justified my
choices by explaining that "there aren't a lot of
"really good" kanji
sites available on the web--there are a few authoritative
sources, and those are what other Japanese pages link to".
Ironically enough, in spite of their titles, Jeffrey's Japanese-English
Dictionary and Charles Kelly's Japanese Study Material site are both
well-known by Japanese language students.
At the same time, the majority of my
titles all have the word "kanji" in them. This is an accurate
description of the particular web site, but when I review my metadata,
I have trouble remembering which web site corresponds to which
title! I can easily recall what Jeffrey's Japanese-English
Dictionary and Charles Kelly's Online Japanese Language Study Materials
contained, but I have to sometimes read the descriptions to remember
the differences between "Kanji alive" and "Kanji-Step" and "The Kanji
SITE".
Now, with objectively analyzing my own
work, there are a few things I want to
question. The biggest error I discovered was in Resource
#9: CMJ Grammar Online. The title itself suggests that the site
contains primarily grammar. Also, in my description, I only
mention the grammar resources available. Yet, in my subject
terms, I list Japanese kanji. Here, I have to admit my failure to
be accurate. If I were listing Japanese kanji as a search term,
and the search "Japanese kanji" yielded this site, I should have told
the patron how the site was
relevant. Another error I discovered was in Resource #3:
Kanji-Step. In my description, I stated that the user could see
an introduction to all aspects of the Japanese language, and I also
suggest that there are specific sections for hiragana and
katakana. However, in my subject terms, I only used Japanese
language and JLPT--Japanese Language Placement Test. To properly
fix my record, I would include some more subject terms, such as
"Japanese grammar", "Japanese hiragana", "Japanese katakana", and so
forth. Now that I reflect back on the time I was making the
record, though, I remember that my initial focus was only Japanese
kanji. Therefore, I was only listing subject terms that seemed
relevant to my chosen topic. But, I now realize that if I
were taking the time to create metadata for a resource, I should create
data for the site in its entirety, and not only the parts relevant to
my topic. This oversight also occurred in Resource #4: The Kanji
SITE, Resource #5: Charles Kelly's Online Japanese Language Study
Materials, and Resource #7: Nihongo Web.
3. Granularity
In my research and metadata creation, I mostly presented
web sites rather than individual web pages in my metadata. While
this may be questionable, a majority of my web sites featured
information that was immediately accessible, if not one click away,
from the home page. Since I used Google to find relevant sources,
and then documented the search terms I used as subject terms, I assumed
that the entire content of a web site could be potentially useful to a
patron. In addition, though, even individual patrons may have
separate reasons for searching "Japanese kanji". For instance,
one patron may input "Japanese kanji" to find the joyo (a
government-approved list of 1,945 daily-use kanji) kanji list, whereas
another patron may be looking for proficiency examination kanji lists,
and another patron may want a definition and/or historical explanation
of what kanji is. In my own research interests, I'd rather be
presented with web sites and then allowed to filter the useful from the
non-useful rather than pointed to direct pages that immediately offer
answers. This idea, along with the idea that patrons may have
different needs that stem from one topic, led me to present information
from a "top-down" approach. In my case, being a "top-down"
researcher is okay, since I have the time to dig through web pages and
sources to find what I want. However, for busy people, being
asked to find the proverbial needle in the haystack in 3 minutes or
less might cause a bit of unnecessary stress. As an information
professional, whose job is to organize things so that they can be found
easily, I've just failed in my job by asking people to "find it
themselves". This concept is discussed in a DLESE study titled
"Merging Metadata and Content-Based Retrieval". Using a
conceptually-based hybrid educational resource discovery system that
combines metadata and content-based retrieval methods to organize
information, the authors conducted an experiment based upon two
hypothetical situations: a teacher searching for a tool to use for one
class, and a teacher searching for resources to use in teaching a
6-week module. In presenting an overview on the design
methodology and considerations that went into designing the system, the
authors state that "many K-12 science educators are teaching
out-of-area-... and thus lack confidence in their ability to evaluate
the quality of science education resources" (3). Lack of
confidence means that "...considerable time can be required to
comprehend a resource in order to determine if it is indeed relevant or
not" (3). As an information professional, I need to consider the
needs of my entire campus community/patron base more than one
group. In creating my metadata, I focused more on students
because of the subject I chose--I presumed that Japanese language
professors were either going to already know kanji, or else they had
their own resources for verifying a kanji. This was a mistake, as
a professor may need a tool to recommend to students and only have 10
minutes between classes to find the URL to use in the next
lecture.
But, at the same time, this
assignment required that we locate and create metadata for 10
sources. Yet, in reality, 10 sources for one subject is a little
big for a casual researcher. I discuss this in my first
assignment's assessment when I state that "In
a realistic library setting, though, I don't believe that 10 links
would be offered to patrons...if
I were a patron faced with
10 different kanji sites to choose from, it would take a long time to
explore and choose my resources." However, a metadata creator may
argue that more advanced students may be completing projects which
require more sources. In this case, though, I would suggest that
a student writing, for instance, a thesis, wouldn't be using the
Internet as his/her primary research source. Also, in the case of
my chosen subject, a more advanced student probably wouldn't be using
the material I selected--the content is more referential than
informative in nature. While "subject needs" and "pedagogical
needs" are named and outlined in our metadata handbook, two other
factors which must be considered are the type of information
institution being cataloged for and the needs of the overall patron
base. For instance, a university that doesn't offer Japanese as a
potential course of study may only need 1 or 2 sources which define
kanji, so a lot of time doesn't need to be spent on finding many useful
resources when 1 or 2 will suffice. Another instance would be a
public library, where a patron needs a kanji dictionary but the library
doesn't own a print version. Therefore, the patron and the
librarian navigate the Internet to find a good dictionary, and the
librarian decides to simply bookmark the page in case it's ever needed
again. But, a university offering both undergraduate and graduate
studies in the Japanese language would perhaps need more resources that
cover a variety of kanji aspects (then again, Japanese language
students will have other sources to verify kanji, such as electronic
and paper kanji dictionaries--after all, I'm a Japanese student, and I
use my electronic dictionary rather than the Internet to look up
kanji).
Part 2: Personal Criteria
In choosing my personal criteria to be included, I consulted several
different studies. While each study focused on metadata quality
and retrieval in general or within various organizations, each paper
seemed to reflect the same message: there is a lot of poorly
constructed metadata in existence. Unfortunately, after
consulting these references and applying the ideas to my own project, I
found that I, too, had made several mistakes. In my defense, many
of my errors can be attributed to my inexperience in metadata creation
(after all, this was my first time creating metadata records!).
However, if I were an employee in a research institution and I
contributed my metadata to a catalog, the information would be
incomplete or, worse yet, hidden from users. The concept of
hiding a resource is discussed by Barton, Currier and
Hey in "Building Quality Assurance into Metadata Creation: an Analysis
based on the Learning Objects and e-Prints Communities of Practice"
when they state that "at worst, poor quality metadata can mean that a
resource is essentially invisible... and remains unused" (1).
But, even in constructing metadata for the first time, I realized just
how easily information can be omitted due to time contraints or lack of
foresight on the user's part. In a realistic job setting, I
probably would not have executed this much follow-up on checking the
metadata for 10 resources. In a way, this is the problem being
addressed by the papers I studied--no one is checking their work after
they've completed it. The lack of follow-up leads to more errors
occurring, which causes more people to research metadata quality.
Yet, in the end, everyone seems to find the same results. Through
reading the literature and realizing just how much the information
applied to me--a beginning-level graduate student with little practical
experience--, I wondered why professionals hadn't taken more time to
apply the results to their own work. After all, what is the point
of studying, content analysis in GILS, presenting the information
complete with specific statistics and points of error, yet doing
nothing to fix the problems that were found? It doesn't make
sense to me! While nothing can be perfect, I would certainly
think that striving for perfection-- or at least a high degree of
correctness to ensure high retrieval rates-- would be an objective of
any metadata creator. (Please forgive my seemingly
accusative statements, as I am addressing my own deficiencies as well!)
There is one question which has 'N' for every resource, and that is
"complete elements". In completing Assignment 1, I found that I
could never use Element #14: Coverage". In addition, each
resource has individual fields missing due to lack of information;
however, the most basic elements such as title, subject and description
are all completed. I don't believe that 'Coverage' would hinder a
user's ability to find the records, but my lack inconsistency in
matching title key words with subject words and descriptive phrases
will.
While I have constructed a list of 11 more elements to include as
checklist points, most of them all stem from the same question: "Is my
metadata record complete enough?" The ideas are concepts,
rather than direct quotes, gathered from my reading--therefore, I can't
really provide specific page numbers.
Informative
Title
Words: Does the resource description use words from the
title? (Moen)
Subject Words: Does
the resource description use words cited as subject keywords? (Moen)
Summary: Does the
metadata creator write a new summary (not merely copy from the web
site)? (Moen)
Completeness
Complete Elements: Are all 15 DC Elements filled
in? If not, what's missing? Is this from lack of information or
the metadata creator's error? Would the missing elements hinder a
researcher's ability to find the page by typing in alternative terms
(not using subject keywords)? (Moen)
Title
Complete Title: Does the title in the metadata
reflect the title of the website?
Real Title:Would a
patron searching for this page/page's content use the title/title key
words to find the page in question? (Sokvitne)
Correct Publisher:
Is the publisher formatted correctly according to the official
publisher name? (Sokvitne)
Popular Name: Is the
publisher formatted correctly according to popular name? For
instance, if a publisher's name has been shortened in the academic
community, can you search for that publisher's shortened name and find
it too? (Sokvitne)
Subject
Abbreviations:
Do I use widely-known abbreviations? If I do, do I spell them
out? (Barton)
Alternative Language:
Are alternative language search terms used? (Chan)
Correct Language: Are
foreign language terms translated into English translated correctly/do
they make sense to foreign users as well? (Chan)
|
R#1
|
R#2
|
R#3
|
R#4
|
R#5
|
R#6
|
R#7
|
R#8
|
R#9
|
R#10
|
Title Words
|
Yes
|
Yes
|
Yes
|
Yes
|
No
|
Yes
|
No
|
Yes
|
Yes
|
No
|
Subject Words
|
No
|
Yes
|
Yes
|
No
|
Yes
|
No
|
No
|
Yes
|
No
|
No
|
Summary
|
Yes
|
No
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Comp. Elem.
|
No
|
No
|
No
|
No
|
No
|
No
|
No
|
No
|
No
|
No
|
Comp. Title
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Real Title
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
No
|
Yes
|
No
|
Yes
|
Correct Pub.
|
N/A
|
Yes
|
Yes
|
N/A
|
N/A
|
N/A
|
N/A
|
Yes
|
Yes
|
Yes
|
Popular Name
|
N/A
|
Yes
|
Yes
|
N/A
|
N/A
|
N/A
|
N/A
|
Yes
|
Yes
|
Yes
|
Abbrev.
|
N/A
|
N/A
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Alt. Language
|
N/A
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
No
|
Yes
|
Yes
|
Yes
|
Correct Language
|
N/A
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Yes
|
Part 3:
Individual Web Site/Metadata Analysis
Resource #1: Jeffrey's Japanese-English Dictionary--Gateway
As a metadata creator, I made a rather large mistake. In
writing
my summary, I only used the word "dictionary" from the title, and I
didn't use either one of my subject terms. If a researcher were
reading my summary, he/she may not only wonder why the record even came
up as a search result but whether it has any relevance to their
search. In addition, a researcher doing a search for "online
dictionary"--wanting a English-only dictionary-- would probably get
this record. While this oversight can be somewhat attributed to
inexperience (it was my very first record), I probably would have never
discovered the problem if I were in a real work setting. Thus, I
created metadata that isn't easily retrievable or made relevant to a
finder.
Resource #2: Kanji Alive
This site was created by the University of Chicago to
showcase the innovative kanji search tool they built. The site
ranks high in terms of subject relevance and granularity, as the user
is directed right to the tool. However, as a metadata creator, I
made a couple more errors. In my subject description, I merely
copied from the web site rather than writing my own. To a
researcher, it may seem that the site was added 'at the last minute',
with little to no thought about its actual relevance. I will
admit that the University of Chicago's description is rather complete,
and it made sense to simply reuse it. But, in creating my subject
key terms, only the word "kanji" is included in the description.
This in itself is fine, as the web site is showcasing a kanji
tool. However, in my added notes, I should have explained how
"dictionary" and "language" were relevant.
Resource #3: Kanji-Step
The subject indexing terms on this metadata record were
mostly used correctly, as "Japanese Language" is the only subject term
available from the Library of Congress. But, in my summary, I use
a lot more words that should have been included as subject terms--such
as "kanji", "writing", "reading", "hiragana" and "katakana". The
resource itself is valuable--I used it to learn hiragana and katakana
before coming to Japan--, but I didn't create my metadata thoroughly
enough to reflect its usefulness. Because the resource is being
offered by a language school rather than an individual, the information
available and the teaching styles used in its presentation also reflect
the quality of the site as a learning tool. But, I'm afraid I hid
it a little bit by not including enough subject terms.
Resource #4: The Kanji SITE--A Guide For
Students of Japanese Kanji
Here, I made the same error as the previous entry--I
loaded my description with lots of potential subject words, but used
none of them as subject key terms. However, one term I wisely
included was "JLPT", since the site offers specific preparation for
this exam. I also included the abbreviation as well as the full
term.
Resource #5: Charles Kelly's Online Japanese Language Study Materials
In choosing this site, I offered a plethora of links to
patrons. As a researcher who prefers to find her own way
sometimes, this seemed like a good move--"show me what's out there and
I'll decide what's good and what's not". But, in terms of
granularity, I stayed extremely top-level. A researcher who is
frantically trying to find a good resource would wonder why I didn't
filter out "the good from the bad" for them. If I were to
consider keeping this site as a record, I might consider using the site
to find the relevant pages, creating metadata records for the
individual sites and using the DC.IsPartOf tag to cite the main URL I
retrived the page from. Then,if a researcher wanted to pick
through the entire web site, he/she could do so.
I was also remiss in not including more words from the title in my
description. However, the description reflects the ideas of the
title phrase, so I didn't completely fail to match the two.
Resource #6: Kanji a Day
This record somewhat reflects my experience in metadata
creation, as there was more of a correlation between the title,
description, and subject terms used. But, I still only used
one of the subject terms in my description. In addition, the
creator entry may be questionable to a user--Rob who? (as it turns out,
that's the only name given on the site). If I were a user, and I
only saw "Rob" as a creator entry, I may wonder if the site was a prank
entry and immediately discredit it as unuseful.
Resource #7: Nihongo Web
In this record, I listed the promise of Japanese
proverbs, pictures, hiragana and katakana, Japanese computing, and
vocabulary, but didn't include them as subject key words. Also,
the site builders called their creation "Nihongo Web", but I didn't
bother including "Nihongo" as a search term. In this case, I
assumed that the user would be thinking in English, and would type
"Japanese" rather than "Nihongo" to find sources on the Japanese
language. If I wanted to include this site in a digital library
(I perhaps wouldn't, as it isn't very subject-specific), I would have
to specifically list "Nihongo" as a subject term--otherwise, in a title
search, the record wouldn't hit.
Resource #8: Japanese Language School--MLC Meguro Language Center
(Tokyo)
Once again, my subject key words aren't listed in the
description. The subject key terms are fairly accurate (to
completely fix the record, I would have to include hiragana and
katakana as well), but my error lies in my description. I would
have to modify the description to 1. make it a bit longer--a short
description insinuates that the metadata creator doesn't have much
confidence in recommending the site to others, and 2. specifically tell
the user that the things listed in my subject terms are
available.
Resource #9: CMJ Grammar Online
The title on this resource doesn't reflect my chosen
topic at all (the same problem also occurs on resources #7 and 8), and
the short description also reflects doubt in recommending the site to
others. I listed many subject terms, but don't include them in my
description. In addition, I also made a subject key term
error. I listed "Japanese alphabet system". While this is
somewhat smart, as Japanese writing consists of hiragana, katakana and
kanji (sometimes all 3 of them in 1 sentence!), I should have broken
down "alphabet" and listed hiragana and katakana separately. Or,
if I wanted to be completely accurate, I should list "Japanese alphabet
system" on all of my records as a subject key term. However, if I
did so, another metadata creator may accuse me of being
redundant. I'm not sure at this point where the line between
being thorough and being redundant lies. I suppose I would have
to research the search habits of my metadata users to properly
determine this.
Resource #10: Kanji Furigana For Japanese Learners
Once again, I have a language problem. "Furigana"
means "kanji readings", but I didn't use the word "furigana" in my
description. This isn't a big error, but a search engine doesn't
know multilingual equivalents unless a data creator sets them
beforehand. While an average user won't be searching for
"furigana" sites--only a Japanese language student would know this
word--, the description explains how the page works. Also, the
page title reflects the idea that furigana is offered. But, to be
accurate, I need to say "furigana" in my description. I may also
need to include a couple more subject key words, but at this point I'm
not sure what they are.
Part 3: Conclusion
The ability to reassess my initial metadata
allowed me to see several areas where improvements are needed.
While another professional may argue that I was a bit hard on myself
for being a first-time creator, I specifically chose to evaluate my own
data because I felt that I would be able to gain more insight on where
I need to improve as a metadata creator and information
professional. I experienced first-hand the difficulties that
metadata creators face in making records accessible to users.
When I created my first record for Assignment #1, it took me roughly 30
minutes to "fill in the blanks". At first, I thought that I was
being extremely slow due to inexperience; however, now I see that
creating quality metadata requires a time commitment similar to, if not
longer than, what I dedicated (my metadata was "okay", but not "really
good", I think). A metadata creator working in a professional
setting probably wouldn't be able to spend the time that I did, which
is why the retrieval rates are currently low in some cases. If
metadata creation is to become a permanent part of the information
organization world, then a regular cataloger cannot do both positions
well due to time constraints--both "print materials" catalogers
and "virtual materials" catalogers are needed, and both positions
must undergo training specific to their field. Since the concept
of metadata creation is still relatively new, making the distinction in
terms of job description AND education needed would be beneficial to
everyone--after all, catalogers and archivists study different subjects
within the field. While catalogers and archivists are performing
similar tasks, both positions require specialist knowledge.
Therefore, print material cataloging and metadata cataloging should
also be different specialties.
Because of this experience, I am in more of a position to keep the
interests of 2 main groups in mind-- the educators who are using my
data to quickly find information and the researchers who are using my
data to browse a field. Because I've only been a student, I have
always seen the library from a researcher's point of view.
However, now I realize that an educator has different things to
consider when seeking information. And, as a information
professional, I have to keep both in mind. By default, every
information professional has spent time as a researcher--after all, we
were all students once--, but not every information professional has
been an educator. I believe that this may also contribute to the
deficiencies in metadata creation--some people perhaps don't realize
the problems that educators with limited time and lack of Internet
know-how (Internet is relatively new, and the younger generation has
become more accustomed to using the Internet as a search tool) face
when finding specific tools, rather than lengthy reports, on the
Internet.
I think that it will be interesting to see how the information
organization field changes as a result of the "Internet generation"
(even I, as a 23-year-old graduate student, can only vaguely remember
the time before Internet came to dominate society) joining the ranks of
information professionals. We bring a "just hit the back
space/back button if it isn't right" attitude with us, but we are
coming into a profession whose theories are still embedded in the "you
have to destroy the card/use correction ribbon if you mess up"
mentality. In many ways, refitting the "old school" information
organization world with technology will require a lot of revision to
both technology and the "old school" world to coexist.
Appendix
1
Resource
Title (quoted verbatim from my metadata)
|
Resource
URL
|
Jeffrey's Japanese-English
Dictionary--Gateway
|
http://rut.org/cgi-bin/j-e/dict |
Kanji alive
|
http://kanjialive.lib.uchicago.edu/main.php?overview.htm |
Kanji-Step
|
http://www.kanjistep.com/index.html |
| The Kanji SITE-- A Guide For
Students of Japanese Kanji |
http://www.kanjisite.com/index.html |
| Charles Kelly's Online Japanese
Language Study Materials |
http://www.manythings.org/japanese |
| Kanji a Day |
http://www.kanji-a-day.com |
Nihongo Web
|
http://www.nihongoweb.com |
| Japanese Language School--MLC
Meguro Language Center (Tokyo) |
http://www.mlcjapanese.co.jp/Download |
| CMJ Grammar Online |
http://mercury.ecis.nagoya-u.ac.jp/WebCMJ/ |
| Kanji Furigana For Japanese
Learners |
http://sp.cis.iwate-u.ac.jp/sp/lesson/j/doc/furigana.html |
Appendix
2
Here is my
metadata from Assignment 1.
Resource #1
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Jeffrey's Japanese-English Dictionary--Gateway |
| DC.Identifier |
|
http://rut.org/cgi-bin/j-e/dict |
| DC.Description |
|
A gateway providing access to the EDICT and KANJIDIC
dictionary databases. Users may also search a number of specialized
dictionaries, as well as choose from 8 different types of
Japanese-supporting browsers. Several features available for
customization by the user. |
| DC.Subject |
|
Japanese kanji |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Available 1994-08-06 |
| DC.Creator |
|
Jeffrey Friedl |
| DC.Contributor |
|
Dr. Jim Breen
|
| DC.Type |
|
Text.Dictionary |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Relation |
|
References Japanese--English Dictionary Server |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #2
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Kanji alive |
| DC.Identifier |
|
http://kanjialive.lib.uchicago.edu/main.php?overview.htm |
| DC.Description |
|
"Kanji alive is a searchable,
web-based tool to help
beginning and intermediate level students read and write Japanese
kanji. It is freely available, cross-platform, and does not require any
Japanese fonts." Users can also view kanji stroke order animation,
listen to pronounciation, and look up further information about kanji. |
| DC.Subject |
|
Japanese kanji
|
| DC.Subject |
|
Japanese dictionary |
| DC.Subject |
LCSH |
Japanese language
|
| DC.Date |
ISO8601 |
Created 2002 |
| DC.Creator.CorporateName |
|
University of Chicago |
| DC.Publisher |
|
University of Chicago |
| DC.Rights |
|
Accessible freely.
http://kanjialive.lib.uchicago.edu/main.php?credits.htm |
| DC.Type |
|
Text.Dictionary |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Interactive.Multimedia |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #3
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Kanji-Step |
| DC.Identifier |
|
http://www.kanjistep.com/index.html |
| DC.Description |
|
Offers an introduction to not
only Japanese kanji but all
other aspects of the Japanese language, including reading, grammar, and
writing. Materials are broken into 4 categories based upon the JLPT
(Japanese Language Placement Test). Sound and illustration files are
included in the hiragana and katakana sections. |
| DC.Subject |
|
JLPT--Japanese Language Placement Test
|
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Created 1999 |
| DC.Creator |
|
Japanese Language Resource
Center (JLRC) |
| DC.Publisher |
|
Japanese Language Resource
Center (JLRC) |
| DC.Rights |
|
Accessible freely. |
| DC.Type |
|
Interactive |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Sound.Speech |
| DC.Type |
|
Text.Form |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #4
Metadata Creator: Jessica Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
The Kanji SITE-- A Guide For Students of Japanese Kanji |
| DC.Identifier |
|
http://www.kanjisite.com/index.html |
| DC.Description |
|
Offers hiragana, katakana and
kanji recognition practice.
Kanji are classified according to JLPT Levels 4, 3, and 2. Users can
view lists of Level kanji in vertical/horizontal format, practice
reading, or complete random testing. |
| DC.Subject |
|
Japanese kanji
|
| DC.Subject |
|
JLPT--Japanese Language Placement Test |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Created 1999-09 |
| DC.Creator |
|
Chris Jennings |
| DC.Type |
|
Interactive |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Text.Form |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #5
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Charles Kelly's Online Japanese Language Study Materials |
| DC.Identifier |
|
http://www.manythings.org/japanese |
DC.Description
|
|
A page offering links to various
aspects of the Japanese
language, including kana and kanji quizzes, reading practice, JLPT
vocabulary lists, and Japanese--English vocabulary quizzes. |
| DC.Subject |
|
Japanese language quizzes |
| DC.Subject |
|
JLPT--Japanese Language Placement Test |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Created 1999 |
| DC.Creator |
|
Charles Kelly |
| DC.Rights |
|
Accessible freely |
| DC.Type |
|
Interactive |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Image.Photograph |
| DC.Type |
|
Text |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Relation |
|
IspartOf http://www.manythings.org |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #6
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Kanji a Day |
| DC.Title.Alternative |
|
kanji-a-day.com preparation for the jlpt |
| DC.Identifier |
|
http://www.kanji-a-day.com |
| DC.Description |
|
A customizable site requiring
users to register. Users can
create vocab/kanji study lists, study randomized vocab/kanji, access
kanji and vocab dictionaries as well as a discussion forum, and
participate in virtual chat with other users. |
| DC.Subject |
|
Japanese language quizzes |
| DC.Subject |
|
JLPT--Japanese Language Placement Test |
| DC.Subject |
|
Japanese kanji |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Modified 2004-10-06 |
| DC.Creator |
|
Rob |
| DC.Creator.Address |
|
rob@kanji-a-day.com |
| DC.Rights |
|
Accessible freely |
| DC.Type |
|
Text.Form |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Interactive.Chat |
| DC.Type |
|
Text.Dictionary |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #7
Metadata Creator: Jessica Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Nihongo Web |
| DC.Creator |
|
Yasuhiro Omoto |
| DC.Identifier |
|
http://www.nihongoweb.com |
| DC.Description |
|
Offers links to Japan pictures,
teaching materials,
Hiragana/Katakana stroke order, Japanese computing, and vocabulary.
Also lists some Japanese proverbs. |
| DC.Subject |
|
Japanese language quizzes |
| DC.Subject |
|
JLPT--Japanese Language Placement Test |
| DC.Subject |
|
Japanese kanji |
| DC.Subject |
|
Japanese kanji sound files |
| DC.Subject |
LCSH |
Japanese language |
DC.Date
|
ISO8601 |
Modified 2003-12-02 |
| DC.Contributor |
|
Shindo, Naoshi (Material contributor, director) |
| DC.Contributor |
|
Ishida, Mayumi (Material contributor and designer) |
| DC.Contributor |
|
Uchida, Yoshiko (Material contributor) |
| DC.Contributor |
|
Schneider, Keiko (Material/Information Contributor) |
| DC.Contributor |
|
Uehara, Satoshi (Nihongo Web: Japan Coordinator) |
| DC.Rights |
|
Accessible freely |
| DC.Type |
|
Text |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Sound.Speech |
| DC.Type |
|
Image.Moving.Animation |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #8
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Japanese Language School--MLC Meguro Language Center (Tokyo) |
| DC.Identifier |
|
http://www.mlcjapanese.co.jp/Download |
| DC.Description |
|
Provided by the MLC Language
School as a free online resource
for Japanese studies. Offers extensive information on the JLPT, as well
as beginning-level grammar. |
| DC.Subject |
|
Japanese language quizzes |
| DC.Subject |
|
Japanese grammar worksheets |
| DC.Subject |
|
Japanese Language Placement Test (JLPT) |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Modified 2004-07 |
| DC.Creator.CorporateName |
|
MLC Meguro Language Center |
| DC.Publisher |
|
MLC Meguro Language School (Tokyo) |
| DC.Type |
|
Text |
| DC.Type |
|
Image.Graphic |
| DC.Type |
|
Sound.Speech |
| DC.Type |
|
Image.Moving.Animation |
| DC.Format |
IMT |
text/html |
| DC.Format |
IMT |
application/pdf |
| DC.Language |
ISO639-1 |
en |
| DC.Relation |
URL |
IsPartOf http://www.mlcjapanese.co.jp |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #9
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
CMJ Grammar Online |
| DC.Identifier |
|
http://mercury.ecis.nagoya-u.ac.jp/WebCMJ/ |
| DC.Description |
|
Created by Nagoya University to
assist students in studying
elementary-level Japanese grammar online. Features interactive grammar
quizzes. |
| DC.Subject |
|
Japanese language quizzes |
| DC.Subject |
|
Japanese grammar information |
| DC.Subject |
|
Japanese alphabet system |
| DC.Subject |
|
Japanese kanji |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Modified 2004-09-23 |
| DC.Creator.CorporateName |
|
Nagoya University |
| DC.Publisher |
|
Nagoya University |
| DC.Type |
|
Text |
| DC.Type |
|
Image.Graphic |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Relation |
URL |
IsPartOf http://www.ecis.nagoya-u.ac.jp/default-e.htm |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Resource #10
Metadata Creator: Jessica
Browning
Metadata Date Created: 2004-10-06
| Dublin Core attribute |
Scheme (if any) |
Value |
| DC.Title |
|
Kanji Furigana For Japanese Learners |
| DC.Identifier |
|
http://mercury.ecis.nagoya-u.ac.jp/WebCMJ/ |
| DC.Description |
|
Created by Iwate University as
an English link to
http://kids.goo.ne.jp, a site created to provide kanji readings for
Japanese-based web pages. Users insert a URL into the blank, and a
pop-up appears with the kanji readings written over the kanji. |
| DC.Subject |
|
Japanese Kanji furigana |
| DC.Subject |
LCSH |
Japanese language |
| DC.Date |
ISO8601 |
Created 2001-07-27 |
| DC.Creator.CorporateName |
|
Iwate University |
| DC.Publisher |
|
Iwate University |
| DC.Type |
|
Text |
| DC.Format |
IMT |
text/html |
| DC.Language |
ISO639-1 |
en |
| DC.Relation |
|
IsPartOf
http://sp.cis.iwate-u.ac.jp/sp/lesson/j/doc/furigana.html |
| DC.Relation |
|
Replaces http://kids.goo.ne.jp (English version) |
| DC.Date.X-MetadataLastModified |
ISO8601 |
2004-10-14 |
Bibliography
Barton, Jane, Currier, Sarah., Hey, Jessie M.N. (2003). Building Quality Assurance into Metadata
Creation: an Analysis based on the Learning
Objects and e-Prints Communities of Practice.
http://www.siderean.com/dc2003/201_paper60.pdf (last accessed 1
December 2004).
Chan,
Lois M & Zeng, Marcia L. (2002). Ensuring Interoperability
among Subject Vocabularies and Knowledge Organization Schemes: A
Methodological
Analysis. 68th IFLA Council and General Conference,
August 18-24, 2002. http://www.ifla.org/IV/ifla68/papers/008-122e.pdf
(last accessed 2 December 2004)
Coleman, Anita S. (2004). Guide
to Selecting and Cataloging Quality
WWW Resources for the Small Library. Fairfield, CA:
Learning Resources Association of the
California Community Colleges.
Deniman, D., Sumner, T., Davis, L., Bhushan, S., Fox, J. (2003).
Merging Metadata and Content-Based Retrieval. Journal of Digital Information, 4(3).
Retrieved 2 December 2004, from http://jodi.ecs.soton.ac.uk/Articles/v04/i03/Deniman/
DLESE Quality WG 3: Metadata Structures (2004). A Framework for the Future of Educational
Digital Libraries: Metadata and Vocabularies for
Learning. http://swiki.dlese.org/quality/uploads/4/QualityWG3.pdf
(last accessed 2 December 2004).
Moen, William E., Stewart, Erin L., McClure, Charles R. (1997). The Role of Content Analysis in Evaluating
Metadata for the U.S. Government
Information Locator Service (GILS): Results
from an Exploratory Study. http://www.unt.edu/wmoen/publications/GILSMDContentAnalysis.htm
(last accessed 2 December 2004)
Sokvitne, Lloyd (1999). An
Evaluation of the Effectiveness of Current Dublin Core Metadata for
Retrieval.
http://www.vala.org.au/vala2000/2000pdf/Sokvitne.PDF
(last accessed 2 December 2004).
Taylor, Arlene G. (2004). The
Organization of Information. Westport, CT: Libraries
Unlimited.