Advanced Metadata Management
Speaking Notes from a Tech Talks Presentation by Daniel Alemneh, December 6, 2006
- Introduction
- Metadata quality
- Factors influencing metadata quality
- Managing metadata quality
- UNTL metadata quality assurance mechanisms
- References
Introduction
The digital life cycle management starts from the point an item is created or selected for digitization (if not born-digital), continues through image cleanup, metadata capture, and derivative creation, and extends to ensuring long-term access. The two aspects of digital library data quality are the quality of the data in the objects themselves, and the quality of the metadata associated with the objects. Metadata errors occur in a variety of forms, but when errors exist, in whatever form, they affect discovery, access, use, and long term management of the resources.
Metadata quality
Metadata quality in terms of:
Error free
- Typographical Errors:
- Letter transposition e.g. 9198 for 1998
- Letter omission, e.g. Omt for omit
- Letter insertion, e.g. asnd for and
- letter substitution or misstrokes, e.g. likw for like
-
Adding/selecting wrong information in the wrong field/subfield
No omissions
- Null
- Incomplete information
Non ambiguous
- Inconsistency eg. multiple spellings, multiple possible meanings, mixed
cases, initials, etc.
Factors influencing metadata quality
The metadata quality characteristics depend on various factors,
including:
Local requirements
- What type of objects will the repository contain? [Heterogeneity]
- How will they be described? And used? And by whom? [Granularity]
- What functionality is required locally?
- How will it be interfaced?
- What entry points will be used?
Collaborators' requirements
- What is required for interoperability? (Structural, semantics, and syntax).
- Are requirements formal or informal?
- Will metadata be meaningful within aggregations of various kinds?
- Will access restrictions be imposed?
Cost
- Are resources sufficient to produce the required metadata quality? (very unlikely)
- (If not), what are the priorities? (CBA)
Managing metadata quality
The metadata quality assurance cycle:
Determine level of quality required
- Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements as well.
Determine nature of gap and how to close it
- effectiveness, efficiency, practicability, scalability
Compromise
- One size does not fit all!
Prioritize
- Resources unlikely to be available to meet all requirements
Test the workflow and the quality cycle
- retest
UNTL metadata quality assurance mechanisms
-
Metadata Template Creator
- Sample workflow
-
Metadata quality assurance
-
Metadata analysis tool demonstration
- NULL Values
- No Null values allowed for Mandatory elements (Title, Subject, Description, Language, Coverage, Resource Type, Format)
- List All Values
- Shows every value in the field selected and the number of times it shows up in the metadata.
- All descriptive elements can be visually viewed for errors using the aid of
various tools including:
- Highlighter .(On/Off)
- Qualifiers (Use/Ignore)
- Sub-elements (All Qualifiers/Each Qualifiers)
- List Values by Institution
- List by institution with total no. of records from each institution.
- List Values by Collection
- List by collection with total no. of records from each collection.
- Authorities Values
- Maintain the details about each and every controlled list to ensure key uniqueness.
- Other Graphical Reports:
- Records Added Over Time
- Records Added Per Month
- Files Added Over Time
- Clickable Map of Texas:
- By Collection
- By Institution
- Word Clouds
- By Collection
- By Institution
- NULL Values
References
- UNT Libraries' Metadata Documentation
- MWI Resources
About the Author: Librarian Daniel Gelaw Alemneh is a metadata management specialist for the UNT Libraries' Digital Projects Unit.

