Speaking Notes from a Tech Talks Presentation by Daniel Alemneh, December 6, 2006
- Metadata quality
- Factors influencing metadata quality
- Managing metadata quality
- UNTL metadata quality assurance mechanisms
The digital life cycle management starts from the point an item is created or selected for digitization (if not born-digital), continues through image cleanup, metadata capture, and derivative creation, and extends to ensuring long-term access. The two aspects of digital library data quality are the quality of the data in the objects themselves, and the quality of the metadata associated with the objects. Metadata errors occur in a variety of forms, but when errors exist, in whatever form, they affect discovery, access, use, and long term management of the resources.
Metadata quality in terms of:
- Letter transposition e.g. 9198 for 1998
- Letter omission, e.g. Omt for omit
- Letter insertion, e.g. asnd for and
letter substitution or misstrokes, e.g. likw for like
- Adding/selecting wrong information in the wrong field/subfield
- Incomplete information
- Inconsistency eg. multiple spellings, multiple possible meanings, mixed cases, initials, etc.
The metadata quality characteristics depend on various factors, including:
- What type of objects will the repository contain? [Heterogeneity]
- How will they be described? And used? And by whom? [Granularity]
- What functionality is required locally?
- How will it be interfaced?
- What entry points will be used?
- What is required for interoperability? (Structural, semantics, and syntax).
- Are requirements formal or informal?
- Will metadata be meaningful within aggregations of various kinds?
- Will access restrictions be imposed?
- Are resources sufficient to produce the required metadata quality? (very unlikely)
- (If not), what are the priorities? (CBA)
The metadata quality assurance cycle:
- Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements as well.
Determine nature of gap and how to close it
- effectiveness, efficiency, practicability, scalability
- One size does not fit all!
- Resources unlikely to be available to meet all requirements
Metadata Template Creator
- Sample workflow
Metadata quality assurance
Metadata analysis tool demonstration
- No Null values allowed for Mandatory elements (Title, Subject, Description, Language, Coverage, Resource Type, Format)
List All Values
- Shows every value in the field selected and the number of times it shows up in the metadata.
All descriptive elements can be visually viewed for errors using the aid of various tools including:
- Highlighter .(On/Off)
- Qualifiers (Use/Ignore)
- Sub-elements (All Qualifiers/Each Qualifiers)
List Values by Institution
- List by institution with total no. of records from each institution.
List Values by Collection
- List by collection with total no. of records from each collection.
- Maintain the details about each and every controlled list to ensure key uniqueness.
Other Graphical Reports:
- Records Added Over Time
- Records Added Per Month
- Files Added Over Time
Clickable Map of Texas:
- By Collection
- By Institution
- By Collection
- By Institution
- NULL Values
- UNT Libraries' Metadata Documentation
- MWI Resources
About the Author: Daniel Gelaw Alemneh is a metadata management specialist for the UNT Libraries' Digital Projects Unit.