Skip to content. | Skip to navigation

Sections
Home Digital Projects Unit Tech Talks Advanced Metadata Management
Document Actions

Advanced Metadata Management


Speaking Notes from a Tech Talks Presentation by Daniel Alemneh, December 6, 2006

 

 

Introduction

 

The digital life cycle management starts from the point an item is created or selected for digitization (if not born-digital), continues through image cleanup, metadata capture, and derivative creation, and extends to ensuring long-term access. The two aspects of digital library data quality are the quality of the data in the objects themselves, and the quality of the metadata associated with the objects. Metadata errors occur in a variety of forms, but when errors exist, in whatever form, they affect discovery, access, use, and long term management of the resources.

 

Metadata quality

 

Metadata quality in terms of:

 

Error free

  • Typographical Errors:
    • Letter transposition e.g. 9198 for 1998
    • Letter omission, e.g. Omt for omit
    • Letter insertion, e.g. asnd for and
    • letter substitution or misstrokes, e.g. likw for like

  • Adding/selecting wrong information in the wrong field/subfield

No omissions

  • Null
  • Incomplete information

Non ambiguous

  • Inconsistency eg. multiple spellings, multiple possible meanings, mixed cases, initials, etc.

 

Factors influencing metadata quality


The metadata quality characteristics depend on various factors, including:

Local requirements 

  • What type of objects will the repository contain? [Heterogeneity]
  • How will they be described? And used? And by whom? [Granularity]
  • What functionality is required locally?
  • How will it be interfaced?
  • What entry points will be used?

Collaborators' requirements

  • What is required for interoperability? (Structural, semantics, and syntax).
  • Are requirements formal or informal?
  • Will metadata be meaningful within aggregations of various kinds?
  • Will access restrictions be imposed?

Cost

  • Are resources sufficient to produce the required metadata quality? (very unlikely)
  • (If not), what are the priorities? (CBA)

Managing metadata quality

 

The metadata quality assurance cycle:

Determine level of quality required

  • Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements as well.

 

Determine nature of gap and how to close it

  • effectiveness, efficiency, practicability, scalability

 

Compromise

  • One size does not fit all!

 

Prioritize

  • Resources unlikely to be available to meet all requirements

 

Test the workflow and the quality cycle

  • retest

 metadata-quality
            Click to view full-size image


UNTL metadata quality assurance mechanisms

 

  • Metadata Template Creator

    • Sample workflow

  • Metadata analysis tool demonstration

    • NULL Values
      • No Null values allowed for Mandatory elements (Title, Subject, Description, Language, Coverage, Resource Type, Format)
    • List All Values
      • Shows every value in the field selected and the number of times it shows up in the metadata.
      • All descriptive elements can be visually viewed for errors using the aid of various tools including:
        • Highlighter .(On/Off)
        • Qualifiers (Use/Ignore)
        • Sub-elements (All Qualifiers/Each Qualifiers)
    • List Values by Institution
      • List by institution with total no. of records from each institution.
    • List Values by Collection
      • List by collection with total no. of records from each collection.
    • Authorities Values
      • Maintain the details about each and every controlled list to ensure key uniqueness.
    • Other Graphical Reports:
      • Records Added Over Time
      • Records Added Per Month
      • Files Added Over Time
      • Clickable Map of Texas:
        • By Collection
        • By Institution
      • Word Clouds
        • By Collection
        • By Institution

  • UNTL Metadata Quality Cycle
    UNT-quality-cycle

                       Click to view full-size image


References

 

 

 

About the Author:  Librarian Daniel Gelaw Alemneh is a metadata management specialist for the UNT Libraries' Digital Projects Unit.

 

This page is maintained by Daniel Alemneh last modified Monday, July 21, 2008. 03:45 PM

UNT and State of Texas: UNT | UNT Search | UNT News and Events | State of Texas | State-wide Search

Policies: UNT Web Accessibility Policy | AA/EOE/ADA | Privacy Statement | Disclaimer

Post Office Box 305190
Denton , TX , 76203-5190
(940) 565-2413

Locations, Maps, and Shipping.

Credits
Government Information Connection