Enterprise User Data Catalog

The Enterprise User Data Catalog (EUDC) is a user-facing, centralized repository of metadata that offers data dictionary, data description/glossary, and other information about the metadata required by a data analyst.

Different from the Enterprise Data Mesh (EDM) Hive Metastore (compute catalog), the EUDC enables data consumers to find pertinent information by searching metadata (column and field names within tables) that Contributors provide.

Data Catalogs

The Enterprise Data Mesh differentiates between the internal, technical metadata store (compute catalog) and the external-facing consumer/contributor metadata store and user interface (user data catalog).

The compute catalog stores all the structure information of the various databases, tables and partitions in the mesh , including column and column type information, implemented using Apache Hive Metastore.

The Enterprise Data Mesh user data catalog, the Enterprise User Data Catalog (EUDC), stores profile information about the data assets [and/or datasets] in the EDM , supports data asset discovery and access, and makes structure metadata accessible to data consumers.

For further detailed information on the data catalog, including standards, assessments, briefings and ConOps please refer to the data catalog guidance.

User Data Catalog Capabilities

A user data catalog should contain the following features: data search, data discovery, data curation, and the management of privacy and security concerns. To that end, it is imperative that a user data catalog provide the capability to document a data asset’s object metadata, that is, information about a data asset’s name, owner, accessibility, format, defining keywords, provenance, and location, as well as its structural metadata, which is, its data attributes, data types, descriptions, etc. The figure below depicts a conceptual data catalog architecture, alternative tools may be used to achieve similar capabilities.

This chart depicts the user data catalog operating concept. This is a drawing that describes the data contributors, data lake core, and the consumers along with the generic services in each. The figure below depicts a conceptual data catalog architecture, alternative tools may be used to achieve similar capabilities.
User Data Catalog Operating Concept