Metadata makes it possible.
Metadata plays a central role in Podium, far beyond metadata’s traditional role of documenting “data about data” with simple cataloging or classification. In Podium, the metadata catalog drives the enterprise data management process and enables many of Podium’s key capabilities. Metadata is used to structure, document, secure, and manage the data collection, ensuring it is a well-governed marketplace (rather than a swamp). IT, data stewards, and data consumers access and use metadata through the Podium GUI to find, understand, and work with data in the marketplace. And the data operations teams use Podium’s metadata to automate the integration of Podium workflows with other enterprise schedulers, applications, or repositories at scale.
Podium Collects Metadata Throughout the Data Management Process
New metadata is continuously and immediately collected in Podium’s metadata repository to record every further action or insight connected to each data element. Starting with the source data itself, metadata is collected to document the expected source schema. Next, Podium validates every record of the full data set against the expected format. Enhancement of the metadata catalog continues with the generation of profile statistics of each field. This continuous process sets the foundation for users to identify and generate insights from the discovery of secure, accurate, and understood data across the enterprise.
As data stewards, analysts, and business users work with the data, new metadata is created and added into the metadata catalog. Tags and comments, generated both manually and automatically, help characterize the quality, completeness, focus area, or usability of each data set, and are also added to the metadata catalog. New copies of data sets generated are linked to ancestors or descendants, providing detailed lineage information captured as new metadata. Security and user access privilege details are captured and maintained as metadata in the catalog.
Podium’s Metadata Is Automatic and Open
Metadata in the Podium Metadata Repository can be easily shared with other enterprise metadata management platforms. The data model for the Podium Metadata Repository is documented, and the metadata in the repository can be accessed via Podium’s open RESTful APIs’ metadata import/export function, Publish, and Reports/Dashboard.
Podium Crowdsources Business Metadata
In addition to capturing technical and operational metadata across the data marketplace process, Podium also collects a rich layer of business metadata by allowing Podium users to crowdsource and share insights about different data entities by populating business definitions and names, blogs, and tags as well as standard and custom properties. The more data stewards, SMEs, and data analysts work with the data, the better documented it becomes, replacing error-prone hand-offs between the business and IT with a self-service, collaborative process for handling shared data.
Intelligent Data Detection
Podium uses metadata to make data sets more valuable by identifying potentially important characteristics in the data and then helping users take action on those insights automatically at scale. For example, by combining detailed data profiling metadata describing each data set with a configurable pattern matching rules engine, Podium can identify, flag, and optionally take action on potentially sensitive data, duplicate data, corrupt data, or data requiring additional examination or governance. Podium’s intelligent data detection capability is easily configured to match the unique characteristics of different data sets or organizations.