Engineered to perform like never before.
Podium is an integrated, end-to-end Hadoop-native platform. The platform uses a set of shared services including data management, metadata, security, and governance to support a lights-out, operational, Data-As-A-Service platform.
The Podium Data Marketplace is built using seven core principles of open data lake architecture:
01 Metadata driven
02 Based on native Hadoop
03 Built on standard Hadoop integrations (No connectors required)
04 Supports a broad range of source formats including legacy data
05 Business friendly (intuitive GUI for non-programmers)
06 RESTful APIs
07 Certified across all distributions
Click/tap graphic for detail
The solution to your source and format problems.
Podium supports a broad range of source formats including relational databases, mainframe data sources and XML files. Even flat files. Non-standard data formats like mainframe files and XML hierarchical records are automatically converted to standardized character sets and formats. Easy and automatic.
Greater power and flexibility for users.
With Podium, enterprises build a persistent data repository. This is a collection of data that serves as the foundation for your Data Marketplace. Podium’s approach contrasts directly with extraction, transformation and load (ETL) tools that move data from sources to a target—without creating a repository of historical data for analysis. This persistent data repository is key to Podium’s ability to provide truly self-service, on-demand access to data in the marketplace. By maintaining data in the marketplace at each stage of maturity, Podium gives users greater power over data selection and the flexibility to choose whatever data best matches their analytic needs.
The magic is in the metadata.
Metadata is the underlying magic in Podium's revolutionary approach. It drives the data marketplace processes and supports many of the platform's key functions and capabilities. Within a Podium Data Marketplace, metadata is directly linked to the data it describes. Both data and metadata are stored locally in the data lake environment. Tight coupling of data and metadata produces a better structured and easier-to-manage data lake. The approach is so effective, in fact, that a data lake can now be built and maintained by analysts—rather than a IT experts, programmers, or Hadoop specialists. Podium also collects a rich layer of business metadata by allowing users working in the GUI to crowd-source and share insights, comments, tags and definitions associated with different data entities.
Realize the true promise of Hadoop, finally.
The Podium platform has been tested and is certified on all major Hadoop distributions including Cloudera, Hortonworks, MapR, and IBM BigInsights. Customers can deploy Podium either on-premise or in the cloud on AWS, Azure, or other providers. Podium also incorporates new Apache Hadoop-related projects including Hive, Pig, Tez and Spark—as well as leverage Cloudera’s Sentry and Hortonworks’ Ranger for security. As a fully native Hadoop solution, Podium executes all access and processing activities on the cluster. This leverages Hadoop’s massive parallel processing architecture to drive faster performance. As a result, every enterprise can now reap the enormous performance, scalability and economic benefits of Hadoop.
Easily publish data for any use.
Publish enables one-time or recurring replication of datasets across enterprise cluster environments. Administrators define Publish Targets, file format and execute or schedule on a one time or recurrent basis. Datasets can be published to various destination types including HADOOP (and Hive), LOCALFILE, HDFS, FTP, AWS S3, and RDBMS.
Simple, easy-to-use interface.
With Podium users drive every aspect of the data marketplace process through a simple intuitive graphical user interface – from ingesting data into the data marketplace, enhancing data quality, building custom data sets and publishing those out for use. Driven by Podium’s metadata layer the GUI gives user comprehensive information on the underlying data collection and robust tools to search, manage and prepare data for business users. By eliminating the need for specialized programming or technical skills, Podium’s GUI allows a much broader group of users to work with data into the marketplace and get value from this next generation enterprise data asset.
Podium's RESTful API is backed by a robust library of function calls. Transformation workflows can be called from a job scheduler, which activates the Podium Marketplace to perform tasks in automated batch execution mode. Used this way, Podium can be configured to automatically update Marketplace data through periodic imports of new data from source systems or to execute transformations, perform cleansing jobs or deliver data through exports to downstream applications and/or platforms.