Petl is a Python library and companion command-line tool to support Extract, Transform, and Load (ETL) workflows on tabular data. Petl can read and write several dozen formats including Excel, XML, JSON, Delimited text, HTML, and many others. Numerous transformations are supported, including transformations of individual values, splitting compound values from one cell into multiple cells, SQL-like joins across several tables, deduplication, and validation against a set of constraints. Petl also includes a number of utility functions to perform statistical analysis and interactive inspection of the data. In the ETL Pipelines section of the Introduction page, users can locate a brief illustration of a simple ETL workflow. As a Python package, Petl can be installed via `pip`. On many Linux and BSD systems, Petl will also be available via the system's package manager.
Comments