Lineage Tracker

This is the developers' documentation of the Lineage Tracker component available in PISTIS factories.

Introduction

The Lineage Tracker monitors the evolution of datasets within the PISTIS ecosystem. It records every dataset create/update/delete operation, allowing users to track the lineage and family tree of any dataset. It captures details such as who performed an operation, what changed, and when that change occurred.

The API endpoints can be utilized to document operations, retrieve lineage information, and verify data integrity via blockchain anchoring. Most endpoints require authorization using a Bearer token from the PISTIS Keycloak.

Access

The API of the Lineage Tracker is available through: https://{FACTORYNAME}.pistis-market.eu/srv/lineage-tracker/

Usage

Operation Documentation

Usage for documenting dataset lifecycle events.

MethodEndpointDescription
POST/create_datasetDocument the creation of a new dataset.
POST/update_datasetDocument an update to an existing dataset.
POST/delete_datasetDocument the deletion of a dataset.
POST/delete_family_treeDocument the deletion of an entire dataset family tree.

Information Retrieval

Usage for retrieving lineage, history, and statistics.

MethodEndpointDescription
GET/get_dataset_family_treeRetrieve the complete family tree of a dataset.
GET/get_dataset_lineageRetrieve the lineage path from root to a specific dataset.
GET/get_dataset_historyRetrieve the complete operation history of a dataset.
GET/get_dataset_statusRetrieve the last operation performed on a dataset.
GET/get_user_historyRetrieve the history of operations performed by a specific user.
GET/get_node_statsRetrieve statistics (depth, children count) for a dataset family tree.
GET/get_dataset_num_operationsGet the count of each operation type for a dataset.
GET/get_datasets_diffRetrieve differences between two datasets.
GET/get_datasets_diff_limitRetrieve a limited sample of differences between two datasets.

Lineage Transfer

Usage for moving lineage information between environments (e.g., Factory and Cloud).

MethodEndpointDescription
GET/read_lineagecontent for exporting lineage from root to a specific dataset.
POST/write_lineageImport/paste dataset lineage into the store.

Blockchain Integration

Usage for data integrity verification.

MethodEndpointDescription
GET/blockchain-hashRetrieve the blockchain hash for a dataset family tree.

Admin Operations

Administrative endpoints for maintenance.

MethodEndpointDescription
POST/admin/regenerate_family_tree_descriptionsRegenerate update descriptions for a family tree.
POST/admin/regenerate_blockchain_hashRegenerate and store the blockchain hash.
POST/admin/purge_dataset_family_treePermanently delete a family tree from the triple store.
To learn about the API endpoints in detail, take a look at the Swagger documentation