Skip to main content

Definitions

This page contains the key definitions used throughout the DataStori portal and documentation.


Cloud Infrastructure

DataStori creates and runs data pipelines entirely in the customer's cloud. The pipelines are orchestrated from the DataStori portal https://app.datastori.io which spins up servers and other components in the customer's cloud on-demand (when a data pipeline is to be run), and shuts them down when the task is completed. Users need to set up their cloud infrastructure on AWS, MS Azure or GCP, all of which are supported by DataStori.


Source

Source is the cloud application from which data is ingested. A source may make its data available via API endpoints, emailed CSV files, SQL databases, or files in SFTP folders. Examples of sources include NetSuite, HubSpot, JIRA, BambooHR, ServiceTitan and Zoho Books.


Destination

Destination is the data store where ingested data is written. DataStori writes output data to the customer's cloud storage - AWS S3, Azure Blob or Google Cloud Storage. In addition, DataStori can write data (all of it or a selected subset) to any SQL Alchemy supported database like Snowflake, PostgreSQL, MySQL or Azure SQL in the customer's cloud.


Integration

A connection between a source and a destination is an integration, for example NetSuite -> Azure SQL. An integration specifies the data copy and automation parameters including data deduplication, pipeline schedules, source columns and data backload. DataStori supports integrations from a source to multiple destinations.


Data Pipeline

A data pipeline is a component to copy specific data from source to destination using an integration. For example, a data pipeline can be built to copy the General Ledger table from NetSuite (source) to Azure SQL (destination) using the NetSuite -> Azure SQL integration.