Overview
DataStori runs data pipelines and stores data in the customer's cloud, so that customer data never leaves their IT environment. Data pipelines are managed and orchestrated in the DataStori web application, which is hosted in AWS US. DataStori supports AWS, Azure and GCP.
Prerequisites for DataStori
In order to set up and use DataStori, the customer needs to provision the following:
- Cloud infrastructure: AWS, Azure or GCP account.
- Data sources: Administrator access to the source applications from which data is to be fetched.
- Data storage destinations: Host and credentials of the cloud storage and (optional) databases where output data is to be written.
Tutorials
These tutorials illustrate how DataStori sets up data pipelines to connect a data source ServiceTitan to a destination data store (Azure SQL).
Steps to Onboard an Application
DataStori follows a fully managed process to onboard an application, which includes data pipeline setup, execution and data storage. This process typically takes 7-10 working days. Here is an overview of the steps followed:
Set up the customer's cloud for data ingestion. DataStori ensures that the customer's cloud configuration is compliant with their data security and privacy policies.
Help generate API credentials for the source application. Between OAuth2, multiple client authentication methods and API tokens, generating API authentication can be challenging. Our team helps you decide the best option and implement it. If the source application does not expose its APIs, then use email / SFTP / SQL credentials to access the required data.
Define data load strategies for the first set of data pipelines, then create and test them. Finally, set them up on a production schedule.
Check that the pipeline outputs (ingested data) are being written to the defined data destinations - blob and (optional) SQL tables.
Set up a demo Power BI, Tableau or Excel report on the output data, to show how it can be consumed for analytics, reporting and other business purposes.
Train customer's team to set up new data pipelines, authenticate APIs, add new sources and support the pipelines by managing alerts and failures.
To learn more, write to contact@datastori.io.