The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
MS Azure Data Lake Storage Gen2 (ADLS Gen2) Connector Documentation
User Persona: Paxata User - Paxata Admin - Data Source Admin or IT/DevOps
*Note: This document covers all configuration fields available during Connector setup. Some fields may have already been filled out by your Admin at an earlier step of configuration and may not be visible to you. For more information on Paxata’s Connector Framework, please see here.
Also: Your Admin may have named this Connector something else in the list of Data Sources.
This connector allows you to connect to Azure Data Lake Storage Gen2 for import and export. The following fields are used to define the connection parameters.
Name: Name of the data source as it will appear to users in the UI.
Description: Description of the data source as it will appear to users in the UI.
Something to consider: You may connect Paxata to multiple Azure Data Lake Storage Gen2 accounts and having a descriptive name can be a big help to users in identifying the appropriate data source.
Azure Data Lake Storage Gen2 Configuration
Data Store Root Directory: The apparent root path accessible by this connector. Use "/" to access all files in the file system.
Azure Storage Account Name: The Subdomain Name of your unique Azure URL. Storage account names must be between 3 and 24 characters in length and may contain numbers and lowercase letters only. Your storage account name must be unique within Azure. No two storage accounts can have the same name.
File System Name: The name of the file system within the storage account. This is sometimes called the "container" name.
Azure Data Lake Storage Gen2 Authentication Settings
From the drop-down, select the preferred authentication method for ADLS Gen2 storage and fill out the required fields.
Storage Account Access Key: Enter the Storage Account Access Key in the field. This is sometimes referred to as a “Shared Key”.
Active Directory Username/Password: Enter the Azure Directory username and password associated with your account.
Note: You must grant access for Paxata to read and write data within your Microsoft account, otherwise, you will get an error while attempting to connect. To grant access, click on the ‘Test Data Source’ button in the Data Source set-up panel and follow the ‘Grant Access’ link. This will bring you to your Microsoft account where you can log in and grant access. Then, come back to Paxata to continue.
Data Import Information
The connector will present a browsable directory hierarchy starting at the location defined in the Data Store Root Directory field.
Via SQL Query
Q: Can we have both ADLS Gen1 and ADLS Gen2 Connectors in the same Paxata account? A: Yes! The two Connectors can coexist and will not interfere with each other.