(warning) The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.

DataRobot Connector Documentation

User Persona: Paxata User - Paxata Admin - Data Source Admin

*Note: This document covers all configuration fields available during Connector setup. Some fields may have already been filled out by your Admin at an earlier step of configuration and may not be visible to you. For more information on Paxata’s Connector Framework, please see here.

Also: Your Admin may have named this Connector something else in the list of Data Sources.

Configuring Paxata

This connector allows you to connect to DataRobot for Library imports and exports. The following fields are used to define the connection parameters.

General

  • Name: Name of the data source as it will appear to users in the UI.

  • Description: Description of the data source as it will appear to users in the UI.

Something to consider: You may connect Paxata to multiple DataRobot Accounts and having a descriptive name can be a big help to users in identifying the appropriate data source.

DataRobot Configuration

  • Server URL: The server URL for DataRobot. For example: https://app.datarobot.com

Authentication Configuration

  • Authentication Type: Select the authentication type to use:

    • API Key
      • Key: The DataRobot API Key
    • User Credentials

      • Email: Email or username for authenticating with DataRobot.

      • Password: Password for authenticating with DataRobot.

        • Note: Multi-factor authentication is not supported and will result in an error.

Data Import & Export Information

Via Browsing

  • To import from or export to the AI Catalog:
    • Upon import, select the "AI Catalog" option to view all available datasets. Select the desired dataset to see a preview and adjust the import settings.
      • Note: If during import you receive an error stating you do not have permissions to download datasets from the AI Catalog, you need to adjust your settings in DataRobot. Login to DataRobot and click the user icon at the top-right → Settings Optional Products → check Enable AI Catalog Downloads  Save. Then come back to DataRobot Paxata to continue.
    • Upon export, select the "AI Catalog" option, then click "Select". Name the dataset and click "Export". 

Via SQL Query

Not supported

FAQ/Troubleshooting/Common Issues

What if I export the model I’ve generated in DataRobot and want to run that code where my data lives?

  • Paxata has over 50 other Connectors and can likely still send the prepped data to the appropriate location. If Paxata does not support Connectivity to the service/storage location you require, please reach out to your Customer Success Representative. 

Why can’t I import my dataset?

  • Issue 1: Paxata has designed the integration with DataRobot’s AI Catalog to only support the importing of “Snapshotted” datasets. The data contained in “Not snapshotted” datasets are not actually stored in DataRobot and are retrieved upon usage. In Paxata’s case, that would mean DataRobot would fully import a dataset from the data source and only then would Paxata begin importing that dataset. For “Not snapshotted” datasets, it’s much more efficient to pull the data directly from the data source into Paxata. To determine if your dataset is a “Snapshot”, go to the AI Catalog, select the dataset in question and look at the “Status” in the right-hand panel.
  • Issue 2: If you receive an error stating you do not have permissions to download datasets from the AI Catalog, you need to adjust your settings in DataRobot. Login to DataRobot and click the user icon at the top-right → Settings Optional Products → check Enable AI Catalog Downloads  Save. Then come back to DataRobot Paxata to continue.
  • Issue 3:  If you receive an error stating "Mapping for <column name> not found, expected one of [ <column name list>]", you need to adjust your settings in DataRobot. Login to DataRobot and click the user icon at the top-right → Settings CSV export → uncheck Include BOM  Save. Then come back to DataRobot Paxata to continue.
  • Issue 4: If you receive an error like "Failed to retrieve datasets. Cause: Unprocessable Entity", it likely indicates that the optional beta feature List Data Mesh Workspaces in AI Catalog is enabled in the DataRobot application. Disable that setting in DataRobot and come back to DataRobot Paxata.

When I export a new version of my dataset, does it appear as such in the AI Catalog?

  • Yes, versions of datasets with the same name will appear under the Version History tab of the AI Catalog, rather than as a new dataset.

Requirements for Data Exports to the AI Catalog

Datasets exported to DataRobot must meet the following criteria:

  • At least 100 rows.
  • At least 2 columns.
  • Have valid column names.