(warning) The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.

MongoDB Connector Documentation

User Persona:  Paxata User - Paxata Admin - Data Source Admin

*Note: This document covers all configuration fields available during Connector setup. Some fields may have already been filled out by your Admin at an earlier step of configuration and may not be visible to you. For more information on Paxata’s Connector Framework, please see here.

Also: Your Admin may have named this Connector something else in the list of Data Sources.

Configuring Paxata

This connector allows you to connect to MongoDB for browsing and importing available data. The following fields are used to define the connection parameters.


  • Name: Name of the data source as it will appear to users in the UI.
  • Description: Description of the data source as it will appear to users in the UI.

Something to consider: You may connect Paxata to multiple MongoDB deployments and having a descriptive name can be a big help to users in identifying the appropriate data source.

MongoDB Configuration

  • Server: The hostname or IP address of the server hosting the MongoDB instance. If connecting to a replica set, use the hostname or IP address of one of the servers.

  • Port: The port for connecting to MongoDB. The default port is 27017.

  • User: The MongoDB user.

  • Password: The MongoDB user's password.

  • Use SSL: This field sets whether SSL is enabled.

  • Timeout: The number of seconds to wait until an operation times out. If set to 0, operations never time out.

Database Type Configuration

  • Database Type: Connect to a standalone MongoDB instance or to a replica set

  • Database Name: MongoDB database name. The database name is required when connecting to a standalone instance. When connecting to a replica set, if this field is left blank, all available databases will be displayed in the import UI

  • If connecting to a replica set:

    • Replica Set: A comma-separated list of secondary servers (server:port). This allows you to specify multiple servers in addition to the one configured in Server and Port

    • Read Preference: Strategy for reading from a replica set. See Read Preferences for more details.

Data Import Information

Via Browsing

Browsing will allow you to view a list of databases and collections within MongoDB. If connecting to a MongoDB replica set, all databases can be browsed if no database was specified in the configuration. If a database was specified, only the collections within that database can be browsed.

Via SQL Query

This Connector is built on top of JDBC, using a driver provided by CData, and therefore allows data to be imported using SQL SELECT queries. Note that queries use SQL as documented in CData's driver documentation, not the JavaScript-based DSL used in the MongoDB shell.

Best Practices

  • Paxata uses MongoDB as its metadata store. Using this Connector to connect to Paxata's own MongoDB replica set is not an intended or recommended use case for this Connector. We cannot guarantee the performance and correct functioning of the connector and of Paxata itself when it is used in this way.
    • If your intent is to read Paxata metadata, please periodically create a backup of the data in Paxata MongoDB metadata store and restore the backup in a separate instance of MongoDB, then point the Connector at that instance.