The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
Your Paxata Administrator must enable this feature in your application.
Do you find that the sheer amount of data in your source datasets is growing larger and impacting your ability to efficiently import and work with that data in a Paxata Project? This is a common problem for business analysts as the volume of data, and the sources from which it comes, continues to grow. To address this growing problem, Paxata offers the Interactive Mode feature, which gives you the power to get to work faster on a portion of your data—a portion size that you decide is right for your Project needs. You can then efficiently and interactively prep that portion in a Paxata Project, without ever having to bring all of that data into the Project.
The major advantages of the Interactive Mode feature include:
you don't need to wait for the entire dataset to load into your Library before you can begin working with it in a Paxata Project. Instead, you define a portion size for datasets, and when that portion size is reached, that data is available for prep in a Project while the remainder of the dataset continues loading in the Library.
when you've finished prepping your data in the Project, you can easily apply the transformations toall of the datain the native datasets through the Automatic Project Flowsfeature.
you can always re-set the dataset portion that you want to work with in Interactive Mode. For example, after working in a Project with a portion limit of 50k rows per dataset, you may realize you actually need larger portions from each dataset. Changing the portion size is a one-step operation for your Paxata Administrator. Your Project then dynamically recognizes your portion limits have changed and provides you with the option to refresh your datasets in order to pick up the new data.
your interactive experience in Paxata Projects is optimized because you only need to work with the defined portions of your datasets in your Project.
you have more flexibility in how you work with large Projects. Paxata Projects in Interactive Mode have a row limit that defines the maximum number of rows that can be prepared within a Project. This limit is set by your Paxata Administrator and is useful because it allows the Administrator to ensure you have the optimal interactive experience based on available system resources.
By default, Interactive Mode is not enabled for your Paxata Projects and you will need to contact your Paxata Administrator to enable it. Before enabling Interactive Mode, you should consider the following points:
for existing Projects that have already been created, use theProfilingfeature for the datasets in those Project. Profiling the datasets will give you fuller insights into the data you have and this will inform your choice regarding the optimal portion size to select for your datasets.
for existing Projects that have already been created and have datasets with row sizes that now exceed the portion size you've defined, those Projects willnotbe dynamically updated to remove any rows. Instead, when you open those Projects, you will have the option to use theRefresh Datasetsfeature to enforce the row portion for each dataset. The row portions will only be applied if you elect to refresh those datasets.
After Interactive Mode is enabled, the following icon displays to indicate that you're operating on a portion of the dataset: When you mouse over the icon, the row portion value is displayed so that you can quickly discern the value enforced for all datasets. Tip: to determine thetotal number of rowsin a dataset, go to the Library page where that total is displayed for each dataset. Additionally, the Library page provides information specific to the Interactive Mode feature so that you can determine:
the loading status of a dataset and know when its interactive portion is available for use in a Project.
the AnswerSets that have been published from Projects in Interactive Mode.