The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
Use this article to update an existing dataset with a new version.
Data is always changing. Even if you just imported data intoPaxata®, there’s a chance the data is outdated. Updating the data in a dataset allows you to import a dataset as a new version of an existing dataset. After you update a dataset, you can easily use the new version in an existing Project. See theUpdate the Project Datasets article for more information.
When updating a dataset, you can chose to update to a dataset where:
Only the values have changed - the structure and format are the same.
The format or structure changed - e.g. column added or removed.
A completely different dataset - new values, new structure and format.
Update a dataset with new data
Follow these steps to update a dataset with new data.
On theLibraryscreen, hover the cursor over the dataset you want to update.
Result:The dataset actions appear.
Click the "More Actions" button.
Result:TheAdd Versionoption appears. Click to select.
Locate and select the dataset to import.
To select a dataset from a ...
Select the data source from theSelect Data Sourcedrop-down.
Locate the dataset you want to import.
To select the dataset, clickSelect.
Note:If a SQL statement was used during the initial import, the SQL statement is retained and can be used again to update the data in the dataset.
ClickUpload Local File.
Click theUpload Filepanel and select the datasets.
Result:The dataset is added to the list in theYou Selectedpanel. Paxata displays theYour Optionspanel for the dataset and a preview of the dataset.
Check the preview of the dataset. To adjust the import settings. See the Adjust the import settings section of theImport Datasets article.
Result:Your data is imported as a new version and is ready to be prepped in a Project.
The following definitions for terms used in this document.
Like adatasetexcept that it is the published result of your data prep
The data on which all other action in the Project will be performed
The source of your dataset
Data that is imported into the Data Library is called adataset
The combination of the functionality of filters with the power of histograms