The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
This menu is where you select one or more columns and then change the data in those columns to:
trim leading and trailing spaces from cells in the column
collapse consecutive, multiple spaces into a single space
Select Columns for change operation
The default Change menu indicates the column you have selected for the change operation and the "into" drop-down field confirms the operation you have selected. To select additional columns for the operation, click the drop-down arrow adjacent to the column name. A menu opens for selecting multiple or all columns in your dataset.
When selecting more than one column for a change operation, you can select columns by either Name or Criteria. The selection options are significant because they determine how your data will be dynamically updated with the change.
Select Columns by Name: applies the change only to the specific columns you select.
To select columns for the change operation:
click the check box adjacent to the column(s) that you want to select.
click the top-most check box to select all columns.
use the Columns and Types filters at the top of the panel to quickly filter down to the columns you want to select for the operation.
use the search function to locate a column by name.
Select Columns by Criteria: applies the change to any column that meets the criteria you specify. For example, if you have String type columns in your dataset and you specify a change operation to trim leading and trailing spaces for all columns of String type, then all existing columns of this type in your dataset—and any new String type columns that are introduced to the dataset prior to this Step—will be dynamically transformed to remove the leading and trailing spaces.
To select columns based on criteria:
optionally specify the data type of the column—Boolean, DateTime, Number or String.
optionally specify the pattern for the column name—starts with, contains, equals or ends with.
Notice the header message updates to indicate the number of columns you have selected based on that criteria. You may later notice the number of selected columns increases or decreases if new data is brought into an earlier Step that introduces or removes columns that meet your criteria.
After completing a change operation, clickSavein the Steps panel where the change you make to all applicable columns is saved in a single Step.
Note: if you switch between theNameandCriteriaoptions before saving the change operation, your selections are remembered and a link to "Restore last selection" returns you to your initial selection method.