(warning) The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.

Columns tool: rename, reorder, remove columns in a Project

Use

Use this article to edit the column names, order, and availability in a Project.

Note: for convenience, you may hide columns from your personal view of the data on the gridwhich is different from removing them. A hidden column remains included for all Project filters and Step calculations. See Hide Columns in a Project for details.


Introduction

As you prepare your data, you will find times when you need to make some changes to your columns. The Columns in Current Dataset panel serves two purposes. First, it displays all of the columns currently in your project, each column’s source and type— String, Number, DateTime. The Columns in Current Dataset panel also gives you the ability to:

  • Rename columns
  • Reorder columns
  • Remove columns

Overview

The following is an overview of the elements you will work with when you edit the columns in your Project.



ElementDescription
Column Filters

Filters the Columns List by:

  • Selected columns
  • Renamed columns
  • Data type
Column ListA list of your columns and the type of data they contain. The columns are listed in the order they appear in your data.
Columns ToolOpens the Columns in Current Dataset panel. This is where you will edit the columns in your Project.
Data PreviewThe data in your Project. You will see your data change as you prep it.
Quick MoveMoves the column to the beginning or the end of your dataset.



Rename Columns—individual columns and in bulk

Individual column rename

Follow these steps to change the name of a column.

StepAction
1

From Tools, click Columns.

Result: The Columns in Current Dataset panel appears.

2

In the Column Name section, click the name of the column you want to rename.

3

Type the new name for the column.

Result: The Old Column Name section appears and displays the column's original name. The Data Preview displays the updated column name.

  • Note: If you change your mind about the new name,
  • Click Reset to reset the column name to the original name. Click Reset All to reset the names of all the columns you renamed.
4

Click Save.

Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview.


Bulk Renaming

The bulk renaming feature allows you to rename all columns using a single, comma-separated string. Notice the field below the Columns List panel. Simply begin typing the new column names in this field and separate each name with a comma. Notice that column names displayed in the Columns List panel update accordingly. You can also paste in new column names from header files, separated by commas, to quickly rename all of the columns in your dataset.


Reorder a column

Follow these steps to change the location of a column.

StepAction
1

From Tools, click Columns.

Result: The Columns in Current Dataset panel appears.

2

In the Column Type section, position your pointer over the column you want to move.

Result: The cursor changes to the move cursor.

3

Determine how many columns you want to move.

To move ...Then ...
One columnTo move a single column, drag the column to the new location.
A contiguous group of columns
  1. Click the first column you want to move.
  2. Hold Shift and click the last Column in the series.
  3. Drag the group of columns to the new location or use the Quick Move.
Multiple columns, not contiguous
  1. Click the first column you want to move.
  2. On a PC hold CTRL, on a Mac hold  and click the additional Columns to move.
  3. Drag the group of columns to the new location or use the Quick Move.

Result: The Data Preview displays the column in its new position.

4

Click Save.

Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview.



Remove a column

Follow these steps to remove a column from the Project.

Note: After you remove a column, it’s no longer available for use in the Project. You won’t be able to use the column for subsequent Steps and errors will occur in any subsequent Steps that rely on a column you removed. A removed column can only be made available again by returning to the original Step where it was removed and selecting it again to include in your data.

StepAction
1

From Tools, click Columns.

Result: The Columns in Current Dataset panel appears.

2

Determine how many columns to remove.

To remove ...Then ...
One columnFor the column you want to remove, click to clear the Column check box.
A contiguous group of columns

For the first column you want to remove, click to clear the Column check box.

Hold Shift and click to clear the Column check box for the last column in the series.

Result: The column information in the Column List is shaded. The Data Preview displays the dataset without the column.

3

Click Save.

Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview.



Glossary

The following definitions for terms used in this document.

TermDefinition
AnswerSetLike a dataset except that it is the published result of your data prep
Base datasetThe data on which all other action in the Project will be performed
Data sourceThe source of your dataset
DatasetData that is imported into the Data Library is called a dataset
FiltergramThe combination of the functionality of filters with the power of histograms