The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
Columns tool: rename, reorder, remove columns in a Project
Use
Use this article to edit the column names, order, and availability in a Project.
Note: for convenience, you may hide columns from your personal view of the data on the grid—which is different from removing them. A hidden column remains included for all Project filters and Step calculations. See Hide Columns in a Project for details.
Introduction
As you prepare your data, you will find times when you need to make some changes to your columns. The Columns in Current Dataset panel serves two purposes. First, it displays all of the columns currently in your project, each column’s source and type— String, Number, DateTime. The Columns in Current Dataset panel also gives you the ability to:
- Rename columns
- Reorder columns
- Remove columns
Overview
The following is an overview of the elements you will work with when you edit the columns in your Project.
Element | Description |
Column Filters | Filters the Columns List by:
|
Column List | A list of your columns and the type of data they contain. The columns are listed in the order they appear in your data. |
Columns Tool | Opens the Columns in Current Dataset panel. This is where you will edit the columns in your Project. |
Data Preview | The data in your Project. You will see your data change as you prep it. |
Quick Move | Moves the column to the beginning or the end of your dataset. |
Rename Columns—individual columns and in bulk
Individual column rename
Follow these steps to change the name of a column.
Step | Action |
1 | From Tools, click Columns. Result: The Columns in Current Dataset panel appears. |
2 | In the Column Name section, click the name of the column you want to rename. |
3 | Type the new name for the column. Result: The Old Column Name section appears and displays the column's original name. The Data Preview displays the updated column name.
|
4 | Click Save. Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview. |
Bulk Renaming
The bulk renaming feature allows you to rename all columns using a single, comma-separated string. Notice the field below the Columns List panel. Simply begin typing the new column names in this field and separate each name with a comma. Notice that column names displayed in the Columns List panel update accordingly. You can also paste in new column names from header files, separated by commas, to quickly rename all of the columns in your dataset.
Reorder a column
Follow these steps to change the location of a column.
Step | Action | ||||||||
1 | From Tools, click Columns. Result: The Columns in Current Dataset panel appears. | ||||||||
2 | In the Column Type section, position your pointer over the column you want to move. Result: The cursor changes to the move cursor. | ||||||||
3 | Determine how many columns you want to move.
Result: The Data Preview displays the column in its new position. | ||||||||
4 | Click Save. Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview. |
Remove a column
Follow these steps to remove a column from the Project.
Note: After you remove a column, it’s no longer available for use in the Project. You won’t be able to use the column for subsequent Steps and errors will occur in any subsequent Steps that rely on a column you removed. A removed column can only be made available again by returning to the original Step where it was removed and selecting it again to include in your data.
Step | Action | ||||||
1 | From Tools, click Columns. Result: The Columns in Current Dataset panel appears. | ||||||
2 | Determine how many columns to remove.
Result: The column information in the Column List is shaded. The Data Preview displays the dataset without the column. | ||||||
3 | Click Save. Result: Your change is saved as a Step in your Project. The column is updated in the Data Preview. |
Glossary
The following definitions for terms used in this document.
Term | Definition |
AnswerSet | Like a dataset except that it is the published result of your data prep |
Base dataset | The data on which all other action in the Project will be performed |
Data source | The source of your dataset |
Dataset | Data that is imported into the Data Library is called a dataset |
Filtergram | The combination of the functionality of filters with the power of histograms |