The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.
Executes a search and replace on a text string. It is composed of three arguments:
The text to be searched
What you are trying to find
The text that should replace it
To determine if a string of text appears in another piece of text without conducting any replacements, see FIND. The FIND function also has the advantage of a slightly more usable syntax—however, the trade-off is that it is slightly less powerful in pattern matching, as well.
REGEXP(arg1, arg2, arg3)
Number of Arguments:three arguments
Argument Requirements:nThe first argument must be text, columns that contain text, or a function that returns text; the second and third arguments are composed of character combinations that define the search and replace activity
Special Notes: This function is recommended for advanced users. The complexity of the
(sometimes called “regex”) pattern matching syntax is such that very few outside the realm of UNIX-based administration or software engineering have a complete mastery of subject. Since detailed instructions of this non-intuitive syntax are beyond the scope of this article, resources specific to the topic should be considered. (See the “Additional Resources,” section below.)
Example 1:REGEXP("The/quick/sly/fox." , "/", "-") would return The-quick-sly-fox
Example 2: REGEXP(column1 ,"[^0-9]", "") would delete any character from the text in Column 1 that is not a number.
A simple web search will return many potential resources for understanding regular expressions. Rather than attempt to recreate that list here, there are two suggestions below that—while they may not confer expertise in and of themselves—serve as potential starting points for the subject.