(warning) The Data Prep (Paxata) documentation is now available on the DataRobot public documentation site. See the Data Prep section for user documentation and connector information. After the 2021.2 SP1 release, the content on this site will be removed and replaced with a link to the DataRobot public documentation site.

FIND

This function determines if one word (or string of text) can be found in a second piece of text. The function returns a number indicating the position in the string—the characters in the second string are counted and the number indicates the character where the first piece of text begins its overlap in the second string of text. If there are multiple overlaps, FIND only indicates the position of the first match—not any successive matches in the pair.

However, this function allows for an optional third argument in the form of a number. This number indicates the position (by number of characters) where you want the search to begin in the second string. It the third argument is omitted, then the second string is searched beginning at the first character.

If the first string is not found in the second string, then the function returns 0.

Rules
Generalization:
FIND(arg1,arg2,[arg3])

  • Number of Arguments: two or three arguments
  • Argument Requirements: The first two arguments must be text, columns that contain text, or a function that returns text; the optional third argument must be a number, a column that contains a number, or a function that returns a number
  • Special Notes: This function is case sensitive, so it treats “The” as a different string than “the”. When using this function, the two pieces of text must match EXACTLY—including capitalization.

This function matches pieces of text—not just words. Therefore, the text “jump” will be determined to be in the string “jumped” at position 1.
Text characters, not just words, can be used as search strings and discovered in the second string.

Examples

Example 1: FIND("the", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.") would would return a value of 31.

Example 2: FIND("dog", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.") would would return a value of 46, which corresponds to the first occurrence of “dog” in the second string.

Example 3: FIND("dog", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.", 47) would would return a value of 75 because the third argument value of 47pushes the start of the search past character 46 (where the first “dog” occurs) and forces the function to find the second “dog” in the string.