String Functions
The query service supports the following functions for string manipulations.
Function | Return Type | Input Arguments | Description |
---|---|---|---|
chr({x}) | Varchar |
| Returns the Unicode code point n as a single character string. |
codepoint({s}) | Integer |
| Returns the Unicode code point of the only character of string. |
concat({s1}, {s2}, ...) | Varchar |
| Returns the concatenation of the input strings. For example:
|
concat_ws({s1}, {s2}, ...) | Varchar |
| Returns the concatenation of input using s1 as a separator. If string1 is null, then the return value is null. Any null values provided in the arguments after the separator are ignored. For example:
|
from_utf8({b}) | Varchar |
| Corrects invalid UTF-8 data. It decodes a UTF-8 encoded string from binary. Any invalid UTF-8 sequences are replaced with the Unicode replacement character U+FFFD. |
hamming_distance({s1}, {s2}) | Bigint |
| Returns the Hamming distance of s1 and s2. A Hamming distance is the number of positions at which the corresponding characters are different. The two input strings must have the same length. |
length({s}) | Bigint |
| Returns the length of the input in characters. |
levenshtein_distance({s1}, {s2}) | Bigint |
| Returns the Levenshtein edit distance of s1 and s2. The Levenshtein distance is the minimum number of single-character edits like insertions, deletions, or substitutions that are needed to change s1 into s2. |
lower({s}) | Varchar |
| Returns the input string in lowercase. |
lpad({s}, {size}, {padstring}) | Varchar |
| Left pads the input string to size characters with padstring. If size is less than the length of string, the result is truncated to size characters. The size can't be negative and padstring can't be non-empty. |
ltrim({s}) | Varchar |
| Removes leading whitespace from the input string. |
luhn_check({s}) | Boolean |
| Tests whether a string of digits is valid according to the Luhn algorithm. |
normalize({s}) | Varchar |
| Transforms string with NFC normalization form. |
position({substring}, {s}) | Bigint |
| Returns the starting position of the first instance of substring in s. Positions start with 1. If not found, 0 is returned. |
replace({s}, {search_string}) | Varchar |
| Removes all instances of search_string from s. |
reverse({s}) | Varchar |
| Returns s with the characters in reverse order. |
rpad({s}, {size}, {padstring}) | Varchar |
| Right pads the input string to size characters with padstring. If size is less than the length of string, the result is truncated to size characters. The size can't be negative, and the padstring can't be non-empty. |
rtrim({s}) | Varchar |
| Removes trailing whitespace from the input. |
split({s}, {delimiter}) | Array[] |
| Splits the input string on delimiter and returns an array. |
split_part({s}, {delimiter}, {index}) | Varchar |
| Splits the input string on delimiter and returns the field index. Field indexes start with 1. If the index is larger than the number of fields, then null is returned. |
split_to_map({s}, {entry_delimiter}, {key_value_delimiter}) | Map<varchar, varchar> |
| Splits the input string by entry_delimiter and key_vcalue_delimiter and returns a map. The entry_delimiter splits the string into key-value pairs. The key_value_delimiter splits each pair into key and value. |
split_to_multimap({s}, {entry_delimiter}, {key_value_delimiter}) | Map<varchar, varchar> |
| Splits the input string by entry_delimiter and key_value_delimiter, and returns a map containing an array of values for each unique key. The entry_delimiter splits the string into key-value pairs. The key_value_delimiter splits each pair into key and value. The values for each key are the same order as they appear in s. |
starts_with({s}, {substring}) | Boolean |
| Tests whether the substring is a prefix of the input string. |
strpos({s}, {substring}) | Bigint |
| Returns the starting position of the first instance of the substring in the input string. Positions start with 1. If not found, 0 is returned. |
substr({s}, {start}) | Varchar |
| Returns the rest of the input string from the starting position start. Positions start with 1. A negative starting position is interpreted as being relative to the end of the string. |
substring({s}, {start})) | Varchar |
| Returns the rest of the input string from the starting position start. Positions start with 1. A negative starting position is interpreted as being relative to the end of the string. |
to_utf8({s}) | Varbinary |
| Returns a string encoded into a UTF-8 varbinary representation. |
translate({source_string}, {from_string}, {to_string}) | Varchar |
| Returns the source_string translated by replacing characters found in the from_string with the corresponding characters in the to_string. |
trim({s}) | Varchar |
| Removes the leading and trailing whitespace from the input string. |
upper({s}) | Varchar |
| Returns the input string in uppercase. |
word_stem({word}, {lang}) | Varchar |
| Returns the stem of the input word in the language lang. |