String Functions and Operators
Applies to: ✅ Data Cloud SQL ✅ Tableau Hyper API
This section describes functions and operators for examining and manipulating string values. Consider the automatic space-padding that occurs when using the whitespace-padded character type.
SQL defines some string functions that use key words, rather than commas, to separate arguments. The strings defined in the SQL standard are shown in the table.
| Function | Return Type | Description | Example |
|---|---|---|---|
string || string | text | String concatenation. NULL arguments aren’t ignored, in contrast to concat. | 'Hy' || 'per' → 'Hyper' 'Hy' || NULL → NULL |
string || non-string or non-string ||string | text | String concatenation with one non-string input | 'Value:' || 42 → 'Value: 42' |
bit_length(string) | int | Number of bits in string | bit_length('jose') → 32 |
char_length(string) or character_length(string) | int | Number of characters in string | char_length('jose') → 4 |
lower(string) | text | Convert string to lower case | lower('TOM') → 'tom' |
octet_length(string) | int | Number of bytes in string | octet_length('jose') → 4 |
overlay(string placing string from int for int) | text | Replace substring | overlay('Txxxxas' placing 'hom' from 2 for 4) → 'Thomas' |
position(substring in string [, start => s] [, occurrence => n]) | int | Location of nth occurrence of the specified substring starting at position s | position('om' in 'Thomas') → 3 |
substring(string from int for int) | text | Extract substring | substring('Thomas' from 2 for 3) → 'hom' |
trim(leading | trailing | both characters from string) | text | Remove the longest string containing only characters from characters (a space by default) from the start, end, or both ends (both is the default) of string | trim(both 'xyz' from 'yxTomxx') → 'Tom' |
upper(string) | text | Convert string to upper case | upper('tom') → 'TOM' |
Some versions of these functions use the regular function invocation syntax together with additional, non-standard string manipulation functions. Those functions are:
| Function | Return Type | Description | Example |
|---|---|---|---|
ascii(string) | int | ASCII code of the first character of the argument. For UTF8 returns the Unicode code point of the character. | ascii('x') → 120 |
btrim(string text , characters text) | text | Remove the longest string consisting only of characters in characters (a space by default) from the start and end of string. | btrim('xyxtrimyyx', 'xyz') → 'trim' |
chr(int) | text | Character with the given code. For UTF8 the argument is treated as a Unicode code point. The NULL (0) character is not allowed because text data types cannot store such bytes. | chr(65) → 'A' |
concat(str "any" [, str "any" [, ...] ]) | text | Concatenate the text representations of all the arguments. Unlike ||, NULL arguments are ignored. | concat('abcde', 2, NULL, 22) → abcde222 |
decode(string text, format text) | bytea | Decode binary data from textual representation in string. Options for format are the same as in encode. | decode('MTIzAAE=', 'base64') → \x3132330001 |
encode(data bytea, format text) | text | Encode binary data into a textual representation. Supported formats are base64, hex, escape. escape converts zero bytes and high-bit-set bytes to octal sequences (\``<nnn>) and doubles backslashes. | encode('123\000\001', 'base64') → MTIzAAE= |
ends_with(string text, suffix text) | boolean | Return true if string ends with suffix. | ends_with('alphabet', 'bet') → t |
initcap(string) | text | Convert the first letter of each word to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. | initcap('hi THOMAS') → 'Hi Thomas' |
left(str text, n int) | text | Return first <n> characters in the string. When <n> is negative, return all but last |<n>| characters. | left('abcde', 2) → 'ab' |
length(string) | int | Number of characters in string | length('jose') → 4 |
lpad(string text, length int , fill text) | text | Fill up the string to length by prepending the characters fill (a space by default). If the string is longer than length, then it’s truncated (on the right). | lpad('hi', 5, 'xy') → 'xyxhi' |
ltrim(string text , characters text) | text | Remove the longest string containing only characters from characters (a space by default) from the start of string. | ltrim('zzzytest', 'xyz') → 'test' |
md5(data text or bytea) | text | Calculates the MD5 hash of data, returning the result in hexadecimal. | md5('abc') → 900150983cd24fb0d6963f7d28e17f72 |
quote_ident(string text) | text | Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary, that is, if the string contains non-identifier characters or would be case-folded. Embedded quotes are properly doubled. | quote_ident('Foo bar') → '"Foo bar"' |
quote_literal(string text) | text | Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single quotes and backslashes are properly doubled. quote_literal returns null on null input. If the argument might be null, quote_nullable is often more suitable. | quote_literal(E'O\'Reilly') → 'O''Reilly' |
repeat(string text, number int) | text | Repeat string the specified number of times. | repeat('Hyper!', 2) → 'Hyper!Hyper!' |
replace(string text, from text, to text) | text | Replace all occurrences in string of substring from with substring to. | replace('abcdefabcdef', 'cd', 'XX') → 'abXXefabXXef' |
reverse(str) | text | Return reversed string. | reverse('abcde') → 'edcba' |
right(str text, n int) | text | Return last <n> characters in the string. When <n> is negative, return all but first |<n>| characters. | right('abcde', 2) → 'de' |
rpad(string text, length int , fill text) | text | Fill up the string to length by appending the characters fill (a space by default). If the string is already longer than length, then it’s truncated. | rpad('hi', 5, 'xy') → 'hixyx' |
rtrim(string text , characters text) | text | Remove the longest string containing only characters from characters (a space by default) from the end of string. | rtrim('testxxzx', 'xyz') → 'test' |
sha256(data text or bytea) | bytea | Calculates the sha256 hash of data, returning the result as binary data. | sha256('abc') → \xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad |
space(length int) | text | Return a string of whitespaces in the specified length | space(5) → (5 whitespace characters) |
split_part(string text, delimiter text, field int) | text | Split string on delimiter and return the given field, counting from one. When n is negative, returns the |n|-th-from-last field. | split_part('abc~@~def~@~ghi', '~@~', 2) → 'def' |
starts_with(string text, prefix text) | boolean | Return true if string starts with prefix. | starts_with('alphabet', 'alph') → t |
strpos(string, substring) | int | Location of specified substring (same as position(substring in string), but note the reversed argument order). | strpos('high', 'ig') → 2 |
substr(string, from , count) | text | Extract substring (same as substring(string from from for count)). | substr('alphabet', 3, 2) → 'ph' |
to_base64(data int bigint or bytea) | text | Convert data to its equivalent base64 representation. | to_base64(2147483647) → 'f////w==' |
to_hex(data int bigint or bytea) | text | Convert data to its equivalent hexadecimal representation. | to_hex(2147483647) → '7fffffff' |