Supported transformation functions - Data cleansing

No-code Transformation formulas for Data cleansing.

In the formula input, you can can write basic or complex expression to transform the data. Expressions are composed of either a value, a function, an operation, or another expression in parenthesis. See below for the description of each one of them.

Functions can be nested, ex: SHA256(LOWER(user.email))

Available functions

FUNCTION NAMEWHAT IT DOESExample

EXTRACT(text, delimiter, position)

Extracts from the given text the substring at the given position, after splitting by the given delimiter.

EXTRACT("a_b_c", "", 1) returns "b"

EXTRACT("a_b_c", "", 0) returns "a"

SUBSTITUTE(text, search_text,substitute_text)

Substitutes new_text for old_text in a text string. Use SUBSTITUTE when you want to replace specific text in a text string.

SUBSTITUTE("The-Clone-Wars", "-", ".")returns "The.Clone.Wars"

REPLACE(sourceStr, pattern, replaceStr)

Replaces every match of pattern (a regex expression) in sourceStr.

REPLACE("The Clone Wars", "\\s", ".") returns "The.Clone.Wars"

REPLACE('0123456789', '([0-9]{2})', '$1-') returns

01-23-45-67-89

SHA256(stringToHash)

Returns an SHA256 hash of the given string

SHA256("test") returns "9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08"

MD5(stringToHash)

Returns an MD5 hash of the given string

MD5("test") returns "098f6bcd4621d373cade4e832627b4f6"

COALESCE(value1, value2, ...)

Returns the first value from the list that isn’t empty

COALESCE("123", "hello") returns "123"

CONCAT(string1, string2, ...)

Concatenate multiple values into one property. Separators could be specified: CONCAT(prop1, "

", prop2, "

TRIM(text)

Removes all spaces, new lines, tabulations at the beginning and at the end of a text.

TRIM(" \Hello World \n") returns "Hello World"

SELECT(sourceStr, pattern, <position>)

Returns the first match of the pattern (a regex expression) in sourceStr. If the third parameter is set, it will return the corresponding group inside the match.

SELECT("From A to Z alphabet", 'A(.*?)Z") returns "A to Z"

LOWER(text)

Converts all uppercase letters in a text string to lowercase

LOWER("ABC") returns "abc"

UPPER(text)

Converts all lowercase letters in a text string to uppercase

UPPER("abc") returns "ABC"

NUMBER(value1)

Converts string to number

NUMBER("123.456") returns 123.456

NUMBER("not a number") returns null

ENCODE_BASE64(value1)

Encode a string to base64

ENCODE_BASE64("hello") returns "aGVsbG8="

DECODE_BASE64(value1)

Decode a string from base64

DECODE_BASE64("aGVsbG8=") returns "hello"

TIMESTAMP()

Returns the current date timestamp

TIMESTAMP () returns a value like 1674832519845 (changes with the current time)

LEFT(text, length)

Returns the requested number of characters from the beginning of a string or number.

LEFT("Hello world", 3) returns "Hel"

RIGHT(text, length)

Returns the requested number of characters from the end of a string or number.

RIGHT("Hello world", 2) returns "ld"

CHAR(text, position)*

Returns the character at the specified position (start at 0).

CHAR("Hello world", 5) returns "o"

IF(condition,resultIfTrue,resultIfFalse)

Returns the second argument if the first argument is true, or the third argument otherwise

IF(city="Paris", ListA, ListB)

ISEMPTY(value)

Returns whether the value is empty or not

ISEMPTY("abc") returns false

SIZE(array or object)

Returns the number of elements in the list

SIZE(items) returns the number of items

LENGTH(string)

Returns the number of characters in the string

LENGTH("abc") returns 3

SUBSTRING(text, pos1, pos2)

Returns a new string composed of the characters between the two position defined by pos1 and pos2. The positions start at 0. If a negative number is given, the positions start from the end of the string. If pos2 is omitted, the end of the string is used.

SUBSTRING("hello beautiful world", 6, 14) returns "beautiful"

JSON_PARSE(string)

Parse a JSON string. Usefull to create an object property that will contain all the properties present inside a stringyfied JSON you may have inside another event property

JSON_PARSE('{"optin":1,"message":"ok"}') returns an object with an optin key equals to 1 and message key equals to "ok"

VALUE_FROM_JSON(string, path)

Extract a value from a data string formatted in JSON

VALUE_FROM_JSON('{"info":{"code":42},"message":"ok"}', 'info.code')

returns

42

GET(property name)

GET(name) return the value of the given event property. Useful for property names that contains a special character like "-"

GET("droid-type")

returns

"r2d2" GET("invalidProperty") returns null

GETHEADER(name)

GETHEADER(name) return the value of the given http header

GETHEADER("Accept-Language")

returns

"fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3"

GETCOOKIE(name)

GETCOOKIE(name) return the value of the given http cookie

GETCOOKIE("cart")

returns

"empty"

DATEPARSE(str, format)

Takes a date in a specific format and converts it to a proper date

DATEPARSE('2023-05-23', 'yyyy-MM-dd') returns 2023-05-23T00:00:00.000+01:00 see more about format

DATEFORMAT(date, format)

Takes a date in a ISO-8601 format and converts it to the specified format

DATEFORMAT('2023-01-31T23:59:00.000+01:00', 'yyyy-MM-dd HH:mm') returns 2023-01-31 23:59 see more about format

DATEADD(date, interval, unit)

Adds the interval as a unit (years, months, weeks, days, hours, minutes, seconds)

DATEADD('2023-01-23T00:05:00.000+01:00', 3, 'days') returns 2023-01-26T00:05:00.000+01:00

DATESUB(date, interval, unit)

Removes the interval as a unit (years, months, weeks, days, hours, minutes, seconds)

DATESUB('2023-01-23T00:05:00.000+01:00', 3, 'days') returns 2023-01-20T00:05:00.000+01:00

DATEDIFF(date1, date2, unit)

Returns the interval between two dates in unit

DATEDIFF('2023-01-20T00:05:00.000+01:00', '2023-01-23T00:05:00.000+01:00', 'days') returns 0

AGE(date, unit)

Returns the elapsed time since the given date in unit (if unit is not specified, years by default)

AGE('2022-01-20T00:05:00.000+01:00') returns 1 in 2023

YEAR(date), MONTH(date), DAY(date), HOUR(date), MINUTES(date), SECONDS(date)

Returns the dates specified unit

YEAR('2023-01-20T00:05:00.000+01:00') returns 2023, MONTH for the same date returns 1, etc

TIMEZONE(date)

Returns the timezone in UTC

TIMEZONE('2023-01-20T00:05:00.000+01:00') returns UTC+1

TODAY()

Returns the current date in ISO-8601

TODAY() might return 2023-01-20T00:05:00.000+01:00, assuming the current date is the 20th of January 2023

Operators

Operator

Description

Example

=, ==

Return true if the left part is equal to the right part. If one of the part is an array, check if some values are in equal between the two parts.

gender = "F"

!=, <>

Inverse of the above

gender != "F"

AND or &&

Do a boolean AND between the two parts

age >18 AND gender = "M"

OR or ||

Do a boolean OR between the two parts

age >18 OR gender = "M"

NOT(expression)

Do a boolean NOT on the expression

NOT(age>18)

IN()

Returns true if the left is equals to at least one value on the right.

currency IN("€", "$", "£")

!IN

Returns true if none of the value on the right is equal to the value on the left.

currency !IN("€", "$", "£")

<, >, <=, or >=

Compare two values If one of the values is an array, check that at least one of the value match.

age > 18

BETWEEN()

Check if the left value is between the two values passed as arguments.If the left value is an array, check that at least one of the value match.

age BETWEEN(7, 80)

EXISTS()

Check if the property exists

EXISTS(my_property)

~ or !~

Check if the left value match the regex in the right value (or doesn’t match if !~). The regex language is the javascript one. If the left value is an array, check that at least one of the value match.

email ~ "@example.com"

STARTSWITH(), ENDSWITH(), or CONTAINS()

Check if the left value starts with, ends with, or contains the right value. If the left value is an array, check that at least one of the value match.

lastname STARTSWITH("DE")

* or /

Multiplication or division

3*5

+ or -

Addition / concatenation or substraction

1+2

Examples

  1. Scenario: Create a Flag that shows if Consumers’ Primary Address is in California IF(EXTRACT(city_state, '-', 1) == " CA", "TRUE", "FALSE")

  2. Scenario: Set a value with multiple conditions. The value equals "12345" for country FR and environment prod, else equals 98888 for country DE and environment dev: \

```xquery
IF(country = "FR" AND environment_work = "prod", "12345", IF(country = "DE" AND environment_work = "dev", "98888")
```

Last updated