Version 6

# Intro

You can use this reference guide to learn about supported functions in the Add Formula and Group By tiles in Magic ETL v2.

Important: There are significant behavioral differences in Magic ETL v2. Read the Upgrading to Magic ETL v2 (Beta) article before converting mission-critical Magic ETL DataFlows. Failure to do so may risk an unintended change to your DataFlow’s behavior.

# Supported Functions

The following sections give examples of the available functions that can be used inside of Magic ETL. However, for a complete list you will need to refer to the Functions list inside of the Formula Editor as seen here:

Important: While the Formula Editor looks similar to the Beast Mode editor, they are not the same. You can copy any Beast Mode you have made into the Formula Editor and it will work, but not all of the functions allowed in the Formula Editor will work inside of a Beast Mode. This is due to the increased processing power available to DataFlows versus the Analyzer.

# Aggregate Functions

 Function Name Description Example APPROXIMATE COUNT (DISTINCT) Returns the approximate count of a number of unique values in a column. APPROXIMATE_COUNT_DISTINCT(X) CORR Returns the correlation coefficient of independent variable x and dependent variable y. It is computed for non-null pairs of values only. For non-null data, it is equivalent to: COVAR_POP(y, x) / (STDDEV_POP(x) * STDDEV_POP(y)). It is an aggregate function. VAR_POP() returns NULL if there were no matching rows. CORR(x,y) COUNT Returns a count of the number of non-NULL values of expr. The result is an integer value. It is an aggregate function. COUNT(*) counts the total number of rows. COUNT() returns 0 if there were no matching rows. COUNT(expr) COVAR_POP Returns the population covariance of independent variable x and dependent variable y. It is computed for non-null pairs of values only. For non-null data, it is equivalent to: (SUM(x*y) - SUM(x) * SUM(y) / COUNT(*)) / COUNT(*). It is an aggregate function. VAR_POP() returns NULL if there were no matching rows. COVAR_POP(x,y) COVAR_SAMP Returns the sample covariance of independent variable x and dependent variable y; the denominator is the sample size minus one. It is computed for non-null pairs of values only. For non-null data, it is equivalent to: (SUM(x*y) - SUM(x) * SUM(y) / COUNT(*)) / (COUNT(*) - 1). It is an aggregate function. VAR_POP() returns NULL if there were no matching rows. COVAR_SAMP(x,y) FIRST_NON_NULL_VALUE Returns the first non-null value in expr, or NULL if no non-NULL values were found. It is an aggregate function. FIRST_NON_NULL_VALUE(expr) FIRST_VALUE Returns the first value in expr, including NULL. It is an aggregate function. FIRST_VALUE(expr) LAST_NON_NULL_VALUE Returns the last non-null value in expr, or NULL if no non-NULL values were found. It is an aggregate function. LAST_NON_NULL_VALUE(expr) LAST_VALUE Returns the last value in expr, including NULL. It is an aggregate function. LAST_VALUE(expr) MAX Returns the largest, or maximum, value of expr. MAX() can also take a string argument in which case it returns the maximum string value.   It is an aggregate function.   MAX() returns NULL if there were no matching rows. MAX(expr) MEDIAN Returns the estimated median value of expr. It is an aggregate function. MEDIAN(expr) MIN Returns the smallest, or minimum, value of expr. MIN() can also take a string argument in which case it returns the minimum string value.   It is an aggregate function.   MIN() returns NULL if there were no matching rows. MIN(expr) PERCENTILE Returns the ntile percentile of expr. It is an aggregate function. PERCENTILE(expr, ntile) STD STD() is a synonym for STDDEV_SAMP(). STD(expr) STDDEV STDDEV() is a synonym for STDDEV_SAMP(). STDDEV(expr) STDDEV_POP Returns the population standard deviation of expr (the square root of VAR_POP()). It is an aggregate function. STDDEV_POP() returns NULL if there were no matching rows. STDDEV_POP(expr) STDDEV_SAMP Returns the sample standard deviation of expr (the square root of VAR_SAMP()). It is an aggregate function. STDDEV_POP() returns NULL if there were no matching rows. STDDEV_SAMP(expr) SUM Returns the sum of expr. If the return set has no rows, SUM() returns NULL. It is an aggregate function. SUM(expr) VAR_POP Returns the population standard variance of expr. It considers rows as the whole population, not as a sample, so it has the number of rows as the denominator. Variance is calculated by working out the mean for the set for each number, subtracting the mean and squaring the result, and finally calculating the average of the resulting differences It is an aggregate function. VAR_POP() returns NULL if there were no matching rows. VAR_POP(expr) VAR_SAMP Returns the sample standard variance of expr. The denominator is the number of rows minus one. Variance is calculated by working out the mean for the set for each number, subtracting the mean and squaring the result, and finally calculating the average of the resulting differences. It is an aggregate function. VAR_POP() returns NULL if there were no matching rows. VAR_SAMP(expr) VARIANCE VARIANCE() is a synonym for VAR_SAMP(). VARIANCE(expr) VARIANCE_POP VARIANCE_POP() is a synonym for VAR_POP(). VARIANCE_POP(expr) VARIANCE_SAMP VARIANCE_SAMP() is a synonym for VAR_SAMP(). VARIANCE_SAMP(expr)

# Statistical Functions

 Function Name Description Example ANOVAF Returns the F-value of a one-way Analysis of Variance of the samples. At least two samples must be provided, and each sample must have at least two non-null values. Note that a one-way ANOVA on two samples is equivalent to the Student's t-test (see the TTEST function). See also: ANOVAP() ANOVAF(sample1,sample2[,sample3[,...,sampleN]]) ANOVAP Returns the p-value of a one-way Analysis of Variance of the samples. At least two samples must be provided, and each sample must have at least two non-null values. Note that a one-way ANOVA on two samples is equivalent to the Student's t-test (see the TTEST function). See also: ANOVAF() ANOVAP(sample1,sample2[,sample3[,...,sampleN]]) TTEST Returns the p-value result of a Student's t-test. Arguments sample1 and sample2 are two numeric columns with at least 2 non-null values each. The tails argument is either 1 or 2, indicating a one- or two-tailed test. The type argument is a number 1 through 3. Type 1 is a paired sample test; for this type of test, values in both sample1 and sample2 must be non-null or be ignored. Type 2 is for two samples with the assumption of equal variance. Type 3 is for two samples without the assumption of equal variance. TTEST(sample1,sample2,tails,type)

# Logical Functions

 Function Name Description Example ALL_TRUE Returns TRUE if all arguments evaluate to TRUE, otherwise returns FALSE. Can be invoked with one or more arguments. ALL_TRUE(cond1,cond2,...) ANY_TRUE Returns TRUE if any arguments evaluate to TRUE, otherwise returns FALSE. Can be invoked with one or more arguments. ANY_TRUE(cond1,cond2,...) BOOLEAN Returns the value of expr converted to BOOLEAN type (TRUE, FALSE, or NULL). BOOLEAN(expr), CAST(expr AS BOOLEAN) COALESCE Returns the first non-NULL value in the list, or NULL if there are no non-NULL values. At least one parameter must be passed. COALESCE(value,...) ERROR When evaluated, produces an error with the given error string. Depending on the transformation settings, this may halt execution entirely. ERROR(error_string) IF If test is TRUE, returns exprTrue. Otherwise, returns exprFalse. The return type is the type that best matches both exprTrue and exprFalse. For example, if exprTrue has type LONG and exprFalse has type DOUBLE, the return type will be DOUBLE, even if test is TRUE. IF(test,exprTrue,exprFalse) IFERROR If the evaluation of expr1 does not produce an error, IFERROR() returns expr1; otherwise it returns expr2. Note IFERROR() will only handle errors produced at runtime. If expr1 is produces an error before even being evaluated, e.g. if it attempts to convert between inconvertible types, IFERROR() will still produce that error. IFERROR(expr1,expr2) IFNULL If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL(expr1,expr2) NULLIF Returns NULL if expr1 = expr2 is true, otherwise returns expr1. This is the same as CASE WHEN expr1 = expr2 THEN NULL ELSE expr1 END. NULLIF(expr1,expr2)

# String Functions

 Function Description Example CHAR_LENGTH Returns the length of the given string argument, measured in characters. A multi-byte character counts as a single character. This means that for a string containing five two-byte characters, LENGTH(), whereas CHAR_LENGTH() returns 5. If the argument is NULL, it returns NULL. If the argument is not a string value, it is converted into a string. CHAR_LENGTH(str) CONCAT Returns the string that results from concatenating the arguments. May have one or more arguments. All arguments are converted to strings. CONCAT() returns NULL if any argument is NULL. CONCAT(str1,str2,...) DYNACAST Returns the value of expr converted to the same type as type_expr. type_expr is not evaluated; it is only used to determine the type to which expr should be converted. In most cases, type_expr be a column reference. DYNACAST(expr, type_expr) GROUP_CONCAT Returns the string that results from concatenating the arguments, separated by the separator, for the whole group. May have one or more arguments, plus the optional separator. If not set, the separator is the comma character '',''. All arguments are converted to strings.  It is an aggregate function. GROUP_CONCAT() returns null if there were no matching rows. GROUP_CONCAT(str1,str2,...[SEPARATOR sep]) INITCAP Returns str with the start of every word in uppercase and all other characters in lowercase. INITCAP(str) INSTR Returns the position of the first occurrence of substring substr in string str. If any argument is NULL, returns NULL. INSTR(str,substr) LCASE LCASE() is a synonym for LOWER(). LCASE(str) LEFT Returns the leftmost len characters from the string str, or NULL if any argument is NULL. LEFT(str,len) LENGTH Returns the length of the string str, measured in bytes using the UTF-8 encoding. A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5. If str is not a string value, it is converted into a string. If str is NULL, the function returns NULL. LENGTH(str) LISTAGG LISTAGG() is a simplified version of GROUP_CONCAT(). The first argument is required, and is the string to be concatenated. The second argument is optional, and is the separator string. If not set, the separator is the empty string. LISTAGG(str1[,sep]) LOWER Returns the string str with all characters changed to lowercase. LOWER(str) LPAD Returns the string str, left-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters. LPAD(str, len, padstr) LTRIM Returns the string str with leading space characters removed. LTRIM(str) REGEXP_LIKE Performs a pattern match of a string expression str against a pattern pat. The pattern can be an extended regular expression. The optional flags argument is a string that may contain any or all the following characters specifying how to perform matching: c: Case-sensitive matching. i: Case-insensitive matching. m: Multiple-line mode. n: The . character matches line terminators. REGEXP_LIKE(str,pat[,flags]) REGEXP_REPLACE Replaces occurrences in the string str that match the regular str specified by the pattern pat with the replacement string repl, and returns the resulting string. If str, pat, or repl is NULL, the return value is NULL. The optional flags argument behaves as it does in REGEXP_LIKE(). REGEXP_REPLACE(str,pat,repl[,flags]) REPLACE Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str. REPLACE(str,from_str,to_str) RIGHT Returns the rightmost len characters from the string str, or NULL if any argument is NULL. RIGHT(str,len) SPLIT_PART Returns the Nth substring of str split by sep. If N is zero or smaller, returns null. If there are fewer than N substrings after str has been split by the given separator, returns empty string. Example: SPLIT_PART(''a~|~bc~cb~|~d~|~e'', ''~|~'', 2) = ''bc~cb'' SPLIT_PART(str,sep,N) STR_CLEAN Replaces NULL values in str with the empty string, and removes all non-whitespace control characters. These are a subset of the operations performed by STR_NORMALIZE(). STR_CLEAN(str) STR_CONTAINS Returns TRUE if the string str contains the string search, FALSE if it does not, or NULL if either argument is NULL. STR_CONTAINS(str, search) STR_DIGITS Returns a new string consisting of only the digit characters in str in the order they appear. Converse of STR_REMOVE_DIGITS(). STR_DIGITS(str) STR_ENDS_WITH Returns TRUE if the string str ends with the string search, FALSE if it does not, or NULL if either argument is NULL. STR_ENDS_WITH(str, search) STR_NORMALIZE Normalizes str by performing the following operations: 1. NULL values in str are replaced with the empty string. 2. Leading and trailing whitespace characters are removed. 3. All other sequences of one or more whitespace characters are replaced with a single space ('' ''). 4. All non-whitespace control characters are removed. This is the more thorough version of STR_CLEAN(). STR_NORMALIZE(str) STR_REMOVE_DIGITS Returns str with all digit characters removed. Converse of STR_DIGITS(). STR_REMOVE_DIGITS(str) STR_STARTS_WITH Returns TRUE if the string str starts with the string search, FALSE if it does not, or NULL if either argument is NULL. STR_STARTS_WITH(str, search) STRING Returns the value of expr converted to a STRING. Locale, timezone, and date-format settings all affect this behavior. STRING(expr), CAST(expr AS STRING) SUBSTRING The forms without a len argument return a substring from string str starting at position pos. The forms with a len argument return a substring len characters long from string str, starting at position pos. It is also possible to use a negative value for pos. In this case, the beginning of the substring is pos characters from the end of the string, rather than the beginning. A negative value may be used for pos in any of the forms of this function. By default, the position of the first character in the string from which the substring is to be extracted is reckoned as 1. If any argument is NULL, returns NULL. SUBSTRING(str,pos), SUBSTRING(str FROM pos), SUBSTRING(str,pos,len), SUBSTRING(str FROM pos FOR len) UCASE UCASE() is a synonym for UPPER(). UCASE(str) UPPER Returns the string str with all characters changed to uppercase. UPPER(str)