# Tolerances and roundings

# Tolerances

A tolerance can only be assigned to column defined as a *value* (meaning that a tolerance can’t be applied to *key* columns) and, for the moment, is only supported for column defined with types *numeric*, *dateTime* and *text*.

## Tolerance for numeric values

### Absolute tolerance

If you apply a tolerance of 0.001 to a numeric column, two rows (with same keys) will be considered as equal if the absolute difference between them is less or equal to 0.001.

This is especially helpful in Business Intelligence when you need to compare two queries and you don’t bother about small differences. Another case in Business Intelligence is also when you customers give you some hints about values expected in reports, they are usually a bit imprecise.

It’s also possible to take some short-cuts and write directly in the *equalTo* tag the columns’ definition. This xml element has the following meaning: all columns are keys except the last one. This last column is a value column where a tolerance of 0.001 must applied.

If you want you can combine both notations. The definition provided in an xml element named column has always the precedence to the tolerance provided the xml element named *equalTo*. It means that following notation (for a result-set with 6 columns):

must be interpreted as: columns 0, 1 and 4 are keys, column 2 must be ignored, column 3 is a value with a tolerance of 0.001 and column 5 is also a value but with a tolerance of 10.

### Relative tolerance

You can express a tolerance, relative to the expected value, by the means of the *%* symbol. This only applies to *numeric* values. When comparing the expected and actual value, the comparer will apply a tolerance of the percentage defined.

To illustrate this, if you’ve two rows with values of 40 and 100, an absolute tolerance of 10 will allow the actual values to be respectively in the intervals [30;50] and [90;110]. A relative tolerance of 10% will change the intervals to [36;44] and [90;110].

The xml syntax requires a % in the attribute *tolerance*.

### Absolute and relative tolerance

Sometimes, you want to use a relative tolerance but you also want to bound thsi tolerance and express that this tolerance cannot be more (or less) than a specific value. It’s possible to achieve this by specifying the value in percentage and also the bound (min or max) between brackets for the xml attribute *tolerance*.

### One-sided tolerance

Sometimes, you want to specify that the tolerance should only be applied on the right or on the left of the expected value. In this case, you must specify the symbols *+* or *-* before the percentage of the absolute value.

## Tolerance for type “date and time”

If you want to specify the tolerance for *dateTime* columns, you must express the tolerance in days, hours, minutes, seconds and milliseconds. The correct syntax for two days and an half is

If the tolerance must be set to 15 minutes, you will write:

At the moment only *absolute tolerance* is supported and its not possible to apply *one-sided tolerance*.

Reminder, tolerances are only applied to *value* columns and never to *key* columns!

## Tolerance for type “text”

### Ignore case

You can specify that two values defined in a column with type *text* should be compared without taking care of the case. To achieve this, you must define an attribute *tolerance* and set it to *ignore-case*.

### Fuzzy Matching

You can define that two values of type ‘text’ should be similar and not exactly the same. Following algorithms are supported:

- Hamming Distance
- Jaccard Distance
- Jaro Distance
- Jaro-Winkler Distance
- Levenshtein Distance
- Longest Common Subsequence
- Longest Common Substring
- Overlap Coefficient
- Ratcliff-Obershelp Similarity
- Sorensen-Dice Distance
- Tanimoto Coefficient

For more information about these algorithms or about how it’s implemented, refer to the third-party library FuzzyStrings.

Use the attribute *tolerance* and specify the name of the algorithm and the threshold value.

If you wish, you can combine the different algorithms by seperating them with a comma. In this case you can’t specify a numeric threshold but must rely on one of the three predefined values: *strong*, *normal* or *weak*.

# Roundings

The roundings are another set of tools to express that two values are equal if they are close to each other. At the opposite of *tolerance*, *rounding* is applied to both expected and actual values. If after the rounding’s operation, the two values are strictly equal then the comparison will be positive (and else negative).

The roundings are related to .Net methods Round, Floor and Ceiling. The rounding rules will be the same that their corresponding equivalent in .Net.

The method used must be specified in the column xml definition by the means of values: round, ceiling and floor in the attribute *rounding-style*.

### Numeric columns

Nevertheless, the rounding methods are different in same points to their .Net equivalent. Each of them require a *step*. If the step is less than 1, the rounding will be applied to the decimal part of the value.

**Example 1**: For a value of 10.52912 with a step of 0.1, the rounding will return

- with a
*floor*style : 10.5 - with a
*round*style : 10.5 - with a
*ceiling*style : 10.6

**Example 2**: For a value of 10.52912 with a step of 0.01, the rounding will return

- with a
*floor*style : 10.52 - with a
*round*style : 10.53 - with a
*ceiling*style : 10.53

If the step is greater than 1, the rounding will be applied to the integer part of the value.

**Example 3**: For a value of 10529.12 with a step of 10, the rounding will return

- with a
*floor*style : 10520 - with a
*round*style : 10530 - with a
*ceiling*style : 10530

**Example 4**: For a value of 10529.12 with a step of 20, the rounding will return

- with a
*floor*style : 10520 - with a
*round*style : 10520 - with a
*ceiling*style : 10540

### Date and time columns

It’s also possible to apply the same kind of roundings for dateTime columns. The rounding must be less than or equal to one day.

**Example 5**: For a value of 2013-10-17 14:47:00 with a step of “1” (day), the rounding will return

- with a
*floor*style : 2013-10-17 00:00:00 - with a
*round*style : 2013-10-18 00:00:00 - with a
*ceiling*style : 2013-10-18 00:00:00

**Example 6**: For a value of 2013-10-17 14:47:00 with a step of “00:15:00” (15 minutes), the rounding will return

- with a
*floor*style : 2013-10-17 14:45:00 - with a
*round*style : 2013-10-17 14:45:00 - with a
*ceiling*style : 2013-10-17 15:00:00