Setup for Data Quality Rules engine
It’s possible to use NBi as a data quality rules engine. The assertions equal-to (subset-of, superset-of), unique-rows, row-count (all/no/single/some-rows) are supported in this context.
The gap between a data quality rule engine and an automated test framework is especially in the reporting. When using an automated testing tool, you usually don’t care of details when a test is successful. At the opposite, a data quality rules engine will need to report (at the minimum) how many rows were successful. To achieve this, NBi lets you configure a few specific attributes that you won’t use in when you’re targeting an automated testing tool.
A valid configuration, to use NBi as a data quality rules engine, is to set the format to JSON and the mode to Always (in place of respectively Markdown and OnFailure which are the default values).
<nbi>
<failure-report-profile
threshold-sample-items="50"
max-sample-items="25"
expected-set="None"
actual-set="None"
analysis-set="Sample"
format="Json"
mode="Always"
/>
</nbi>
These two settings will enable NBi to always add to the result file some information about the tests in a JSON format.
The JSON content is embedded in the element message of the xml file containing the results. This JSON document has the following structure: 3 items named
- actual: a table information for the actual result-set
- expected: a table information for the expected result-set [Only for equal-to, subset-of, superset-of]
- analysis: some tables information for the analysis
Depending on the type of the constraint, the item analysis could contain tables unexpected-rows, missing-rows, duplicated-rows, …
Each table information contains the row-count, if sampled the row-count of the sampled result-set, the structure of the result-set (for each column, the properties such as role, type, tolerance, roundings, …) and the rows. The representation of the value of each cell is in a text format independently of the underlying type. This is a choice made to let you capture specific values such as (any), (value) or (empty).
<test-case name="NBi.NUnit.Runtime.TestSuite.Simple equalTo with Failure" description="" executed="True" result="Error" success="False" time="1.351" asserts="1">
<categories>
<category name="Execution" />
</categories>
<failure>
<message><![CDATA[NBi.NUnit.Runtime.CustomStackTraceAssertionException : {"expected":{"total-rows":2,"table":{"columns":[{"position":0,"name":"Column1","role":"KEY","type":"Text"},{"position":1,"name":"Column2","role":"VALUE","type":"Numeric"}],"rows":[["Alpha","1"],["Beta","3"]]}},"actual":{"total-rows":2,"table":{"columns":[{"position":0,"name":"Column1","role":"KEY","type":"Text"},{"position":1,"name":"Column2","role":"VALUE","type":"Numeric"}],"rows":[["Alpha","1"],["Beta","2"]]}},"analysis":{"unexpected":{"total-rows":0},"missing":{"total-rows":0},"duplicated":{"total-rows":0},"non-matching":{"total-rows":1,"table":{"columns":[{"position":0,"name":"Column1","role":"KEY","type":"Text"},{"position":1,"name":"Column2","role":"VALUE","type":"Numeric"}],"rows":[[{"value":"Beta"},{"value":"2","expectation":"3"}]]}}}}]]></message>
<stack-trace><![CDATA[<?xml version="1.0" encoding="utf-16"?>...]]></stack-trace>
</failure>
</test-case>