Describe the bug
The output data produced by the link phase change each time the model is run.
To Reproduce
Steps to reproduce the behavior:
- Follow quick start instructions:
docker pull zingg/zingg:0.3.4
docker run -it zingg/zingg:0.3.4 bash
-
Change matchType in examples/febrl/configLink.json from 'exact' to 'fuzzy' (resolves issue with this example in version 0.3.4 realted to Issue 427)
-
Run the 'febrl' model in link mode
./scripts/zingg.sh --phase link --conf examples/febrl/configLink.json
- Examine output files (
/tmp/zinggOutput)
- Re-run steps 3. and 4. (without making changes to configuration or input files) and observe different results . Results differ in the number of output rows, the subset of input datasets included in the output, and the
z_score values.
Expected behavior
My expectation is that sequential runs without config or input file changes would produce identical results (except, perhaps, in z_cluster labels).
Describe the bug
The output data produced by the link phase change each time the model is run.
To Reproduce
Steps to reproduce the behavior:
Change
matchTypeinexamples/febrl/configLink.jsonfrom 'exact' to 'fuzzy' (resolves issue with this example in version 0.3.4 realted to Issue 427)Run the 'febrl' model in link mode
/tmp/zinggOutput)z_scorevalues.Expected behavior
My expectation is that sequential runs without config or input file changes would produce identical results (except, perhaps, in
z_clusterlabels).