Automate merge and confirmation

The Merge Policy and Auto-Confirm Policy available when creating a matcher allow you to automatically merge and confirm detected match groups in certain cases.

Merge policy cases

In the merge policy, you define a confidence score threshold above which the match group is automatically merged into a golden record.

For example, if you set the value for Creating a golden record from new master records to 80, and a match group with a confidence score higher than 80 appears only from new master records, it is merged into a new golden record.
If a match group of new master records with a confidence score of 80 or lower appears, then is not merged automatically. It is proposed for merging to the data steward.

The various cases under which an automated merge may take place are listed and explained below:

  • Create a golden record from new master records: This is a frequent case when new master records are loaded into the hub and matched/merged. The initial data loads enter in this case.

  • Merge unconfirmed golden records: This case occurs when existing master records attached to golden records that have not been confirmed are modified, causing the existing golden records to possibly merge all together. In this case, if the two unconfirmed golden records merge, one ceases to exist, and the survivor may see its values modified.

  • Merge confirmed golden records: This case occurs when existing master records attached to golden records that have been confirmed are modified, causing the existing golden records to possibly merge all together. In this case, if the two golden records merge, a confirmed golden record may cease to exist, and the other one may see its values modified.

  • Merge unconfirmed with confirmed golden records: This case occurs when existing unconfirmed golden records are about to merge with a golden record that has been confirmed. In this case, unconfirmed golden records may cease to exist, and the confirmed golden record may see its values modified.

  • Add new master record to an unconfirmed golden records: This case occurs when a new master record is about to be merged with a golden record that has not been confirmed. In this case, the golden record values may change. This is typically the case for loads following the initial load.

  • Add new master record to a confirmed golden records: This case occurs when a new master record is about to be merged with a golden record that has been confirmed. In this case, the golden record values may change. This is typically the case for loads following the initial load.

  • Merge golden records previously split by the user: This case occurs when two groups manually split by a data steward re-matches due a new record matching both groups. In this case, existing golden records reviewed by the data steward may cease to exist.

A group may fall into several cases. In that case, its confidence score must exceed all the thresholds for the automated merge to happen.

Auto-confirm cases

There are two cases for automatically confirming merged golden records:

  • When the match group’s confidence score is above a certain threshold, the resulting golden record can be automatically marked as confirmed.

  • Singletons, that is golden records composed of a single master record, can be automatically confirmed. Note that singletons that have match suggestions (records matched but with a score not high enough to automatically merge) are not automatically confirmed.

For example, if a match group with a confidence score of 80 was automatically merged, and the Auto-Confirm Golden Record threshold was set to 79, then this group is also marked as confirmed. If the threshold is set to 85, then this group is merged but not marked as confirmed.

Pattern for automating merge and confirmation

Pattern #1: No unmonitored change

In this pattern, the data steward should review all records. All of them are treated with the same importance. No change is made and no record is created without having a user confirming it.

Solution:

  • Set all the values in the Merge Policy and Auto-Confirm Policy to 100

  • Un-select Auto-Confirm Singletons.

Pattern #2: No stewardship

In this pattern, the hub merges and confirms all content. Fixes will take place on demand on confirmed golden records.

Solution:

  • Set all the values in the Merge Policy and Auto-Confirm Policy to 0

  • Select Auto-Confirm Singletons.

Pattern #3: Delayed stewardship

In this pattern, the hub merges all content and confirms no record. The steward monitors and confirms all records after they are merged.

Solution:

  • Set all the values in the Merge Policy to 0

  • Set Auto-Confirm Golden Records to 100

  • Un-select Auto-Confirm Singletons.

Alternately, to reduce stewardship overheard, you may select Auto-Confirm Singletons to avoid reviewing the singletons.

Pattern #4: Merge all then review suspicious matches

In this pattern, the hub merges all content but the steward must review suspicious matches.

Solution:

  • Set all the values in the Merge Policy to 0

  • Set Auto-Confirm Golden Records to 80 (adjust this value to your match rules scores)

  • Select Auto-Confirm Singletons.

The steward will be able to review suspicious unconfirmed golden records (those under the 80% confidence score).

Pattern #5: Manually merge suspicious matches

In this pattern, the hub merges and confirms confident matches but the steward must manually merge others.

Solution:

  • Set all the values in the Merge Policy to 80 (adjust this value to your match rules scores)

  • Set Auto-Confirm Golden Records to 80 (adjust this value to your match rules scores)

  • Select Auto-Confirm Singletons.

Pattern #6: Prevent golden-record deletion

In this pattern, the hub must prevent golden records from ceasing to exist without the data steward’s approval, but allows other confident merge operations to take place automatically.

Solution:

  • Set the Merge confirmed golden records, Merge unconfirmed golden records, Merge unconfirmed with confirmed golden records, Merge golden records previously split by the user to 100.

  • Set other values in the Merge Policy to 80 (adjust this value to your match rules scores)

  • Set Auto-Confirm Golden Records to 80 (adjust this value to your match rules scores)

  • Select Auto-Confirm Singletons.