Survivorship

Survivorship defines which data survives in golden records.

A survivorship rule defines—​for attributes in fuzzy-matched and ID-matched entities—​how golden record values are computed. It is composed of:

  • A consolidation rule, defining how to consolidate values from duplicate records (detected by the matcher) into a single (golden) record.

  • An override rule, defining how values authored by users override the consolidated value in the golden record.

Consolidation strategy

The consolidation rule defines, using a consolidation strategy, how to choose the best values in the consolidation process.

The consolidation strategies are listed in the table below, along with the equivalent SemQL expression for each strategy.

Strategy Description Expression

Custom Ranking

The SemQL ranking expression is used to rank duplicate records, and the first value by rank is used.
The expression is an order by clause and can contain the specification of the ascending (ASC) or descending (DESC) order as well as the position of the null values (NULLS FIRST or NULLS LAST).

[semQL expression], PubID ASC, SourceID ASC

Largest/Smallest Value

Values are sorted using their type-specific sort method (e.g., alphabetical for strings).
For example: Mozart is larger than Beethoven (M is after B in the alphabet). Note that binary attributes do not support this consolidation strategy.

[value] ASC or [value] DESC

Longest/Shortest Value

The lengths of the values are ordered.
For example: Mozart is shorter than Beethoven (string size).

LENGTH([value]) ASC or LENGTH([value])DESC

Most Frequent Value

The first most frequent non-null value.

Specific

Preferred Publisher

Publishers are manually ordered. The first one returning a value for the field is used.

Specific

For all strategies except Most Frequent Value, the ranking expression stores a SemQL expression used to sort records in the event of an ambiguity for the strategy.
For example, when two fields having different values are duplicates from the same publisher and a Preferred Publisher strategy is used. The expression is an order by clause and can contain the specification of the ascending (ASC) or descending (DESC) order as well as the position of the null values (NULLS FIRST or NULLS LAST).
Only the Custom Ranking and Preferred Publisher strategies work for consolidation rules involving multiple attributes. If you want to use, for example, the Largest Value strategy for two attributes, then you must define two rules, one for each attribute.
A Skip Null option is available for the Custom Ranking and Preferred Publisher strategies to avoid null values in the consolidation process. For single-attribute rules, the first non-null value after ranking is consolidated. This option is not supported for multi-attribute rules.
If you do not want to configure the consolidation, use the Custom Ranking strategy with no ranking expression set.

Override strategy

The override rule defines, using an override strategy, how to overwrite consolidated values with values authored by the users.

The override strategies are listed in the table below.

Strategy Description

Always Authored in the MDM

The value is always authored in the MDM application. Values coming from other sources (if any) are always ignored. This attribute remains null until a user explicitly changes the value. The attribute is therefore ignored for consolidation.

No Override

The value is always consolidated according to the consolidation rules. The application does not allow overriding data for this attribute.

Override - until consolidated value changes

Value override is allowed. Values coming from the publishers consolidate according to the consolidation rules. When an override is made, the value is maintained until the value consolidated from the publishers changes. When this happens, the new consolidated value wins against the authored value. The system reverts to the defined consolidation rules to arbitrate survivorship for the next value changes from the publishers.

Override - until next user changes

Value override is allowed. Values coming from the publishers consolidate according to the consolidation rules. When an override is made, the value is maintained until a user enters a new value or removes the override.

Grouping attributes in survivorship rules

A survivorship rule applies to a single attribute or a set of attributes of a given entity.

When a rule has multiple attributes, all these attributes work together for the consolidation and override process:

  • Attributes in the group consolidate together with the same consolidation rule.

  • When a user overrides one of these attributes, all of them are considered overridden. In the user interface, when the Override button is clicked for one attribute, override is also toggled to the on position for all the attributes in the group.

  • When the override is removed (according to the override rule) from one of these attributes, all attributes in the group lose their override simultaneously.

When moving an attribute from one group to another governed by a different survivorship rule, any overridden value within this attribute causes all other attributes in the new group to be considered overridden as well. As a result, the values of these attributes may be nullified during the subsequent execution of the certification process, unless override values have previously been defined. To minimize potential negative impacts on data consistency and accuracy, designers should consider moving attributes with pre-existing overridden values to a group that does not include any other attributes.

The Default flag is defined for one and only one survivorship rule for an entity. The rule with the Default flag applies to all attributes not handled by any other rule of this entity.

Master ID survivorship rule

The Master ID survivorship rule is a specific rule identified by the following icon: Master ID survivorship rule.
This rule does not have an override strategy. It only defines how the ID of one of the master records attached to a golden record consolidates into this golden record.

This master record is preferred when references to the golden record need to be re-attached to a master record. For example, if the golden record splits, references to the golden will preferably "follow" this master record.