Point-in-time tables (Release 4.1.7.)

With the release of 4.1.7., we moved further into the Business Vault. We already had business views.Pits and bridges were the next objects to include.
Pits are included in this release and will be discussed below.

In the Data Vault 2.0., pits are used as an easy to query combination of one entity in your Data Vault.

If you have a hub with multiple sats, by having sat splits or just by combining multiple sources, a pit will reflect that information at certain points in time. In the application itself there are multiple possibilities to define those time intervals.

A quick overview of how to define pits in the tool.

The pits are located beneath the Business Vault tab in the menu.

For one Data Vault release, one or more Business Vault releases can be created.

The pit screen itself in the application will have two overviews. One with the Dv objects related to the pits and one with the different Pit objects. The pit object can be seen as a configuration on how to set up the pits and can be reused for multiple hubs, links and direct links.

In the application when wanting to add a pit with a certain configuration, just select the hubs, links and direct links and click the +PIT button to add a new configuration. If you already have an existing configuration, you can just click ‘Add to PIT’.

The pit configurations we support at the moment are detailed and snapshot pits.

This means that if you choose Detailed, all the different dates that are found in the selected date column will be the different points in time to be included. If you choose snapshot, you will have the possibility to define an interval, expressed in seconds, minutes, days, weeks, months and years.

Last possibility is the selection of the column which is the base for the interval processing.

When having configured a pit, your BV release can be locked.

The business vault release will contain the metadata needed to create the business views and the pit tables which are based upon a snapshot date view.

The DDL generation itself will generate the code for the pit tables. The ETL generation will generate the Business Views, the Snapshot date views and the functions to load the pit tables. The views will be used for initial and incremental load. The Pit tables will have to mappings per object. One for the initial load and one for the incremental load.

When generating, an extra drop list with all your business vault release will be available.

The BV itself can now also be generated separately.


Release version 4.1.0: linking entities over and between sources

As of version 4.1.0. Vaultspeed can work with multi-master tables within a single source. 

One of the most powerful Data Vault features is connecting tables from different sources and within sources. In Vaultspeed a table can be single-master or multi-master and it can be a slave of another table when grouped.

About single-master and multi-master hubs 

Hubs, i.e. the entities that contain keys, can be either single master of multi master: 

  • Single-master hubs store one source without any extra sources or source object business key fields. 
  • Multi-master hubs store an extra field to identify from which source the data comes or, in case of a multi-master hub within one source, from which source table the data comes.

The Vaultspeed master orchestration

In Vaultspeed you can define on source level if a table/hub is single master or multi master. This makes it easier to add another source to the same hub for a next release: there is no need to migrate and recalculate the hash keys because a source business key field is already included.

If an original setup did not account for this, Vaultspeed will still recognize when a value is changed to multi master in a next release. A migration ELT-mapping is then automatically generated to recalculate the hash keys with the extra source business key.

Defining a single- or multi-master hub is just groundwork for the ‘real deal’.

Hub group management

Hubs connect multiple sources to a Data Vault. These building blocks can store keys from different source tables identified with a source business key. Grouping entities with the same meaning creates one hub only.

The group predictor integrated in the Hub Management tool connects objects from different sources with the same name automatically. (Any unwanted results can always be unlinked.) Other objects with different names can be grouped together manually. 

In the following example SRC3 contains the table employees, SRC2 contains the table persons. Content-wise, these two groups belong to the same entity and are grouped by dragging and dropping the hub_persons from the unlinked hubs canvas to the group canvas.

Unlinked hubs on the left, grouped hubs on the right

Persons of SRC2 is grouped together with employees of SRC3

The impact of the grouping is: 

  • The tool knows that the entities “employees” and “persons” are the same.
  • One hub is generated with a source business key. But every entity gets its own satellite.

From now on only tables from the same source can be grouped. The big difference between grouping over and grouping within a source is the extra element that is needed to identify the source: the source entity name.

An example:

  • Grouping over sources: SRC2 (source business key: S2) object persons – SRC3 (source business key: S3) object employees are grouped in ‘EMPLOYEES’

When loading from source object persons:

    • Source business key Value = ‘S2’
    • Satellite name = SAT_S2_EMPLOYEES

When loading from source object employees: 

    • Source business key Value = ‘S3’
    • Satellite name = SAT_S3_EMPLOYEES
  • Grouping within a source: SRC2 (source business key: S2) object persons – SRC2 (source business key: S2) object employees are grouped in ‘EMPLOYEES’

When loading from source object persons:

    • Source business key Value = ‘S2_PERSONS’
    • Satellite name = SAT_S2_PERSONS_EMPLOYEES

When loading from source object employees: Source business key Value = ‘S3’

    • Source business key Value = ‘S2_EMPLOYEES’
    • Satellite name = SAT_S2_EMPLOYEES_EMPLOYEES

Hub master-slave management

For hubs grouped together, a relationship can be defined. One golden rule: ensure that all business keys in one object cover all business keys in another one. 

An example: the employees table has all the business keys of the persons table. This means that the employees can be master and the persons can be slave. In the ETL logic this means that only one mapping is generated to load the master. Since the slave has the same hash keys, there is no need to load them. 

But each entity will have its own satellite and can have its own unique descriptive data. If both entities are master, both are loaded into the hub with their own source identifier.

In the Hub Management tool, a source can be made slave or master for all tables at once or on group level. 

Master/Slave management on source level

Master/Slave management on group level

Hub business key concatenation

When the hubs are grouped and the master/slave relation is set, the concatenation of the business keys can be edited. 

Different entities can have the same or different business keys. In a master/slave relationship, only the master keys are loaded into the hub. For master-master relationships with the same entities but different keys, a concatenated business key can be set. This will store the different business keys in one column.

An example: 

  • The persons table identifies a person using its first and last names.
  • The employees table identifies a person using an employee_id. 

In this case a concatenated business key makes sure that first and last names stay connected and stored together with the the employee_id. In the business vault a link is created to keep the employee_id and the first and last names together. 

When both entities are using a same identifier, but just have different data, concatenation is an option. In the Name management tool, this can be controlled through a toggle switch. 

Be aware that the concatenation is only possible for multi-master hubs. In the same screen the group itself can get another short name and an abbreviated name. Short names are used to create the link name e.g. LNK_COUN_EMPL, the abbreviated names are used for the hub and satellites.