Whitepaper: Accelerate your business taxonomy mapping with VaultSpeed

Jonas De Keuster

13 January 2022|In Events, Knowledge|By Jonas De Keuster

Learn how the VaultSpeed automation tool is designed to transfer any business taxonomy you might think of, into a raw Data Vault layer.
The Raw Data Vault (RDV) contains what Data Vault 2.0 calls the ‘Single Version of the Facts’. Facts are nothing more than the raw, historical, unfiltered data from the sources.

The Business Data Vault (BDV) aligns business keys/terms from the source system with the different business views in order to ensure compliance. Different viewpoints coexist and are all regarded by Data Vault 2.0 to be valid versions of the truth.

You’ll discover that these ‘versions of the Truth’ and the ‘Single version of Facts’ can truly blend.




Automation² API, Matillion for Azure Synapse and Custom ETL settings (Release 4.2.6)

VaultSpeed is creating a habit of launching a major release just before the holiday season! This year is no exception, so 4.2.6 is loaded with Santa's gifts!

So, what's new?

We completely redesigned our API to make it publicly accessible and consumable. Matillion ETL for Azure Synapse is now available. And we added functionality to your data pipelines: you can now customize your ETL mappings and add additional code.

Automation² API

Our API has been substantially reworked. You can start calling corresponding API endpoints for all the data and actions available in our application. We're proud of this achievement making VaultSpeed the first tool to deliver a REST API for data vault automation. The API enables further integration with other tools and allows users to truly automate the automation = Automation².

Rest API docs

 

With our API, you can start automating tasks such as:

  • the creation of a new source version
  • the configuration for similar sources
  • loading metadata into your preferred data lineage or data governance tool
  • the import of business view definitions
  • the migration of your existing Data Vaults into VaultSpeed
  • and much more

The screenshot below shows the setup for automatic agent download via the API.

 

Download the agent using curl to the API endpoint

 

Another example:

This screenshot illustrates how VaultSpeed metadata is extracted via Snowflake's Matillion ETL API. A schedule running this mapping would sync all Data Vault lineage and metadata straight into Snowflake!

 

VaultSpeed API to Snowflake mapping built in Matillion ETL

 

Data Vault metadata loaded in Snowflake

 

Not all endpoints will be included in our standard licenses, but some will always be available, such as downloading the Agent or the Airflow plugin.

Matillion Synapse

On the ETL side, we've added support to run Matillion Synapse. VaultSpeed now generates Matillion ETL code for Synapse Data Vaults.

Matillion users can automate the pipelines that load data in the Data Vault area and focus on tailor-made transformations in the other layers of the cloud data warehouse.

It is good to know that Matillion has just released CDC support. This opens the opportunity to land the data from different sources, making Data Vault's integration even more effortless.

Our current support for Matillion includes both Snowflake and Synapse , and we are looking to extend it to other cloud data platforms in the near future.

 

Generated SAT mapping in matillion for Synapse

Data pipelines

This release contains a significant development to make your data pipelines run smoother.

VaultSpeed has offered the possibility to add custom code snippets to generated DDL code for quite some time now. Think of examples like DDL for transient tables or partitioning definitions.

We’re now allowing users to add custom code to the generated mappings as well. Depending on your preferred ETL solution, different settings can be applied.

The example below shows SQL procedures.

 

Example of custom ETL snippets added to procedures

 

The possibilities are endless — from changing execution grants by adding "Execute as owner" to adding custom logging statements with row counts after every DML statement.

The complete documentation can be found at  https://vaultspeed.atlassian.net/wiki/spaces/VPP/pages/2701370816/Generating+Code#ETL-Settings.

Other important changes:

  • We added the possibility to define DDL settings for all the standard BV objects (no VaultSpeed Studio templates at this stage, those will be added later).
  • The initial load STG mappings not only use the extraction table but also the SATs to look up BKs. This comes in handy, mainly for delta generations when loading the initial data for a new object with references to an existing object.
  • We added two extra DV parameters: CAST_TO_NVARCHAR_IN_HASH and CAST_TO_VARCHAR_IN_HASH. These can be used to control the hashing behavior and determine which type the business keys are cast before hashing them. These parameters are beneficial for SQL Server and Synapse and are mutually exclusive.
  • A new logic applies to the BV release creation to catch the cases where bridges become invalid. When an object gets deleted from a Data Vault while being used in a bridge, the initial BV release created when locking that DV release will be unlocked. No code can be generated for this new DV release. It can only be rendered after resolving the issue in the bridge and locking the business vault. While there is still an invalid bridge in a BV, hovering over the (grayed out) lock button will display the faulty bridge.
  • Hard deletes can now be generated for ODI. The deletes are implemented using a setBeginCmd containing the delete SQL statement.
  • We updated our template language to allow for repeating templates. You can now generate a query for every SAT of a HUB or all DV objects in a bridge in the VaultSpeed Studio templates.

Example of the template code:

Template $ DVO_TEMPL 
templaterepeatedbycomponent DVO
...

So, lots of new stuff to play with for next year. All we want to do now is wish you happy holidays. More exciting features are coming in 2022.
Spoiler alert, some of them involve Spark Streaming!

 

 


VaultSpeed meets Matillion ETL (Release 4.2.5)

VaultSpeed meets Matillion ETL (Release 4.2.5)

Building your cloud data warehouse just got a whole lot easier. Matillion ETL and VaultSpeed have teamed up to bring you automated, cloud native data integration powered by Data Vault 2.0. This release brings you automated ETL generation for Matillion on Snowflake and support for Azure Synapse is coming soon.

 

About Matillion ETL

Matillion helps data teams get insights and results faster with a cloud-native data integration platform. It’s low-code, no-compromise turbocharges your data ingestion and transformation workflows, while taking full advantage of every native capability in your cloud data platform of choice (Snowflake, Amazon Redshift, Delta Lake on Databricks, Google BigQuery, and Microsoft Azure Synapse).

Matillion is the only cloud-native ETL platform built for enterprise deployment with advanced collaboration, security and data sovereignty protection at its core. This helps data teams of all sizes get ahead of the curve, stay competitive, and deliver game changing value back to their organizations.

Deliver Cloud integration systems Faster

VaultSpeed’s automation engine can now deliver generated ETL mappings for Matillion. This drastically decreases Time to Market for any cloud data warehouse project. It also increases the level of scalability.

Matillion supports an abundance of data source connectors so you can easily load and transform any type of source data into your cloud Data Warehouse or Data Lake.

Both Matillion and VaultSpeed support modern cloud data platforms such as Snowflake, Azure Synapse, Delta Lake on Databricks, Amazon Redshift and Google Bigquery. For the moment, we support Snowflake only, with Synapse following soon.

The automated integration process starts by harvesting your source metadata into VaultSpeed and modelling your Data Vault 2.0 data model towards any business model. VaultSpeed will do the heavy lifting for you.

Afterwards, you are able to auto-deploy ETL code and workflows to the Matillion ETL repository and DDL towards your target environment.

 

 

Once you finish development, Matillion takes care of loading your data from source to target using VaultSpeed’s auto-generated mappings. Matillion’s ETL engine ensures excellent loading performance. You are able to schedule, run and monitor your data flows from within Matillion's interface.

You can additionally build custom business logic using Matillion’s intuitive ETL designer tools and you can also automate custom business rules into the solution by coding custom VaultSpeed Studio automation templates.

 

Matillion ETL mappings and flows generated by VaultSpeed

Other changes

We also added a few other cool features in this release:

  • You can now generate ETL for only a specific object. If you select a source object, VaultSpeed will generate all the code to load the resulting Data Vault objects.
    If you select a Data Vault object, we will give you all the code to load that specific object. You can limit further by also specifying a source (see example below). If a Business Vault object is selected, then only the code for that object will be generated.

 

  • We also added support for Multi-active Sattelites without a subsequence attribute. This is only available for objects with no CDC or no CDC incremental since we need all the records per key to be delivered each load.
  • We renamed the mapping counter to a more generic name: Vaultspeed Automation Units. This should avoid some confusion since mappings are not the only product that you can generate with VaultSpeed . More information can be found in the documentation Vaultspeed Automation Unit (VAU).
  • The format mask for source attributes is now a free text field instead of a selection menu. You can use this feature to easily convert data into the correct format if you like (e.g. a char to a date field). Note that this format string must be valid for the target database.




Vaultspeed Studio Demo

Jonas De Keuster

23 August 2021|In Events, Knowledge|By Jonas De Keuster

Build custom templates to create and load objects in the Business Vault Area automatically.

 

Vaultspeed provides prebuilt templates that accelerate the development of the integration layer.
Our new module, VaultSpeed Studio, automates other parts of  the data warehouse, such as pre-staging or presentation areas requiring custom logic. Think of converting prices to your local currency as an example.

Watch this demo to find out how easy we’ve made it to build, test, and put your custom templates to work to generate any logic required.




VaultSpeed and consultancy Datavault partner to empower organisations to accelerate Data Warehousing

VaultSpeed and consultancy Datavault partner to empower organisations to accelerate Data Warehousing

Leading UK Data Warehouse consultancy chooses VaultSpeed automation tool to make their customers’ business more agile.

 

[Leuven, Belgium; Hampshire, United Kingdom] - August 16, 2021 – VaultSpeed, the data warehouse automation company, today announced a strategic partnership with Datavault, a leading consultancy, specializing in delivering data warehouse solutions to maximise organizations’ investment returns in Data Warehousing and to speed up Business Intelligence. VaultSpeed’s Data Warehouse Integration platform lets Datavault provide its customers a cost-effective, low-risk way to modernise their data infrastructure while leveraging their current investments in Data Warehouse technology and ETL-tools.

In order to compete with today’s disruptors, organizations of every size and industry are investing in data warehouses to facilitate analytical and reporting processes that help make data-backed routine and strategic business decisions. And while virtually every company employs a data-driven approach, they must overcome the complexities associated with combining data from different sources and handling large amounts of structured and unstructured data. Data Vault 2.0 modeling helps organizing and preparing all the incoming data. The VaultSpeed platform is designed to automate these data integration processes, speed up implementation and increase agility when new sources are added.

Both Datavault and VaultSpeed provide solutions that are built on hands-on experience in many projects in different industries. This partnership enables us to focus on our core platform while Datavault solves for complex enterprise data warehouse challenges and reduces IT friction for customers.

Piet De Windt, CEO VaultSpeed

Consultancy Datavault has extensive experience deploying Data Vault 2.0 solutions and works with clients throughout the life-cycle of projects from strategic alignment, business case development, through design, implementation and ongoing operational support.

We are delighted to be partnering with VaultSpeed to deliver data warehouse automation solutions to save our clients both time and money when modernizing their data and analytics platforms using Data Vault 2.0.

Neil Strange, CEO Datavault

About VaultSpeed

VaultSpeed is a Belgium-based software company. Its data warehouse automation solution speeds up the process of data integration through a best in class tool built on the Data Vault 2.0 methodology. More and more companies worldwide rely on VaultSpeed to simply build and maintain their enterprise data hub. The tool connects with most popular ELT(ETL)-tools, source and target technologies. VaultSpeed has recently closed a €3.6 million Series A round led by Fortino Capital Partners. The company was carved-out from the Cronos Group, who remains on board through its investment arm Co-foundry.

For more information, visit https://www.vaultspeed.com

About Datavault

Datavault is a specialist consultancy dedicated to helping clients achieve best practice in data warehousing, business intelligence, information governance and analytics.

The company is expert in delivering agile data and analytics solutions using the latest best practice in Data Vault 2.0, DataOps, Continuous Integration and testing. Services embrace the whole data platform lifecycle from strategy and business case preparation through to proof-of-concepts; architectural design; project delivery; coaching and training. A growing global blue chip client base comes from a range of industries including banking, insurance, retailer and government.

For more information, visit https://www.data-vault.com/

Media Contacts:

Piet de Windt
CEO VaultSpeed
piet.dewindt@vaultspeed.com

Neil Strange
CEO Datavault
neil.strange@data-vault.com




WWDVC Keynote Video: Automation in the Real World

Jonas De Keuster

5 July 2021|In Events, Knowledge|By Jonas De Keuster

There are a lot of aspects that can make or break your data warehouse project.
Discover how to overcome challenges in cloud architecture, business requirements and time to market with case studies from Eurocontrol, Olympus and Bank de Groof Petercam.




WWDVC Roadmap Video: Survival of the Fastest

Jonas De Keuster

5 July 2021|In Events, Knowledge|By Jonas De Keuster

We are living in an environment that continues to evolve. That’s why we’re happy to share a sneak peek of our Roadmap and how we see the world of data warehouse automation will evolve in the near and not so near future.



Service Desk 2.0

Service Desk 2.0

We updated our service desk! VaultSpeed offers out of the box data vault automation for which we include support in case of any issues. We’ve been working hard to improve our support desk with some great new features to ensure that it will be easier to find the help you’re looking for.

 

The new service portal

Support Levels

Service desk tickets are handled at 3 levels:

  1. First-line support: the entry support level where all information is gathered for your request and solved quickly using known solutions.
  2. Second-line support: if first-line support can not quickly solve the request, more in-depth analysis is needed. The second line will also decide if development needs to be included.
  3. Third line support: this is the development team that will dive into the code to fix the issue.

Support Types

Self Service

Enter your question in the search box and the system will automatically look for possible solutions. If the proposed documents do not help you in any way, choose whether you have an “Account Management” request or a “Technical Assistance” request.

Technical Assistance

Customers can request technical assistance through the service desk by reporting technical issues, suggesting improvements, making feature requests, etc.

Account Management

The service desk is also used for account management. You can use it to request a new user, remove a user, obtain billing information, request training, request on-site implementation assistance, etc.

Support License Plans

All VaultSpeed licences contain basic support and come with 24h response times. We also sell additional Support packages based on your needs.

 

Support pricing

 




Source copy, Databricks and Apache 2.0 (Release 4.2.4)

Source copy, Databricks and Apache Airflow 2.0 (Release 4.2.4)

We’re back with a new release, and it is stuffed with new features.
We added support for Databricks, we updated our Flow Management connector to work with Apache Airflow 2.0. Also, VaultSpeed users can now copy an entire source configuration. These, and many more changes, come with VaultSpeed R4.2.4!

Databricks

Run your Data Vault in the Databricks data lakehouse!
You are now able to generate and deploy Spark code to Databricks and run it with Airflow. The deployment will create Spark SQL notebooks in Databricks for all your Data Vault mappings. Airflow will launch those jobs, running the Notebooks. Integration with Azure Data Factory is coming soon.
The target Database type is still Spark, but the ETL generation type has to be set to Databricks SQL.

 

Airflow 2.0

Apache Airflow 2.0 brings a truckload of great new features like a modernized user interface, the Airflow API, improved performance of the scheduler, the Taskflow API and others. VaultSpeed now supports Airflow 2.0. The VaultSpeed plugin for Airflow and all generated code have been reworked. All code will still work for previous Airflow versions. Just like before, once you’ve installed our plugin into your Airflow environment, Airflow becomes VaultSpeed aware. You’re able to generate and deploy workflows and run all the code needed to load your Data Vault.

 

Copy Sources

Users will also have the ability to copy existing sources. In some cases, an organization will need to integrate multiple sources that share a lot of similarities between them.

To give an example: Company ABC has the same version of their Sales CRM running in both Europe and the US. The only difference is that they have a few additional modules activated in the US.

Using the source copy functionality, they can now copy the entire source configuration from EU Sales to US Sales. All you need to do is identify and configure objects or settings that are specific only for the new source, but you can now skip all similar configuration you had already done for the EU source.

Using this functionality can obviously save a lot of time when integrating similar sources into your Data Vault model.

 

 

User Experience Improvements

The new release comes with a few other changes like a better screen to create a new data vault release. It has become a lot easier to indicate which version of which sources you would like to include in a specific data vault release. You can also choose to exclude certain sources from your release.

 

We also made it possible to mark objects in the source editor as completed, the completed objects will be highlighted in green. This status can be toggled by right-clicking on an object. The selection page can filter out completed objects, and there is also a button to remove all completed objects from the canvas. This allows you to track progress in your source modelling and get things organized.

 

Business Keys

We made it easier to change and re-order business keys in the Data Vault. We added a new screen where for each hub group the business keys of the grouped objects can be renamed and reordered, and the business keys of the hubs in the group can be reordered to match. So the keys in the different sources can now have different orders and names and still result in the same hash key calculation.

 

 

We added similar ability to reorder the linked hubs in many-to-many links ( and non historized links). In a separate screen, you can change the order of the HUB’s included in a many to many link or non-historical link.

Other changes

  • We renamed the “build flag” property to “ignored” everywhere in the application.
  • Added extra template variables for the custom deploy scripts in the agent, instead of only the zip name, you can now also get the generation id, the generation info, and the generation type, similar to the git commit message functionality. example:
    deploy.cmd = sh C:\Users\name\Documents\agent\deploy.sh {zipname} {code_type} ”{info}”
  • The compare functionality in the source graphical overview will now skip ignored releases. This means that it will compare with the last non ignored locked release before the current one.
  • We added support for overlapping loading windows to the Azure Data Factory FMC, this can be configured by using the following parameters: FMC_OVERLAPPING_LOADING_WINDOWS, FMC_WINDOW_OVERLAP_SIZE, FMC_WINDOW_OVERLAP_TYPE.
  • The metadata-export has been converted to a task, this is done to support exporting data for very large Data Vaults. Before the export would time out and not return a file if it takes too long.

More releases are coming!




Meet EON Colletive: our new Integration Partner in North America

Meet EON Collective: our new Integration Partner in North America

Important news from the partnership front! We have recently teamed up with EON Collective. EON is a group of highly experienced data professionals located in USA & Canada. EON will act as an integration partner for VaultSpeed in the region.

We’re delighted to announce EON Collective as our newest integrator partner in North America. EON have strong focus on automation and are very familiar with Data Vault 2.0. Their expertise in data warehousing and data integration is impressive and we’re happy to team up with such a strong player.

Piet De Windt - CEO VaultSpeed

Every EONite team lead has over 20 years of experience in their discipline. They all at one time or another have worked for one of the world's largest consulting firms and all understand that that real change doesn't have to cost an arm and a leg. With that in mind, EON Collective's team developed the tools that lower the hours needed to bring you real results. They help organizations gain validated business insights faster and with greater flexibility. And help companies ensure business value through proven methodology and automated tools.

They are partnering up with VaultSpeed as their preferred solution for data warehouse automation:

We are very excited about being Vaultspeeds North American integration partner. Automation is a key component of any successful Data Vault implementation and we feel Vaultspeeds automation strategy in combination with Adept methodology for Data Vault implementation is the perfect combination.

We are also looking forward to working with Vaultspeed as we start to integrate some of our Adept technology with the Vaultspeed solution."-

Robert Scott - CTO EON Collective

The power of EON is having the collective capability to work alongside their clients utilizing the ADEPT Managed Solution. EON ADEPT links process model analysis and data-oriented analysis. In fact, ADEPT is not limited to automated process discovery based on event data. It also answers a wide variety of clients performance and compliance questions based on the identified solution's operational metrics. ADEPT was built with the simple goal of greatly reducing the cost of consulting.

We should also mention that EON are joining us at the World Wide Data Vault Conference starting May 17th. Any questions on how to get started with VaultSpeed in their region and about the ADEPT integration can be addressed. We can highly recommend Keith Belanger’s keynote presentation “Is your Data Vault speaking your language?”.