Generative AI is changing the world.
Every company will be an AI company, and VaultSpeed is no exception. But what implications does this hold for our product, our team, our customers, and the future of data warehouse automation?
To delve into these questions, we sat down with key figures at VaultSpeed: Piet De Windt, CEO; Dirk Vermeiren, CTO; and Kurt Janssens, a machine learning expert, who was recently welcomed to the team. Kurt has practically held every job function involving data and previously co-founded Brainjar, an AI consultancy firm.Simplifying user management
"Every scale-up needs an AI strategy," asserts Piet De Windt. "Not just for fundraising or PR, but because of customer expectations and the revenue growth it generates. AI applications are giving our industry an enormous boost. Cloud data warehouses will process 75% of AI workloads in 2024.”
“Companies looking to embrace AI and advanced analytics recognize the need for a shift in their data strategy too," he adds, "Let’s face it, the battle will not be won over who has the best AI tooling. The battle is won over who has the best data readily available to feed ever-larger models.
We are already a key player when it comes to delivering a data foundation for advanced analytics and AI. VaultSpeed is complementary to solutions such as Alteryx or C3AI because machine learning models deliver better results when trained on the reliable data input that automated data transformation provides. VaultSpeed ensures that you have your data modeled and structured for learning before your competitors have theirs.”
Not yet for automated data transformation
“Even though AI reshapes industries," Dirk Vermeiren explains, "it will not generate or adapt our template code. VaultSpeed templates are designed to deliver ETL and DDL code, and that code is built on years of data integration experience to navigate sheer complexity. We don’t want AI algorithms to modify in any way the million lines of code that our tool generates as it must maintain 100% accuracy. It is all about stability, adherence to standards, and performance optimization.
AI is not a solution for automated data integration and transformation. Not anytime soon. However, AI can play a role in niche specialized business areas with a more limited number of combinations and greater uniqueness, leveraging the right business taxonomy and ontology, recognizing patterns, and generating ETL for very business-specific cases. Moreover, AI can certainly support our engineering team in creating their code, acting as a fine-tuned co-pilot.”
Bridging human expertise with machine learning
Kurt Janssens emphasizes that AI can be pivotal in helping our customers use our tool better. He says, “VaultSpeed is the only data automation provider that stores all data on the usage of its tool. We have enough data to learn from and support our users in handling our tool in the best possible way.
To illustrate this, the traditional way of setting up data warehouses was to gradually populate and model it source per source. However, VaultSpeed immediately starts with the bigger picture, suggesting a comprehensive Data Vault model based on the metadata it harvests from all the sources.
AI can help us implement a continuous heuristic approach for suggestions we are not 100% sure about. But each time, we ask for feedback from the data modeler, as a continuous learning.
AI can also help explain to the data engineer why our tool proposes this data model or these object types.”
Towards autonomous data warehousing
“The end vision of VaultSpeed is autonomous data warehousing," says Piet De Windt. "We want to automate data integration and transformation as much as possible, going from 70%, which is where we are now, to ultimately 90 or 95%. So that data teams can free up time to deliver advanced analytics and data products that add business value. AI can generate additional metadata to make our tool more performant and more exact, pushing automation to its limits."
Dirk Vermeiren emphasizes, “AI will certainly play a crucial role in advancing towards autonomous data warehousing. We've identified various domains, such as NLP and metadata harvesting, and are currently analyzing our usage data to determine the most effective applications of machine learning in our product roadmap. Kurt's primary responsibility will be to validate our ideas and align them with the growing capabilities in generative AI. We're planning to move quickly and will announce our steps in the coming months."