data engineering blog logo

January 07, 2026

Hacking OpenTelemetry: dynamic auth for Langfuse behind Google IAP

When Langfuse runs behind Google Identity-Aware Proxy, short-lived tokens can silently break OpenTelemetry exports. This article shares a pragmatic monkey-patching workaround to enable dynamic authentication and preserve full LLM observability without restarting services.

August 12, 2025

Decentralized but Never Disordered: Bruno Freitag’s Principles for Practical Data Mesh

Bruno Freitag, author of 'Data Mesh Design,' shares pragmatic strategies for building scalable, modular, and semantically consistent data mesh architectures. Discover how to avoid overengineering, maintain governance without gatekeeping, and deliver business value fast.

July 15, 2025

From Decentralized Data to Autonomous Agents: A Talk with Eric Broda

Eric Broda, author of 'Implementing Data Mesh' and the upcoming 'Agentic Mesh,' shares his insights on the evolution from data mesh to agent-driven systems. Explore cultural and technical challenges, provisioning dynamics, and what it takes to build enterprise-ready autonomous agents.

November 11, 2024

Why Your Business Loses if Data Lake is Not in Apache Iceberg

Explore how Apache Iceberg transforms data lake management by addressing the limitations of traditional data storage, including data consistency, schema evolution, and query performance. Learn how Iceberg’s advanced features enhance reliability and efficiency.

October 25, 2024

DataOps or DataOops? How to Avoid Pitfalls in Data Mesh Teams

In decentralized data environments like data mesh, DataOps is essential for streamlining workflows but can also introduce challenges. This article explores how to balance autonomy and governance while maintaining efficiency. Discover key strategies and tools to navigate the complexities of DataOps and keep your data pipelines consistent and secure.

October 16, 2024

When Terraform Turns Terraterror, or About Why DSL Beats Custom Terraform Scripts

In today's data-driven organizations, especially those embracing the principles of the data mesh architecture, flexibility is on every manager’s PP presentation. The need for scalable, adaptable, and autonomous infrastructure is at an all-time high.

October 7, 2024

Sustainability in the Cloud: How Cloud Manufacturing Supports Green Initiatives

Regardless if you follow Elon Musk on the Ice Age rabbit hole, see the climate changes as parts of bigger cycles we can barely influence, or think the environmental shifts happen mostly due to our interaction with the planet, one thing is for certain: sustainability is one of the most sought-after grails of technological advancement & will remain so for decades to come. And yes, it does matter.

August 27, 2024

Impossible or Inevitable? - Anticipating Fully Autonomous Systems in Cloud Manufacturing

Cloud Manufacturing allows seamless integration of AI and machine learning into production processes, enabling automated decision-making and process optimization. Sounds very lackluster, but as stated by Gartner, AI-driven automation can lead to a 30% increase in efficiency by reducing manual errors and optimizing resource use, and that’s big!

August 5, 2024

Cloud Manufacturing: The Future of Industry 4.0

Throughout history, humanity has leveraged technology to improve our species’ living conditions, marking significant shifts in development during four industrial revolutions. Most of us pretty much understand the trajectory of human development does lean on new technologies, and that our century is steeped with ideas that are currently transforming our lives, en masse. And even though some of us still like reenacting our hunter-gatherer story in various ways, the majority agree we will be better off staying on the tech wagon.

May 9, 2024

Data Platforms vs Large Datasets— Dremio & Snowflake match

As much as we like Dremio & the open source community around it, we need to point out that the company advertises its platform as capable of performing well without the need for data movement — a great feature if you have data available in different formats and locations, right? The thing is that, based on our experience, it’s only valid for relatively small datasets. To work at scale, a bit more bumpy road needs to be taken where one needs to build pipelines to convert data to Apache Iceberg. By no means a huge hindrance, but a slight hiccup, perchance.

March 27, 2024

Domain-Driven Data Architecture: The Data Mesh Concept in Practice

An American futurologist, Robert Anton Wilson, used to say that “the measure of a system’s viability is the measure of information propagation”. In tech, data, as an information that has been translated into a form that is efficient for movement or processing, is the most crucial element in all the systems we are — and will be — building in a foreseeable future. In this piece of writing we will take a look at a new way of data handling & decentralized environments built to better disseminate data in our organisations. The data mesh.

March 1, 2024

A small piece from a huge world of Large Language Models… or, what can LLMs do for You?

Imagine a company swamped with thousands of customer messages handled solely by customer support teams. Truth being told, one does not have to imagine, as we’ve all been there— either as employees or customers awaiting responses that should have been provided off the cuff. Luckily enough, with the rise of smart solutions quickly sorting out & understanding all the customers’ messages — and even responding to each customer personally — we’ve found a threshold to our ultimate live chat experience.

February 23, 2024

Not your typical Data Warehouse — a tech dive into modern data platforms

Data warehouses and data lakehouses are crucial components of modern data management, providing a centralised repository for storing and managing large volumes of structured and unstructured data. In this article, let us take a look at BigQuery and Snowflake, which are not your typical cloud-based data warehouses, and Dremio — a platform that facilitates access to data from various sources without the need for data movement, distinguishing it from traditional warehouse solutions.