Tech Blog | insightify.io

January 07, 2026

OpenTelemetry Langfuse LLM Observability Google IAP Cloud Infrastructure

Hacking OpenTelemetry: dynamic auth for Langfuse behind Google IAP

One of the problems we’ve recently encountered was the need for full LLM observability.
Alas, we’ve also bumped into an instant constraint: our infrastructure sits behind Google Identity-Aware Proxy (IAP) with tokens that die every hour.

The solution we are happy with, at least for the time being ;), is an ugly, yet reliable, monkey-patching workaround. We decided to share it as it might come in handy for you as well.

Integrating observability tools like Langfuse is usually trivial. You drop in your API keys, add a decorator, and watch the traces roll in. The SDK handles the rest, batching spans and sending them over HTTP.

But what when things start to fall apart when your infrastructure is locked down?

In our environment, every request, including background traces sent by the Langfuse SDK, requires a fresh OIDC token in the Proxy-Authorization header to pass through Google Identity-Aware Proxy (IAP).

The catch? Google tokens expire after 1 hour.

It is important to note that this challenge is specific to Langfuse SDK v3, which fully adopted OpenTelemetry (OTEL) as its backbone. While older versions (v2) allowed for simpler, direct control over the HTTP client, v3 delegates data transmission to the standard OTLPSpanExporter. This architectural shift brought powerful features like auto-instrumentation, but it came with a trade-off: rigidity.

The standard OTEL exporter initializes headers once at startup.

At minute 61, the token expires. The background exporter starts hitting 401 Unauthorized, and we lose all data until the service is manually restarted.

We needed a way to rotate headers inside a running session, without restarting the container.

Since the SDK doesn’t natively support a token refresh callback, we had two choices:

Fork the SDK: A maintenance nightmare.
Monkey patching: Modifying the library's internal behavior at runtime.

We chose option 2. As said, it isn't pretty, but it works.

Monkey patching the OTLP exporter

We decided to intercept the export method of the underlying OpenTelemetry (OTEL) exporter. The plan was to check the token age before every batch export and refresh it if necessary.

1. Capturing the original method

First, we grab a reference to the original, unmodified code. We do this at the module level to ensure we have the clean version before any other initialization runs.

2. The dynamic wrapper

We created a wrapper function _dynamic_header_export. And this is where the ugly magic happens. We use a simple singleton config object to track the last_refresh_time and the cached token.

3. Applying the patch

Finally, during our app initialization (before any traces are generated), we swap the methods.

Why this is "ugly?" (and necessary)

Let's be honest about why this is a hack:

Internal dependency: we are relying on OTLPSpanExporter having a _session attribute. If OpenTelemetry refactors their code in a future update, this will break.
Global state: we are manipulating global module state to swap methods.
No guarantee: This bypasses the public API contracts of the library.

Conclusion?

Ideally, native support for dynamic headers would exist – in fact, PR #1318 in the Langfuse repo proposes this exact feature, but it remains unmerged due to low community interest 🙂.

Until that lands, this workaround is our bridge. It allows us to maintain strict security (short-lived tokens) without sacrificing observability.

And while monkey-patching is fragile, it solves a critical infrastructure problem transparently. The rest of our application code (@observe decorators) remains completely unaware that this complexity exists – which, in the end, is exactly what you want from a temporary fix.