Proxy
Data Fabric provides a cache-enabled REST proxy to various external services. The list of currently proxied services is documented on the root API landing.
Purpose
The proxy serves a few primary purposes for Data Fabric clients, including:
-
Central location for all data outreach (simplifying network egress controls and management).
-
Ability to cache response data for reducing latency and operating in DDIL environments.
-
Instrumented for observability to monitor all data request activity across all data consumer applications (clients of the proxy).
Enablement
Each proxied service must first be [enabled from the Catalog]({{< ref "/ui/catalog#enabling-a-data-source" >}}). During enablement, the Data Fabric user (or system account) provides the credentials necessary to authenticate with the target service.
The credentials are then stored securely in a Kubernetes Secret
,
and used by Data Fabric to invoke the target service on behalf of the user (or system account).
If a client attempts to invoke a proxied service without first enabling it, Data Fabric will respond with a
407 Proxy Authentication Required
error code and a message indicating the service has not been enabled for that client.
Caching
One of the benefits to going through the Data Fabric proxy is leveraging the built-in cache. For any given request, the response from its upstream service is cached locally. Upon the next request for the same data, the proxy responds with the previously cached response. Once the response expires from the cache, the next request will again attempt to fetch from the upstream service (and cache the new response).
Cached Responses
If a response came from the cache, the proxy will add a Cache-Ttl
header to the response.
The value will be the time left until the response expires from the cache.
Cache-Ttl: 55m47s
The absence of Cache-Ttl
in the response indicates the response was fetched from the upstream service.
Controlling the Cache Behavior
There may be scenarios where you want to invalidate or skip over the cache for a particular request (testing a new upstream service, forcing a cache update, etc.).
To do this, the Data Fabric proxy supports an optional Cache-Mode
request header which can have one of the following values.
-
Default
- Default cache behavior (same as omitting theCache-Mode
header all together). -
None
- Ignore the cache completely (strictly pass-through to the upstream service). Useful for testing if the upstream is currently available without impacting the cache. -
Only
- Only return the cached response. If no cached response exists, returns404
. Useful for avoiding any external outreach attempts. -
Invalidate
- Ignore any existing cached response and fetch from upstream service. Update cache with new response. Useful for purging the cache of old data.