top of page

RET: Intelligent Data Retrieval for Dataspaces

  • Laura Gavrilut
  • 2 days ago
  • 3 min read

In modern dataspaces, accessing and using data from providers should be straightforward, but technical barriers often prevent non-technical users from taking full advantage of available data. Data consumers face challenges in composing complex REST calls with specific parameter formatting requirements, and retrieved data may not match their expected schema or format. This is where the DS2 RET (Data Retrieval) module comes in.


What is RET?

RET is an intelligent data retrieval module that uses machine learning technology to help data consumers obtain and use data from dataspace providers. It bridges the gap between data availability and data usability by automating the complex technical steps required to access, transform, and consume data.


At its core, RET leverages Large Language Model (LLM) technology to understand OpenAPI specifications and automatically compose correct REST calls, format parameters, and apply necessary data transformations. The module guides users through parameter input, executes data retrieval, and optionally transforms the data to match target application requirements.


RET enables participants to:

  • Access provider data through an intuitive GUI without technical REST expertise

  • Automatically format complex parameters like timestamps to required specifications

  • Retrieve data and save to files or pipeline directly to applications

  • Transform data schemas automatically to match target application requirements

  • Execute the complete data acquisition pipeline with minimal manual intervention


Why IDT?

Dataspaces promise to make data more accessible, but technical complexity often stands in the way. Data providers typically offer data through REST APIs that require specific parameter formats—such as RFC 3339 timestamps or entity IDs—which can be challenging for non-technical users. Additionally, retrieved data may use different field names, units, or formats than what the consumer's application expects.


Without intelligent tooling, data consumers must manually:

  • Study OpenAPI specifications to understand parameter requirements

  • Format parameters correctly (e.g., converting natural language dates to specific timestamp formats)

  • Compose and test REST calls

  • Handle errors and retry with corrections

  • Transform retrieved data to match their application's schema

  • Integrate the data into their workflows


RET addresses these challenges by providing an AI-powered interface that handles the technical complexity automatically. Using LLM function calling technology, RET interprets OpenAPI specifications, guides users through parameter input in natural language, composes properly formatted REST calls, and even selects appropriate transformations to align source and target data schemas.


By automating these technical steps, RET helps participants move from complex API interactions toward simple, guided data access that works for both technical and non-technical users.


DS2 Architecture Overview

The RET module consists of two main components: a graphical user interface (GUI) and an intelligent backend that handles program logic. The module is available in both web-based and local GUI versions to support different deployment scenarios.


The RET architecture includes several key components:

  • GUI Layer, which displays available data sources and guides users through parameter input without requiring knowledge of formatting requirements

  • LLM-Powered Backend, which uses Large Language Model technology for function calling—composing REST calls from user parameters based on OpenAPI specifications

  • Automatic Error Correction, which feeds failed REST calls back to the LLM with error messages, requesting corrected versions (up to three attempts)

  • Transformation Engine, which automatically selects and applies data transformations to align source data with target application schemas

  • Pipeline Executor, which manages the complete flow from data retrieval through transformation to application execution or file storage

  • OpenAPI Integration, which consumes OpenAPI specifications to understand data source requirements and returned data schemas

 

RET connects seamlessly with the DS2 ecosystem. It retrieves OpenAPI specifications through the DS2 Catalogue to understand provider data offerings. The module can inject API tokens automatically for secure access, and supports pipelining retrieved data directly to target applications with automatic schema transformation.


The transformation capability is particularly powerful: by providing both source (from OpenAPI spec) and target schemas to the LLM, along with a library of transformation functions, RET can automatically determine what transformations are needed—such as renaming columns, converting units, or reformatting data—and apply them without manual configuration.


In this way, RET acts as an intelligent bridge between dataspace providers and consumers: a user-friendly interface where complex data access becomes simple, parameters are automatically formatted, and data is delivered in the exact format needed by the consuming application.

 

Comments


bottom of page