Natural-Language-Driven Transfer Function Optimization for Volumetric Visualization Using Multi-Agent LLMs
Abstract
This project presents a natural-language-driven system for volumetric visualization that uses an agentic workflow to design transfer functions aligned with user-specified objectives. Viewed as a research problem, it asks how multi-agent LLMs can be used to optimise transfer functions for volumetric visualization, enabling users to specify visual goals through natural language while the system plans and refines suitable mappings automatically.
Instead of directly manipulating low-level transfer-function controls, a user issues instructions, and the system decomposes these requests into steps that resemble an expert’s workflow: analysing the intensity distribution, proposing a transfer function that maps intensities to colour and opacity, rendering multiple views, and assessing alignment with the stated objective. Example instructions include:
- “Highlight the metallic objects in the backpack and remove everything else.”
- “Emphasise the vertebral column and suppress soft tissue.”
- “Make high-density regions opaque and low-density regions semi-transparent.”
The pipeline integrates a GPU-accelerated renderer (VTK/PyVista), a Knowledge Agent for transfer-function generation, an Evaluation Agent for image-based quality assessment, and an Orchestrator that coordinates the loop. In the current implementation, the knowledge agent parameterises the transfer function primarily through piecewise intensity–opacity segments, while the renderer applies a fixed plasma colour map; however, the overall design is agnostic to the specific colour model.
A working prototype is deployed on a Debian server with AMD GPU support and accessed via a SvelteKit web client.
Case-study experiments on real-world volumetric datasets show that the system can converge to visually meaningful transfer functions within a small number of refinement cycles, without any manual editing of the underlying mapping. The results indicate that natural-language control, coupled with a multi-agent LLM workflow, is a viable interface for volume visualization and provides a practical platform for exploring intelligent, LLM-driven transfer-function planning and refinement.