Context-Aware XR Interfaces

Overview

Data-Driven Interface Design.

The Augmented Perception Lab developed MineXR, a design mining workflow and data platform that is used for collecting and analyzing personalized Extended Reality (XR) user interaction and experiences.

This research project focuses on leveraging the MineXR dataset to develop an LLM pipeline that outputs user interface components based on an initial given context to create context-aware AR/VR headsets. The work was conducted as a research internship at Carnegie Mellon University's Human-Computer Interaction Institute in collaboration with Meta Reality Labs.

Research Question

How might we develop a personalized context-aware user interface with LLMs, informed by MineXR data?

Context

Understanding LLM Pipelines.

While brainstorming our workflow and developing each LLM agent, we referenced several papers that generate UI components using LLMs. We also explored examples of adaptive interfaces in various contexts. From the paper BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks, we examined the use of specific LLM agents for the purpose of testing code, making changes, and rendering those changes in a visual interface for the user. While this paper was within the context of teaching coding and programming to users, we examined the LLM agent structure to inform our own pipeline.

BISCUIT LLM pipeline

We also referred to the paper LLMR: Real-time Prompting of Interactive Worlds using Large Language Models which focuses on the generation of UI components using an LLM pipeline to design VR interfaces. This helped inform us about how to create components in an AR/VR context. One LLM agent we planned to develop in our pipeline was the code generator which can be built in Unity, referencing the Microsoft Mixed Reality Toolkit.

LLMR pipeline

Research

Exploring the MineXR Dataset.

I examined the MineXR workflow which enables researchers to collect and analyze users' personalized XR interfaces in various contexts. I focused on the MineXR dataset for our integration with LLMs, consisting of 695 XR widgets in 109 unique XR layouts.
Widgets: cropped app interfaces

MineXR Data Types

Extracted MineXR Dataset

From the dataset, I played around with the given contexts (environment + task) and the subsequent widgets and screen descriptions from the dataset. I aimed to fine-tune the dataset into the most ideal form to supplement the LLM.

Exploration

Planning the Workflow.

We first explored LLMs and different LLM agents by asking questions, creating prompts of varying specificity, and documenting our findings. Based on these explorations, we developed an initial framework for the pipeline and defined the desired input and output of each LLM agent.

This pipeline contains three initial components:

Task Generator: The input is the context which includes the environment and task. The output is a list of tasks that would be useful based on the given context.

Application Planner: The output is an example of apps based on the input tasks.

Functionality Planner: The output is a list of functionalities for each app type and a categorization of each app type as primary, periphery, or ambient.

Development

Developing the Pipeline.

The goal was to leverage the MineXR dataset within the pipeline to make the outputs (functionalities) more well informed and relatively parallel to the dataset.

To this end, we experimented with introducing the dataset in multiple points in the pipeline

as an example within a prompt
as a debugger LLM

We implemented an LLM agent pipeline that effectively integrates the MineXR dataset to generate precise, context-aware interface components across diverse locations and task scenarios.

Click to expand

Experimentation

Testing and Optimizing the Workflow.

I first developed a score for quantifying the quality of our output to compare how well the pipeline produced outputs. This was dependent on three calculations and factors. One was the overlap of app categories in the functionality output. Another being the semantic similarity of the functionalities themselves. The last factor is the primary overlap of apps in the LLM output and the Dataset.

We created our pipeline tests and developed comparisons between modified LLM agents, prompts, and factors such as number of apps, app types, functionality descriptions, and categorization into primary, periphery, and ambient layouts.

Final Thoughts

Learnings...

Building structure from ambiguity was a challenge throughout this research project. It was difficult to tame something as unstructured as LLM outputs and find a way to create more deterministic, reproducible outputs for our pipeline.

I learned more about semantic similarity scoring and how best to quantify seemingly subjective metrics.

With more time...

I would test UI outputs more to understand how reproducible and similar outputs are to the initial dataset layouts.

It would be great to understand how more modern frameworks and AI agents would also perform, now that newer LLM models have fine tuning capabilities.

Considering different interaction patterns and user context would also be interesting, in educational settings or in collaborative use.

Role

Tools

Team

Skills

Timeline

Overview

Data-Driven Interface Design.

Research Question

How might we develop a personalized context-aware user interface with LLMs, informed by MineXR data?

Context

Understanding LLM Pipelines.

Research

Exploring the MineXR Dataset.

Exploration

Planning the Workflow.

Development

Developing the Pipeline.

Experimentation

Testing and Optimizing the Workflow.

Final Thoughts

Learnings...

Learnings...

With more time...

With more time...

up next...

Xometry