MinD-Vis AI system decodes images from MRI scans


A new AI system reconstructs images from MRI data two-thirds more accurately than older systems. This is made possible by more data and diffusion models.

Can AI models decode thoughts? Experiments with large language models, such as those by a Meta research group led by Jean-Remi King, attempt to decode words or phrases from MRI data using language models.

Recently, a research group presented an AI system that decodes MRI data from a person watching a video into text describing some of the visible events.

These technologies could one day lead to advanced interfaces allowing, for example, people with disabilities to better communicate with their environment or to control a computer.

A d

A new study now relies on diffusion models to reconstruct images from human MRI data. Diffusion models are also available in advanced image AI systems such as DALL-E 2 or Stable Diffusion. They can reconstruct images from noise.

MinD-Vis relies on diffusion and 340 hours of MRI

Researchers from National University of Singapore, Chinese University of Hong Kong, and Stanford University demonstrate “Sparse Masked Brain Modeling with Dual-Condition Latent Scattering Model for Decoding Human Vision” – MinD -Live in abbreviated form.

The work aims to create a broadcast-based AI model that can decode visual stimuli from brain data, laid a basis for linking human vision and machine vision.

MinD-Vis learns to reconstruct images from MRI exams. | Image: Chen et al.

First, the AI ​​system learns an efficient representation of the MRI data through self-supervised learning. The incorporations of this data then serve as a condition for the generation of images of the diffusion model.

For training, the team relies on data from the Human Connectome project and the Generic Object Decoding dataset. In total, the training data reaches 136,000 MRI segments from 340 hours of MRI exams, the largest dataset to date for a brain decoding AI system.


Tesla's Optimus Robot: Here's How Experts View Elon Musk's AI Robot
Tesla's Optimus Robot: Here's How Experts View Elon Musk's AI Robot

MinD-Vis captures semantic details and image features

While the first dataset consists entirely of MRI data, the second includes 1,250 different images from 200 classes. The team selected 50 of the images for testing.

For further validation of their approach, the researchers also relied on the Brain, Object, Landscape dataset, which includes 5,254 MRI image pairs.

According to the publication, MinD-Vis clearly outperforms older models: the the system is 66% better at capturing semantic content and 41% better at the quality of generated images.

In the end, however, this still leaves the system far from being able to read thoughts reliably: despite the improvement, the accuracy of capturing semantic content is 23.9%.

The image quality and semantic accuracy of MinD-Vis are significantly better than previous systems, but remain weak overall. | Image: Chen et al.

In addition, the quality of the reconstructed images varied between the different subjects. A well-known phenomenon in the field of research, writes the team. However, some of the image classes tested were not included in the training dataset. More data could therefore further improve the quality of the system.

More information and examples are available on the MinD-Vis project page.

Leave a Comment

Your email address will not be published. Required fields are marked *