Researchers may bridge the gap between biology, agriculture and computer science by using AI-based conversational VR
Modern agricultural sensors allow unprecedented amounts of 3D data to be gathered for research. These capabilities play an important role in understanding complex biological systems such as plant growth.
Data from the sensors are central for creating realistic "digital twins." By learning from data gathered from real-world experiments and field measurements, these digital twins are created, known also as plant phenotyping. It is estimated that up to 80% of this data is never touched. It is possible this unexplored captured data may reveal the answers to many important research questions.
Researchers at Purdue University and the University of Arizona are looking to put this and future captured data to use. They will develop a novel AI-based conversational virtual reality (VR) platform called VR-Bio-Talk which addresses some of the barriers for effectively utilizing these large and valuable data sets.
The main goal is to bridge the gap between data science and research by enabling the formulation of questions in natural language. This platform will automatically convert their questions into code to extract the information and will display the results in VR.
Researchers will use VR-Bio-Talk to converse with agricultural data from the Field Scanalyzer one of the largest outdoor automated phenotyping facilities. They will explore and analyze natural language interactions without needing to understand how the underlying data is organized or stored.
VR-Bio-Talk will be deployed on various datasets and tested on a wide variety of diverse users ranging from experts to users with limited knowledge of biology and agriculture.
"One of the main goals of this project is to bridge the domain gap between biology, agriculture and computer science researchers," says Bedrich Benes, professor and associate head of the Department of Computer Science at Purdue University.
"Traditional data science projects require domain expertise to process the data. Someone needs to write programs to extract information from data. We aim to lower this barrier by using the latest advances in AI for voice recognition and conversational interfaces."
VR-Bio-Talk is a research project that will let users issue verbal queries such as, "select all plants older than two weeks and calculate their average height."
The VR application will let the researcher be visually immersed in the 3D field representation of the environment and interact with each individual plant, where each plant is enhanced with associated data, including that plant's associated data such as height, leaf area index, among other variables.
Researchers will be able to ask follow-up questions such as, "from these selected plants, show me all leaves affected by a disease and compute their area." The conversational agent will allow for the extraction of information and will aid research discovery in an unprecedented way.
"Extracting information from data has always required programming expertise," says Duke Pauli, associate professor and director of the Center for Agroecosystems Research from the University of Arizona.
"Having the ability to break down technical barriers and let users converse natively with their data will enable researchers to focus more on their science and generate greater societal impact."
This project brings together researchers from different disciplines. Benes, leader of this effort, has 25 years of experience modeling digital vegetation. Voicu Popescu, associate professor of computer science at Purdue University, has worked in VR research, 3D interaction, and fast scene displaying for large datasets.
Alejandra Magana, the W.C Furnace Professor of Enterprise Excellence in computer and information technology and professor of engineering education at Purdue University, leads aspects of human-computer interaction and learning effects of VR-Bio-Talk, ensuring the interface responds adequately to people with different abilities and that the experience results in conceptual understanding or learning advantage.
"VR devices have a great interface when it comes to selecting the desired view through natural head motions, but when it comes to data analysis, the interface is quite unintuitive. Our hands-free conversational agent has the potential to generalize to other domains beyond agriculture," says Popescu.
Pauli leads the plant phenotyping efforts and has overseen multiple large experiments utilizing the Field Scanalyzer and also provides domain expertise.
Nirav Merchant, the director of the University of Arizona Data Science Institute, is the principal investigator for NSF CyVerse, a cyberinfrastructure project for managing data and analysis workflows from phenotyping platforms such as the Field Scanalyzer and CyVerse. He will provide access to the vast amounts of open datasets and methods to rapidly extract information from it.
Another underlying novelty of VR-Bio-Talk is the 3D reconstruction of plants into their digital twins—their data-driven computer counterparts.
"We have over two petabytes of data that potentially include information yet to be discovered," says Merchant.
"With recent advances and maturity in conversational technologies and the ability to chat and interact with large systems, we believe that this platform has the potential to dramatically change the way we extract information and knowledge from data."
The project and its outcomes will be rigorously validated.
"The overall user experience will be delivered via experiential learning and will involve iterative cycles of design-based research to ensure it delivers new insights," says Magana.
Provided by Purdue University