This Science News Wire page contains a press release issued by an organization and is provided to you "as is" with little or no review from Science X staff.

OmicsSuite: A customized and pipelined suite for analysis and visualization of multi-omics big data

October 13th, 2023
OmicsSuite: A customized and pipelined suite for analysis and visualization of multi-omics big data
The OmicsSuite native innovation framework architecture and the integration of the Posit Shiny framework. The yellow arrows show logical steps, while the blue arrows indicate datasets inputs and results outputs. Credit: Horticulture Research

With the advancements in high-throughput sequencing technologies such as Illumina, PacBio, and 10X Genomics platforms, and gas/liquid chromatography-mass spectrometry, large volumes of biological data in multiple formats can now be obtained through multi-omics analysis.

Bioinformatics is constantly evolving and seeking breakthroughs to solve multi-omics problems, however it is challenging for most experimental biologists to analyze data using command-line interfaces, coding, and scripting. Based on experience with multi-omics, we have developed OmicsSuite, a desktop suite that comprehensively integrates statistics and multi-omics analysis and visualization.

The suite has 175 sub-applications in 12 categories, including Sequence, Statistics, Algorithm, Genomics, Transcriptomics, Enrichment, Proteomics, Metabolomics, Clinical, Microorganism, Single Cell, and Table Operation. We created the user interface with Sequence View, Table View, and intelligent components based on JavaFX and the popular Shiny framework.

The multi-omics analysis functions were developed based on BioJava and 300+ packages provided by the R CRAN and Bioconductor communities, and it encompasses over 3,000 adjustable parameter interfaces. OmicsSuite can directly read multi-omics raw data in FastA, FastQ, MAF, mzML, Matrix, and HDF5 formats, and the programs emphasize data transfer directions and pipeline analysis functions.

OmicsSuite can produce pre-publication images and tables, allowing users to focus on biological aspects. OmicsSuite offers multi-omics step-by-step workflows that can be easily applied to horticultural plant breeding and molecular mechanism studies in plants. It enables researchers to freely explore the molecular information contained in multi-omics big data.

Over the past decade, the widespread application of next-generation sequencing (NGS) technologies represented by Illumina Solexa and HiSeq platforms, as well as third-generation sequencing (TGS) technologies led by PacBio Sequel and Oxford Nanopore platforms, has revolutionized the fields of molecular, evolutionary, and computational biology. Currently, omics diversity and integrated multi-omics analysis are thriving research fields, and multidimensional analysis of biological features and mechanisms has become more precise and efficient.

With the widespread application of sequencing and mass spectrometry technologies, the quantity and storage types of data generated are also rapidly increasing. For example, data generated by PacBio HiFi sequencing has longer reads and higher throughput, and storage formats based on GC/LC-MS (gas/liquid chromatography mass spectrometry) such as mzXML (mass spectrometric data in eXtensible Markup Language), mzData (mass spectrometric Data), and mzML are gradually evolving to meet more diverse needs.

At the same time, data from 10X Genomics, ranging from chromium matrix data generated from single-cell transcriptomics to Visium HDF5 (Hierarchical Data Format version 5) data generated from spatial transcriptomics, have become more complex and mysterious. As a result, there are significant differences in the analytical pipelines, methods, and programs employed in data parsing for downstream bioinformatic analysis across genomics, transcriptomics, proteomics, metabolomics, microbial omics, and single-cell omics.

OmicsSuite native framework architecture

OmicsSuite is an innovative framework for analyzing and visualizing multi-omics data in a workflow. The JavaFX library provides user interface (UI) control methods, parameter component classes, web engine support, and other interface display and friendly interaction functions through a series of sub-libraries such as javafx-controls, javafx-graphics, and javafx-web.

The interfaces and analysis parameters of all 175 sub-applications are implemented through various components provided by JavaFX. Each sub-application provides essential interfaces such as uploading example datasets or user data files, parameter synchronization and feedback, and outputting results.

Rserve and REngine provided by org.rosuda.REngine as two special and critical libraries are used to implement real-time communication with the R environment through daemon threads. Correspondingly, the Rserve function is utilized to provide instant responses to Java call signals.

Although OmicsSuite is used by Java and R as the real-time running environment and function execution environment respectively, and has a large number of built-in Java and R modules, and even the Shinyapp framework. But easily users only need to install based on binary file to use the full functionality out of the box, without any additional configuration environment and dependencies.

Basically, a computer device with a 4-cores CPU, 4G memory, and 256G storage can perform normal operations of OmicsSuite. We recommend providing a minimum of 6-cores CPU and 8GB memory for single-cell analysis, with a test PBMC (Peripheral Blood Mononuclear Cells) dataset (including 2700 single cells) execution time of approximately three minutes.

The study, "OmicsSuite: a customized and pipelined suite for analysis and visualization of multi-omics big data," has been published in Horticulture Research.

OmicsSuite: A customized and pipelined suite for analysis and visualization of multi-omics big data
OmicsSuite layout and sub-application user interface. (A) The layout of OmicsSuite; (B) The sub-application user interface is divided into three sections from top to bottom: data upload and preview, parameter components, and results preview and download. Credit: Horticulture Research

UI design and data interface

OmicsSuite has redesigned the UI of JavaFX to provide a modern and improved operating experience for users. The default layout features a multi-level menu bar at the top of the window, a shortcut access bar at the bottom, a collapsible toolbox on the left, a home page in the middle, and a meta information and version update record panel on the right.

The menu bar allows users to quickly launch sub-applications based on multi-level categorization. When a sub-application is started, the layout will switch to the user interface, with the analysis page of the application in the middle and application details information on the right.

The analysis page is comprised of a data section, a parameter component section, and a result section from top to bottom. The fixed components Progress, Demo, Clear, and Submit are part of the task management components used to display the current status, run example data, clear the current task, and submit a new task respectively.

Other common components such as Themes, Colors, Fonts, Figure Width, Figure Height, and Figure DPI belong to the parameter specification components. These components implement a unified theme and color scheme for OmicsSuite and standardize the default output image in 10.00 × 6.18-inch (300 dpi) form, following the golden ratio.

OmicsSuite: A customized and pipelined suite for analysis and visualization of multi-omics big data
The overview of OmicsSuite 12 categories and 175 sub-applications. Credit: Horticulture Research

Sub-applications overview and classification

Bioinformatics encompasses biology (such as multi-omics) and methodology (such as statistics and advanced algorithms). Therefore, OmicsSuite continuously improves multi-omics analysis and visualization functions based on the foundation of statistical analysis, providing users with a comprehensive one-stop solution. Currently, there are 12 categories with 175 sub-applications.

The categories are: Sequence, Statistics, Algorithm, Genomics, Transcriptomics, Enrichment, Proteomics, Metabolomics, Clinical, Microorganisms, Single Cell, and Table Operation. OmicsSuite can analyze almost all multi-omics data, and each classification corresponds to different types of professional data formats. Applications in the OmicsSuite Sequence category typically require a FastA format sequence file, applications in the Genomics category require data in MAF file (Mutation Annotation Format); applications in the Metabolomics category require compressed mzML format files, and users need to provide compressed Matrix or HDF5 format files for the Single-Cell category.

To summarize, the key features of OmicsSuite include:

  1. User-friendly interactive experience, convenient demo running button, complete parameter components, and table and image preview windows.
  2. Comprehensive coverage of multi-omics analysis and visualization functions, particularly in metabolomics and single-cell analysis workflows.
  3. OmicsSuite supports reading most multi-omics raw files, such as LC-MS data mzML format, single-cell 10x genomics Chromium matrix format, and Visium HDF5 format data.
  4. OmicsSuite provides a complete basic visualization system, intuitive operation interface for dimensionality reduction algorithms (PCA, PCoA, tSNE, etc.) and clustering algorithms (Kmeans, Hclust, AGNES, etc.), and a SEM model construction and evaluation system.

More information:
Ben-ben Miao et al, OmicsSuite: a customized and pipelined suite for analysis and visualization of multi-omics big data, Horticulture Research (2023). DOI: 10.1093/hr/uhad195

Provided by NanJing Agricultural University

Citation: OmicsSuite: A customized and pipelined suite for analysis and visualization of multi-omics big data (2023, October 13) retrieved 28 November 2024 from https://sciencex.com/wire-news/458660292/omicssuite-a-customized-and-pipelined-suite-for-analysis-and-vis.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.