This Science News Wire page contains a press release issued by an organization and is provided to you "as is" with little or no review from Science X staff.

Unsupervised spectral feature selection algorithms for high dimensional data

November 13th, 2023

It is a significant and challenging task to detect the informative features to carry out explainable analysis and build an interpretable AI system for high dimensional data, especially for those with a very small number of samples without any label information.

Unsupervised feature selection algorithms are the right way to deal with this challenge and realize the task, especially in the big data era. However, the available unsupervised feature selection approaches usually cannot precisely identify the most discriminative features from high dimensional data with few samples.

To address the challenges, a research team led by Juanying Xie published their research in Frontiers of Computer Science.

The team proposes two novel unsupervised spectral feature selection algorithms, which group features into clusters using an advanced Self-Tuning spectral clustering algorithm based on local standard deviation, guaranteeing the global optimal feature clusters could be detected as far as possible.

The entropy-based and cosine-similarity-based feature ranking techniques are, respectively, proposed, so that the representative feature from each cluster could be detected out to comprise the feature subset on which an explainable classification system will be built. This guarantees that the detected features are representative and independent of each other as far as possible.

The extensive experiments and rigorous statistical tests demonstrate that these unsupervised spectral feature selection algorithms are superior to the peer ones in comparison. They detected features having strong discriminative capabilities in downstream classifiers for omics data, such that the AI system built on them would be reliable and explainable, making it possible to build a transparent and trustworthy medical diagnostic system from an interpretable AI perspective.

Future research could study the general way of finding an appropriate parameter of the advanced Self-Tuning spectral clustering based on local standard deviation. Another goal is reducing the computing cost when detecting the optimal feature subset of very high dimensionality data, such as SNP data.

More information:
Mingzhao Wang et al, Unsupervised spectral feature selection algorithms for high dimensional data, Frontiers of Computer Science (2022). DOI: 10.1007/s11704-022-2135-0

Provided by Frontiers Journals

Citation: Unsupervised spectral feature selection algorithms for high dimensional data (2023, November 13) retrieved 24 June 2026 from https://sciencex.com/wire-news/461352231/unsupervised-spectral-feature-selection-algorithms-for-high-dime.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.