The increasing number of single-cell transcriptomic and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism as well as the onset of pathologies. These studies continuously refine the functional roles of known cell populations, and provide their characterization as soon as putatively novel cell populations are detected. In order to isolate the cell populations for further tailored analysis, succinct marker panels-composed of a few cell surface proteins and clusters of differentiation molecules-must be identified. The identification of these marker panels is a challenging computational problem due to its intrinsic combinatorial nature, which makes it an NP-hard problem. Genetic Algorithms (GAs) have been successfully used in Bioinformatics and other biomedical applications to tackle combinatorial problems. We present here a GA-based approach to solve the problem of the identification of succinct marker panels. Since the performance of a GA is strictly related to the representation of the candidate solutions, we propose and compare three alternative representations, able to implicitly introduce different constraints on the search space. For each representation, we perform a fine-tuning of the parameter settings to calibrate the GA, and we show that different representations yield different performance, where the most relaxed representations-in which the GA can also evolve the number of genes in the panel-turn out to be the more effective, especially in the case of 0-knowledge problems. Our results also show that the marker panels identified by GAs can outperform manually curated solutions.

The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data

Nobile, MS;
2021-01-01

Abstract

The increasing number of single-cell transcriptomic and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism as well as the onset of pathologies. These studies continuously refine the functional roles of known cell populations, and provide their characterization as soon as putatively novel cell populations are detected. In order to isolate the cell populations for further tailored analysis, succinct marker panels-composed of a few cell surface proteins and clusters of differentiation molecules-must be identified. The identification of these marker panels is a challenging computational problem due to its intrinsic combinatorial nature, which makes it an NP-hard problem. Genetic Algorithms (GAs) have been successfully used in Bioinformatics and other biomedical applications to tackle combinatorial problems. We present here a GA-based approach to solve the problem of the identification of succinct marker panels. Since the performance of a GA is strictly related to the representation of the candidate solutions, we propose and compare three alternative representations, able to implicitly introduce different constraints on the search space. For each representation, we perform a fine-tuning of the parameter settings to calibrate the GA, and we show that different representations yield different performance, where the most relaxed representations-in which the GA can also evolve the number of genes in the panel-turn out to be the more effective, especially in the case of 0-knowledge problems. Our results also show that the marker panels identified by GAs can outperform manually curated solutions.
2021
2021 IEEE Congress on Evolutionary Computation (CEC)
File in questo prodotto:
File Dimensione Formato  
The_Impact_of_Representation_on_the_Optimization_of_Marker_Panels_for_Single-cell_RNA_Data.pdf

non disponibili

Tipologia: Versione dell'editore
Licenza: Accesso chiuso-personale
Dimensione 1.09 MB
Formato Adobe PDF
1.09 MB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5004823
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 4
social impact