With the advent of microarrays, it is possible to obtain large amounts of gene expression data – which provide a picture of the transcriptional activity of genes in an organism. Microarray technology makes use of the sequence resources created by the genome projects and other sequencing efforts to answer the question, what genes are expressed in a particular cell type or organism, at a particular time, under particular conditions. For instance, they allow comparison of gene expression between normal and disease (e.g. cancerous) cells.
Advanced microarray data analysis could be split into three mains tasks:

  1. Clustering (grouping genes based on their expression patterns, in order to infer functional classes of non-annotated genes)
  2. Classification (diagnostic categorization of cancer versus non-cancer tissues, discrimination among different subtypes of tumors, as well as drug response prediction or cancer prognosis)
  3. Gene networks reconstruction (Gene regulatory networks (GRNs) are the on-off switches and rheostats of a cell operating at the gene level. They dynamically orchestrate the level of expression for each gene in the genome by controlling whether and how vigorously that gene will be transcribed into RNA)