This document discusses approaches for clustering gene expression microarray data and tissue samples. It describes modeling the tissue and gene spaces with mixture models to account for correlations between genes and impose elliptical clusters. Dimensionality reduction is achieved by screening genes and clustering them into groups to represent tissues in lower dimensional space based on a smaller number of unobservable factors regulating gene interactions. Overlapping gene sequences are grouped into superclusters to efficiently validate overlaps using hash tables storing pairwise sequence IDs.