Abstract
In Systems Biology, gene expression data are crucial for designing biological system circuitry. While clustering and soft computing techniques are commonly used for classification, Information Theory-based entropy functions – particularly multivariate entropy – remain underutilized for deriving biological inferences. With the advent of high-throughput data acquisition systems, more quantitative data are now available, increasing the relevance of Information Theory-based applications. Simultaneously, this creates a demand for a user-friendly, automated analytical framework. This work presents an automated computational framework for the systematic exploration of molecular data, designed to facilitate the construction of biological process-based networks. Algorithms based on multivariate Information Theory have been implemented on different platforms: one in a proprietary environment (MATLAB) and two in open-source environments (GNU Octave and Python). All implementations are ready to use, allowing researchers to analyze their data using the platform of their choice. The algorithms have been successfully tested on published gene expression datasets.
