Since Covert and Palsson published their study on transcriptional regulation in constraints-based metabolic models in 2002, more than fifty methods have been developed to investigate how the integration of gene expression could affect the content and predictive accuracy of a GEM. In 2004, Akesson et al. used gene expression data as an additional constraint on the metabolic fluxes in yeast. Afterwards, different algorithms were devised e.g., GIMME, E-Flux, MBA, Moxley, MADE, RELATCH, INIT, mCADRE, tINIT, CORDA, E-Flux2, SPOT and FASTCORE. These algorithms differ in assumptions and mathematical formulations. However, they can be classified in different ways:
In this study we focussed on the following integration methods:
Requires a core set of reactions.
The FASTCORE algorithm starts with a core set of reactions that forced to be active in the final model. Then, FACTCORE finds the minimum number of possible reactions to support this core. Every iteration of the algorithm computes a new sparse mode that aim to maximize the support of the mode inside the core set and minimize that quantity outside the core set.
Requires one set of expression data
Requires thresholding
Requires objective function
The GIMME algorithm first runs FBA to calculate the maximum possible flux through the stated functionalities (Growth rate). Then, GIMME eliminates reactions whose mRNA transcription levels are below a given threshold. However, if the subsequent model is not functional (able to achieve the desired objective function), GIMME adds sets of the removed reactions back into the model to minimize deviation from the expression data.
Requires discretization of data (lowly, and highly expressed)
Requires optional minimum flux threshold for "expressed" reactions
The INIT algorithm maximizes the activation of certain reactions associated with highly expressed genes, while minimizing the utilization of reactions associated with absent proteins. One of the new features of INIT is the relaxation of the steady-state condition to allow the accumulation rate for internal species. In fact, this accumulation avoids the removal of the reactions that are essential for its synthesis.
Requires one set of expression data
Requires discretization of data (lowly, moderately, and highly expressed)
Does not require an objective function
The iMAT algorithm categorizes gene expression data into three classes: highly, moderately, and lowly expressed genes. Then, iMAT solves a mixed integer linear programming (MILP) problem to maximize the reactions associated with the highly expressed genes and minimize the reactions associated with the lowly expressed genes. Like GIMME, the presence of reactions is allowed to result in a functional model. However, unlike GIMME, iMAT does not need an objective function. Instead, iMAT requires that highly expressed reactions carry a minimum flux.
In a Nutshell: | FASTCORE | GIMME | INIT | iMAT |
---|---|---|---|---|
Optimization | LP | LP | MILP | MILP |
Function required | ||||
Omics required | ||||
Computational cost |