abstract = "One of the long-standing open challenges in
computational systems biology is the topology inference
of gene regulatory networks from high-throughput omics
data. Recently, two community-wide efforts, DREAM4 and
DREAM5, have been established to benchmark network
inference techniques using gene expression
measurements. In these challenges the overall top
performer was the GENIE3 algorithm. This method
decomposes the network inference task into separate
regression problems for each gene in the network in
which the expression values of a particular target gene
are predicted using all other genes as possible
predictors. Next, using tree-based ensemble methods, an
importance measure for each predictor gene is
calculated with respect to the target gene and a high
feature importance is considered as putative evidence
of a regulatory link existing between both genes. The
contribution of this work is twofold. First, we
generalize the regression decomposition strategy of
GENIE3 to other feature importance methods. We compare
the performance of support vector regression, the
elastic net, random forest regression, symbolic
regression and their ensemble variants in this setting
to the original GENIE3 algorithm. To create the
ensemble variants, we propose a subsampling approach
which allows us to cast any feature selection algorithm
that produces a feature ranking into an ensemble
feature importance algorithm. We demonstrate that the
ensemble setting is key to the network inference task,
as only ensemble variants achieve top performance. As
second contribution, we explore the effect of using
rank-wise averaged predictions of multiple ensemble
algorithms as opposed to only one. We name this
approach NIMEFI (Network Inference using Multiple
Ensemble Feature Importance algorithms) and show that
this approach outperforms all individual methods in
general, although on a specific network a single method
can perform better. An implementation of NIMEFI has
been made publicly available.",