At DNAlytics, we are confronted everyday to large data sets analysis in a predictive context, for diagnostic, prognostic or theranostic purposes. On top of predictive modeling, we also have to identify relevant sets of “markers” to be incorporated in those models among a wide set of candidate markers. The fact that those data sets are “Big Data” is subject to debate. Generally, what people understand as Big Data is a large number of observations with a limited number of features describing each observations. In our context of clinical research, the situation is generally the opposite: a limited number of observations (patients, generally) described by a very high number of features (candidate markers). That is why, at DNAlytics, we decided to develop and/or maintain a series of key software element supporting our analysis effort. We do not commercialize these pieces of software. Rather, we make for you the best use of them! Most of our developments are done in the open source R language. R language is a very common development language in the biostatistics / bioinformatics / data mining community. You will find below several software elements that we develop and/or maintain and for which comments and feedbacks are welcome!

REED – Rapid and Easy Evaluation of Datasets

Predictive modeling is a complex science. But what is more frustrating that obtaining poor or no results at all after having invested time and money in a data mining project? At DNAlytics, we clearly understand this. On our side, it is also a pity to have to announce such poor project outcome to our customers. We definitely don’t like it. That is why we now propose a very fast evaluation of the potential value of your data, and this for free!

Read more


LIBLINEAR is a software library for very large data sets classification and regresion. DNAlytics has incorporated this software originally developed in C/C++ into the R framework.

Read more


jForest is a general framework for Machine Learning implementing tree ensemble based classification methods. The package comes into Java and R flavors.

Read more


BLISS stands for Biomarker List Interpretation Simple Software. Because nothing makes you loose more time than performing a bibliographic, functional, pathway, drug and mutation analysis based on a list of candidate biomarkers, allow BLISS to do the work for you!

Read more