Data-driven precision medicine

Advanced data mining to enhance medical decisions based on complex, heterogeneous and scarce health data.


Learn about our data science expertise in precision medicine:

  1. What challenges do we address?
  2. What technology do we master?
  3. What do we publish about our projects and the projects of our customers?


Precision medicine challenges

Health care is facing several challenges:

  • Several medical indications are still nowadays subject to wrong and/or inefficient diagnoses and therapeutic strategy choices
  • The industry is subject to difficulties to come with new efficient therapeutics.
  • Medical technologies have huge societal costs.
  • An ever increasing pressure is put on budgets from health payers.
  • Patients are more and more empowered and request for more and more tailoring in the management of their health.
  • In all fields of health care, from fundamental research to market surveys, we observe a data deluge.

In that context, DNAlytics offers a tailored and on-demand consultancy service in data mining. By making use of Artificial Intelligence techniques, we build new decision-support tools from public and private datasets. As an introduction to what we do, we suggest watching the short video at the top of this page.

Our data mining platform fits with many data production technologies as well as various medical areas and needs. Although perhaps not explicit from the start, most of the project we are confronted with pursue three objectives:

  1. Build and validate predictive solutions
  2. Identify a small set of relevant (bio)markers from a vast set of possible markers
  3. Prototype software applications making those solutions available to practitioners

Through most of our projects, we use software elements that combine publicly available code and our own code library. A part of our website is dedicated to relevant software elements developed and/or maintained at DNAlytics. We also invite you to read about some actual projects we have performed in the past.

About Big Data: In most cases, when the GAFAs (Google, Amazon, Facebook, Apple, …) discuss the concept of “big data”, theyr refer to their own situation where many observations (millions of users) are available, each of them however described by a limited set of features. In this context, statistical learning, i.e. Machine Learning, is quite an easy task. In medicine, it is in most cases the opposite: due to financial, ethical, logistical constraints, databases generally include a very limited number of observations (patients) with respect to the number of features describing each of them (tens of thousands of genes and other covariates). In that context, Machine Learning is much harder, and it is precisely the context for which DNAlytics developed its own, recognized, expertise. That, also, makes us unique.



Our core expertise is in data science. We build predictive models and identify what (combination of) markers (in a broad sense) should contribute to these models. These models enable making predictions (such as diagnosis, prognosis, treatment guidance, adverse event prediction, etc).

We are able to deal with very large datasets (many patients) and very broad datasets (many features). To manage and analyse data, we rely on our own code written in R and C languages (mainly), and on various open-source libraries.

To obtain the computing power we need, we make heavy use of cloud computing solutions, such as the Amazon Web Services (AWS). We are an AWS Consulting Partner.

We are able to deal with very large datasets of many different kinds: epigenetics (e.g. methylation), genetics (DNA), transcriptomics (mRNA, lncRNA, miRNA, …), proteomics (e.g. mass spec.), elisa, metabolomics, clinical, epidemiologic, psychological, demographic data.

Ion torrent, Affymetrix, Illumina, Taqman, …

On top of that, we can tap into publicly available datasets to complete the data provided by our customers.

We master more classical statistics too (hypothesis testing, statistical analysis plan design/writing/execution). In the context of the conduct of clinical studies, we can also deploy electronic data capture tools (EDC), such as OpenClinica.



Here is a list of publications (scientific communications, patents, software libraries).

Scientific communication


To which we contributed as (co-)inventors, or on which we have a license:


For more information about our software libraries, go to the dedicated page.