Short bio

Since September 2018, I am an Associate Professor of Bioinformatics at the UCLouvain, in Belgium, and director of the Computational Biology and Bioinformatics (CBIO) group. I am located in the de Duve Institute, on the medical campus in Brussels, where I run a research group and teach at the faculty of pharmacy and biomedical sciences (FASB).

I am an avid open research advocate and make every possible effort to make my research reproducible and openly available. I am a Software Sustainability Institute fellow, a Data and Software Carpentry instructor, and an affiliated member of the Bioconductor project.

I am also involved in the Bullied Into Bad Science campaign, an initiative by and for early career researchers who aim for a fairer, more open and ethical research and publication environment. In 2017-2018, I was also part of the eLife Early-career advisory group and, since 2017, am a #ASAPbio ambassador.

Before that, I was a Senior Research Associate in the Department of Biochemistry at the University of Cambridge. It’s at the Cambridge Centre for Proteomics and the Computational Proteomics Unit (my former research group in Cambridge) that I started working on various aspects of quantitative and spatial proteomics, developing new methods and implementing computational tools with a strong emphasis on rigorous and reproducible data analysis. I am also a visiting scientist in the PRIDE team at the European Bioinformatics Institute, and the Cambridge Computational Biology Institute.

And even before that, during my MSc and PhD work, I studied micro-evolutionary genetic patterns of the Broom leaf beetle Gonioctena variabilis in Southern Europe (Gatto el al., 2008), the application of short interspersed mobile elements (SINEs) to study the evolution of cetaceans, and the applicability of the General Time Reversible nucleotide substitution model in the light of differential lineage sorting (Gatto el al., 2006). I also spend 3 years in industry working on genomic and transcriptomics data, in particular the microarray quality control (Shi et al., 2010).

Research

As pointed out by D. Donoho, An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. This directly applies to high throughput biology data analysis and I strongly believe that being able to reproduce the complete set of results, replicate an analysis with new data and track the evolution of the work that lead to the scientific novelty are essential aspects of the process of doing research. Hence, I regard the development of scientific software as well as agile and robust analysis methodologies that facilitate reproducible research, an important aspect of my scientific activity.

Clarity and traceability of the data and the analysis methodology enable us to better understand what we do, how and why we do it and consequently exploit the data and comprehend the biology. While not sufficient, these are nevertheless necessary requirements for effective data-driven science.

The collaborative and interdisciplinarity nature of much of the research in biology calls for a open approaches, influenced by the open source and free/libre software movements, from communication between stake holders, open research and development to open dissemination of all research outputs.

Proteomics

My work on the design and implementation of reproducible mass spectrometry-based proteomics data analysis pipeline has materialised in the development of the MSnbase (Gatto et al., 2012) package to manipulate, process and analyse quantitative proteomics data. The MSnbase infrastructure also supports the work on the statistical learning applied to spatial proteomics (see below). The synapter package and the associated publications (Bond et al., 2013 and Shliaha et al., 2013) addresses MSE label-free quantitation, optionally including ion mobility separation.

Spatial proteomics

In biology, localisation is function: knowledge of the localisation of proteins is of paramount importance to assess and study their function, and spatial proteomics is the systematic study of the sub-cellular localisation of proteins and changes thereof (Gatto et al., 2010). Since 2010, I have developed novel software and machine learning approaches enabling more reliable and systematic inference of protein localisations using quantitative proteomics. This work has materialised in the pRoloc package (Gatto et al., 2014) that implements various established classification algorithms, effective visualisation techniques (Gatto el al., 2015) as well as novelty detection (Breckels et al., 2013) and transfer learning, harvesting GO annotations of microscopy-based methods to improve the spatial resolution of experimental spatial proteomics data (Breckels et al., 2016).

Teaching

Over the years I have been involved in many teaching activities, ranging from beginners and advanced R courses, genome biology, proteomics bioinformatics, integrative omics, scientific computing as part of the MPhil in Computational Biology in Cambridge, as well as several Software and Data Carpentry bootcamps. All my teaching material is available in the TeachingMaterial repository.

Please do get in touch if you are interested in running workshops.

Publications

See also my Google scholar profile.

Research articles

Oliver M. Crook, Laurent Gatto, Paul D. W. Kirk Fast approximate inference for variable selection in Dirichlet process mixtures, with an application to pan-cancer proteomics arXiv:1810.05450 2018.

Aikaterini Geladaki, Nina Kocevar Britovsek, Lisa M. Breckels, Tom S. Smith, Claire M. Mulvey, Oliver M. Crook, Laurent Gatto, Kathryn S. Lilley LOPIT-DC: A simpler approach to high-resolution spatial proteomics bioRxiv 378364; doi: 10.1101/378364.

Laurent Gatto, Lisa M Breckels, Kathryn S Lilley Assessing sub-cellular resolution in spatial proteomics experiments bioRxiv 377630; doi: 10.1101/377630.

Segeritz CP, Rashid ST, Cardoso de Brito M, Paola MS, Ordonez A, Morell CM, Kaserman JE, Madrigal P, Hannan N, Gatto L, Tan L, Wilson AA, Lilley K, Marciniak SJ, Gooptu B, Lomas DA, Vallier L. hiPSC hepatocyte model demonstrates the role of unfolded protein response and inflammatory networks in α(1)-antitrypsin deficiency. J Hepatol. 2018 Jun 4. pii: S0168-8278(18)32113-5. doi: 10.1016/j.jhep.2018.05.028.

Nett I, Mulas C, Gatto L, Lilley KS, Smith A. Negative feedback via RSK modulates Erk‐dependent progression from naïve pluripotency EMBO reports (2018) e45642 doi:10.15252/embr.201745642.

Crook OM, Mulvey CM, Kirk PDW, Lilley KS, Gatto L. A Bayesian Mixture Modelling Approach For Spatial Proteomics bioRxiv; doi: https://doi.org/10.1101/282269.

Thul PJ, et al. A subcellular map of the human proteome. Science. 2017 May 11. pii: eaal3321. doi:10.1126/science.aal3321. [Epub ahead of print] PubMed PMID:28495876.

Mulvey CM, Breckels LM, Geladaki A, Kocevar Britovsek N, Nightingale DJH, Christoforou A , Elzek M, Deery MJ, Gatto L, Lilley KS. Using HyperLOPIT to perform high-resolution mapping of the spatial proteome. Nature Protocols, 12, 1110–1135 (2017) doi:10.1038/nprot.2017.026 (See the F1000Research workflow for details on the computational side of the protocol.)

Leprevost FD, et al. BioContainers: An open-source and community-driven framework for software standardization. Bioinformatics. 2017 Mar 30. doi:10.1093/bioinformatics/btx192. [Epub ahead of print] PubMed PMID:28379341.

Breckels LM, Mulvey CM, Lilley KS and Gatto L. A Bioconductor workflow for processing and analysing spatial proteomics data F1000Research 2016, 5:2926 (doi:10.12688/f1000research.10411.1). [Software: MSnbase, pRoloc, pRolocGUI]

Wieczorek S, Combes F, Lazar C, Giai Gianetto Q, Gatto L, Dorffer A, Hesse A, Coute Y, Ferro M, Bruley C, and Burger T. DAPAR & ProStaR: software to perform statistical analyses in quantitative discovery proteomics Bioinformatics 2016, doi:10.1093/bioinformatics/btw580.

Perez-Riverol Y, Gatto L, Wang R, Sachsenberg T, Uszkoreit J, Leprevost Fda V, Fufezan C, Ternent T, Eglen SJ, Katz DS, Pollard TJ, Konovalov A, Flight RM, Blin K, Vizcaino JA. Ten Simple Rules for Taking Advantage of Git and GitHub. PLoS Comput Biol. 2016 Jul 14;12(7):e1004947. doi:10.1371/journal.pcbi.1004947 PMID:27415786.

Breckels LM, Holden S, Wonjar D, Mulvey CM, Christoforou A, Groen AJ, Kohlbacher O, Lilley KS, Gatto L. Learning from heterogeneous data sources: an application in spatial proteomics. PLoS Comput Biol. 2016 May 13;12(5):e1004920 doi:10.1371/journal.pcbi.1004920, Software)

Fabre B, Korona D, Groen A, Vowinckel J, Gatto L, Deery MJ, Ralser M, Russell S, Lilley KS. Analysis of the Drosophila melanogaster proteome dynamics during the embryo early development by a combination of label-free proteomics approaches, Proteomics, 2016 (PMID:27029218, Publisher)

Lazar C, Gatto L, Ferro M, Bruley C, Burger T. Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. J Proteome Res. 2016 Apr 1;15(4):1116-25. (Publisher, PMID:26906401, Software: CRAN and Bioconductor)

Christoforou A, Mulvey CM, Breckels LM, Geladaki A, Hurrell T, Hayward PC, Naake T, Gatto L, Viner R, Arias AM, Lilley KS. A draft map of the mouse pluripotent stem cell spatial proteome. Nat Commun. 2016 Jan 12;7:9992 doi:10.1038/ncomms9992 (PMID:26754106, data, PRIDE, resource)

Gatto L, Hansen KD, Hoopmann MR, Hermjakob H, Kohlbacher O and Beyer, A Testing and validation of computational methods for mass spectrometry. J Proteome Res. 2015. doi: 10.1002/stem.2067 (PubMed).

Mulvey CM, Schröter C, Gatto L, Dikicioglu D, Baris Fidaner I, Christoforou A, Deery MJ, Cho LT, Niakan KK, Martinez-Arias A, Lilley KS. Dynamic proteomic profiling of extra-embryonic endoderm differentiation in mouse embryonic stem cells. Stem Cells. 2015 Jun 8. doi: 10.1002/stem.2067 (PubMed).

Gatto L, Breckels LM, Naake T and Gibb S Visualisation of proteomics data using R and Bioconductor. Proteomics. 2015 Feb 18. doi:10.1002/pmic.201400392 (PubMed, Publisher and software: Bioconductor, github).

Huber W et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015 Jan 29;12(2):115-21 (PubMed, Publisher).

Hiemstra TF et al. Human urinary exosomes as innate immune effectors, J Am Soc Nephrol. 2014 Sep;25(9):2017-27. (PubMed,Publisher).

Nikolovski N, Shliaha PV, Gatto L, Dupree P and Lilley KS Label free protein quantification for plant Golgi protein localisation and abundance, Plant Physiol. pp.114.245589; First Published on August 13, 2014; doi:10.1104/pp.114.245589 (Publisher, PubMed)

Griss J, et al. The mzTab Data Exchange Format: communicating MS-based proteomics and metabolomics experimental results to a wider audience, Mol Cell Proteomics. 2014 June 30. (Publisher)

Tomizioli M, et al. Deciphering thylakoid sub-compartments using a mass spectrometry-based approach, Mol Cell Proteomics. 2014 May 28. (Publisher, PubMed)

Gatto L, et al. A foundation for reliable spatial proteomics data analysis, Mol Cell Proteomics. 2014 Aug;13(8):1937-52. (Publisher, PubMed, software, press coverage)

Walzer M, et al. qcML: an exchange format for quality control metrics from mass spectrometry experiments, Mol Cell Proteomics. 2014 Apr 23. (PubMed).

Vizcaíno J.A. et al. ProteomeXchange: globally co-ordinated proteomics data submission and dissemination, Nature Biotechnology 2014, 32, 223–226. (PubMed)

Gatto L., Breckels L.M, Burger T, Wieczorek S. and Lilley K.S. Mass-spectrometry based spatial proteomics data analysis using pRoloc and pRolocdata, Bioinformatics, 2014 (software, PubMed, publisher, software and data).

Groen A., Sancho-Andrés G., Breckels LM., Gatto L., Aniento F. and Lilley K.S. Identification of Trans Golgi Network proteins in Arabidopsis thaliana root tissue Journal of Proteome Research, 2013 (PubMed, publisher).

Wilf N.M. et al. RNA-seq reveals the RNA binding proteins, Hfq and RsmA, play various roles in virulence, antibiotic production and genomic flux in Serratia sp. ATCC 39006 BMC Genomics 2013, 14:822.

Gatto L. and Christoforou A. Using R and Bioconductor for proteomics data analysis, Biochim Biophys Acta - Proteins and Proteomics, 2013. (PubMed, pre-print and software: Bioconductor, github).

Bond N.J., Shliaha P.V, Lilley K.S., and Gatto L. Improving qualitative and quantitative performance for MSE-based label free proteomics, J. Proteome Res., 2013 (PubMed, publisher, software).

Shliaha P.V, Bond N.J., Gatto L. and Lilley K.S. The Effects of Travelling Wave Ion Mobility Separation on Data Independent Acquisition in Proteomics Studies, J. Proteome Res., 2013 (PubMed, publisher, software).

Breckels L.M., Gatto L., Christoforou A., Groen A.J., Lilley K.S. and Trotter M.W.B. The Effect of Organelle Discovery upon Sub-Cellular Protein Localisation, Journal of Proteomics, 2013 (PubMed, software).

Chambers M. et al. A Cross-platform Toolkit for Mass Spectrometry and Proteomics, Nature Biotechnology 30, 918–920, 2012 (PubMed, pdf, software [1|2]).

Gatto L. and Lilley K.S. MSnbase - an R/Bioconductor package for isobaric tagged mass spectrometry data visualisation, processing and quantitation, Bioinformatics, 28(2), 288-289, 2012 (PubMed, pdf, software).

Capuano F., Bond N.J., Gatto L., Beaudoin F., Napier J., Benvenuto E., Lilley K.S, and Baschieri S. LC-MS/MS methods for absolute quantification and identification of proteins associated to chimeric plant oil bodies, Analytical Chemistry, Dec 15;83(24):9267-72, 2011 (PubMed - data).

Foster J.M., Degroeve S., Gatto L., Visser, M., Wang, R., Griss J., Apweiler R. and Martens L. A posteriori quality control for the curation and reuse of public proteomics data, Proteomics 11(11):2182-94, 2011 (PubMed, pdf).

Lilley K.S., Deery M.J. and Gatto L. Challenges for Proteomics Core Facilities, Proteomics 11: 1017–1025, 2011 (PubMed, pdf).

Gatto L., Vizcaíno J.A., Hermjakob H., Huber W. and Lilley K.S. Organelle proteomics experimental designs and analysis Proteomics, 10:22, 3957-3969, 2010 (PubMed, pdf).

MAQC Consortium The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models Nature Biotechnology 28, 827–838 2010 (PubMed, pdf).

Gatto L., Mardulyn P. and Pasteels J.M. Morphological and mitochondrial DNA analysis suggest the presence of a hybrid zone between two species of the leaf beetle Gonioctena variabilis species complex in southern Spain, Biological Journal of the Linnean Society, 2008, 94(1), 105-114 (abstract, pdf).

Danis B., George T.C., Goriely S., Dutta B., Renneson J., Gatto L., Fitzgerald-Bocarsly P., Marchant A., Goldman M., Willems F. and De Wit D. Interferon regulatory factor 7-mediated responses are defective in cord blood plasmacytoid dendritic cells. Eur J Immunol. 2008 Feb;38(2):507-17. (PubMed, pdf).

Gatto L., Catanzaro D. and Milinkovitch M.C. Assessing the Applicability of the GTR Nucleotide Substitution Model Through Simulations Evolutionary Bioinformatics 2006:2 (PubMed, pdf).

Book chapters

Christoforou A., Mulvey C., Breckels LM., Gatto L. and Lilley KS. Spatial Proteomics: Practical Considerations for Data Acquisition and Analysis in Protein Subcellular Localisation Studies in Quantitative Proteomics, 185-210, The Royal Society of Chemistry, 2014.

Breckels LM, Gibb S, Petyuk V and Gatto L R for Proteomics in Proteome Informatics, The Royal Society of Chemistry, November 2016.

Technical Notes

Gatto L. Data Management Plan for a Biotechnology and Biological Sciences Research Council (BBSRC) Tools and ResourcesDevelopment Fund (TRDF) Grant, Research Ideas and Outcomes (2017), doi:10.3897/rio.3.e11624.

Gatto, L. and Schretter, C. Designing Primer Pairs and Oligos with OligoFaktorySE. EMBnet.news North America, 15, oct. 2009 (pdf,software).

Schretter, C. and Gatto, L. A Tiny Queuing System for Blast Servers December, 2005 (short and slighly longer versions).

Web/press coverage

News in Proteomics Research blog posts about R for proteomics, R and Bioconductor, spatial proteomics and handling of missing values.

Genome web Statistical Design Remains Sticking Point for Proteomics Experiments Jul 27, 2017.

Publishers put the squeeze on ResearchGate, 11 Oct 2017.

Nature News & Comment Correspondance Preprints help journalism, not hinder it, , 29 Aug 2018.

Software

I have developed and have contributed to many open source R/Bioconductor packages, in particular proteomics software and data packages, all of which are available on my own and my group’s GitHub pages. See Gatto and Christoforou, 2014, Gatto et al., 2014 and the RforProteomics vignettes for an overview of the R/Bioconductor infrastructure for mass spectrometry and proteomics.

Talks

Forthcoming talks:

  • Current plans are to be at the Howard Hughes Medical Institute in Chevy Chase, MD in February 2018 and at the Northeastern University in Boston, MA in May 2018.

  • A talk presenting R/Bioconductor for proteomics and applications at the Sainsbury Laboratory in Norwich on the 15 January 2018.

Open source and open development proteomics software at the EuBIC 2018 developer’s meeting, 9 - 12 January 2018, Ghent, Belgium.

A longer talk about the Bullied Into Bad Science campaign at the OpenConCam 2017 conference on the 16 November 2017. The content and slides are available here.

Short I won’t be #BulliedIntoBadScience! as part of the panel discussion Next-generation initiatives advancing open, at OpenCon 11 - 13 November 2017, Berlin.

Mapping the sub-cellular proteome, 8 November 2017, Leibniz Institut for Aging, Jena, Germany.

Open Science in Practice, 25 September 2017, Lausanne, Switzerland. An early career researcher’s view on modern and open scholarship.

Proteomics Method Forum, Oxford, UK, 22-23 June 2017. The Bioconductor project - analysis and comprehension of high-throughput proteomics data.

Research Data Management Forum, London, UK, 9th June 2017. An early career researcher’s view on modern and open scholarship … and careers.

Office of Scholarly Communication Training - How to Get the Most Out of Modern Peer Review, Cambridge, UK, 30 Mar 2017. The role of peer-reviewers in promoting open science.

European Bioconductor Developer Meeting, Zurich, Switzerland, 6 - 7 December 2016. MSnbase2 - disk access is the limit.

Cambridge Computational Biology Institute, UK, 16 November 2016. Mapping the sub-cellular proteome: Computational analyses of high-throughput mass spectrometry-based spatial proteomics data.

Dialogue on methods for ecology, Cambridge, UK, 15 November 2016, Learning from heterogeneous data in spatial proteomics.

Quantitative Proteomics and Data Analysis, Chester, UK, 4 - 5 April 2016. Inspection, visualisation and analysis of quantitative proteomics data (slides, vignette).

Introduction to Integrative Omics: proteomics, European Bioinformatics Institute, Hinxton, UK, 8 March 2016.

Updated: