Marek Gagolewski

Dr habil. Marek Gagolewski

(Pronounced like Mark Gaggle-Eve-Ski)

Senior Lecturer in Applied Artificial Intelligence
School of Information Technology, Deakin University
Melbourne-Burwood Campus, Room T2.20, 221 Burwood Hwy, Burwood, VIC 3125, Australia
Associate Professor in Computational Data Science (on long-term leave)
Faculty of Mathematics and Information Science, Warsaw University of Technology
ul. Koszykowa 75, 00-662 Warsaw, Poland
Emails (pick one – and only one):
marekgagolewski·com (main)
m.gagolewskideakin·edu·au (academic)
ORCID ORCID=0000-0003-0637-6028
See also:  CV


Researcher in the Science of Data (with particular emphasis on modelling of complex phenomena and developing of usable, general purpose algorithms)

  • Research interests: machine learning; data fusion, aggregation, and clustering; computational statistics; mathematical modelling in informetrics, sports analytics, and science of science – amongst others
  • Author or co-author of 75 publications (see featured papers), including 33 journal papers in outlets such as Proceedings of the National Academy of Sciences (PNAS), Information Fusion, International Journal of Forecasting, Statistical Modelling, R Journal, Information Sciences, IEEE Transactions on Fuzzy Systems, and Journal of Informetrics
  • Eligible Principal Supervisor at PhD level (principal supervisor of 3 PhD and 11 MSc by research students from commencement through to successful completion) – feel welcome to contact me if you have any interesting research ideas (and essential capabilities such as programming skills (Python, R, C, etc.), data wrangling, matrix algebra, probability and statistics, and optimisation)

Free (Libre) and Open Source Data Analysis Software Developer

Data Science, Machine Learning, and Statistical Computing Tutor & Trainer

Recent News

2020-11-23 new paper

Interpretable sport team rating models based on the gradient descent algorithm

Jan Lasek and I authored a paper that will soon appear in International Journal of Forecasting, where we introduce several new (and efficient) rating models for teams (football/soccer in particular) based on the gradient descent algorithm. Read more…

2020-11-13 research grant

ARC 2021 Discovery Project

Our (Gleb Beliakov, Simon James, and yours truly) 2021 Discovery Project Beyond black-box models: Interaction in eXplainable Artificial Intelligence has been approved by the Australian Research Council. Read more…

2020-09-09 software

R Package stringi 1.5.3 Released

A new, major release of my R package stringi brings quite a few new features and bug fixes. Read more…

2020-09-07 software

Tutorial on stringi

A comprehensive tutorial on the stringi package is now available.

2020-08-17 software

stringi Has a New Website

I have created a new home(page) for my stringi package, see

2020-07-31 software

Python and R package genieclust 0.9.4

A reimplementation of my robust hierarchical clustering algorithm Genie is now available on PyPI and CRAN. Now even faster and equipped with many more features, including noise point detection. See for more details, documentation, benchmarks, and tutorials.

2020-07-08 new paper

Paper on SimilaR in R Journal

SimilaR: R Code Clone and Plagiarism Detection by Maciej Bartoszuk and me has been accepted for publication in the R Journal. Read more…

2020-06-08 new paper

Paper in PNAS: Three Dimensions of Scientific Impact

In a paper recently published in the Proceedings of the National Academy of Sciences of the United States of America (PNAS) (doi:10.1073/pnas.2001064117; joint work with Grzesiek Siudem, Basia Żogała-Siudem and Ania Cena), we consider the mechanisms behind one’s research success as measured by one’s papers’ citability. By acknowledging the perceived esteem might be a consequence not only of how valuable one’s works are but also of pure luck, we arrived at a model that can accurately recreate a citation record based on just three parameters: the number of publications, the total number of citations, and the degree of randomness in the citation patterns. As a by-product, we show that a single index will never be able to embrace the complex reality of the scientific impact. However, three of them can already provide us with a reliable summary. Read more…

Browse all news