Contents Menu Expand Light mode Dark mode Auto light/dark mode Auto light/dark, in light mode Auto light/dark, in dark mode
Marek Gagolewski — Home Page
Marek Gagolewski
  • About
  • News
  • Research
  • Publications
    • By Year
    • By Type
  • Software
  • Data
  • Teaching
    • Przetwarzanie danych ustrukturyzowanych (Structured Data Processing; BSc in Data Science) 🇵🇱
    • Advanced Algorithms and Data Structures for Data Science (MSc in Data Science)
    • Seminarium dyplomowe (Research Thesis Seminar; BSc in Data Science) 🇵🇱
  • CV
  • Blog/Notes
    • Recommended literature for data science students (undergraduate)
    • Recommended literature for data science students (postgraduate)
    • Random reads
    • Hiking, (trail) running, and cycling maps
    • Częste uwagi redakcyjne: jak składać prace doktorskie i dyplomowe w LaTeX-u po polsku i angielsku 🇵🇱
    • Piszemy i mówimy po polsku 🇵🇱
  • Personal

Quick Links

  • Deep R Programming
  • Minimalist Data Wrangling in Python
  • stringi
  • genieclust
  • Clustering Benchmarks
  • Teaching Data
  • MADAM Seminar
  • GitHub
Back to top

Recommended literature for data science students (undergraduate)¶

This list is a work in progress.

Last update: 2025-04-28.

See also: the postgraduate version.

Introductory Mathematics¶

  1. Rasiowa, H., Introduction to Modern Mathematics, North Holland, 2014 (Polish 🇵🇱: Wstęp do matematyki współczesnej, PWN, 2013)

  2. Deisenroth, M.P., Faisal, A.A., Ong, C.S., Mathematics for Machine Learning, Cambridge University Press, 2020 🔓

  3. Bishop, C., Pattern Recognition and Machine Learning, Springer, 2006 🔓

    • Chapters 1–5 give a good overview of the kind of maths you are expected to master

  4. Gentle, J.E., Matrix Algebra: Theory, Computations and Applications in Statistics, Springer, 2024

    • Probably not for beginners, but is a good source for the second exposure to the topic

Probability and Statistics¶

  1. Bartoszyński, R., Niewiadomska-Bugaj, M. (2007). Probability and Statistical Inference, Wiley, 2007

  2. Efron, B., Hastie, T.,. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, Cambridge University Press, 2016

  3. Gentle, J.E., Random Number Generation and Monte Carlo Methods, Springer, 2003

  4. Gentle, J.E., Theory of Statistics (draft), 2020 🔓

  5. Gentle, J.E., Computational Statistics, Springer, 2009

  6. Tufte, E.R., The Visual Display of Quantitative Information, Graphics Press, 2001

Programming, Algorithms¶

  1. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C., Introduction to Algorithms, MIT Press and McGraw-Hill, 2009

  2. Gagolewski, M., Deep R Programming, 2025 🔓

  3. Gagolewski, M., Minimalist Data Wrangling with Python, 2025 🔓

  4. Knuth, D.E., The Art of Computer Programming I: Fundamental Algorithms, Addison-Wesley, 1997

  5. Knuth, D.E., The Art of Computer Programming II: Seminumerical Algorithms, Addison-Wesley, 1997

  6. Knuth, D.E., The Art of Computer Programming III: Sorting and Searching, Addison-Wesley, 1997

Machine/Statistical Learning and Data Mining¶

  1. Aggarwal, C.C., Data Mining. The Textbook, Springer, 2015

  2. Blum, A., Hopcroft, J., Kannan, R., Foundations of Data Science, Cambridge University Press, 2020 🔓

    • beginner students should note an interesting chapter on the curse of dimensionality (Chap. 2)

    • applications of the SVD matrix factorisation in data science (Chap. 3)

  3. Murphy, K.P., Probabilistic Machine Learning: An Introduction, MIT Press, 2022 🔓

  4. Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning, Springer, 2017 🔓

  5. Devroye, L., Györfi, L., Lugosi, G., A Probabilistic Theory of Pattern Recognition, Springer, 1996

  6. Koronacki, J., Ćwik, J., Statystyczne systemy uczące się, EXIT, 2008 🇵🇱

Exploratory Data Analysis, Visualisation, Scientific Writing, etc.¶

  1. Anna Kozak’s lecture notes – Exploratory Data Analysis, Data Visualisation Techniques

  2. Oetiker, T. and others., The Not So Short Introduction to LaTeX 2ε, 2023 🔓

  3. Trzeciak, J., [Writing Mathematical Papers in English: A Practical Guide], EMS Press, 2005 (See also his [https://emis.de/monographs/Trzeciak/](Mathematical English Usage – a Dictionary))

  4. Tufte, E.R., The Visual Display of Quantitative Information, Graphics Press, 2001

Other¶

  1. Ginsberg, B., The Fall of the Faculty: The Rise of the All-Administrative University and Why It Matters, OUP, 2011

  2. Goldacre, B., Bad Science, Fourth Estate, 2008

  3. Goldacre, B., Bad Pharma, Fourth Estate, 2012

  4. Spicer, A., Business Bullshit, Routledge, 2018

  5. Bergstrom, C.T., West J.D., Modern-day oracles or bullshit machines? How to thrive in a ChatGPT world?, 2025

Next
Recommended literature for data science students (postgraduate)
Previous
Blog/Notes
Copyright © by Marek Gagolewski. Some rights reserved. Licensed under CC BY-NC-ND 4.0. Built with Sphinx and a customised Furo theme. Last updated on 2025-05-19T14:06:26+0200. This site will never display any ads: it is a non-profit project. It does not collect any data.
In this section
  • Recommended literature for data science students (undergraduate)
    • Introductory Mathematics
    • Probability and Statistics
    • Programming, Algorithms
    • Machine/Statistical Learning and Data Mining
    • Exploratory Data Analysis, Visualisation, Scientific Writing, etc.
    • Other