Software¶
 
I’ve been programming since the age of 7; my first computer was the C64.
I believe most software should be free.
Shortcuts:
○ genieclust (Python+R)
○ stringi (R+[1])
○ clustering-benchmarks (language-agnostic)
○ quitefastmst (Python+R)
○ stringx (R)
○ realtest (R)
My developer (“social media”) profiles: ○ GitHub ○ StackOverflow
Check out my open-access textbooks: ○ Deep R Programming ○ Minimalist Data Wrangling with Python
genieclust  (Python+R)¶
 
Fast and robust hierarchical clustering with noise point detection: Genie finds meaningful clusters. It does so quickly, even in large datasets.
- Paper on the - genieclustpackage in SoftwareX (doi:10.1016/j.softx.2021.100722)
- Paper on the Genie algorithm in Information Sciences (doi:10.1016/j.ins.2016.05.003) 
stringi (R)¶
 
stringi (pronounced “stringy”, IPA [strinɡi]) is THE R[1] package for very fast, portable, correct, consistent, and convenient string/text processing in any locale or character encoding. It is one of the most often downloaded packages on CRAN.
- Paper on - stringiin the Journal of Statistical Software (DOI:10.18637/jss.v103.i02)
clustering-benchmarks (Python, R, etc.)¶
 
A framework for benchmarking clustering algorithms
- Paper in SoftwareX (DOI:10.1016/j.softx.2022.101270) 
quitefastmst  (Python+R)¶
 
Euclidean and Mutual Reachability Minimum Spanning Trees
stringx (R)¶
 
Drop-in replacements for base string functions powered by
stringi
realtest (R)¶
 
Where expectations meet reality: Realistic unit testing in R
Other¶
- TurtleGraphics (R) (maintained by Barbara Żogała-Siudem) - Learn R programming while having a jolly time 
- agop (R) - Aggregation operators and preordered sets in R 
- genie (R) - A new, fast, and outlier resistant hierarchical clustering algorithm (superseded by - genieclust)
- SimilaR (R) (maintained by Maciej Bartoszuk) - R source code similarity evaluation - ○ CRAN ○ GitHub ○ Paper on the SimilaR package in the R Journal (DOI:10.32614/RJ-2020-017) 
- FuzzyNumbers (R) - Tools to deal with fuzzy numbers in R 
- CITAN (R) - CITation ANalysis toolpack [deprecated]