Genie finds meaningful clusters and is fast even on large data sets.
stringi (pronounced “stringy”, IPA [strinɡi]) is THE R package for very fast, portable, correct, consistent, and convenient string/text processing in any locale or character encoding. It is one of the most often downloaded packages on CRAN.
realtest: When Expectations Meet Reality: Realistic Unit Testing in R
TurtleGraphics: Learn R Programming While Having a Jolly Time
(maintained by Barbara Żogała-Siudem)
agop: Aggregation Operators and Preordered Sets in R
SimilaR: R R Source Code Similarity Evaluation
(maintained by Maciej Bartoszuk)
FuzzyNumbers: Tools to Deal with Fuzzy Numbers in R
CITAN: CITation ANalysis Toolpack [deprecated]
See comment lines for a detailed description of each dataset.
airlines <- read.csv("nycflights13_airlines.csv.gz", comment.char="#") head(airlines)
## carrier name ## 1 9E Endeavor Air Inc. ## 2 AA American Airlines Inc. ## 3 AS Alaska Airlines Inc. ## 4 B6 JetBlue Airways ## 5 DL Delta Air Lines Inc. ## 6 EV ExpressJet Airlines Inc.
import pandas as pd airlines = pd.read_csv("nycflights13_airlines.csv.gz", comment="#", compression="gzip") airlines.head()
To print comment lines, call, e.g.:
import gzip with gzip.open("nycflights13_airlines.csv.gz", "rt") as f: while True: x = f.readline().strip() if not x.startswith("#"): break print(x)
Licensed under CC-by-SA 3.0; see readme.txt for more details.
Hadley Wickham's nycflights13-0.2.1 (licensed under CC0, gzipped) – on-time data for all flights that departed NYC (i.e., JFK, LGA, or EWR) in 2013:
Hadley Wickham's babynames-0.2.1 (licensed under CC0, gzipped) – US Baby Names 1880-2014:
Hadley Wickham's fueleconomy-0.1 (licensed under CC0, gzipped) – fuel economy data from the EPA, 1985-2015:
Hadley Wickham's nasaweather-0.1 (licensed under GPL-3, gzipped):
The following datasets are included in the datasets package for GNU R:
"If you can implement something, this means you understand it."
Nowadays I develop most of my software with: