stringi
1.7.2¶
Another major update of
stringi
brings a rewritten version of stri_sprintf
,
support for custom rule-based transliteration,
extraction of named regex capture groups,
and many other enhancements.
Changes since v1.6.2:
[BACKWARD INCOMPATIBILITY]
%s$%
and%stri$%
now use the newstri_sprintf
(see below) function instead ofbase::sprintf
.[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_sub<-
andstri_sub_all<-
, providing a negativelength
from now on does not result in the corresponding input string being altered.[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_sub
andstri_sub_all
, negativelength
results in the corresponding output beingNA
or not extracted at all, depending on the setting of the new argumentignore_negative_length
.[BACKWARD INCOMPATIBILITY, BUGFIX, NEW FEATURE] In
stri_subset*
and their replacement versions,pattern
andvalue
cannot be longer thanstr
(but now they are recycled if necessary).[BACKWARD INCOMPATIBILITY, NEW FEATURE]
stri_sub*
now accept thefrom
argument being a matrix likecbind(from, length=length)
. Unnamed columns or any other names are still interpreted ascbind(from, to)
. Also, the new argumentuse_matrix
can be used to disable the special treatment of such matrices.[DOCUMENTATION] It has been clarified that the syntax of
*_charclass
(e.g., used instri_trim*
) differs slightly from regex character classes.[NEW FEATURE] #420:
stri_sprintf
(alias:stri_string_format
) is a Unicode-aware replacement for and enhancement of the basesprintf
: it adds a customised handling ofNA
s (on demand), computing field size based on code point width, outputting substrings of at most given width, variable width and precision (both at the same time), etc. Moreover,stri_printf
can be used to display formatted strings conveniently.[NEW FEATURE] #153:
stri_match_*_regex
now extract capture group names.[NEW FEATURE] #25:
stri_locate_*_regex
now have a new argument,capture_groups
, which allows for extracting positions of matches to parenthesised subexpressions.[NEW FEATURE]
stri_locate_*
now have a new argument,get_length
, whose setting may result in generating from-length matrices (instead of from-to ones).[NEW FEATURE] #438:
stri_trans_general
now supports rule-based as well as reverse-direction transliteration.[NEW FEATURE] #434:
stri_datetime_format
andstri_datetime_parse
are now vectorised also with respect to theformat
argument.[NEW FEATURE]
stri_datetime_fstr
has a new argument,ignore_special
, which defaults toTRUE
for backward compatibility.[NEW FEATURE]
stri_datetime_format
,stri_datetime_add
, andstri_datetime_fields
now callas.POSIXct
more eagerly.[NEW FEATURE]
stri_trim*
now have a new argument,negate
.[NEW FEATURE]
stri_replace_rstr
convertsgsub
-style replacement strings tostri_replace
-style.[INTERNAL]
stri_prepare_arg*
have been refactored, buffer overruns in the exception handling subsystem are now avoided.[BUGFIX] Few functions (
stri_length
,stri_enc_toutf32
, etc.) did not throw an exception on an invalid UTF-8 byte sequence (and merely issues a warning instead).[BUGFIX]
stri_datetime_fstr
did not honourNA_character_
and did not parse format strings such as"%Y%m%d"
correctly. It has now been completely rewritten (in C).[BUGFIX]
stri_wrap
did not recognise the width of certain Unicode sequences correctly.