stringi-search-coll {stringi}R Documentation

Locale-Sensitive Text Searching in stringi


String searching facilities described in this very man page provide a way to locate a specific piece of text. Note that locale-sensitive searching, especially on a non-English text, is a much more complex process than it seems at the first glance.

Locale-Aware String Search Engine

All stri_*_coll functions in stringi utilize ICU's StringSearch engine – which implements a locale-sensitive string search algorithm. The matches are defined by using the notion of “canonical equivalence” between strings.

Tuning the Collator's parameters allows you to perform correct matching that properly takes into account accented letters, conjoined letters, ignorable punctuation and letter case.

For more information on ICU's Collator and the search engine and how to tune it up in stringi, refer to stri_opts_collator.

Please note that ICU's StringSearch-based functions often exhibit poor performance. These functions are not intended to be fast; they are made to give correct in natural language processing tasks.


ICU String Search Service – ICU User Guide,

L. Werner, Efficient Text Searching in Java, 1999,

See Also

Other search_coll: stri_opts_collator, stringi-search

Other locale_sensitive: %s<%, stri_compare, stri_count_boundaries, stri_duplicated, stri_enc_detect2, stri_extract_all_boundaries, stri_locate_all_boundaries, stri_opts_collator, stri_order, stri_split_boundaries, stri_trans_tolower, stri_unique, stri_wrap, stringi-locale, stringi-search-boundaries

Other stringi_general_topics: stringi-arguments, stringi-encoding, stringi-locale, stringi-package, stringi-search-boundaries, stringi-search-charclass, stringi-search-fixed, stringi-search-regex, stringi-search

[Package stringi version 1.1.7 Index]