2015-12-13 new paper

Accepted Paper in IEEE TFS

A short paper entitled H-index and other Sugeno integrals: Some deffects and their compensation, by Radko Mesiar and Marek Gagolewski, has been accepted for publication in IEEE Transactions on Fuzzy Systems.
Abstract: The famous Hirsch index has been introduced just ca. 10 years ago. Despite that, it is already widely used in many decision making tasks, like in evaluation of individual scientists, research grant allocation, or even production planning. It is known that the h-index is related to the discrete Sugeno integral and the Ky Fan metric introduced in 1940s. The aim of this paper is to propose a few modifications of this index as well as other fuzzy integrals -- also on bounded chains -- that lead to better discrimination of some types of data that are to be aggregated. All of the suggested compensation methods try to retain the simplicity of the original measure.
2015-12-01 new paper

Accepted Paper in European Physical Journal B

Agent-based model for the h-index – Exact solution by Żogała-Siudem B., Siudem G., Cena A., and Gagolewski M. has been accepted for publication in European Physical Journal B (assigned doi:10.1140/epjb/e2015-60757-1).
Abstract: The Hirsch's h-index is perhaps the most popular citation-based measure of the scientific excellence. In 2013 G. Ionescu and B. Chopard proposed an agent-based model for this index to describe a publications and citations generation process in an abstract scientific community. With such an approach one can simulate a single scientist's activity, and by extension investigate the whole community of researchers. Even though this approach predicts quite well the h-index from bibliometric data, only a solution based on simulations was given. In this paper, we complete their results with exact, analytic formulas. What is more, due to our exact solution we are able to simplify the Ionescu-Chopard model which allows us to obtain a compact formula for h-index. Moreover, a simulation study designed to compare both, approximated and exact, solutions is included. The last part of this paper presents evaluation of the obtained results on a real-word data set.

IPMU 2016 Special Session:
Computational Aspects of Data Aggregation and Complex Data Fusion

We are happy to invite you to submit your contribution(s) to the special session entitled Computational Aspects of Data Aggregation and Complex Data Fusion within the 16th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2016) that will be held on June 20-24, 2016 in Eindhoven, The Netherlands.

Important dates:

  • Paper submission: January 8, 2016
  • Notification of acceptance/rejection: March 1, 2016
  • Camera-ready papers: March 31, 2016

The proceedings of IPMU 2016 will be published in Communications in Computer and Information Science (CCIS) with Springer. Papers must be prepared in the LNCS/CCIS one-column page format. The length of papers is 12 pages in this special LaTeX2e format. For the details of submission click here.

Please feel free to disseminate this information to other researchers that may potentially be interested in the session. For the details on the Session click here.
2015-10-22 software

stringi 1.0-1 Now on CRAN

Notable changes since v0.5-2:

* [GENERAL] #88: C++ API is now available for use in, e.g., Rcpp packages, see for an example.

* [BUGFIX] #183: Floating point exception raised in `stri_sub()` and
`stri_sub<-()` when `to` or `length` was a zero-length numeric vector.

* [BUGFIX] #180: `stri_c()` warned incorrectly (recycling rule) when using more
than 2 elements.
2015-09-23 new paper

Accepted Paper in Journal of Applied Statistics

"How to improve a team's position in the FIFA ranking – A simulation study" by Lasek J., Szlavik Z., Gagolewski M., and Bhulai S. has been accepted for publication in Journal of Applied Statistics.
Abstract: In this paper, we study the efficacy of the official ranking for international football teams compiled by FIFA, the body governing football competition around the globe. We present strategies for improving a team's position in the ranking. By combining several statistical techniques we derive an objective function in a decision problem of optimal scheduling of future matches. The presented results display how a team's position can be improved. Along the way, we compare the official procedure to the famous Elo rating system. Although it originates from chess, it has been successfully tailored to ranking football teams as well.
2015-09-18 award

Scholarship for Outstanding Young Scientists

I am happy to announce that I have been awarded a scholarship for outstanding young scientists from Ministry of Science and Higher Education, Republic of Poland (36 months). According to the Ministry, scholarships are awarded to scientists below the age of 35, who conduct high-quality research and have impressive scientific achievements. Here is the complete list of laureates (in Polish).
2015-06-22 software

stringi 0.5-2 Now on CRAN

A new release of the stringi package is available on CRAN. As for now, about 850 CRAN packages depend (either directly or recursively) on stringi. And quite recently, the package got listed among the top downloaded R extensions.

Notable changes since v0.4-1:

* [BACKWARD INCOMPATIBILITY] The second argument to `stri_pad_*()` has
been renamed `width`.

* [GENERAL] #69: `stringi` is now bundled with ICU4C 55.1.

* [NEW FUNCTIONS] `stri_extract_*_boundaries()` extract text between text

* [NEW FUNCTION] #46: `stri_trans_char()` is a `stringi`-flavoured
`chartr()` equivalent.

* [NEW FUNCTION] #8: `stri_width()` approximates the *width* of a string
in a more Unicodish fashion than `nchar(..., "width")`

* [NEW FEATURE] #149: `stri_pad()` and `stri_wrap()` now by default bases on
code point widths instead of the number of code points. Moreover, the default
behavior of `stri_wrap()` is now such that it does not get rid
of non-breaking, zero width, etc. spaces

* [NEW FEATURE] #133: `stri_wrap()` silently allows for `width <= 0`
(for compatibility with `strwrap()`).

* [NEW FEATURE] #139: `stri_wrap()` gained a new argument: `whitespace_only`.

* [NEW FUNCTIONS] #137: date-time formatting/parsing:
* `stri_timezone_list()` - lists all known time zone identifiers
* `stri_timezone_set()`, `stri_timezone_get()` - manage current default time zone
* `stri_timezone_info()` - basic information on a given time zone
* `stri_datetime_symbols()` - localizable date-time formatting data
* `stri_datetime_fstr()` - convert a `strptime`-like format string
to an ICU date/time format string
* `stri_datetime_format()` - convert date/time to string
* `stri_datetime_parse()` - convert string to date/time object
* `stri_datetime_create()` - construct date-time objects
from numeric representations
* `stri_datetime_now()` - return current date-time
* `stri_datetime_fields()` - get values for date-time fields
* `stri_datetime_add()` - add specific number of date-time units
to a date-time object

* [GENERAL] #144: Performance improvements in handling ASCII strings
(these affect `stri_sub()`, `stri_locate()` and other string index-based

* [GENERAL] #143: Searching for short fixed patterns (`stri_*_fixed()`) now
relies on the current `libC`'s implementation of `strchr()` and `strstr()`.
This is very fast e.g. on `glibc` utilizing the `SSE2/3/4` instruction set.

* [BUILD TIME] #141: a local copy of `icudt*.zip` may be used on package
install; see the `INSTALL` file for more information.

* [BUILD TIME] #165: the `./configure` option `--disable-icu-bundle`
forces the use of system ICU when building the package.

* [BUGFIX] locale specifiers are now normalized in a more intelligent way:
e.g. `@calendar=gregorian` expands to `DEFAULT_LOCALE@calendar=gregorian`.

* [BUGFIX] #134: `stri_extract_all_words()` did not accept `simplify=NA`.

* [BUGFIX] #132: incorrect behavior in `stri_locate_regex()` for matches
of zero lengths

* [BUGFIX] stringr/#73: `stri_wrap()` returned `CHARSXP` instead of `STRSXP`
on empty string input with `simplify=FALSE` argument.

* [BUGFIX] #164: `libicu-dev` usage used to fail on Ubuntu
(`LIBS` shall be passed after `LDFLAGS` and the list of `.o` files).

* [BUGFIX] #168: Build now fails if `icudt` is not available.

* [BUGFIX] #135: C++11 is now used by default (see the `INSTALL` file,
however) to build `stringi` from sources. This is because ICU4C uses the
`long long` type which is not part of the C++98 standard.

* [BUGFIX] #154: Dates and other objects with a custom class attribute
were not coerced to the character type correctly.

* [BUGFIX] Force ICU `u_init()` call on `stringi` dynlib load.

* [BUGFIX] #157: many overfull hboxes in the package PDF manual has been
2015-04-30 software

stringr Now Powered by stringi

I'm happy to announce that starting from the 1.0.0 release, the stringr package for R is now powered by stringi. For more details, read more here.