User-Defined Binary Operators in R

Implementing custom binary operators in R
R
Published

February 1, 2024

One of the things I miss most (when working in R) is Python’s builtin string methods and string manipulation functions. The two methods I miss the most are startswith and endswith. Here’s and example of how they work in Python:

In [1]: "apple".endswith("e")
True
In [2]: "rubbersoul".startswith("rubber")
True
In [3]: "billiondollar".endswith("babies")
False

Everything in Python is an object, and all Python string objects expose these two methods (and many others). I wanted to make the same functionality available in R while maintaining the simplicity of the Python approach. I found a way to accomplish this using user-defined binary operators in R.

Binary Operators

User-defined binary operators in R consist of a string of characters between two % characters. Some frequently used builtin binary operators include %/% for integer division and %%, which represents the modulus operator. Declaring a binary operator is identical to declaring any other function, except for the name. Here’s an implementation of %startswith% and %endswith%:

# Example of declaring user-defined binary operators in R.

`%startswith%` = function(teststr, testchars) {
    #   `teststr`: The target string.
    # `testchars`: The character(s) to test for in `teststr`. 
    return(grepl(paste0("^", testchars), teststr))
}

`%endswith%` = function(teststr, testchars) {
    #   `teststr`: The target string.
    # `testchars`: The character(s) to test for in `teststr`. 
    return(grepl(paste0(testchars, "$"), teststr))
}

Once read in to the current session, both individual strings and vectors of strings can be passed to either operator to test for the specified leading or trailing character(s). For example, if I had the following vector:

months = c("January", "February", "March", "April", "May", "June", "July", 
           "August", "September", "October", "November", "December")

And wanted to test whether or not the elements of months start with “J”,
%startswith% could be used as follows:

> months %startswith% "J"
[1]  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE

Similarly, to check whether elements of months end with “ber”, run:

> months %endswith% "ber"
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE

To obtain the indicies of the elements of months ending with “ber”, we can use %endswith% in conjunction with which:

> months %endswith% "ber"
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
> which(months %endswith% "ber")
[1]  9 10 11 12
> months[which(months %endswith% "ber")]
[1] "September" "October"   "November"  "December"