The mclosest function calculates the closest rows between two matrices
(or data frames) considering pairwise differences between values in columns
of x and table. It returns the index of the closest row in table for
each row in x.
Arguments
- x
numericmatrix or data frame representing the query data. Each row inxwill be compared to every row intable. Bothxandtableare expected to have the same number of columns, and the columns are expected to be in the same order.- table
numericmatrix or data frame containing the reference data to be matched with each row ofx. Each row intablewill be compared to every row inx. Bothtableandxare expected to have the same number of columns, and the columns are expected to be in the same order.- ppm
numericrepresenting a relative, value-specific parts-per-million (PPM) tolerance that is added to tolerance (default is 0).- tolerance
numericaccepted tolerance. Defaults totolerance = Inf, thus for each row in x the closest row in table is reported, regardless of the magnitude of the (absolute) difference.
Value
integer vector of indices indicating the closest row of table for
each row of x. If no suitable match is found for a row in x based on the
specified tolerance and ppm, the corresponding index is set to NA.
Details
If, for a row of x, two rows of table are closest only the index of first
row will be returned.
For both the tolerance and ppm arguments, if their length is different to
the number of columns of x and table, the input argument will be
replicated to match it.
Examples
x <- data.frame(a = 1:5, b = 3:7)
table <- data.frame(c = c(11, 23, 3, 5, 1), d = c(32:35, 45))
## Get for each row of `x` the index of the row in `table` with the smallest
## difference of values (per column)
mclosest(x, table)
#> [1] 1 1 3 1 1
## If the absolute difference is larger than `tolerance`, return `NA`. Note
## that the tolerance value of `25` is used for difference for each pairwise
## column in `x` and `table`.
mclosest(x, table, tolerance = 25)
#> [1] NA NA NA NA 1
