The mclosest
function calculates the closest rows between two matrices
(or data frames) considering pairwise differences between values in columns
of x
and table
. It returns the index of the closest row in table
for
each row in x
.
Arguments
- x
numeric
matrix or data frame representing the query data. Each row inx
will be compared to every row intable
. Bothx
andtable
are expected to have the same number of columns, and the columns are expected to be in the same order.- table
numeric
matrix or data frame containing the reference data to be matched with each row ofx
. Each row intable
will be compared to every row inx
. Bothtable
andx
are expected to have the same number of columns, and the columns are expected to be in the same order.- ppm
numeric
representing a relative, value-specific parts-per-million (PPM) tolerance that is added to tolerance (default is 0).- tolerance
numeric
accepted tolerance. Defaults totolerance = Inf
, thus for each row in x the closest row in table is reported, regardless of the magnitude of the (absolute) difference.
Value
integer
vector of indices indicating the closest row of table
for
each row of x
. If no suitable match is found for a row in x
based on the
specified tolerance
and ppm
, the corresponding index is set to NA
.
Details
If, for a row of x
, two rows of table
are closest only the index of first
row will be returned.
For both the tolerance
and ppm
arguments, if their length is different to
the number of columns of x
and table
, the input argument will be
replicated to match it.
Examples
x <- data.frame(a = 1:5, b = 3:7)
table <- data.frame(c = c(11, 23, 3, 5, 1), d = c(32:35, 45))
## Get for each row of `x` the index of the row in `table` with the smallest
## difference of values (per column)
mclosest(x, table)
#> [1] 1 1 3 1 1
## If the absolute difference is larger than `tolerance`, return `NA`. Note
## that the tolerance value of `25` is used for difference for each pairwise
## column in `x` and `table`.
mclosest(x, table, tolerance = 25)
#> [1] NA NA NA NA 1