Title: | String Distance Calculation with Tidy Data Principles |
---|---|
Description: | Calculation of string distance following the tidy data principles. Built on top of the 'stringdist' package. |
Authors: | Colin Fay [aut, cre] , Dmytro Perepolkin [ctb] |
Maintainer: | Colin Fay <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.4 |
Built: | 2024-12-05 06:13:21 UTC |
Source: | https://github.com/colinfay/tidystringdist |
Get all combinations from a dataframe column or from a list
tidy_comb(data, base, ...) ## S3 method for class 'data.frame' tidy_comb(data, base, ...) ## Default S3 method: tidy_comb(data, base, ...)
tidy_comb(data, base, ...) ## S3 method for class 'data.frame' tidy_comb(data, base, ...) ## Default S3 method: tidy_comb(data, base, ...)
data |
data object containing the list of words, either a list or a data.frame |
base |
the base word to compare with all the words |
... |
if data is a data.frame, the col where the words to combine are |
a tibble with all possible combination of elements from a list
tidy_comb(iris, "this", Species) tidy_comb(state.name, "Paris")
tidy_comb(iris, "this", Species) tidy_comb(state.name, "Paris")
Get all combinations from a dataframe column
tidy_comb_all(data, ...) ## S3 method for class 'data.frame' tidy_comb_all(data, ...) ## Default S3 method: tidy_comb_all(data, ...)
tidy_comb_all(data, ...) ## S3 method for class 'data.frame' tidy_comb_all(data, ...) ## Default S3 method: tidy_comb_all(data, ...)
data |
a list or a data.frame with the elements to combine |
... |
if data is a data.frame, the col where the words to combine are |
a tibble with all possible combination of elements from a list
tidy_comb_all(iris, Species) tidy_comb_all(state.name)
tidy_comb_all(iris, Species) tidy_comb_all(state.name)
Tidy stringdist calculation
tidy_stringdist(df, v1 = V1, v2 = V2, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), ...)
tidy_stringdist(df, v1 = V1, v2 = V2, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), ...)
df |
a dataframe containing the strings to compare |
v1 |
the name of the first columns |
v2 |
the name of the second columns |
method |
one of the methods implemented in the stringdist package — "osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex". See |
... |
other parameters passed to |
a tibble with string distance
proust <- tidy_comb_all(c("Albertine", "Françoise", "Gilberte", "Odette", "Charles")) tidy_stringdist(proust)
proust <- tidy_comb_all(c("Albertine", "Françoise", "Gilberte", "Odette", "Charles")) tidy_stringdist(proust)