--- title: "Getting started" author: "Colin Fay" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Getting started with {tidystringdist} ## About {tidystringdist} `{tidystringdist}` is a package that extends the [{stringdist}](https://github.com/markvanderloo/stringdist) package with tidy data principles. The idea is to perform string distance calculation and combine it with functions for data manipulation and visualisation from the tidyverse framework. ## Installing tidystringdist You can install the last stable version from GitHub with: ```{r eval = FALSE} install.packages("tidystringdist") ``` Or the dev version from GitHub: ```{r eval = FALSE} # install.packages(remotes) remotes::install_github("ColinFay/tidystringdist") ``` ## {tidystringdist} basic workflow ## `tidycomb()` The `tidycomb()` & `tidy_comb_all()` functions return all the possible combinations from a vector / a data.frame and a column / two vectors: ```{r} library(tidystringdist) tidy_comb_all(LETTERS[1:3]) tidy_comb_all(iris, Species) ``` ```{r} tidy_comb("Paris", state.name) ``` ## Compute string distance Once you've got this data.frame, you can use `tidy_string_dist()` to compute string distance. This function takes a data.frame, the two columns containing the strings, and one or more stringdist methods. ```{r example, warnings = FALSE, error=FALSE, message=FALSE} comb <- tidy_comb_all(state.name) tidy_stringdist(comb) ``` Default call compute all the methods. You can use specific method with the `method` argument: ```{r} comb <- tidy_comb_all(state.name) tidy_stringdist(comb, method = c("osa","jw")) ```