Colloquium 2023 Data Wrangling with Tidyverse

From SHARCNETHelp
Revision as of 13:06, 1 September 2023 by Syam (talk | contribs) (Created page with "Tidyverse is an cohesive set of packages for doing data science in R. We have demonstrated the graphics portion of this in prior talks (ggplot). In this one we are going to de...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Tidyverse is an cohesive set of packages for doing data science in R. We have demonstrated the graphics portion of this in prior talks (ggplot). In this one we are going to demonstrate the data munging portions (dplyr, forcats, tibble, readr, stringr, tidyr, and purr) by restoring the underlying data hierarchy implicit in the layout of a 500 pages reference PDF file given only the words on each page and their bounding boxes.