Colloquium 2024 Data Wrangling with Tidyverse (part 2)

From SHARCNETHelp
Jump to navigationJump to search

Tidyverse is an cohesive set of packages for doing data science in R. In an earlier talk, we began reviewing the data munging portions of tidyvese (dplyr, forcats, tibble, readr, stringr, tidyr, and purr) by using it to reconstruct the data hierarchy in a 500 pages reference PDF given only the words on each page and their bounding boxes. This talk will complete this.

If you have not seen the first part, or wish to review it, you can find it here:

https://www.youtube.com/watch?v=8_Q-WwqY_Og 

For completeness, we also covered the graphical portion of tidyverse (ggplot) here: https://www.youtube.com/watch?v=PR2Rs0W4zYg