find_annotations()
takes a data frame and identifies possible annotations
contained within it and returns them as a named list. guess_annotations()
is a low-level helper that extracts annotations and returns them as a tibble
of cell values, row and column positions.
Arguments
- df
A data frame object
- type
Whether the data frame is in "sheet" format or "cells" format
- title_first
Whether the first annotation should be treated as the table title
- guess_source
Whether to guess a source note from the annoations
- .row_var
When using
type = "cells"
the name of the variable with row positions- .col_var
When using
type = "cells"
the name of the variable with column positions- .value_var
When using
type = "cells"
the name of the variable with row positions
Details
Data frames have a declared type
, which must be either "sheet"
format
(the default) or "cells"
format. "sheet"
format is a standard
two-dimensional data frame format, such as those read in by
base::read.csv()
or readxl::read_excel()
. "cells"
format is for
data frames where each row represents a cell from a spreadsheet and contains
a variable for the cell's value, and separate variables providing the row
and column variable.
By default find_annotations()
will try to help parse the annotations found
by guess_annotations()
. With title_first = TRUE
, the first annotation
found in a data frame is assumed to provide a title or label for the table
contained in the data frame. With guess_source = TRUE
, the annotations
will be searched for one starting with either "Source:"
, "Data source:"
or "Source data:"
.
When using type = "cells"
the variables identifying the row, column and
cell values are specified by .row_var
, .col_var
and .value_var
respectively.
Examples
example_df <- tibble::tibble(
col1 = c(
"Table 1", "An example sheet", "species", "Adelie", "Gentoo", "Chinstrap",
"This table is based on data in the palmerpenguins R package",
"Source: {palmerpenguins} R package"
),
col2 = c(NA_character_, NA_character_, "bill_length_mm", "38.791",
"47.505", "48.834", NA_character_, NA_character_),
col3 = c(NA_character_, NA_character_, "bill_depth_mm", "18.346",
"14.982", "18.421", NA_character_, NA_character_)
)
example_df
#> # A tibble: 8 × 3
#> col1 col2 col3
#> <chr> <chr> <chr>
#> 1 Table 1 NA NA
#> 2 An example sheet NA NA
#> 3 species bill_length… bill…
#> 4 Adelie 38.791 18.3…
#> 5 Gentoo 47.505 14.9…
#> 6 Chinstrap 48.834 18.4…
#> 7 This table is based on data in the palmerpenguins R package NA NA
#> 8 Source: {palmerpenguins} R package NA NA
find_annotations(example_df)
#> ── Notes found in `example_df` ─────────────────────────────────────────────────
#> Title: Table 1
#> Source: Source: {palmerpenguins} R package
#> Notes:
#> • An example sheet
#> • This table is based on data in the palmerpenguins R package
guess_annotations(example_df)
#> # A tibble: 4 × 3
#> row col annotation
#> <int> <int> <chr>
#> 1 1 1 Table 1
#> 2 2 1 An example sheet
#> 3 7 1 This table is based on data in the palmerpenguins R package
#> 4 8 1 Source: {palmerpenguins} R package