Import cells from an OpenDocument Spreadsheet (ODS) file. In the resulting dataset each cell is its own row with columns providing information about the cell's position, value types and containing formulas.
Arguments
- path
The ODS file
- sheet
The sheet name or index number, set to NA for all sheets
- quick
Whether to use the quick reading process
- quiet
Whether to silence console messages (recommended for bulk processing)
Details
The aim of read_ods_cells()
is to extract the constituent value(s) and
other information about cells stored in an ODS file and to present that
in a "tidy" format that allows further programmatic manipulation. It is
modelled after the functionality of tidyxl::tidy_xlsx()
which performs a
similar role for Microsoft Excel spreadsheets.
There are between two to four presentations of a cell's value in the resulting tibble:
all cells have a
base_value
, a character vector providing the "most raw" version of a cell's value;all cells also have a
cell_content
, a character vector providing the version of a cell's value as seen by the user of a spreadsheet application (i.e. having applied number formatting rules);for float, currency and percentage value types, cells will have a
numeric_value
with the raw value of the cell as a base R numeric vector, currency value types also have acurrency_symbol
providing the 3-character ISO currency symbol;for date value types, cells will have a
date_value
, a character vector with the date or date-time in ISO 8601 format;for time value types, cells will have a
time_value
, a character vector with the time duration in ISO 8601 format.
Processing the ODS XML is a memory intensive process, you can achieve
significant speed enhancements by setting the quick
argument to FALSE.
This process will extract only a minimum of information about the cells,
namely: sheet
, row
, col
, value_type
and a base_value
. The
base_value
when using the quick
argument will combine the raw value
stored for float and percentage value types with the cell_content
(i.e.
the formatted character string) for all other value types.
More details on the types of information extracted by read_ods_cells()
,
including examples of how the different value types are stored in the ODS
file format and extracted by read_ods_cells()
can be found in the
vignette, vignette("read_cells", package = "tidyods")
.
Examples
example <- system.file("extdata", "basic_example.ods", package = "tidyods")
example_cells <- read_ods_cells(example, 1)
dplyr::glimpse(example_cells)
#> Rows: 28
#> Columns: 28
#> $ sheet <chr> "penguins", "penguins", "penguins", "penguins",…
#> $ address <chr> "A1", "B1", "C1", "D1", "A2", "B2", "C2", "D2",…
#> $ row <dbl> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4,…
#> $ col <dbl> 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4,…
#> $ cell_type <chr> "cell", "cell", "cell", "cell", "cell", "cell",…
#> $ is_empty <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ value_type <chr> "string", "string", "string", "string", "string…
#> $ cell_content <chr> "species", "female", "bill_length_mm", "body_ma…
#> $ base_value <chr> "species", "female", "bill_length_mm", "body_ma…
#> $ numeric_value <dbl> NA, NA, NA, NA, NA, NA, 40.39041, 4043.49315, N…
#> $ currency_symbol <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ boolean_value <lgl> NA, NA, NA, NA, NA, FALSE, NA, NA, NA, TRUE, NA…
#> $ date_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ time_value <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ has_formula <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ formula <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ has_error <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ error_type <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ has_annotation <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ annotation <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ is_merged <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ merge_colspan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ merge_rowspan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ merge_shape <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ cell_style <chr> NA, "Default", "Default", "Default", NA, NA, NA…
#> $ row_style <chr> "ro1", "ro1", "ro1", "ro1", "ro1", "ro1", "ro1"…
#> $ col_style <chr> "co1", "co2", "co3", "co4", "co1", "co2", "co3"…
#> $ col_default_cell_style <chr> "Default", "ce56", "ce57", "ce58", "Default", "…