Get data from table id
. The data of a CBS opendata table is in so-called wide
format. Each Measure
has its own column.
Usage
cbs4_get_data(
id,
catalog = "CBS",
...,
query = NULL,
name_measure_columns = TRUE,
show_progress = interactive() && !verbose,
download_dir = file.path(tempdir(), id),
verbose = getOption("cbsodata4.verbose", FALSE),
sep = ",",
as.data.table = FALSE,
base_url = getOption("cbsodata4.base_url", BASEURL4)
)
Arguments
- id
Identifier of the Opendata table. Can be retrieved with
cbs4_get_datasets()
- catalog
Catalog in which the dataset is to be found.
- ...
optional selections on data, passed through to cbs4_download. See examples
- query
optional query in odata4 syntax (overwrites any specification in
...
)- name_measure_columns
logical
ifTRUE
theTitle
of the measure will be set as name column.- show_progress
if
TRUE
shows progress of data download, can't be used together with verbose.- download_dir
directory in which the data and metadata is downloaded. By default this is temporary directory, but can be set manually
- verbose
if
TRUE
prints the steps taken to retrieve the data.- sep
separator to be used to download the data.
- as.data.table
logical
, should the result be of type data.table?- base_url
Possible other url which implements same protocol.
Value
a data.frame()
or data.table()
object. See details.
Details
The returned data.frame()
has the following columns:
For each dimension a separate column with category identifiers. Category labels can be added with
cbs4_add_label_columns()
or found incbs4_get_metadata()
. Date columns can be added withcbs4_add_date_column()
.For each Measure / Topic a separate column with values. Units can be found in
cbs4_get_metadata()
(MeasureCodes
).
For a long format instead of wide format see cbs4_get_observations()
which has one Measure
column and a Value
column.
See also
Other data-download:
cbs4_download()
,
cbs4_get_observations()
Examples
if (interactive()){
# filter on Perioden (see meta$PeriodenCodes)
cbs4_get_data("84287NED"
, Perioden = "2019MM12" # december 2019
)
# filter on multiple Perioden (see meta$PeriodenCodes)
cbs4_get_data("84287NED"
, Perioden = c("2019MM12", "2020MM01") # december 2019, january 2020
)
# to filter on a dimension just add the filter to the query
# filter on Perioden (see meta$PeriodenCodes)
cbs4_get_data("84287NED"
, Perioden = "2019MM12" # december 2019
, BedrijfstakkenBranchesSBI2008 = "T001081"
)
# filter on Perioden with contains
cbs4_get_data("84287NED"
, Perioden = contains("2020")
, BedrijfstakkenBranchesSBI2008 = "T001081"
)
# filter on Perioden with multiple contains
cbs4_get_data("84287NED"
, Perioden = contains(c("2019MM1", "2020"))
, BedrijfstakkenBranchesSBI2008 = "T001081"
)
# filter on Perioden with contains or = "2019MM12
cbs4_get_data("84287NED"
, Perioden = contains("2020") | "2019MM12"
, BedrijfstakkenBranchesSBI2008 = "T001081"
)
# This all works on observations too
cbs4_get_observations( id = "80784ned" # table id
, Perioden = "2019JJ00" # Year 2019
, Geslacht = "1100" # code for total gender
, RegioS = contains("PV") # provinces
, Measure = "M003371_2" # topic selection
)
# supply your own odata 4 query
cbs4_get_data("84287NED", query = "$filter=Perioden eq '2019MM12'")
# an odata 4 query will overrule other filter statements
cbs4_get_data("84287NED"
, Perioden = "2018MM12"
, query = "$filter=Perioden eq '2019MM12'"
)
# With query argument an odata4 expression with other (filter) functions can be used
cbs4_get_observations(
id = "80784ned" # table id
,query = paste0( # odata4 query
"$skip=4", # skip the first 4 rows of the filtered result
"&$top=20", # then slice the first 20 rows of the filtered result
"&$select=Measure,Geslacht,Perioden,RegioS,Value", # omit the Id and ValueAttribute fields
"&$filter=endswith(Measure,'_1')") # filter only Measure ending on '_1'
)
}