Abstract
This document provides guidance on the writing of R functions for data quality assessments to ensure exchangeable R-Code by using a homogeneous structure of functions and data. Therefore, conventions regarding the input/output of data are defined, naming conventions are introduced, and documentation requirements are denoted. In addition we provide recommendations regarding (machine-readable) output and visualization.
1 University Medicine of Greifswald
✉ Correspondence: Stephan Struckmann <stephan.struckmann@uni-greifwald.de>
Code developed in teams, especially if intended to be used by a larger community, needs to be comprehensible for all team members and ideally also for anyone trying to use that code. Here we provide some conventions that resulted from the 1st project phase.
The document is organized as follows:
Since the data quality project producing these conventions aims at R as the main programming language for statistical calculations, all examples and the wording will refer to R standards (see R documentation).
The concept distinguishes two types of data sources:
Study data:
Clinical data: Measurements (organized within variables) intended to be subject to data quality assessments
Process information: all data providing information on the measurement process such as time, ambient variables, the respective device or examiner.
Meta data:
Contain the expected characteristics of study data on the level of each variable. For example, labels, limits, or missing codes. Also the allocation of respective process information organized in variables.
Further tables referencing descriptions such as labels of missing codes.
For further information see Richter et al.
In this concept study data and meta data have a 1:1 correspondence, i.e. each column in the study data is identifiable in the meta data (Figure).
Table: Example study and meta data structures
So-called process variables store meta data about the
measurement process. Content of these variables represent measurements
and are therefore stored with the study data. The names of the variables
to use are passed in a function argument. Process variable names can
also be stored in the attributes of a study variable. Such variable
attributes referring other variables are usually prefixed by
KEY_. Some such key attributes are listed in the table below. There is a wrapper function named
pipeline_vectorized
to automatically extract this
information from the meta data and to provision parallel function calls
with the respective function arguments. This is primarily used for
calling functions of the Dimension Accuracy for a set of
variables at once, because this dimension’s functions frequently
For brevity, we here present pseudo-code:
my_function_4 <- function( resp_var,
group_vars,
study_data,
meta_data
) {
s_data <- study_data[ , resp_var ]
group_data <- study_data[ , group_vars ]
# ...
Calling this function using pipeline_vectorized
would
work as follows:
named_list_of_results <-
pipeline_vectorized(fct = my_function_4, resp_vars = c("SBP_2", "DBP_2", "HF_2"), study_data = study_data,
meta_data = meta_data, label_col = LABEL,
args_from_meta = c(group_vars = KEY_OBSERVER),
mc.cores = 4)
# results are a named list of the univariate results:
named_list_of_results$SBP_2
named_list_of_results$HF_2
named_list_of_results$DBP_2
Later, this may also be extended by using classes for the variable based function arguments:
my_function_5 <- function( resp_var,
group_vars,
study_data,
meta_data
) {
s_data <- study_data[ , resp_var ]
if (inherits(group_vars, 'process_var_att')) {
group_data <- study_data[ , subset(meta_data, VAR_NAMES == resp_var, group_vars) ]
} else {
group_data <- study_data[ , group_vars ]
}
# ...
}
proc_var <- function(x) {
class(x) <- 'process_var_att'
x
}
my_function_5(
'SBP_0',
proc_var('KEY_OBSERVER'),
study_data,
meta_data
)
All implementations of the project were developed to applied alone or in a vectorized reporting pipeline. …
Functions addressing variables are divided into two sub-types:
addressing one variable only (univariate) or addressing many variables
(multivariate). For the univariate functions vectorisation can be
performed using the pipeline_vectorized
function given that
they follow these conventions. Such functions perform calculations on
the study data to detect quality issues.
In large studies there are thousands of variables, so the quality assurance officers need some guidance to find problematic variables quickly without going through a huge per variable QA report. Therefore so-called aggregation functions are introduced that work on the output of functions that work on variables indicator functions.
Such aggregate functions differ technically from the indicator functions regarding their input being the output of other functions. This is why they depend on a sound definition of the primary functions’ output, especially they have to provide an output that values a specific data quality aspect of one or more study variables. To give an example, such a function could calculate the percentage of a study section’s variables displaying a relevant number of missings.
Functions that not directly address QA issues but perform consistency checks, data preparation, pipelining and other auxiliary tasks are called Helper functions and described in the section Use of Helper Functions
# function calls:
my_function_1(resp_vars = colnames(study_data), co_vars = character(0), group_vars = NA,
label_col = 'LABEL', study_data = study_data, meta_data = meta)
## Error in my_function_1(resp_vars = colnames(study_data), co_vars = character(0), : could not find function "my_function_1"
try(
my_function_2(resp_vars = colnames(study_data), co_vars = character(0), group_vars = NA,
label_col = 'LABEL', study_data = study_data, meta_data = meta) # expect to stop
)
## Error in my_function_2(resp_vars = colnames(study_data), co_vars = character(0), :
## could not find function "my_function_2"
R code should be structured as follows (derived from http://style.tidyverse.org/ (Hadley Wickam), and inspired by https://google.github.io/styleguide/Rguide.xml):
# required packages/code should be specified prior to user-defined functions -----------------------
library(ggplot2)
# source required functions prior to the function --------------------------------------------------
# such code will be later embedded in the R-package
source("some_other_function.R")
my_function <- function(x, formal_1, formal_n) {
# start with all checks that safeguard applicability of the function ----------------
if (missing(x) || length(x) == 0L || mode(x) != "numeric")
stop("'x' must be a non-empty numeric vector")
if (missing(formal_1) || missing(formal_n))
stop("'attributes' must be specified")
# main body of the function ---------------------------------------------------------
x_mod <- ... # …
# call of nested function -----------------------------------------------------------
result <- some_other_function(x_mod)
# the output ------------------------------------------------------------------------
return(result)
}
Since the targeted output is an R library (namely
dataquieR
), library
and source
should only being used during the internal drafting of code. In
the R package, external libraries must be listed in the
DESCRIPTION
file (generally in its
Imports
-section) of the package and can be imported to the
package namespace using roxygen2
comments.
To ensure a generic usability of R-scripts, they should be organised in functions whose input arguments must not be handled in a static fashion. This is necessary, because the names and the number of variables, meta data attributes, process variables as well as the names of the data frames are not known a priori. All functions must be able to handle whatever variables and data sets are used, as long as these meet some structural preconditions as outlined above.
This comprises:
No hard coded variable names
No hard coded expected lengths of variable lists
No hard coded data frame names
No function embedded meta data attributes
No hard coded thresholds for decision making of quality assessments
To avoid misunderstandings: Hard coded names of meta data attribute fields must be used to properly retrieve related information. All necessary information to run the scripts is transferred via an appropriate function call.
We intentionally do not use the synonymous term “function parameters” to avoid ambiguities regarding the statistical term parameter related to probability distributions.
In the following, we give a table listing standardised function argument names. Functions can have additional arguments, but for the ones listed below, conventions exist. Two of them are mandatory.
In the table above, there are two
arguments (study_data
and meta_data
) mandatory
for all indicator functions. These are declared to be data frames, which are explained in the following.
The table with function arguments lists
arguments mostly referring to study or process variables.
There may be additional parameters such as certain threshold values
or arguments affecting the format of the generated output like specific
colours or fonts. The latter should not be part of the functions in
future, because the output including ggplot2
plots can be
formatted later. The types of additional outputs depend on the specific
use-cases. Also thresholds may be generalised later so that using
threshold arguments is not recommended in favour of returning filterable
results.
All function arguments are user input. So these have to be verified carefully.
For arguments referring to study variables, there is a family of
utility functions for this: util_correct_variable_use
and
util_correct_variable_use2
. These can check input arguments
referring to variable names. Some examples:
util_correct_variable_use("resp_vars", # check function argument resp_vars
allow_null = TRUE, # allow resp_vars being NULL
allow_more_than_one = TRUE, # allow more than one entry in resp_vars
allow_any_obs_na = TRUE, # allow resp_vars in study_data contain NAs (see stats::na.fail)
need_type = "integer | float") # allow variabes of metadata-declared types integer or float
util_correct_variable_use("group_vars", # check function argument group_vars
allow_null = TRUE, # allow group_vars being NULL
allow_more_than_one = TRUE, # allow more than one entry in group_vars
allow_any_obs_na = TRUE, # allow group_vars in study_data contain NAs (see stats::na.fail)
need_type = "!float") # allow variabes of all possible metadata-declared types except float
Please refer to the full documentation of util_correct_variable_use / util_correct_variable_use2 for an exhaustive reference.
Note, that util_correct_variable_use*
are utility
functions and hence intended for package internal use only. The package
dataquieR
does not export these functions, they will only
be found if called from within that package or if called explicitly with
the disadvised :::
-operator during development. During
drafting functions, we recommend import of all used functions to the
global environment as follows:
util_correct_variable_use <- dataquieR:::util_correct_variable_use
Checks for parameter not referring to variables can be performed
using standard R functions such as is.numeric
,
na.fail
, missing
, is.null
,
length
, stopifnot
, inherits
. Be
careful with is.integer
: This functions checks the declared
type but not the real type of a vector:
a <- 12
is.integer(a)
## [1] FALSE
b <- as.integer(12)
is.integer(b)
## [1] TRUE
a == b
## [1] TRUE
identical(a, b)
## [1] FALSE
str(a)
## num 12
str(b)
## int 12
Therefore, we have included a utility function as proposed in the
manual page of is.integer
, which is called
util_is_integer
. This function behaves as expected and
returns true also for the variable a
from the example
above. As for all utility functions, util_is_integer
is not
exported by the dataquieR
package but can be accessed from
functions in the package. Again we recommend copying the function to the
global environment when drafting a function without compiling the
package.
my_function_1 <- function(resp_vars, # vector of response variables, i.e. each of
# these variables is analysed
co_vars, # vector of additional variables used for
# adjustment or similar
group_vars, # CAVE: currently only one grouping variable
label_col, # meta data variable attribute to use for naming variables
# in the output
study_data, # data frame of study records
meta_data # data frame of meta data attributes
) {
# Replace the column names of the data in "study_data" by the corresponding short variable
# labels. This step ensures comprehensive output. Convention: not more than 20 characters.
# "meta_data" must provide a row for each column in "study_data", a unique and alphanumeric
# label must be contained.
translations <- setNames(meta_data[[label_col]], nm = meta_data$VAR_NAMES) # generate a named
# vector translating
# names to labels
translationEnv <- as.environment(as.list(translations)) # convert it to an environment
# for use with mget
translated <- mget(colnames(study_data), translationEnv) # use mget to get translated
# column labels
ds1 <- study_data # do not modify the original data frame
colnames(ds1) <- unlist(translated) # use the translted as new column names
r <- lapply(seq_along(ds1), function(v) {
sum(meta_data[v, "INCL_SOFT_LIMIT_UP"] < ds1[, v])
})
names(r) <- colnames(ds1)
r <- simplify2array(r)
return(r)
}
The mapping of meta data variable labels and based on variable names
is performed by the utility function
util_prepare_dataframes
, which can be used like a C Makro. After having called
util_prepare_dataframes
without arguments from a function
that follows the here listed conventions, a new object is created in the
function’s local environment named ds1
. Using this, the
function above will looks as follows:
my_function_1b <- function(resp_vars, # vector of response variables, i.e. each of
# these variables is analysed
co_vars, # vector of additional variables used for
# adjustment or similar
group_vars, # CAVE: currently only one grouping variable
label_col, # meta data variable attribute to use for naming variables
# in the output
study_data, # data frame of study records
meta_data # data frame of meta data attributes
) {
util_prepare_dataframes()
r <- lapply(seq_along(ds1), function(v) {
sum(meta_data[v, "INCL_SOFT_LIMIT_UP"] < ds1[, v])
})
names(r) <- colnames(ds1)
r <- simplify2array(r)
return(r)
}
Note, that util_prepare_dataframes
is a utility function
and hence intended for package internal use only. The package
dataquieR
does not export that function, it will only be
found if called from within that package or if called explicitly with
the disadvised :::
-operator during development. During
drafting functions, we recommend import of all used functions to the
global environment as follows:
util_prepare_dataframes <- dataquieR:::util_prepare_dataframes
Once a function has been integrated to dataquieR, it will find the package internal functions without any tweaks.
Please refer to the full documentation of util_prepare_dataframes for an exhaustive reference.
my_function_2 <- function(resp_vars, # vector of response variables, i.e. each of
# these variables is analysed
co_vars, # vector of additional variables used for
# adjustment or similar
group_vars, # CAVE: currently only one grouping variable
label_col, # meta data variable attribute to use for naming variables
# in the output
study_data, # data frame of study records
meta_data # data frame of meta data attributes
) {
## in case of a function that handles one variable at once:
if (length(resp_vars) > 1)
stop("my_function_2 cannot handle more than one variable at once.")
# ...
}
All functions should carefully check all their input and abort the
execution with understandable error messages, if some preconditions are
not met. To cover the most common cases, some utility functions have
been implemented (util_prepare_dataframes
and
util_correct_variable_use
).
util_prepare_dataframes
checks for function it has been
called by, if its mandatory standard function arguments
study_data
and meta_data
provide the expected
valid data and if these two data frames match.
util_correct_variable_use
can be called for each argument
referring one or more variables by their names. It can be parameterised
to check for the most common mistakes, e.g. too few / too many variable
names, or referred variables of unsuitable data types.
There are more helper functions except the two mentioned in the
section Checks to be
performed / robustness. All internal helper functions should be
prefixed by util_
. The util_
functions will
not be exported by the R-package, because these are not intended to be
used by end users directly. Because also the users of the functions will
need some helper functions for processing data and generating quality
reports, there are two more prefixes, namely prep_
for
general data processing and pipe_
for stuff related to
automated report generation.
Documentation in this project is function specific, depending on whether the user is enabled to edit the code. should exist.
Please refer to roxygen2
’s package documentation, R documentation about packages, and vignette.
This type of functions will be mostly used by the users and has therefore two routes for documentation:
a. for handling and meaning of the code use [RMarkdown](https://rmarkdown.rstudio.com/){target=_blank} for all documentation, i.e. [function help pages](http://r-pkgs.had.co.nz/vignettes.html#markdown){target=_blank} and [package vignettes](https://roxygen2.r-lib.org/articles/markdown.html){target=_blank}.
b. for integration into the R-package use [Roxygen2 comments](https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html){target=_blank}.
For handling and meaning of code use RMarkdown for all documentation, i.e. function help pages and package vignettes.
All functions should be documented using comments as above and also using Roxygen2 comments.
The structure of study data has to comply with the following conventions to be applicable in our framework:
Study data is usually stored in tables (in R we use instances of
the class data.frame
, data frames).
Study data frames have one sample/patient per row and one variable per column. This corresponds to a “wide format”. Conversion from long/narrow format to wide format can be performed in R using several packages.
The column headers of study data frames are variable names.
Variable names must be unique.
Variable names do not contain blanks or other non-alphanumeric characters except for dots and underscores. They do not start with non-alphanumeric characters.
In case of repeated measurements, the names of variables measured repeatedly should receive a suffix indicating the measurement order (e.g. blood_01 blood_02 blood_03)
Meta data are arguments for the indicator functions. They are
provided to these functions as meta data frames in their function
argument meta_data
. For functions that handle only one
variable at once the structure of the meta data will be identical as for
multivariate functions. All functions extract the relevant columns from
the full meta data frame provided to them.1 For further details
see the specific examples below.
The output of a data quality function must contain the following elements:
The data quality related results as text, graph or table.
If possible, machine readable output of the data underlying the results (particularly for graphs), preferably in form of a data frame
Output should be usable in RMarkdown files.
It is desirable not to implement a new function for each output
option. Returned data frames as well as ggplot2
based graphics can be modified
and laid out later.
If unavoidable, we accept function parameters to control the output.
The output of the functions is given as a named R list
. The following names are
consented:
SummaryTable
SummaryPlot
ggplot2
graph visualising the resultsThese will be amended by:
DQvalue
If a function provides specific output for a set of response
variables (resp_vars
missing or a vector), these specific
outputs should be elements in the list, named by the
VAR_NAMES
. Additionally, such functions can still provide a
SummaryTable
and/or a SummaryPlot
for all
variables. Also a summary DQvalue
should be available.
The example a function may generate the data frame below as the primary result:
df1
This data frame can then be used for a respective graph and both results are returned:
# COMMENT: call ggplot
p1 <- ggplot(df1, aes(x = x1, y = y_prob)) +
theme_bw() +
geom_bar( aes( fill = cave), stat = "identity") +
scale_fill_manual( values = c("#2166AC", "#B2182B"), guide=FALSE) +
geom_errorbar( aes( ymin = lcl, ymax = ucl), width = 0.1) +
geom_line(data = df2, aes( x = x2, y = y_line, color = "#E69F00"),
size = 2) +
scale_color_manual( values = c("#E69F00"), guide = FALSE)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
return(list(SummaryTable = df1, SummaryPlot = p1))
Functions are written without precise knowledge about the application context. Therefore it must be safeguarded, that information remains readable, even if for example the number of variables or clusters grows very large. To support this:
Categories of known dimension should be implemented in the horizontal axis
Categories of unknown dimension (e.g. number of variables) should be implemented in the vertical axis
This applies primarily to printed text document formats (pdf, docx). For a html display of results, limitations to the handling of axes apply to a lesser degree.
This conventions should later be controllable by a function argument to faciliate to comply with external restrictions.
The relevant data quality information should be available not only based on colours but also based on additional elements, e.g. the amount of an effect size, or a line indicating a range or variance.
Using ggplot2
allows to manipulate the colours later.
Nevertheless, we recommend the following conventions for colours.
For discrete scales, such as interviewers or centres, we recommend to generate colour-blind friendly figures as recommended here: http://bconnelly.net/2013/10/creating-colorblind-friendly-figures/
We have augmented the list by two colours: grey and brown.
QS_Name | hex_code | red | green | blue |
qs_black | #000000 | 0 | 0 | 0 |
qs_gray | #B0B0B0 | 176 | 176 | 176 |
qs_orange | #E69F00 | 230 | 159 | 0 |
qs_skyblue | #56B4E9 | 86 | 180 | 233 |
qs_green | #009E73 | 0 | 158 | 115 |
qs_yellow | #F0E442 | 240 | 228 | 66 |
qs_blue | #0072B2 | 0 | 114 | 178 |
qs_red | #D55E00 | 213 | 94 | 0 |
qs_purple | #CC79A7 | 204 | 121 | 167 |
qs_brown | #8C510A | 140 | 81 | 10 |
These colours are not recommended for the representation of data quality issues.
For continuous scales, such as magnitude of effects, we recommend to generate colour-blind friendly figures as recommended here http://colorbrewer2.org/#type=sequential&scheme=PuBu&n=9. Alternatively, we recommend the use of Viridis:
https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
These colours are not recommended for the representation of data quality classifications.
To graph data quality issues, we recommend to generate colour-blind
friendly figures as recommended here http://colorbrewer2.org/#type=sequential&scheme=PuBu&n=9
The red pole should always be used to identify problems.
We assume, a function produces the following data frame:
Using the R-package “formattable” this data frame can be formatted as follows (crude example):
Please see some annotation on the use of this R-package; other important options are datatable (DT) and knitr::kableExtra
Colours for tables should be based on the colours mentioned above for graphical output.
Data quality related output should:
allow for an overview over all checked data structures (e.g. variables)
allow for an overview over all checked data structures with a data quality finding
allow for an overview over all checked data structures with a data quality finding, crossing a defined threshold
use space as efficiently as possible
allow for an understanding of tables or graphs without using other information sources
The outline of attributes is only defined as far as it is needed to run a single data quality assessment routine. Variable attributes comprise static meta data attributes (e.g. limits for a metric study variable) and process variable assignments (e.g. the study variable that stores the ID of the device used to measure some outcome variable).
A list of attributes is provided in the table below with suggested naming conventions. Attributes starting with the prefix KEY_ contain for each single study variable its references to other study variables identified by their respective VAR_NAMES entry. The meta data attribute VARIABLE_ROLE categorises the variables. An automated analysis of the Accuracy dimension related properties of a study variable considers all KEY_ attributes of that study variable that refer to study variables of the category PROCESS.
meta_atts_table <- openxlsx::read.xlsx("media/variable_attributes.xlsx")
DT::datatable(meta_atts_table, options = list(pageLength = min(20, nrow(meta_atts_table))), elementId = "variable_attributes")
The prefix INCL_ is used for generated variable attributes added automatically for internal use. The formats mentioned in the table above are:
Name | Description | Examples |
---|---|---|
String |
character data | labels such as BPSYST_01 |
Numeric |
numeric data | variable order numbers |
Enumeration(A, B, C) |
categorical data with the listed categories | data types |
Assignment |
assignments expressed using = and
separated using ǀ |
0 = females ǀ 1 = males |
CSV |
comma separated values | missing code lists like
99999, 88888, 12345 |
Interval |
Interval notation using
[ ] /( ) for including/excluding limits and
Inf /-Inf for open intervals |
[50;Inf) , [0;10] ,
[0;10] , (-Inf;2] |
Variable Reference |
Reference to some other variable storing meta information about each value of the current variable | variable names as given in the meta data attribute
VAR_NAMES , e.g. v00019 or
SBP_0 |
The recommended attributes will be expanded according to the needs of the project and can be extended to the needs of the users of the generated R routines.
To facilitate editing/creating meta data attributes, a Shiny App has been implemented.
Some meta data vary regarding their length across study variables. Lists of different length cannot be represented in a rectangular data frame. This is the case e.g. for missing lists.
Missing codes for two study variables:
Study variable “v33247”: 99980, 99974, 99976, 99982, 99975, 99992, 99990, 99989, 99995
Study variable “v33259”: 99998, 99999, 99984, 99990, 99975, 99982, 99992, 99977, 99986, 99989, 99981, 99987, 99983
Such structures are given as comma separated strings within the meta data data frames. A function has to know about the exact meaning of and can then handle this.
# Example use in a function
my_function_3 <- function(study_data, meta_data) {
ml_var1 <- subset(meta_data, VAR_NAMES == "var_1", select = "MISSING_LIST", drop = TRUE)
ml_var1_vector <- strsplit(ml_var1, ",", fixed = TRUE)[[1]]
value1 = 99980
value2 = 75.35
if (value1 %in% ml_var1_vector)
print("For var_1, value1 is a missing code")
if (value2 %in% ml_var1_vector)
print("For var_1, value2 is a missing code")
}
my_function_3(study_data = study_data, meta_data = meta_data)
## [1] "For var_1, value1 is a missing code"
If missing codes are used consistently for all variables of one
analysis, the corresponding functions from the dimension
Completeness can generate output with labels, if a table
translating missing codes to labels is given. This table should be in a
CSV
format using ;
as field separator and containing a
header line. The 2 columns should be CODE_VALUE
and CODE_LABEL
. Such files can be read by readr::read_csv2
or utils::read.csv2
. An example is
explained here
and available from here.
An example is given below:
CODE_VALUE;CODE_LABEL
99980;Missing - other reason
99981;Missing - exclusion criteria
99982;Missing - refusal
99983;Missing - not assessable
99984;Missing - technical problem
[...]
Contradiction checks are widely used in all kind of studies. A classical example is the number of pregnancies for male study participants not being zero. Such checks are available in the dimension Consistency. They are provided as a separate table which can be referred to by the meta data attribute CONTRADICTIONS but more importantly vice versa refer to variable labels (variable attribute LABEL).
Each rule refers two items that can be variable names, values or lists of levels/categories (at least one item must refer a variable). The rules also refer a function parameterised by these items. The rule then expresses a contradiction.
The available functions are given below (A and B refer to the referred items):
age at baseline
should not be
greater than age at followup
.age at followup
should not be less than
age at baseline
.yes
but age is greater than
70
.no
, the value of cigarette
consumption should not be heavy
, frequently
,
rarely
.gender at baseline
and
gender at followup
should usually be the same)expected birth-date
is available but the
presumed date of fertilisation
is not.The rules are stored in a CSV file
using #
as field separator and containing a header line.
Each rule has an ID to be amended in the meta data for easier finding
variables covered by contradiction rules. The rules may also have labels
for improved output formatting and tags/categories for stratifying and
aggregating contradictions. The names of the columns are:
An example is given below:
ID#Function_name#A#A_levels#A_value#B#B_levels#B_value#Label
1001#A_less_than_B_vv#AGE_1#NA#NA#AGE_0#NA#NA#Age follow-up
1002#A_not_equal_B_vv#SEX_1#NA#NA#SEX_0#NA#NA#Sex follow-up
1003#A_less_than_B_vv#EDUCATION_1#NA#NA#EDUCATION_0#NA#NA#Education follow-up
1004#A_levels_and_B_levels_ll#EATING_PREFS_0#vegetarian#NA#MEAT_CONS_0#1-2d a week | 3-4d a week | 5-6d a week | daily#NA#Nutrition inconsistency vegetarian
1005#A_levels_and_B_levels_ll#EATING_PREFS_0#vegan#NA#MEAT_CONS_0#1-2d a week | 3-4d a week | 5-6d a week | daily#NA#Nutrition inconsistency vegan
[...]
To facilitate editing/creating contradiction rules, a Shiny App has been implemented.
Function arguments in R: Technically, R passes arguments by reference but employs copy-on-write making arguments looking like being passed by value. Therefore, passing around large constant data frames is usually not a performance problem except for specific forms of parallel computing.↩︎