This function takes a data frame and creates a data dictionary. The data dictionary includes the variable name, a human-readable name, the variable type, and a description. If a model is specified, the function uses OpenAI's API to generate the information based on the characteristics of the data frame.
Usage
create_data_dictionary(
data,
file_path,
model = NULL,
sample_n = 5,
grouping = NULL,
force = FALSE
)
Arguments
- data
A data frame to create a data dictionary for.
- file_path
The file path to save the data dictionary to.
- model
The ID of the OpenAI chat completion models to use for generating descriptions (see
openai::list_models()
). If NULL (default), a scaffolding for the data dictionary is created.- sample_n
The number of rows to sample from the data frame to use as input for the model. Default NULL.
- grouping
A character vector of column names to group by when sampling rows from the data frame for the model. Default NULL.
- force
If TRUE, overwrite the file at
file_path
if it already exists. Default FALSE.