Clean missing data above an acceptable threshold
Source:R/clean_and_downsample.R
clean_missing_data.RdThis function removes trials and participants who exceed specified thresholds for missing data. There are two main parameters for cleaning: one to remove trials with excessive missing data, and another to remove participants who drop more than a specified proportion of trials. An optional parameter allows you to specify the total number of trials expected for each participant, which is used to calculate the proportion of missing trials.
Usage
clean_missing_data(
data,
pupil,
trial_threshold = 1,
subject_trial_threshold = 1,
total_trials_expected = NULL
)Arguments
- data
Your data of class PupillometryR.
- pupil
A column name denoting pupil size.
- trial_threshold
A proportion of missing data over which a trial is considered lost.
- subject_trial_threshold
A proportion of missing trials over which a participant is considered lost.
- total_trials_expected
(Optional) The total number of trials expected for each participant. If specified, it will be used to calculate the proportion of missing trials. If not specified, the proportion is calculated based on the total number of trials in the data.
Value
A cleaned PupillometryR dataframe with trials and participants exceeding the thresholds removed.
Examples
data(pupil_data)
Sdata <- make_pupillometryr_data(data = pupil_data,
subject = ID,
trial = Trial,
time = Time,
condition = Type)
new_data <- downsample_time_data(data = Sdata,
pupil = LPupil,
timebin_size = 50,
option = 'mean')
#> Calculating mean pupil size in each timebin
calculate_missing_data(data = new_data, pupil = LPupil)
#> # A tibble: 48 × 3
#> ID Trial Missing
#> <chr> <fct> <dbl>
#> 1 1 Easy1 0
#> 2 1 Hard1 0
#> 3 1 Easy2 0
#> 4 1 Hard2 0
#> 5 1 Easy3 0
#> 6 1 Hard3 0
#> 7 2 Easy1 0
#> 8 2 Hard1 0
#> 9 2 Easy2 0
#> 10 2 Hard2 0
#> # ℹ 38 more rows