# VPT - Difference of Means

library(splithalfr)

This vignette describes a scoring method similar to Mogg and Bradley (1999); difference of mean reaction times (RTs) between conditions with probe-at-test and probe-at-control, for correct responses, after removing RTs below 200 ms and above 520 ms, on Visual Probe Task data.

# Dataset

Load the included VPT dataset and inspect its documentation.

data("ds_vpt", package = "splithalfr")
?ds_vpt

## Relevant variables

The columns used in this example are:

• UserID, which identifies participants
• block_type, in order to select assessment blocks only
• patt, in order to compare trials in which the probe is at the test or at the control stimulus
• response, in order to select correct responses only
• rt, in order to drop RTs outside of the range [200, 520] and calculate means per level of patt
• thor, which is the horizontal position of test stimulus
• keep, which is whether probe was superimposed on the stimuli or replaced stimuli

## Data preparation

Only select trials from assessment blocks

ds_vpt <- subset(ds_vpt, block_type == "assess")

## Counterbalancing

The variables patt, thor, and keep were counterbalanced. Below we illustrate this for the first participant.

ds_1 <- subset(ds_vpt, UserID == 1)
table(ds_1$patt, ds_1$thor, ds_1$keep) # Scoring the VPT ## Scoring function The scoring function calculates the score of a single participant as follows: 1. select only correct responses 2. drop responses with RTs outside of the range [200, 520] 3. calculate the mean RT of remaining responses fn_score <- function (ds) { ds_keep <- ds[ds$response == 1 & ds$rt >= 200 & ds$rt <= 520, ]
rt_yes <- mean(ds_keep[ds_keep$patt == "yes", ]$rt)
rt_no <- mean(ds_keep[ds_keep$patt == "no", ]$rt)
return (rt_no - rt_yes)
}

## Scoring a single participant

Let’s calculate the VPT score for the participant with UserID 23. NB - This score has also been calculated manually via Excel in the splithalfr repository.

fn_score(subset(ds_vpt, UserID == 23))

## Scoring all participants

To calculate the VPT score for each participant, we will use R’s native by function and convert the result to a data frame.

scores <- by(
ds_vpt,
ds_vpt$UserID, fn_score ) data.frame( UserID = names(scores), score = as.vector(scores) ) # Estimating split-half reliability ## Calculating split scores To calculate split-half scores for each participant, use the function by_split. The first three arguments of this function are the same as for by. An additional set of arguments allow you to specify how to split the data and how often. In this vignette we will calculate scores of 1000 bootstrapped splits. The trial properties patt, thor and keep were counterbalanced in the VPT design. We will stratify splits by these trial properties. See the vignette on splitting methods for more ways to split the data. The by_split function returns a data frame with the following columns: • participant, which identifies participants • replication, which counts replications • score_1 and score_2, which are the scores calculated for each of the split datasets Calculating the split scores may take a while. By default, by_split uses all available CPU cores, but no progress bar is displayed. Setting ncores = 1 will display a progress bar, but processing will be slower. split_scores <- by_split( ds_vpt, ds_vpt$UserID,
fn_score,
replications = 1000,
stratification = paste(ds_vpt$patt, ds_vpt$thor, ds_vpt\$keep)
)

## Calculating reliability coefficients

Next, the output of by_split can be analyzed in order to estimate reliability. By default, functions are provided that calculate Spearman-Brown adjusted Pearson correlations (spearman_brown), Flanagan-Rulon (flanagan_rulon), Angoff-Feldt (angoff_feldt), and Intraclass Correlation (short_icc) coefficients. Each of these coefficient functions can be used with split_coef to calculate the corresponding coefficients per split, which can then be plotted or averaged via a simple mean. A bias-corrected and accelerated bootstrap confidence interval can be calculated via split_ci.

# Spearman-Brown adjusted Pearson correlations per replication
coefs <- split_coefs(split_scores, spearman_brown)
# Distribution of coefficients
hist(coefs)
# Mean of coefficients
mean(coefs)
# Confidence interval of coefficients
split_ci(split_scores, coefs)