3 - Item Response Theory with Stan
In this lab, you will explore item response theory and Bayesian modelling with the Stan language.
Setup
First, you need to install Stan. This may take several minutes :))
[ ]:
import numpy as np
import pandas as pd
# Colab setup (courtesy of Justin Bois)
# N.B. This cell may take several minutes to complete (3 mins on the instructor's machine)
import os, sys, subprocess
cmd = "pip install --upgrade iqplot bebi103 arviz cmdstanpy watermark"
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
import cmdstanpy; cmdstanpy.install_cmdstan()
CmdStan install directory: /root/.cmdstan
Installing CmdStan version: 2.35.0
Downloading CmdStan version 2.35.0
Download successful, file: /tmp/tmpcn44248l
Extracting distribution
DEBUG:cmdstanpy:cmd: make build -j1
cwd: None
Unpacked download as cmdstan-2.35.0
Building version cmdstan-2.35.0, may take several minutes, depending on your system.
DEBUG:cmdstanpy:cmd: make examples/bernoulli/bernoulli
cwd: None
Installed cmdstan-2.35.0
Test model compilation
True
Next, you need to download the data and Stan template here. Save it to your own Google Drive as in previous labs, and then mount your drive.
[ ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
Unzip the files into a folder (you will be able to find this folder if you click the folder icon in your left sidebar):
[ ]:
!unzip -qq '/content/drive/MyDrive/irt4ancm.zip'
The following cell prints a list of all of the segments used in the experiment, so that you can find and listen to the results. All of the audio was extracted from the official YouTube videos of the Eurovision Song Contest finals.
Background
The data in this lab come from the Eurovision Song Contest edition of the Hooked on Music experiment. You can try the experiment here. In this experiment, people were presented with segments from Eurovision songs and were asked if they’d ever heard the song. If their answer was ‘yes’, the song was muted for a few seconds and then went back on. In some trials, the song resumed at the right point. In others, it resumed a bit earlier or later. Participants were then alsp asked whether the second segment was the ‘right’ continuation for the first or not.
[ ]:
segment_df = pd.read_csv('irt4ancm/segment_list.csv')
segment_df = segment_df.set_index('segment')
segment_df
song | country | year | artist | title | start_position | segment_type | |
---|---|---|---|---|---|---|---|
segment | |||||||
1 | 1 | Ukraine | 2016 | Jamala | 1944 | 0.000 | i |
2 | 1 | Ukraine | 2016 | Jamala | 1944 | 7.925 | v |
3 | 1 | Ukraine | 2016 | Jamala | 1944 | 39.500 | c |
4 | 1 | Ukraine | 2016 | Jamala | 1944 | 72.043 | v |
5 | 1 | Ukraine | 2016 | Jamala | 1944 | 132.559 | b |
... | ... | ... | ... | ... | ... | ... | ... |
433 | 69 | Czechia | 2019 | Lake Malawi | Friend of a Friend | 78.128 | v |
434 | 70 | Denmark | 2019 | Leonora | Love Is Forever | 61.508 | v |
435 | 71 | Cyprus | 2019 | Tamta | Replay | 66.212 | v |
436 | 73 | Slovenia | 2019 | Zala Kralj & Gašper Šantl | Sebi | 70.698 | v |
437 | 75 | Serbia | 2019 | Nevena Božović | Kruna | 106.544 | v |
437 rows × 7 columns
Lab
Open the irt4ancm.stan
file in the right-hand pane. You will make any adjstments to your model there. Here is a breakdown of what the main code blocks from irt4ancm.stan
are doing:
data {
int<lower=1> M; // number of observations
int<lower=1> N; // number of participants
int<lower=1> I; // number of song segments
int<lower=1> J; // number of songs
array[M] int<lower=0,upper=1> is_recognised; // was the segment recognised?
array[M] int<lower=0,upper=1> is_verified; // was the segment verified correctly?
array[M] int<lower=1,upper=N> participant; // participant number
array[M] int<lower=1,upper=I> seg; // segment number
array[M] int<lower=1,upper=I> song; // segment number
array[M] int<lower=0,upper=1> continuation_correctness; // did the verification segment restart in the correct place?
vector<lower=0>[M] recognition_time; // how long did it take to recognise the segment?
matrix[I,10] audio_features; // audio features for each segment
vector[N] sophistication; // Goldsmith's music sophistication for each participant
}
This block is simply defining the variables corresponding to the data we’re going to fit the model on.
parameters {
real mu_delta;
real<lower=0> sigma_theta; // participant prior SD
real<lower=0> sigma_delta; // difficulty prior SD
vector[N] theta; // participant abilities
vector[I] delta; // segment difficulties
}
Here is where you declare your parameters. Any parameter that you plan to include in the model needs to be specified here. In this case, we’re starting off with a Rasch (1PL) model, where the only parameters are \(\theta\) (the participant’s ability) and \(\delta\) (the difficulty of the segment). As you experiment with more complex models (2PL, 3PL, and 4PL), will need to add more parameters. For each parameter, we should also specify what is our hypothesis about its distribution. This
is what the following block (included inside model
) does:
// Hyperpriors
mu_delta ~ std_normal();
sigma_theta ~ std_normal(); // Stan automatically cuts off the negative values
sigma_delta ~ std_normal(); // Stan automatically cuts off the negative values
// Priors
theta ~ normal(0, sigma_theta);
delta ~ normal(mu_delta, sigma_delta);
How do we decide on which distribution to choose for each parameter? Unless you have a specific educated guess about it, starting with a normal distribution is usually a good choice. Let’s now move to the core part of the code, where our model is atually defined:
is_verified[m] ~ bernoulli_logit(theta[participant[m]] - delta[seg[m]]);
This line essentially means that the variable you are trying to model (is_verified
) is distributed as (~
) a bernoulli logit distribution with probability \(\theta-\delta\). When editing the irt4ancm.stan
file, you might actually want to start from this line. You can choose to either predict the is_verified
or the is_recognised
variable.
[ ]:
from cmdstanpy import CmdStanModel
As we’ve just seen, the ‘recipe’ for our model is defined in the stan file. Now we will actually compile the model. Every time you change the model, you will need to save the sta file and recompile it by running the cell below.
💡 Tip: Colab will probably save your edits to the stan file automatically. However, be aware that the saved changes may be available only within the session. In other words, when the notebook runtime is disconnected, you may lose the edits you make to your stan file. To make sure you don’t lose your progress, you may want to use one of the following strategies:
Unzip the
irt4ancm.zip
folder directly in GG Drive and make sure you edit the stan file saved to your DriveAlternatively, you can simply download the stan file before closing Colab. You can do that by simply hovering on the stan file on the left pane > clicking on the 3 dots > selecting ‘download’.
[ ]:
model = CmdStanModel(model_name="irt4ancm", stan_file="irt4ancm/irt4ancm.stan")
13:02:37 - cmdstanpy - WARNING - CmdStanModel(model_name=...) is deprecated and will be removed in the next major version.
WARNING:cmdstanpy:CmdStanModel(model_name=...) is deprecated and will be removed in the next major version.
13:02:37 - cmdstanpy - INFO - compiling stan file /content/irt4ancm/irt4ancm.stan to exe file /content/irt4ancm/irt4ancm
INFO:cmdstanpy:compiling stan file /content/irt4ancm/irt4ancm.stan to exe file /content/irt4ancm/irt4ancm
DEBUG:cmdstanpy:cmd: make STANCFLAGS+=--filename-in-msg=irt4ancm.stan /content/irt4ancm/irt4ancm
cwd: /root/.cmdstan/cmdstan-2.35.0
DEBUG:cmdstanpy:Console output:
--- Translating Stan model to C++ code ---
bin/stanc --filename-in-msg=irt4ancm.stan --o=/content/irt4ancm/irt4ancm.hpp /content/irt4ancm/irt4ancm.stan
--- Compiling C++ code ---
g++ -std=c++17 -pthread -D_REENTRANT -Wno-sign-compare -Wno-ignored-attributes -Wno-class-memaccess -I stan/lib/stan_math/lib/tbb_2020.3/include -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.84.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials -DBOOST_DISABLE_ASSERTS -c -Wno-ignored-attributes -x c++ -o /content/irt4ancm/irt4ancm.o /content/irt4ancm/irt4ancm.hpp
--- Linking model ---
g++ -std=c++17 -pthread -D_REENTRANT -Wno-sign-compare -Wno-ignored-attributes -Wno-class-memaccess -I stan/lib/stan_math/lib/tbb_2020.3/include -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.84.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials -DBOOST_DISABLE_ASSERTS -Wl,-L,"/root/.cmdstan/cmdstan-2.35.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/root/.cmdstan/cmdstan-2.35.0/stan/lib/stan_math/lib/tbb" /content/irt4ancm/irt4ancm.o src/cmdstan/main.o -ltbb stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_idas.a stan/lib/stan_math/lib/sundials_6.1.1/lib/libsundials_kinsol.a stan/lib/stan_math/lib/tbb/libtbb.so.2 -o /content/irt4ancm/irt4ancm
rm /content/irt4ancm/irt4ancm.o /content/irt4ancm/irt4ancm.hpp
13:02:49 - cmdstanpy - INFO - compiled model executable: /content/irt4ancm/irt4ancm
INFO:cmdstanpy:compiled model executable: /content/irt4ancm/irt4ancm
We fit the model here using all_plays.json
, which contains a complete set of data. You may find it more interesting to explore rec_only.json
as an alternative, which contains only plays where the participant claimed to recognise the segment.
[ ]:
fit = model.sample(data="irt4ancm/all_plays.json")
DEBUG:cmdstanpy:cmd: /content/irt4ancm/irt4ancm info
cwd: None
13:02:53 - cmdstanpy - INFO - CmdStan start processing
INFO:cmdstanpy:CmdStan start processing
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/content/irt4ancm/irt4ancm', 'id=1', 'random', 'seed=84057', 'data', 'file=irt4ancm/all_plays.json', 'output', 'file=/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 1
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/content/irt4ancm/irt4ancm', 'id=2', 'random', 'seed=84057', 'data', 'file=irt4ancm/all_plays.json', 'output', 'file=/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_2.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 2
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/content/irt4ancm/irt4ancm', 'id=3', 'random', 'seed=84057', 'data', 'file=irt4ancm/all_plays.json', 'output', 'file=/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_3.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
DEBUG:cmdstanpy:idx 3
DEBUG:cmdstanpy:running CmdStan, num_threads: 1
DEBUG:cmdstanpy:CmdStan args: ['/content/irt4ancm/irt4ancm', 'id=4', 'random', 'seed=84057', 'data', 'file=irt4ancm/all_plays.json', 'output', 'file=/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_4.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
13:05:30 - cmdstanpy - INFO - CmdStan done processing.
INFO:cmdstanpy:CmdStan done processing.
DEBUG:cmdstanpy:runset
RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4
cmd (chain 1):
['/content/irt4ancm/irt4ancm', 'id=1', 'random', 'seed=84057', 'data', 'file=irt4ancm/all_plays.json', 'output', 'file=/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
retcodes=[0, 0, 0, 0]
per-chain output files (showing chain 1 only):
csv_file:
/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv
console_msgs (if any):
/tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_0-stdout.txt
DEBUG:cmdstanpy:Chain 1 console:
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = false (Default)
thin = 1 (Default)
adapt
engaged = true (Default)
gamma = 0.05 (Default)
delta = 0.8 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
save_metric = false (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
num_chains = 1 (Default)
id = 1 (Default)
data
file = irt4ancm/all_plays.json
init = 2 (Default)
random
seed = 84057
output
file = /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = profile.csv (Default)
save_cmdstan_config = false (Default)
num_threads = 1 (Default)
Gradient evaluation took 0.007894 seconds
1000 transitions using 10 leapfrog steps per transition would take 78.94 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 2000 [ 0%] (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Iteration: 100 / 2000 [ 5%] (Warmup)
Iteration: 200 / 2000 [ 10%] (Warmup)
Iteration: 300 / 2000 [ 15%] (Warmup)
Iteration: 400 / 2000 [ 20%] (Warmup)
Iteration: 500 / 2000 [ 25%] (Warmup)
Iteration: 600 / 2000 [ 30%] (Warmup)
Iteration: 700 / 2000 [ 35%] (Warmup)
Iteration: 800 / 2000 [ 40%] (Warmup)
Iteration: 900 / 2000 [ 45%] (Warmup)
Iteration: 1000 / 2000 [ 50%] (Warmup)
Iteration: 1001 / 2000 [ 50%] (Sampling)
Iteration: 1100 / 2000 [ 55%] (Sampling)
Iteration: 1200 / 2000 [ 60%] (Sampling)
Iteration: 1300 / 2000 [ 65%] (Sampling)
Iteration: 1400 / 2000 [ 70%] (Sampling)
Iteration: 1500 / 2000 [ 75%] (Sampling)
Iteration: 1600 / 2000 [ 80%] (Sampling)
Iteration: 1700 / 2000 [ 85%] (Sampling)
Iteration: 1800 / 2000 [ 90%] (Sampling)
Iteration: 1900 / 2000 [ 95%] (Sampling)
Iteration: 2000 / 2000 [100%] (Sampling)
Elapsed Time: 48.617 seconds (Warm-up)
29.932 seconds (Sampling)
78.549 seconds (Total)
13:05:30 - cmdstanpy - WARNING - Non-fatal error during sampling:
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Consider re-running with show_console=True if the above output is unclear!
WARNING:cmdstanpy:Non-fatal error during sampling:
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 32, column 2 to column 33)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in 'irt4ancm.stan', line 33, column 2 to column 40)
Consider re-running with show_console=True if the above output is unclear!
Stan has a handy set of diagnostics that can warn you of any problems with your model fit. For the purposes of this lab, you will probably not have time to fix any problems, but you can report on them in the assignment.
[ ]:
print(fit.diagnose())
DEBUG:cmdstanpy:cmd: /root/.cmdstan/cmdstan-2.35.0/bin/diagnose /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_2.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_3.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_4.csv
cwd: None
Processing csv files: /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv, /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_2.csv, /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_3.csv, /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_4.csv
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
If the model is (mostly) problem-free, you can look at a summary of the parameter values. Remember that we get not a specific value but rather a whole distribution on each parameter. Stan reports means, standard error and deviation, and (most popular in the literature) 5%/50%/95% quantiles.
The final three columns are convergence statistics. As a (very) rough rule of thumb, you want N_eff
to be above 400 and R_hat
to be less than 1.05.
[ ]:
fit.summary()
DEBUG:cmdstanpy:cmd: /root/.cmdstan/cmdstan-2.35.0/bin/stansummary --percentiles= 5,50,95 --sig_figs=6 --csv_filename=/tmp/tmpmu1ubl7v/stansummary-irt4ancm-rukd60vu.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_1.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_2.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_3.csv /tmp/tmpmu1ubl7v/irt4ancmlqtbjyc1/irt4ancm-20240906130253_4.csv
cwd: None
Mean | MCSE | StdDev | 5% | 50% | 95% | N_Eff | N_Eff/s | R_hat | |
---|---|---|---|---|---|---|---|---|---|
lp__ | -5525.360000 | 0.895102 | 26.427900 | -5568.770000 | -5525.080000 | -5482.130000 | 871.729 | 7.37628 | 1.003680 |
mu_delta | 1.259720 | 0.008797 | 0.096447 | 1.102020 | 1.260350 | 1.420080 | 120.208 | 1.01716 | 1.033090 |
sigma_theta | 1.673200 | 0.002529 | 0.074205 | 1.552020 | 1.670900 | 1.798820 | 861.049 | 7.28591 | 1.003320 |
sigma_delta | 0.813807 | 0.001415 | 0.044432 | 0.743547 | 0.812273 | 0.889880 | 986.007 | 8.34326 | 1.001050 |
theta[1] | -1.869670 | 0.013852 | 0.785863 | -3.246340 | -1.820170 | -0.692069 | 3218.810 | 27.23650 | 0.999585 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
delta[433] | 0.739824 | 0.007818 | 0.367495 | 0.144554 | 0.739232 | 1.343020 | 2209.490 | 18.69600 | 1.001850 |
delta[434] | 1.223600 | 0.008058 | 0.379633 | 0.606406 | 1.212650 | 1.860400 | 2219.480 | 18.78050 | 1.000750 |
delta[435] | 1.011530 | 0.008252 | 0.384406 | 0.382054 | 1.006030 | 1.650400 | 2169.800 | 18.36010 | 1.003760 |
delta[436] | 1.620390 | 0.008185 | 0.380285 | 1.005200 | 1.610360 | 2.249280 | 2158.780 | 18.26690 | 1.003100 |
delta[437] | 2.033020 | 0.009591 | 0.422329 | 1.374380 | 2.026730 | 2.750340 | 1938.970 | 16.40690 | 1.001530 |
936 rows × 9 columns
You may want to rearrange the parameter estimates in a numpy array (or even a pandas dataframe, when your model has more parameters), so that you can look at the \(\delta\) estimates for each segment.
[ ]:
deltas = fit.stan_variables()['delta'].mean(axis=0)
By simply sorting the array, you can then identify the segments corresponding to the higher/lower \(\delta\) values.
[ ]:
# segments ids corresponding to lower deltas
print(np.argsort(deltas)[:10])
# segments ids corresponding to higher deltas
print(np.flip(np.argsort(deltas))[:10])
Edit irt4ancn.stan
to try different models. Ask Ashley for help with the syntax! Handy distributions include:
~ std_normal()
for a standard normal (or half-normal) distribution~ normal(mu, sigma)
for a normal distribution with specified mean and standard deviation~ lognormal(mu, sigma)
for a log-normal distribution (handy for discrimination parameters)~ bernoulli(p)
for a Bernoulli distribution parameterised by the probability of success~ bernoulli_logit(z)
for a Bernoulli distribution parameterised by the inverse logistic function of the probability of success.
The full 4PL IRT model looks like this:
\(\mathrm{P}[x_{ni} = 1] = \gamma_i + (\zeta_i - \gamma_i) \frac{\mathrm{e}^{\alpha_i(\theta_n - \delta_i)}}{1 + \mathrm{e}^{\alpha_i(\theta_n - \delta_i)}}\)
For the 3PL, \(\zeta\) is fixed to 1.
For the 2PL, \(\zeta\) is fixed to 1 and \(\gamma\) is fixed to 0.
For the 1PL (Rasch model), \(\zeta\) is fixed to 1, \(\gamma\) is fixed to 0, and \(\alpha\) is fixed to 1.
Don’t forget to add priors as you add more parameters!
WARNING: In the 2PL, 3PL, and 4PL, \(\theta\) needs to be distributed as a standard normal distribution and there can be no hyper-parameter \(\sigma_\theta\). Otherwise, the model is not identified, and Stan will run into many problems while sampling.
ASSIGNMENT
Explore 1-, 2-, 3- and 4-parameter IRT models for the Hooked on Music data according to the template. Which segments are most difficult? Which are easiest? Most/least discriminating? Are the guessing parameters what you would expect? (Tip: if you’re struggling with the syntax, here is the IRT section of the Stan Users Guide)
Explore an alternative data model (e.g., using
rec_only.json
or focussing onis_recognised
instead ofis_verified
), again with 1-, 2-, 3- and 4-parameter IRT models. How do your results compare to what you found in Step 1?
Write a short report (less than one page) summarising your findings and (to the extent you can) any musical explanations or surprising findings based on what you can hear in the songs.
⚠️ Please make sure you include your irt4ancn.stan
file(s) in the assignment submission, as well as the notebook and the PDF with your report.
Additional Resources
The
cmdstanpy
documentationThe Stan User’s Guide