Skip to content

Technical caveats with advanced CI methods (bca, studentized) implemented for block bootstrap methods #801

@mjwillson

Description

@mjwillson

Reading around the topic, if I understand correctly there are some technical caveats to the way this library computes CIs for block bootstrap methods. I wanted to check if you agree and if so if you have any opinion/rationale/citation on whether these are reasonable approximations or how much of a problem they are/aren't in practise

IID jackknife used for BCa intervals with non-IID data

When using method='bca' with block bootstrap methods, it uses a jackknife procedure to estimate the acceleration a, but this is done using the IID jackknife which doesn't account for serial dependence in the data. One suggestion might be to use a leave-k-out moving block jackknife with the same block size as the outer bootstrap, but I'm not sure if anything has been proven about it in this specific context, and wasn't able to find any papers specifically recommending it. I did find a paper [1] recommending a more complicated modification to the BCa method to make it suitable for use in a block bootstrap setting, however this assumes the statistic is a function of a vector mean and that the function's gradient is available.

It seems that common R routines like boot.ci also have the same problem, in that they use the IID jackknife for BCa intervals even when (as is apparently common) applied to the results of time-series bootstraps from tsboot etc. So it seems this practise may be common, but I'd like to understand if there's any justification for it as a reasonable approximation.

[1] Götze, F. & Künsch, H. R. (1996). "Second-order correctness of the blockwise bootstrap for stationary observations". The Annals of Statistics, 24(5), 1914–1933.

Same block length used for nested bootstrap and outer bootstrap with studentized intervals

IIUC the theory around consistency for nested applications of block bootstrap [2] requires that the inner block length must grow at a slower rate than the outer block length as a function of sample size / sequence length. But here you set it the same as the outer block length. Seems one needs at least a way to specify the inner block length (and ideally some automated way to set it, like we have for the outer block length).

[2] Lahiri, S. N. (2003). Resampling Methods for Dependent Data. Springer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions