exploratory analysis | Štěpán Bahník

In the last year or so, the talk about pre-registrations has become increasingly frequent in psychology. Some see it as one of the possible remedies for problems negatively affecting trustworthiness of published results, but not everyone is convinced that benefits of pre-registration outweigh its disadvantages. Here, we focus on one critique of pre-registration that is sometimes given: Namely, some people argue that pre-registration precludes exploration of data and therefore prevents important serendipitous findings. This would in turn slow down the scientific process, which, these critics argue, is just not worth it.

However, we believe that such critiques are groundless and stem from misunderstanding of how pre-registration works in practice. For this reason, we want to share our experiences with conducting and publishing a pre-registered study in which we had tested our main hypothesis first and then explored the heck out of our data.

It is true that if it was required to have all analyses pre-registered, exploratory analysis would not be possible. But that is not the case – you are supposed to pre-register only your main hypothesis and how you are going to test it. And we all can agree that experiments should test a specific hypothesis and researchers should know in advance which specific manipulation is supposed to influence which specific outcome. If your experiment does not do that, you should think about its design a bit more. But if you have a specific hypothesis and a way to test it in mind, there is no reason why you cannot commit to that hypothesis by putting your whole design in a time-stamped document (e.g. on OSF). After you commit to your study design and hypothesis, there is little possibility that you will be able to fool yourself (and others) that you predicted an outcome that you in fact did not. However, this does not mean that you cannot get crazy analyzing your data after you do that initial test of the main hypothesis. The only thing to watch for is that you keep your confirmatory and exploratory findings separated not only in a results section but also in a discussion. It may be tempting to focus on interpretation of the exploratory findings, especially in cases where data do not support your original hypothesis. However, that would defeat the purpose of pre-registration which lies in making results of hypotheses testing trustworthy and reliable.

All this writing about the possibility of doing exploratory analysis following pre-registered confirmatory analysis was a bit abstract. Fortunately, we can illustrate the idea with a concrete example of our own study. We were interested in a moderator influencing effectiveness of a positive psychology exercise. In particular, the “Three good things in life” exercise asks you to write each day three good things that happened to you during that day. Initial research suggested that this exercise may improve happiness and decrease depressive symptoms. However, this effect wasn’t shown immediately, but only after some time has passed. Now, it is possible that recalling three good things that went well during a day may not be easy for some people. And, there is a literature about processing fluency, which suggests that any difficulty encountered during the recollection may be interpreted as a sign that your day was not that good. Recalling the things probably gets easier with practice, and the effect of the exercise can thus occur only after certain time. Which leads us to our study.

We recruited 204 students who did the exercise on our website for two weeks. However, not all students wrote three good things. They were randomly assigned to write one to ten things each day. Our hypothesis was simple – we expected that writing more things may lead to lower increase in life-satisfaction from a pre-exercise to a post-exercise measurement. It did not.

After testing the pre-registered primary hypothesis, it is possible to ask a lot of additional questions about the exercise. And, this is where exploratory analysis comes to play. Did the number of good things influence life-satisfaction only after a week or six weeks after the exercise? No. Did it influence positive or negative affect instead of life-satisfaction? No, it did not. Could it be that participants did not find the recollection hard because they did not follow instructions of the exercise? It does not seem so. Did the participants writing more things actually consider the exercise harder? A little.

We asked other questions in the exploratory analysis, but it should be already clear that the pre-registration did not by any means stop us from doing exploratory analysis.

Hopefully, we have shown that pre-registration in no way precludes exploration of data. But all exploration takes place in the “garden of forking paths” where decisions about how to proceed with analysis are contingent on the data at hand. In this sense, pre-registration does not take anything away – it just makes truly confirmatory hypotheses testing possible by making it clear that it is not conducted within the garden of forking paths.

Furthemore, pre-registration has other benefits: Because it forces you to think in advance much more deeply about your hypothesis, current literature, study design, sample size, and exclusion criteria, pre-registered studies tend to be more thought-through. And as Anna van’t Veer noted in her post – when conducting pre-registered study, you do not have to do more work, you just do it in a different order. Of course, that is true only if you are not used to collect large amount of data on many different hypotheses and then write up only those that “worked out”. In this way, pre-registration could improve publication (as in “making something public”) of null results and help to decrease the high proportion of false positives in the psychology knowledge base.

This post was written together with Marek Vranka.

Štěpán Bahník

Category Archives: exploratory analysis

Good things about pre-registration