Preregistration can improve research by providing transparency for around data collection methods, analysis plans, and rules for data exclusion. This can help reduce biases that occur once the data are in front of you.
When research is preregistered, the researchers are describing their plans in advance of their study and submitting it to a registry. For this activity you will find and review a preregistration for a study in your discipline.
To find a preregistration: Go to the OSF Registries and browse the registrations. Search for something in your discipline or that interests you and read the preregistration. You may also wish to explore the OSF Preregistration Outreach Packet, which includes examples.
To Review: Select a preregistration and read it. While reading it, think about the following questions:
- Does the preregistration have complete metadata, such as title, authors, license, etc?
- Does the preregistration clearly describe the study’s hypotheses, variables, and analyses? Is there an outline of the study design. Can you tell what are independent and and dependant variables?
- Does the preregistration include information about how any data will be collected?
Complete this Activity
After reading the preregistration, please link to the preregistration and share your thoughts about it in the comment section below. Do you feel like you have enough information to evaluate the research question and study design? Why or why not.
Image Credit: Preregistration badge from the Badges to Acknowledge Open Practices project by Blohowiak, Benjamin B et al, CC-By Attribution 4.0 International
I read the preregistration for, “What is the value of privacy?” by Rabia Ibrahim Kodapanakkal registered on March 18, 2020. The OSF link is here: https://osf.io/qgw57. The metadata is clearly indicated, although I think the title could be more clear to specify what context of privacy they are researching. I see at the bottom of the preregistration it lists a better title: “What value do people place on their close friend’s privacy?” so I am not sure if that is the correct title.
The study hypothesis and analyses are clear, however only the dependent variables are listed. I still found the study design easy to understand and follow, I am glad that the researcher provided the context and definitions of the terms they will be studying. I do not see the method of data collection, but it can be inferred it will be conducted through a survey as they have listed the questions they want to ask.
I have a general understanding of the study design, however not enough to evaluate the quality. The fact that it is missing an explicit data collection method makes it difficult to evaluate and while it describes the sample size of 350 people and some questions they plan to ask, I still need to know exactly how it will be conducted. The research question and hypothesis is more clear. I would have liked to see a rationale behind why this topic was chosen (significance) and its implications for society as a whole.
I did a searches around anthropology and ethnography as a way of seeing how widely-used OSF preregistrations are in my discipline. There are few hits for these searches, and very few of the studies would be viewed as ethnographic or anthropological in department. Those that focus on anthropological data are predominantly meta-reviews/text-analysis of association proceedings or of archaeological data. None that I observed were pre-registrations for ‘traditional’ participant observation fieldwork. I did find one preregistration with an above-average robustness that tags itself as “anthropology” and has a research team member in a department of anthropology (Penn State) so I’ve chosen to review it – found here: https://osf.io/sqf5b
The project metadata is okay, title and authors appear (though only one of the authors has any profile on the site, i.e. info about institutional affiliation, website etc.). There is no license mentioned. The study’s hypotheses, variables, and analyses are fairly clear, clear enough that a reader could evaluate these.The study design is clearly articulated, though some of the variable description are unclear. There are some formatting/display hiccups throughout.
The preregistration does include information about the data is collected, but this part is somewhat odd in the context of the pre-registration since it mentions that data was previously collected in 2010 and 2018. That said, much of the statistical analysis – what is to be done with the data remains to be done as articulated in the preregistration.
Overall, the use of the preregistration is clear. Even where there are shortcomings – those apart from disciplinary/methodological quibbles – I learned a lot about how OSF preregistration could be useful for anthropological study design. It was helpful to familiarise myself with this system,.
I performed a search for “educational technology” in the OSF registry and returned 19 studies. I decided to review “Cognitive Style and Mobile Technology in E-learning in Undergraduate Medical Education” (https://clinicaltrials.gov/ct2/show/study/NCT02971735). The project metadata appears to be very detailed including title, author, sponsor, collaborators, etc. All the details of the study are clearly outlined including a detailed study description, study type, intervention type, inclusion criteria, and included the study outcomes. What I thought was really handy, and hadn’t expected to see, were links to relevant publications. Very cool!
I reviewed a study titled “Digital Literacy and Corrective Interventions to Reduce Fake News in India: A Field Experiment.”
Complete metadata?
Most of the metadata (author, institution affliction, etc.) was complete, with notable inclusions such as IRB approval date. I could not find mention of a license.
Study design?
The hypotheses and study design are explained well. The author also included a PDF version of their pre-analysis plan, which was cool to see. Independent and dependant variables were also included.
Data collection?
I think it is very clear how the study will be conducted, both on the main registration page but also in the pre-analysis plan. It is quite thorough!
Link: https://osf.io/4sjtm
I do feel like there is enough information to evaluate the research questions because the method and pre-analysis are thoroughly and clearly described.
I looked up the paper titled “ Promoter-intrinsic and local chromatin features determine gene repression in lamina-associated domains”, which I had read (and then used in class) when it came out without knowing that it had been pre-registered. The metadata are relatively clear, but I found very little in terms of the study design, hypotheses, etc. On the other hand, there were full raw data sets, code, and part of a lab book. I must say that I was also a bit disappointed to see that the study was registered April 10, 2019 and the paper was published in May 2019! So basically, this pre-registration was not a true pre-registration but rather something done at the last minute after the final article reporting on the study had been already accepted for publication. What is the point, then? Why pre-register a study after it is already complete and accepted by a top journal (where all kinds of supplementary and raw files are also available)? It seems to defeat the purpose.
I reviewed the paper entitled: Police interrogators’ experiences conducting suspect interrogation in child sexual abuse
investigations. This was created on 2019-04-23 and updated : 2020-09-21. The contributors are:
Mikaela Magnusson, Malin Joleby, Emelie Ernberg, Timothy J. Luke, Sara Landström and Marthe Lefsaker Sakrisvold
The OSF link is: https://osf.io/s7y35/
The headings included:
Design and Procedure
Recruitment, Sampling, and Stopping Rule
Exclusion and inclusion criteria
Survey design
Quantitative analyses.
The metadata seems to be well laid out, but there is no statistical or data included as yet. I did not notice author or institution affiliations. On the whole, there seemed to be a lot of procedural data laid out in the paper.
Hello,
I looked at the following preregistration:
Journal Impact Factor vs Citation Counts
Author(s): Dasapta Erwin Irawan
Last edited: July 16, 2017 UTC
OSF link: https://osf.io/2jnb9
Metadata?
Expected metadata fields for the preregistration are robust.
The following fields are present:
Title, Research question, Hypothesis, Sample Information, Design details, Variable details, Analyses details (including scripts).
Outline of the study design?
Below is the outline presented in the document, explaining the process to be followed by researchers to gather and analyze the data:
1. Access databases: Scopus, Google Scholar, Ms Academic
2. Search for papers in geoscience field
3. Download citation data from the three databases
4. Combine the data from the databases
5. Run data cleaning
6. Analyze data: simple linear regression
7. Draw some conclusions
The data will be collected under the following headings:
journal title, author, JIF, citation counts, and publisher.
For each citation data, we will specifically harvest: article title, author, journal name, journal publisher, impact factor 2016, 5 year impact factor, and citation counts.
Licensing?
I did not see any information about Licensing. Since funding is yet to be seured, this item appears to be an open question.
Study’s hypotheses, variables, and analyses?
The hyothesis is clearly stated:
Journal Impact Factors do not control citations counts of articles in the Geoscience field
Variables are explained here:
1. We measure the following variables: journal title, author, JIF, citation counts, and publisher.
2. We will evaluate the correlation between JIF and citation count using simple linear regression equation. R2 values are going to be used to evaluate the correlation. Good correlation is when the R2 value larger or equal to 0.75.
3. The we will analyze the possible reasons that relate to such correlations.
How the data will be collected and analyzed is also clearly stated –
The sample size is 1000 citations. They will be collected from Google Scholar and Scopus database. We use Google Scholar to get wider scope of citation data. “Publish or Perish” app (from Anne-Will Harzing) will be used as the main tool to harvest Google Scholar citation data. We will use Scopus service from Institut Teknologi Bandung library service. In both databases, we will use the following search terms (in title): “geology”, “geoscience”, “earth science”, “physics”, “math”, “health”, “medicine”, “humaniora”, “literature”, and “arts”. Each search results will be aggregated and filtered from double entries. For each citation data, we will specifically harvest: article title, author, journal name, journal publisher, impact factor 2016, 5 year impact factor, and citation counts. .
Analyses?
Then simple linear regression will be applied to the data to see for any visible correlation.
Simple linear regression is the only statistical model in this project. We will split the analysis based on field of study: life and earth science, humanities/literature/arts, health and medical science, physics and math, and social science.
Independent and and dependant variables?
These details were not discussed in the document see the description of analysis above.
Hello,
I found the licensing information on the side, for the following item:
OSF link: https://osf.io/2jnb9
Date registered
July 4, 2017
Registration DOI
10.17605/OSF.IO/2JNB9
License
CC-BY Attribution 4.0 International
As a higher education professional and researcher who has conducted narrative inquiry and personal history self-study (teacher education) research, I picked the following: Kiltz, L., Jansen, E., Rinas, R., Daumiller, M., & Fokkens-Bruinsma, M. (2020, April 6). ‘If They Struggle, I Can’t Sleep Well Either’: The Interaction Between Student and Teacher Well-Being in Higher Education.
OSF Link: https://osf.io/2bu7v
Date registered: April 14, 2020
DOI: n.a.
License: None.
Does the preregistration have complete metadata, such as title, authors, license, etc?
-No, this has incomplete information. Includes the title and authors, but no license. No specific institutions named (just Germany and the Netherlands, one institution per).
Does the preregistration clearly describe the study’s hypotheses, variables, and analyses? Is there an outline of the study design. Can you tell what are independent and and dependant variables? Does the preregistration include information about how any data will be collected?
-Yes. Methods are described: sixteen interviews were conducted. 1) six university students 2) ten university teachers. “Since this project uses qualitative research methods, there is no clear outcome variable, but only themes which emerge during the coding process and which could answer the research questions.” Data collection and analysis procedures are described quite clearly, including sampling and interview procedures. “The participants were recruited via already existing contacts. People involved in the project asked potential participants if they would be willing to take part in the study as well as if they themselves knew people who would want to. The interviews were held partly at the university and partly at the students’ homes, according to their preference. They took around 1-1,5 hours and were audio-recorded.”
I reviewed a per-registration in psychology (https://osf.io/vp9cf)
Does the preregistration have complete metadata, such as title, authors, license, etc?
The data has already been collected with some pre-analysis data manipulation done (e.g., regarding missing values). Information about the title, authors, and license are reported.
Does the preregistration clearly describe the study’s hypotheses, variables, and analyses? Is there an outline of the study design. Can you tell what are independent and and dependant variables?
The hypotheses, variables (multiple ivs and dvs), method, and analyses are spelt out clearly. The authors additionally did a good job with reporting how data exclusion criteria will be applied, and a good job with describing details regarding how data will be prepared.
Does the preregistration include information about how any data will be collected?
The only drawback with the pre-registration is on data that has already been collected and some pre-analysis data manipulation being done. Notably, the sample size was not justified. However, details regarding power, data manipulation are spelt out very clearly, as well as the justification for them.
I am interested in education research topics so I chose the preregistration titled “Bringing the Theory and Measurement of Teaching into Alignment” preprint to review. The OSF link is: https://osf.io/fnhvw/
Reflection Question: Do you feel like you have enough information to evaluate the research question and study design? Why or why not.
This preprint seemed to have all the elements of an interesting research topic with an informative abstract, list of authors, Creative Commons license, a list of references and a link to “supplemental materials”. Unfortunately the “supplemental materials” file is locked and permission from the author is needed. The wiki also has no data. Without access to an explanation on how the research topic was done it is impossible to evaluate the design of the study. Perhaps the authors feel the information in the preprint is sufficient but an overall summary of how the project and research was done would help readers evaluate the reciprocity.
I definitely learned the importance of outlining the study design in the OSF preprint. I’m going to work to add some of my research but with more details on reciprocity and the design. A worthwhile exercise indeed!
I reviewed the preregistration for “Students’ Beliefs About Open-Access Textbooks” by Elyssa Twedt and Carlee Beth Hawkins (https://doi.org/10.17605/OSF.IO/3ZMDF). The metadata indicates title, authors, description and date. The only thing missing would be an open license.
The authors clearly describe the research questions, the variables and the statistical tests they will use with their data.
The authors also acknowledge that they had already collected the data from a sample of 166 psychology students. Therefore, they specified how they were going to analyze their data and the number of items measured by their questionnaire. I think they also could have shared the questionnaire they applied so others could replicate their study on a different sample.
I read the preregistration for, “Machine learning computational tools to assist the performance of systematic reviews: A mapping review.
registered on March 17, 2021.
The OSF link is; https://doi.org/10.17605/OSF.IO/9S4GM
On the platform is the summary with an accessible and detailed attachment of the Information within the protocol.
The metadata is indicated clearly. though I generally understand the study design,The research question and hypothesis are clear, however there is not enough to further evaluate the quality. There is an elaborate provision of the background context, search stragegy, eligibility, process and quality assessment. The project plans to extracted data, compile it in an Summary of Findings table before mapping identified tools. The result will in the form of a description of the results (graphic represntation) of obtained data.
I read the preregistration for Effects of Ocrevus in Relapsing Multiple Sclerosis (MOBILE-RMS)
link: https://clinicaltrials.gov/ct2/show/record/NCT04387734
There is a detailed description of what variables they are going to collect and who the participant population is. However there was no information on analysis. The only information we have is in their summary where they say “The purpose of this study is to test if people with relapsing multiple sclerosis (RMS) can improve ambulatory functions after one-year treatment with Ocrevus in comparison with platform therapy. Sixty qualified individuals with RMS will be evenly assigned into two groups: Ocrevus and Platform. Each group will receive the respective treatment following the FDA regulations over the one-year course. Their ambulatory functions will be assessed five times three months apart. In addition, they will receive brain MRI scans three times six months apart. Their ambulatory functions and MRI measurements will be compared between groups over time to fulfill the purposes of this study.” So I know it is a longitudinal study (over 1 year) and that they are collecting a whole bunch of metrics but I don’t know how they will be comparing them and how they will decide if ambulation is “improved”.
I must say that I’m rather disappointed in the lack of information, especially reading other posts here where clear details on all aspects of the study are listed.
I work in science education and am specifically interested in information literacy and decision making in our everyday lives related to science. I searched using the terms science AND education AND misinformation. I scrolled through some results and found this study, which interested me:
Doi: https://doi.org/10.17605/OSF.IO/G2MPE
Study Title: Scientific thinking and decision-making in everyday life: Part 1, Exploratory study
The Preregistration has all the metadata I can think of – title, authors, license information (CC-BY Attribution 4.0 International), information about the associated project, as well as tags related to subjects and key words.
The preregistration also describes the hypotheses, variables explored, how data will be collected and analyzed. This study is exploring multiple variables; while they were identified in a table, I might like to see a more detailed description of each variable, and perhaps inclusion of some of the instruments (eg survey questions) that are being used. I appreciated that they offered information on data cleaning and quality checks / data exclusion and missing data, as well.
Overall, I really appreciated this opportunity to read about proposed research and will be checking back to read more about this research as it continues (this is part 1 of a larger study).