The Developmental Profile, Fourth Edition (DP-4) demonstrates excellent reliability when examined by a number of methods. Test users can feel confident using this assessment to measure a child’s developmental strengths and weaknesses.
Internal Consistency Reliability
Internal consistency reliability in the DP-4 was measured using the split-half method, and was calculated using raw scores for each of the five scales. The internal consistency reliability for the General Development Scale was measured using the formula for reliability of linear combinations.
Different age groupings were used on different forms to reflect a sufficient number of individuals in each group. The internal consistency reliability estimates for each of the forms were all above .80 except for the Parent/Caregiver Checklists, which were .76 and .79. Many of the coefficients were in the excellent range, with scores higher than 0.90.
Standard Error of Measurement (SEM)
SEM provides an index of how close a child’s observed score is to the true score that would be reached without any measurement error. These SEM values are converted into confidence intervals that show a range of scores that likely contain the true score.
A 95% confidence interval indicates a 95% probability that the range of scores surrounding the observed score contain the true score. SEM values demonstrate strong reliability in the DP-4.
The stability of DP-4 scores is represented in test-retest reliability. Test-retest studies of the DP-4 used nationally representative test groups at two week intervals between two administrations (Time 1 and Time 2).
Because the intervals were of such a brief duration, drastic changes in test scores would not be expected. Effect sizes were calculated, which range from from 0 to 1 (with lower scores indicating no clinically meaningful results). The Parent/Caregiver Interview yielded effect sizes ranging from .03 to .07. The effect sizes for the Parent/Caregiver Checklist ranged from .04 to .22. The Teacher Checklist range was .04 to .25.
There were no clinically meaningful differences between administrations and effect sizes, which indicates good test-retest reliability of the DP-4.
The DP-4 underwent two interrater reliability studies. One was conducted with the Parent/Caregiver Interview and the other one was with the Parent/Caregiver Checklist. The interrater reliability was estimated using the intraclass correlation coefficient.
The intraclass correlation coefficients ranged from .60 to .92 for the Parent/Caregiver Interview and .73 to .86 for the Parent/Caregiver Checklist. This indicates a high level of agreement between various raters who complete the same form for the same case.
The DP-4 went through two cross-form consistency studies. One compared the Parent/Caregiver Interview with the Teacher Checklist and the other compared the Parent/Caregiver Checklist with the Teacher Checklist.
The results from the Parent/Caregiver Interview and the Teacher Checklist showed effect sizes ranging from .01 to .10, indicating small differences between the two forms. The results from the Parent/Caregiver Checklist and the Teacher Checklist also showed small effect sizes ranging from .01 to .11.
Alternate Form Reliability
For alternate form reliability in the DP-4, the same parent completed two different forms for the same child. The resulting effect sizes ranged from .05 to .18, indicating strong alternate form reliability.
These five measures all indicate strong reliability of the DP-4.