Abstract:By analyzing the response data set of 170 tertiary level students on the old version CET6 test form, we demonstrated how multivariate generalizability theory can be employed in the evaluation of reliability coefficient for a language test form with a complex structure. The result shows that for the given group of students and the population it represents, the objective section of the test as whole has a generalizability coefficient of 0.921 and an independence index of 0.907, but the three sections vary greatly in terms of reliability coefficient, with Listening Comprehension having an of 0.769 and a of 0.744, Reading Comprehension an of 0.551 and a of 0.503, and Vocabulary and Straucture an of 0.802 and a of 0.782. Further probe reveals that of the 70 items, 23 are inconsistent with their relevant sections, 6 being inconsistent with Listening Comprehension, 10 with Reading Comprehension, and 7 with Vocabulary and Structure. If the response to the inconsistent items does not count, both the overall reliability coefficient and the reliability coefficient for each section significantly improve. The coefficient for the total score rises to 0.937, and that for Listening to 0.831, for Reading 0.773 and for Vocabulary and Structure 0.859. On the basis of the analysis, 5 suggestions have been proposed for language testers in China.