慶應SFC 2003年 総合政策学部 英語 大問1 全文

 You have probably taken many tests in your life. Perhaps some questions have occurred to you as you struggled through a test. How good is the test I am taking? Does it really work? These questions could and occasionally do result in long hours of useless discussion. Subjective opinions, hunches, and personal biases may lead either to extravagant claims regarding what a particular test can accomplish, or to its stubborn rejection. The only way questions such as these can be conclusively answered is by empirical trial. The objective evaluation of tests primarily involves the determination of the reliability and the validity of the test in specified situations.

 In terms of testing, reliability means consistency. Test reliability refers to the consistency of scores obtained by the same persons when 1 (1. examined 2. retested 3. considered) with the same or equivalent test. For example, if a child receives an IQ of 110 on Monday and an IQ of 80 when retested on Friday, it is obvious that little confidence can be placed in either score. Likewise, if on one vocabulary test a student gets 40 words correct and on another test of similar difficulty the same student gets only 20 words correct, then [2](1. either 2. each 3. neither) test can be taken as a dependable measure of the students verbal ability. In both of these examples, it is possible that only one of the two scores is [3](1. in 2. by 3. on) error, but this could only be demonstrated by further retesting. Whether one of the scores or neither is an adequate measure of the individual’s ability cannot be established without additional information.

 Before a test is released for general use, a thorough, objective check of its reliability should be carried out. There are different types of test reliability, as well as methods of measuring reliability. Reliability can be checked with references to [4](1. fluctuations 2. escalation 3. duration) over time, the particular selection of items or behavior sample constituting the test, the role of different examiners or scorers, and other aspects of the testing situation. It is essential to specify the type of reliability and the method employed to determine it, because the same test may [5](1.fit 2. vary 3. apply) in these different aspects. The number and nature of individuals on whom reliability was checked should [6](1. realistically 2. however 3. likewise) be considered. With such information, the test user should be able to predict how reliable the test would be for any given group.

 Undoubtedly the most important question to be asked about any test concerns its validity. Validity refers to the degree to which the test actually measures what it intends to measure. Validity provides a direct check on how well a test fulfills its function. The determination of validity usually requires independent, external criteria of whatever the test is designed to measure. For example, if a medical aptitude test is to be used in selecting promising applicants for medical school, ultimate success in medical school would be [7](1 a criterion 2. an outcome 3. a consequence). The process of determining the validity of such a test would [8](1. begin 2. operate 3. pass) by administering the test to a large group of students at the time of their admission to medical school. Later, some measure of performance in medical school would be obtained for each student on the basis of grades, ratings by instructors, success or failure in completing medical training, and similar criteria. Such a [9](1. composite 2. dependent 3. similar) measure would constitute the criterion against which each students initial test score is then correlated. The measure of this correlation is called the validity coefficient. A high correlation between the initial test scores and measure of each students performance would [10](1. expect 2. signify 3. prove) that those individuals who scored high on the test had been relatively successful in medical school. This would indicate a high validity coefficient. A low correlation would show little correspondence between test scores and criterion measure and would indicate a poor validity coefficient for the test. The validity coefficient enables researchers to determine how closely any individuals criterion performance could be [11](1. revealed 2. detached 3. predicted) from that individuals test Score.

 In a similar manner, tests designed for other purposes can be validated against appropriate criteria. A vocational aptitude test, for example, can be validated against the on-the-job success of a trial group of new employees. A pilot aptitude test can be validated against achievement in flight training. Tests [12](1. demonstrated 2, designed 3. defined) for broader and more varied uses are validated against a number of criteria and their validity can be established only by the gradual accumulation of data from many different kinds of investigations.

 There is an apparent paradox in the concept of test validity that needs to be addressed. If it is necessary to follow up the subjects of a test, or in other ways try to obtain independent measure of what the test is trying to predict, then [13](1. how can we 2 . why not 3. do we need to) dispense with the test? The answer is to be found in the distinction between the validation control group and the groups on which the test will [14] (1. eventually 2. conclusively 3. inevitably) be used for operational purposes. Before a particular test is ready for general use, its validity must be established on a representative [15](1. sample 2.mass 3. portion) of subjects. The scores of these persons are not used for operational purposes, but serve only in the process of testing the test. If the test proves valid on a control group, it can then be used on other groups without [16](1. holding 2. putting 3. resorting) back to other criterion measures.

 It might be argued that tests themselves are not needed; that over time the criterion measures will indicate the same information that a given test is trying to predict. But such a procedure would be [17](1. often 2. so 3. virtually) wasteful of time and energy as to be prohibitive in most instances. Imagine the consequences, for example, if all of the applicants for a job were hired, or all of the students who wish to attend a school were admitted, and then a final decision was made only after time [18](1. has determined 2. determines 3. had determined) which individuals were most likely to do the job well or satisfactorily finish the schooling. It is the very wastefulness of this procedure and its emotional impact on individuals that tests are designed to [19](1. measure 2. minimize 3. highlight). By means of a test, a person’s present level of required skills, knowledge and other relevant characteristics can be assessed [20](1. with 2. on 3. for) a determinable margin of error. The more reliable and valid the test, the smaller will be this margin of error.

AO入試・小論文に関するご相談・10日間無料添削はこちらから

「AO入試、どうしたらいいか分からない……」「小論文、添削してくれる人がいない……」という方は、こちらからご相談ください。
(毎日学習会の代表林が相談対応させていただきます!)

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です