The correlation between students’ reading and listening score in a standardized test of TOEFL

This study aimed at analyzing the correlation between reading and listening in TOEFL ITP test and how much reading predicts listening. It involved 50,684 reading and listening scores in 2015-2019 test periods of undergraduate students in one of the state universities in Malang. The data were collected by using standardized TOEFL ITP issued by ETS. Using Pearson Correlation Product Moment and linear regression analysis, the result demonstrated reading and listening had significant, linear, and strong correlation (.682), and reading significantly predicted 46.5% variance of listening. The results lead to the hypothesis that two language input skills, reading and listening significantly correlated and predicted one another. The result also suggested that correlation language skills not only occurred among reading and writing and listening and speaking, but also it happened in reading and listening. Moreover, the results suggested the combination of reading and listening activities in classroom activities.


INTRODUCTION
The correlation between reading and listening has been for long time in literature. In language teaching, the correlation is also implicitly included in some activities such as identifying the main idea, stated details, unstated details, implied details, and vocabulary in context. Yet, unfortunately, the appeals of the discussion not as renowned as the correlation between reading and writing and listening and speaking. It is only scanty research studies discussing the correlation between those skills. Surprisingly, despite the scarcity of the discussion, some researchers met a consensus that reading and listening intimately correlated and enhanced one another in language classrooms (e.g. Blonder et al., 2019;Jiang et al., 2018;Wolf et al., 2019). This overlooked information hinders the teachers from developing effective teaching methods and assessments (Palmer, 1997).
The notion that holds true reading and listening correlates show the possibility to teach both skills together. Ranto Rozak et al. (2019) mentioned that combining reading and listening, like doing reading while listening, can reduce student teachers' foreign language listening anxiety. Devine (1976) mentioned some techniques that combined reading and listening that have not uncharted in teaching. The first is selecting three critical reading and listening skills, such as distinguishing between fact and opinion, recognizing a writer's or speaker's bias, and noting loaded or emotionally-charged words. The second is teaching critical reading skills in one group and teaching critical listening to the second group. The last is testing the reading group with a listening test and the listening group with a reading test; however, this technique might intensify the test validity. The combination of those skills have been applied in classrooms, and they have showed effective result in improving students ability in reading and listening (e.g. Babayiğit & Stainthorp, 2014;Begeny et al., 2009;Blonder et al., 2019;Jiang et al., 2018).
In addition, Nation and Yamamoto (2012) asserted reading possibly takes part in the listening activities, such as reading while listening or reading summaries of materials before listening to it. Otherwise, it is also possible for the listening activities to work in the reading course. In line with this, Case (2009) mentioned some insightful strategies to teach reading and listening at the same time such as radio news; movie with subtitles; reading the summary or review before watching the movie; radio play, or TV episode; listening and reading to check; listening and reading in preparation for speaking; matching the listening to the texts; listening to song and match with the description, and finding the mistakes in the summary of the story.
Similarly, reflecting upon the anecdotal experience while teaching in the TOEFL preparation test, the correlation between reading and listening also holds true in the TOEFL. The ability to recognize minimal pairs, as the common distractor in the multiple choices, is highly recommended to be mastered owing to the distractions of choices. The following sentence is a classic example of the typical listening section using similar sounds like cash, glass, and crash to replace grass that requires listening and reading comprehension to answer correct answers.
(Man) How long until you will be ready to leave? (Woman) First, I need to water the grass.
(Narrator) What does the woman mean? Answer: a) She has to wait for some cash. b) The waiter is bringing a glass of water. c) The lawn is too dry. d) She needs to watch out for a crash.
taken from: Phillips (2001) The aforementioned example clearly represents the demand for two comprehension processes of reading and listening. Therefore, the test-takers should decode the oral and written word accurately and quickly to prevent distractors and keep on-time to answer.
The time constraint of the short-term memory impacts the whole comprehension of the discourse during reading. Consequently, slowly sounding out each word is accurately unlikely to help to achieve comprehension. This happens due to the ability to recognize words quickly and accurately is needed, or it is often called as reading fluency. Some studies have been devoted to show the significant contribution of reading to improve comprehension (e.g. Álvarez-Cañizo et al., 2015;Cotter, 2012;Lems, 2012;Talada, 2007). Besides, cognitive resources are commonly | 129 limited; thus, the more the resources are used for decoding, the less comprehension will happen. It also occurs in the answering listening section in the TOEFL test. The more the test-takers scrutinize each choice and read back and forth, the higher possibility they miscarry the next question because they only have twelve seconds to read and decide the best answer. As a result, random selection for the answer highly happens.
This phenomenon has been initially explained by Huey (1968) that asserted the existence of "inner speech" that occurs while reading. While reading the passage, the brain process of reading words in print and breaks down the sound of each letter. Then words are read out loud in the brain to reach comprehension. Based on this notion, Gough and Tunmer (1986);Hoover' and Gough (1990) shed further light on explaining the relationship which later was called Simple View of Reading (SVR). This model resulted from the belief that reading and listening comprehension are mastered through a similar and shared cognitive process. It proposed that reading comprehension is the result of decoding ability and listening comprehension ability.
In recent years, there has been an increasing interest in the discussion on the correlation between reading and listening in language tests. Some research studies analyzed the correlation between reading and listening in some language tests (Bozorgian, 2012;Tiendas, 2018;Hastuti & Kalim, 2019). Previous research conducted by (Hastuti & Kalim, 2019) focused on analyzing the correlation of reading and listening in the local TOEFL-PBT test. They involved 121 students of STKIP PGRI Sidoarjo, majoring in Mathematics Education, English Education, and History Education. Before having the correlation, the students got intensive TOEFL Preparation for ten days with two hours duration for each meeting. After the treatment, the students had the test and their scores were correlated using the Pearson Correlation Coefficient. The result of the analysis revealed that students' reading and listening scores strongly correlated (.60-.79). Before than that, Tiendas (2018) examined the correlation among reading, listening, and writing of 104 intermediate EFL students in the Cambridge Preliminary English Test for Schools published by UCLES. Based on the data analysis, it is found that the correlation between reading and listening was stronger than listening and writing. A study of Bozorgian (2012) also reported that reading and listening had a stronger correlation (.735) than listening and writing (.643) and listening and speaking (.654) in 1,800 Iranians IELTS scores.
Indeed, those studies have been devoted to scrutinize the correlation between reading and listening pertinent to the language tests and provided detailed information on the correlation between reading and listening. However, in my humble opinion, Hastuti and Kalim's (2019) study has a dualism research design, the use of experimental-like, and the use of the local-TOEFL-PBT test, might affect the validity result of the correlation. Besides, the limited number of participants needs to be broadened to have a better generalization of the result. Similarly, Tiendas' (2018) study only focused on correlating two proficiency levels, A2 and B2 level, as he used Cambridge PET for Schools test. To this end, I believed that the students' proficiency levels involved in his were highly possible higher that those levels. Consequently, the result of the correlation was limited and cannot be generalized to other language proficiency levels. Additionally, Bozorgian (2012)  opinion, was essential to identify the variance of listening predicted by reading to provide beneficial to discover teaching technique and assessment.
Above all, to the best of my knowledge, not many studies have been dedicated to analyze the correlation of those skills in the standardized TOEFL ITP test issued by ETS despite the fact that this test is a well-known test in academic and professional career milieu used in more than 130 countries and 900 universities (Onaizi, 2019). In my humble opinion, this void possibly flaws the completeness of the discussion of correlation between reading and listening, especially in language tests. This gap also hinders to inform teachers and testers that reading and listening are correlated and to be taught together and predicting one another. This sort of information is important for them to advance teaching, pertinent to the discovering engaging teaching techniques and upgraded assessment. Moreover, the limited focus and participants of the previous studies are necessarily to be enlarged to achieve the generalization of the study.
Reflecting upon the gaps, the aim of this article is to explore the relationship between reading and listening in a standardized test, TOEFL ITP test issued by ETS. The study also focused on understanding the percentage of how much reading predicted listening in the test. By having the results, the unknown information in the discussion of reading and listening is fulfilled. Practically, the use of the information is highly possible to develop teaching techniques and assessment in the classroom setting.
Therefore, the central thesis of this paper is that How is the correlation between students' reading and listening comprehension scores in a standardized test? and specified into twofold. The first question is whether students' reading comprehension scores significantly correlate to their listening comprehension scores in the TOEFL ITP. The second question is how much students' reading comprehension scores predict their listening comprehension scores in the TOEFL ITP?
Since the present study was a correlational study, the following were the hypotheses to determine the correlation and the prediction: H0: There is no correlation between students' reading comprehension scores and their listening comprehension scores in the TOEFL ITP. H0: The student's reading comprehension scores do not predict their listening comprehension scores in the TOEFL ITP. H1: The higher the students' reading comprehension scores, the higher their listening scores in the TOEFL ITP. H2: The student's reading comprehension scores predict their listening scores in the TOEFL ITP.

METHOD
This research followed a correlational research design that presents a coefficient correlation between reading and listening scores and prediction of the variance (Creswell, 2011;Latief, 2013). The quantitative data consisted of 50.684 college students' reading and listening scores in the 2015-2019 test administration. The participants of the present research were university students who took the TOEFL ITP test as a requirement to graduate from one of the state universities in Malang.
In this research, I did not administer the test due to the test protocol from ETS, yet allowed to access the scores, I asked permission from the director of the language center. After having approval, I copied the scores of the TOEFL ITP scores from the data server of the institution. Then I saved the scores in Microsoft Excel 2010. After copying the score, they were classified into each test period to calculate the total number of the score involved in my study.
The correlation coefficient (R) was carried out using SPSS and called for understanding the direction and the degree of association. The data analysis covers some processes. The first process was analyzing the direction of the correlation using scatterplot and table of the summary table from SPSS. The direction was identified as positive or negative. The second process measured the degree or strength of association using a table of coefficient correlation results adopted from Cohen et al. (2007). The following Table 1 is the benchmark of the correlation proposed by Cohen et al. (2007). Table 1

. Correlation value and Interpretation Value
Interpretation Very strong Once the degree was extracted, it was necessary to test the strength of the reading scores to predict listening by squaring the correlation coefficient result (R 2 ) using linear regression analysis. For example, if the correlation between reading and listening showed r = .7, then r 2 = .49 means that the prediction of y from x has 49% accuracy. If the correlation coefficient between students' reading and listening showed +.70 and the r 2 is .49, it meant the variance in reading score predicted 49% variance of listening scores.
The last process of the data analysis comprised the level of significance (chance of being wrong). In this case, it was set 0.5 or 5% as the common level in education. This level was used to determine whether the r is significant (ρ < .05) or not significant (ρ > .05). If the result showed ρ < .05 then the alternative hypothesis was supported by the empirical evidence, so the null hypothesis was rejected. Otherwise, when the result shows ρ >. 05, it meant the alternative hypothesis was not supported by the empirical evidence, so the null hypothesis was not rejected.
All processes are depicted in the following flowchart ( Figure 1).

RESULTS AND DISCUSSION
To answer research questions, 50.684 college students' reading and listening scores collected during the 2015-2019 test administration were analyzed using Person Correlation Product Moment and Linear Regression Analysis. The following the first-two sub-sections subsequently presented the results of the analysis answering the research questions and followed by two sub-sections that discussed the discussion of the results.

Measuring Relationships between Reading and Listening in TOEFL ITP
The first set of question aimed to examine the correlation between reading and listening scores in standardized TOEFL ITP. To answer the first research question, Person Correlation Product was run. The analysis of the correlation was focused on the direction, strength, and the level of significance of the correlation.
The following is the result of the correlation illustrated by scatter plot. Figure 2 shows an overview of the positive direction between reading and listening scores. As expected, the direction of the correlation between reading and reading is positive, straight line from left to right. It indicates that the higher reading score, the higher listening score.

Figure 2. Result of Correlation based on Scatterplot
To know the strength and the significance level, a mathematical analysis was made. The following Table 2 presents the result of the analysis.  Table 2 provides the summary statistics for the strength and the significance level of the correlation value. Based on the analysis, it revealed that the strength of the positive correlation was marked by 0.682 points. According to Cohen (2007), the value is considered as strong correlation and has a good prediction that could be made from one variable to another. In addition, the table also illustrates the significance level (p-value). Based on the table, it was stated that the p-value was 0.00. Since the p-value is less than 0.05, thus the correlation is significant. In short, based on the analysis, the alternative hypothesis is supported and the null is rejected.

Predicting students' Listening Scores through Their Reading Scores
The next question asked how much reading predicts listening. To answer the question, Linear Regression Analysis was run. The following was the result of the analysis. As presented in Table 2 above, the value of R Squared was marked 0.465 or 46.5%. According to Moore et al. (2013), the value was considered as weak and has low effect size. The result was unexpected because the value indicates that reading does not explain much the variation of listening. To know the significant contribution of reading to predict listening, F-test was used. Based on the analysis, it was revealed that the significance level was 0 lower than 0.05. It means that reading significantly predicts the listening score, although it is weak.
The t-test was also used to know the precise value of the reading prediction. Based on the analysis, it was known that reading contributes 0.631 in predicting listening scores. It means whenever the reading score raised by 1 point, the listening score is added by 0.631 points. Besides, it was revealed that the constant of listening was 17.829 points. Based on the results, the model of listening score prediction is listening= 17.829+0.631 reading.
All in all, the reading significantly predicts listening although the prediction is considered as weak. It predicts 46.5% the score of listening. Since the prediction is significant, the alternative hypothesis is supported and the null hypothesis is rejected.

Understanding the Relationship between Reading and Listening in Standardized TOEFL ITP Test
Several reports have shown that reading and listening has strong relationship. It is interesting to note that both are input skill or passive skills. This fact might be intriguing since the common correlation of language skills are focused on the relationship between listening and speaking and reading and writing. However, the current study presented that reading and listening also has significant correlation value, especially in language tests. Those finding was also reported by some previous studies.
The current finding is consistent with Bozorgian (2012), who reported that reading and listening had stronger relationships than listening and writing and listening and speaking in the IELTS test. This finding was also found by Tiendas (2018), who presented a stronger correlation between reading and listening than listening and writing in the Cambridge Preliminary English Test for Schools. This also accords with Hastuti & Kalim's (2019) study, which showed that reading strongly correlated with listening in the local TOEFL-PBT test. The result of the study, including the present finding, undergirds the notion that reading and listening correlates in every language test. In terms of the correlation value, this outcome is contrary to Bozorgian's (2012) study which has suggested that correlation between reading and listening is higher than .628, as found in this present study. (Bozorgian, 2012) found that the correlation between reading and listening, especially in EILTS found .735 points higher than the standardized TOEFL ITP test.
A possible explanation for this might be that difference format between standardized TOEFL ITP and IELTS. Muijselaar et al., (2017) argued that the relationship between reading and listening might be various if the two tests differ largely in some aspects such as the time administration, format, or task. The obvious difference is on the listening task. In my humble opinion, in IELTS, the test takers are required to read than in the standardized TOEFL ITP. They need to fill the blank form by following the audio as the source of information while reading to the items that should be completed. The high correlation in IELTS is also affected by the construct measured IELTS and that of academic reading in the target space (Weir, 2009b). In line with this, Dornyei (2001) asserted that the high correlation is caused by the significant contribution of the situation to the particular task.
Another possible explanation for the discrepancy is that the scoring system. The standardized TOEFL ITP converted scores range from 45 to 50 for listening, 65 to 67 for reading section, and 677 for the total test score covering three sections. On the other hand, each section and total scores of IELTS on a 9-band scale in one-half (0.5) band increments gathered from four sections. Besides, the larger scale of TOEFL scores than IELTS scores impact on the range of the scores of each section, for example, an IELTS listening score of 5.5 would correspondence to the standard TOEFL ITP listening scores of 513-547 (ETS, 2010).
The value of the correlation in the present study is close with Hastuti & Kalim, 2019) who found R-value = .60-.79 point in the local TOEFL-PBT test. This result could be explained by the fact that the likeness between local TOEFL-PBT test and the standardized TOEFL ITP test used in this study. The identical result between standardized TOEFL ITP and the local TOEFL-PBT test causes by the similar format of the test, construct validity of the test, and test administration. The local TOEFL-PBT test will consult and follow the pattern of the standardized TOEFL ITP test. It uses a similar topic and uses the same format of the standardized test while designing the test specification such as the topic of the audio for listening and the topic for the passages, the number of the questions, and the time allotment for the test. In addition to this, the level of difficulty is slightly different once the institution sometimes used the textbook designed by ETS. The difference between those tests is only on the regular updated made by the test designer. The standardized test is continually renewed and upgraded, while the local test is unspecified for the renewal.
The proximity R-value between local TOEFL-PBT test and the standardized TOEFL-ITP test may also partly be explained by the characteristics of the high reliable of the standardized test. According to Brown (2004: 68) standardized test is designed to meet the high reliability and particular standard objectives. Therefore, the score from those two tests is not significantly different. Consequently, the result of the correlation is similar.
In addition, the undergirding theory of those tests is similar. The standardized TOEFL ITP and the local TOEFL-PBT test are reinforced by the same linguistics theory known as structural linguistics. The theory explains that language is grouped Jurnal Penelitian dan Pengkajian Ilmu Pendidikan: e-Saintika, July 2021 Vol. 5, No. 2 | 135 into two layers; form and meaning (Sulistyo, 2009). By having the identical characteristics, therefore, the R-value of the local-PBT test and the standardized ITP tests are close enough.

Projecting Listening Scores through Reading Scores in Standardized TOEFL ITP Test
After analyzing the correlation of reading and listening, the analysis was further conducted to see the percentage of reading toward listening. The result revealed that reading significantly predicts listening. The value of the prediction was estimated by 46.5%. It means reading predicts 46.5% percent of listening. In other words, the 46% variation of listening is shared with reading, and the rest of the listening variations are predicted by other components.
Based on the benchmark set by Moore et al. (2013), the value of the prediction is considered as weak or low. This value is contrary to the expectation because it shows that reading does not explain much variation in listening. To find precise value of the prediction, I used the F test. Based on the analysis, it revealed that the value of the prediction is significant and marked as 0.637. The value showed whenever the reading score increases by 1 point; the listening also increases by 0.631. The finding was also reported by Wolf et al. (2019) who found reading predicted 34% variance of listening.
The value can be logically explained by the value of the correlation and the overlapping in the comprehension process. In the comprehension process, both skills are required vocabulary. In addition, Wolf et al. (2019) claimed that vocabulary takes the biggest portion of the shared-aspect between reading and listening. The vocabulary is essentially required to understand the language input either in oral and written modality. It also implies that the comprehension process is not affected by the form of modality, but rather affected by the vocabulary. The different between both skills is the reader can reread the passage, while the listener cannot. Other than prior knowledge and vocabulary, the demand for attention and memory are more necessary in listening than in reading (Wolf et al., 2019). Due to the similarity and the difference, the value of the prediction is not perfect and tends to be weak or low.
In the context of answering questions in standardized TOEFL ITP test, the testtakers only have 12 seconds to answer a listening question. It means that they have to posse ability to read fluently and have range vocabulary to answer correctly and on time. Reading fluency helps to recognize the words while reading questions and multiple choices, while the vocabulary mastery helps to understand the audio and the multiple choices. Reflecting to the fact, it is possible that vocabulary and reading fluency contribute to the value of the prediction (Wolf et al., 2019).
The weak value of the prediction might be explained by the other contributor aspects shared by reading and listening. According to Wolf et al. (2019), verbal short-term memory, verbal working memory, visual memory, and sustained aural attention, and inhibition might contribute to the prediction but not remarkable as vocabulary and reading fluency. Futransky (1992) argued that verbal working memory had less contribution to problem experience in reading and listening comprehension. The comprehension rather required vocabulary as stated by Hogan et al., (2014) who mentioned that complex and academic texts demand vocabulary. Pertinent to language test, the background knowledge of the test-takers might also not significantly predict the listening in language test since test designers do not work with familiar recording or passage that has been published or used by other parties.
The result of the prediction must be interpreted with caution because limited theory and literature explaining the prediction. However, the result hypnotizes that the value of the prediction is due to modal-specific process (comprehension process) and shared-contributor aspects (vocabulary) as stated by (Wolf et al., 2019). In addition, the value of the prediction does not happen due to verbal short-term memory, verbal working memory, visual memory, sustained aural attention, and inhibition.
Based on the findings of the study, it implied that reading and listening could be taught in together because they correlated each other. (Valentini et al., 2018) also asserted that combining reading and listening increases the ability to learn vocabulary more rather than using mono-language input. This sort of information is beneficial for the teachers to design classroom activities and tests. Another valuable cue from the correlation of reading and listening is the teaching techniques. Devine (1967) and Nation and Yamamoto (2012) mentioned that reading and listening can be combined for the example; teacher may direct the students to read the information of particular topic before having the audio version in listening. By doing this activity, it is expected that the students will have better comprehension. The activity is also possible to be done otherwise.
The present results have proven that reading and listening correlated in standardized TOEFL ITP test like other tests being studied. Besides, it was favorably presented the percentage of reading to predict listening. The results were successfully identified by involving large quantitative data of reading and listening scores. Therefore, the results contributed to fill the gap of the discussion between reading and listening pertinent to language tests and became undergirding theory to combine reading and listening activities in language classrooms.
Nevertheless, the present only scrutinized the correlation and prediction between reading and listening in a language test. Therefore, the results might be not quite relevant to the value of correlation and prediction in achievement test. Besides, since the discussion and the literature about variance affecting the prediction between reading and listening is quite rare, the further research could be focuses on the voids of the research to complete the discussion of reading and listening so that the information will be useful for teaching in combining teaching strategies and assessing students.

CONCLUSION
The present research investigated the correlation between reading and listening and measured the prediction of reading toward listening in the context of standardized TOEFL ITP test issued by ETS. The results revealed that reading and listening is significantly correlated and has linear, strong and positive correlation marked by .682 point. Also, it was found that reading significantly predicted listening although the value of the prediction is considered as weak or low (46.5%).
Besides, the level of significant tests showed lower than 5%. Therefore, the result of the correlation value and the squared correlation value is big enough. Therefore it can be concluded that all the alternative hypotheses are supported and the null hypotheses are rejected.

RECOMMENDATION
The result can be informative for the teachers and language test designers. They can be informed that reading and listening can be taught together and combined in classroom activities. However, for the assessment, the finding only able to show the correlation of both skills in language tests due to the fact that the prediction is low. Therefore, in order to assure that reading and listening can be combined in a task in a language test, further studies can be done. Additionally, further studies might focus on measuring other contributor aspects between those skills in order to have a stable undergirding theory to combine the skills in classroom activities