THE COMPARISON OF BOARDING SCHOOL STUDENT’S CAPABILITY IN SOLVING HOTS QUESTION OF ISLAMIC HISTORY SUBJECT

This research was conducted to determine the ability and the comparison to solve HOTS questions of Islamic History subject of the 11th grades of MAN I and MAN II Surakarta boarding school. The research applied a comparative quantitative method by comparing the ability to solve HOTS questions of Islamic History subject of the 11th grade of MAN I and MAN II Surakarta students. The result reveals that MAN I students’ ability to solve HOTS questions reaches 22% and is categorized as low, 55% of students obtained scores of medium category, and 23% of students are in the high category. Meanwhile, for MAN II students, 14% obtained scores for the low category, 49% for the medium category, and 37% for the high category. There were no significant differences related to the ability to solve HOTS questions of Islamic history of both schools. With HOTS’s evaluation, the students can expand the ability to solve HOTS questions, and the teachers can take into consideration to implement HOTS’s evaluation.


INTRODUCTION
Related to the issue of educational development at the international level, Curriculum 2013 was designed (constructed) with various improvements. The improvements were made to the assessment of learning outcomes which were expected to help students improve higherorder thinking skills (HOTS). HOTS could encourage students to think broadly and deeply about the material. 1 HOTS is a high ability by expressing something back to others. 2 This capability was one of the various aspects which were highlighted in producing quality human resources. 3 One of the subject that have carried out the evaluation using HOTS was the Islamic Cultural History subject (herein named as SKI (Sejarah Kebudayaan Islam)). It seemed difficult to implement HOTS-based evaluations for SKI because SKI is textual while HOTS emphasized the student ability to take ibrah (moral value), wisdom, and lessons from Islamic history, imitate outstanding figures, then relate them to social, cultural, political, economic, scientific and technological phenomena, arts and others to develop Islamic culture and civilization in the present and future. 4 HOTS questions had been implemented in the boarding school program of MAN I Surakarta (herein named as M1) and MAN II Surakarta (herein named as M2). The implementation of HOTS was in the final examination and formative test (Interview,4/2/19). Based on the formative test archive, it shows that M1 had applied HOTS about 40% with an average score of 88.0 while M2 with a percentage of 20% for HOTS and 76.0 for the average score. It shows that students of M2 had higher ability in solving HOTS questions rather than students of M1.
Both schools had applied the learning components that comply with the requirements and the needs of the students. Based on the observation conducted on February 4 2019 at both schools, some information was obtained. First, the facilities and infrastructures of both schools were classified as very good and complete. Second, the support and commitment from the Department of Education and school principal to improve HOTS upon the students were shown by the convening HOTS-based learning and evaluation through conferences and workshops since 2017. Third, related to adequate teacher education qualifications, M1 and M2 had fulfilled it. 5 Referring to the explanation above, various supporting factors for achieving HOTS of two schools had been fulfilled both procedurally and operationally. However, from the results of formative tests on SKI using HOTS, it was found that there were quite striking differences.
Based on the background, the researcher focused on the students' ability in solving HOTS questions. The evaluation was conducted by providing HOTS questions in the same level and standard of evaluation to the students of grade XI MAN I Surakarta (M1) and MAN II Surakarta (M2) boarding school. The research included analyzing students' ability to solve HOTS questions in SKI performed by students of M1 and M2. It also analyzed the comparison of both students of M1 and M2's capabilities to solve HOTS questions in SKI Subject.
This is comparative research that investigated the differences between two or more sample groups on the event that was being studied. 6 The procedure consisted of some stages: first, determining and focusing the problem of the research; second, determining the population and research sample; third, collecting the data, and fourth analyzing the data. 7 The research was conducted from February to April 2019 by collecting data in the form of documents and achievement tests. Based on the documentation, the initial data were obtained in the form of formative test scores, as well as using achievement tests by multiple-choice questions. Once the data were obtained, they were analyzed to draw the conclusion that relates to the implementation of this research.
There were 125 students which consisted of 84 students from M1 and 41 students from M2. Considering the existing population, samples were taken using the the probability random sampling technique. The samples were 95 students performing the Slovin formula with a 5% error rate.

RESULT
Related to the tests carried out in the study, the researcher used Product Moment to test the validity of the instrument, 8 as well as Alpha Cronbach's to test its reliability. 9 The normality of this research was tested using Saphiro Wilk, 10 while the homogeneity of the instrument was tested using the Levene test. 11 The hypothesis proposes was that there was a comparison/difference in the ability to solve HOTS questions in Islamic History subject performed by XI grade students in MAN I and MAN II Surakarta which were proven by using the Independent T-Test. 12 Based on the results of the calculation upon the unit analysis, the ability to solve the HOTS questions performed by XI grade students of M1 with (N) of 54 students are as follows: Table 0.1 The Result of Unit Analysis Performed by XI Grade Students of M1

No
Unit Analysis Score 1 Highest score (Db) 100 2 Lowest score (Dk) 40 3 Standard Deviation 16,6 The data were grouped into several categories which indicate the amount of data in each category in the form of a frequency distribution. This was conducted to form the data to be more informative and easier to understand. The result of calculation in frequency distribution in showing the ability to solve the HOTS questions performed by M1 students were displayed below: Based on the above calculations, it showed that data frequency distribution of the ability to solve HOTS questions of M1 is as follows: Based on the table above, there were seven categories of students' abilities in solving HOTS questions. Of the seven categories, they are regrouped into three main categories: low, medium, and high categories. The low category consisted of very less and less. The medium category consisted of good enough and good. The high category consisted of incredible and exquisite.
Based on the table above, the percentage of the frequency displaying the ability to solve HOTS questions of M1 from 54 samples, 12 students scored for the low category with a percentage of 22%, 30 students scored for the medium category with a percentage of 55% and 12 students scored for the high category with a percentage of 23%. For more details, see the diagram below: The diagram shows that the percentage of the students' ability to solve HOTS questions of M1 is classified into the medium category, with a value interval between 58-84 obtained by 30 students with a percentage of 55%.
For M2, based on the results of the calculation upon the unit analysis, the ability to solve the HOTS questions performed by XI grade students of M2 with (N) of 41 students were as follows: Table 0.3 The Result of Unit Analysis Performed by XI Grade Students of M2

No
Unit Analysis Score 1 Highest Score (Db) 100 2 Lowest Score (Dk) 40 3 Median (Me) 73 5 Mode (Mo) 87 6 Standard Deviation 16,6 The table shows that data frequency distribution of the ability to solve HOTS questions of M2 is as follows: From the table, it can be seen that the percentage of the frequency displaying the ability to solve HOTS questions of M2 from 41 samples of students, 6 students scored for the low category with a percentage of 14%, 20 students scored for the medium category with a percentage of 49% and 15 students scored for the high category with a percentage of 37%. For more details, see the diagram below: It can be concluded that the percentage of the students' ability to solve HOTS questions of class XI M2 is classified into the medium category, with a value interval between 58-84 obtained by 20 students with a percentage of 49%.

Normality Test
The data normality test is intended to ensure that the sample data come from populations that are normally distributed. 13 The normality test used in this study was the Shapiro Wilk test. This test is used to test the normality of existing instruments both for data in the form of intervals and ratios as well as large or small amounts, with grouped data. 14 From the calculations using the Shapiro Wilk test, the following data are obtained: The result of the normality test with Shapiro Wilk using the resulting sig value. If the sig value> 0.05, the data is normally distributed. On the contrary, if the sig value <0.05, the data is not normally distributed. 15 The data shows that the students' ability to solve HOTS questions from both schools has a normal distribution. This is because the sig value of M1 is 0.099 and the sig value of M2 is 0.053.

Homogeneity Test
The homogeneity test was used to show that groups or more sample data come from populations that have the same variance. 16 The homogeneity test used in this study is the Levene test. The following are the results of the calculation of the homogeneity test with Levene: The result for the homogeneity test with the Levene test uses the resulting sig value. If the sig value is> 0.05, the data is homogeneous, and if the sig value is <0.05, the data is not homogeneous. 17 The data shows that the students' ability of both schools to solve the HOTS questions is homogeneous. This is because the sig value for the homogeneity test is more than 0.05, namely 0.721.

Hypothesis test
A hypothesis is a temporary conclusion based on theoretical studies that still have to be tested empirically for truth. 18 The hypothesis is formulated in a declarative sentence which states whether there is a relationship, difference, and influence between two or more variables. 19 In this study, researchers used the independent t-test in testing the hypothesis. 20 The following is the hypothesis proposed in this study:

Ho
= There is no comparison in the ability to solve the HOTS questions in SKI of Class XI students of MAN I and MAN II Surakarta Boarding School. Ha = There is a comparison in the ability to solve the HOTS questions in SKI of Class XI students of MAN I and MAN II Surakarta Boarding School.
From the independent t-test, the hypothesis test results were obtained as follows: The resulting hypothesis by using the t-test use the resulting sig value. If the sig value is >0.05 then Ho is accepted, meanwhile if the sig value <0.05 then Ha is accepted. 21 From the data above, the sig for the ability to solve HOTS questions of M1 and M2 is 0.325. Thus, Ho was accepted by the decision that there was no comparison/difference in the ability to solve the HOTS questions in SKI.

DISCUSSION
HOTS was a high-level thinking ability, which applied the transfer of a concept to another by processing, linking, analyzing, and criticizing existing information in the cognitive realm. 22 This capability consists of the ability to analyze, evaluate and be creative with information. 23 With the HOTS that began to develop among students, there was an indirect change in the standard of assessment. The assessment carried out to measure HOTS in students was using HOTS-based assessment instruments.
The calculation of the unit analysis in the ability to solve HOTS questions performed by M1 with the samples (N) of 54 students, the highest score (Db) obtained was 100, the lowest score (Dk) was 40, the mean (x) was 72, the median (Me) was 73, the mode (Mo) was 80 and the standard deviation was 16.6.
The data shows the percentage of the frequency of the ability to solve HOTS questions of M1. Of the 54 student samples, 12 students scored for the low category, 30 students scored for the medium category, and 12 students scored for the high category.
Meanwhile, based on the results of the calculation of the unit analysis ability to solve HOTS questions of M2 with (N) as many as 41 students, the highest score (Db) obtained was 100, the lowest score (Dk) was 40, the mean (x) was obtained 75.24, the median (Me) was 73, the mode was (Mo) is 87 and the standard deviation was 16.5.
The data shows the percentage of the frequency of the students' ability to solve HOTS questions of M2. Of the 41 student samples, 6 students scored for the low category, 20 students scored for the medium category, and 15 students scored for the high category. This showed that the average value obtained by the students from both schools is in the medium category, which for the average value obtained, M1 is 3.24 points higher than M2.
The data normality test was intended to ensure that the data samples came from normally distributed populations. The normality test used in this study was Shapiro Wilk test. This test was used to test the normality of existed instruments both for data in the form of intervals and ratios as well as large or small amounts, with grouped data.
Decision-making for normality test with Shapiro Wilk uses the resulted Sig Value. From the data obtained, the students' ability to solve HOTS questions of both schools had a normal distribution. This was because the sig value of M1 was 0.099 and the sig value of M2 was 0.053.
The homogeneity test was intended to show that groups or more sample data come from populations that had the same variance. The homogeneity test used in this study was the Levene test. Decision-making for the homogeneity test with the Levene test used the resulting sig value. From the data obtained, the students' ability from both schools was said to be homogeneous. This was because the sig value for the homogeneity test was more than 0.05, namely 0.721.
The decision-making to test the hypothesis was conducted by applying the T-test using the sig value. From the existing data, the sig for the students' ability to solve HOTS questions from both schools was 0.325. Thus, Ho was accepted by the decision that there was no comparison/difference in the students' ability to solve the HOTS questions. The comparison that emerged was 3.24 for the average achievement test score tested.
Based on the conclusion, the hypothesis stated that there was no comparison/difference in the students' ability to solve HOTS questions was inversely proportional to the results of the formative test obtained by both schools. When associated with several supporting and inhibiting factors for HOTS in students, the results of hypothesis testing indicate linearity with various theories about these factors.
Regarding HOTS supporting factors for students, when it is related to the learning components in each school, it had met the students' needs in the learning process which included facilities and infrastructure, support, and commitment from the Education Office and the Head of school and adequate teacher qualifications.
Concerning the facilities and infrastructure owned by the schools, both madrasas have sufficiently complete infrastructure that can support the achievement of the learning process. As well as facilities and infrastructure, support, and commitment from the Education Office and the Head of Madrasahs, both M1 and M2 have fulfilled properly in improving students' higher-order thinking skills. This can be seen from the implementation of HOTSbased learning and evaluation since 2017 and the implementation of socialization with HOTS Learning and Evaluation materials. Therefore, this did not make any difference towards the school's requirements.
Regarding adequate teacher qualifications, M1 and M2 already had qualified teachers and teaching staff with linear educational backgrounds. This can be seen from the prerequisite that had to be met before they were accepted to work at the schools.
Related to the inhibiting factors, all students from both schools have the low ability in SKI. This was because SKI might be classified as a boring and complicated subject among students, which indirectly affects the implementation of learning and evaluating to cause learning less optimal and produce output with less optimal values.
Another inhibiting factor was the students' lack of accuracy in reading, understanding, and determining the answers to the options from the HOTS questions. This can be seen from some of the mistakes made by students due to these three things as well as complaints from some students during the discussion of the HOTS questions. From this factor, the students M1 made more mistakes related, which indirectly contributed to the low HOTS-based evaluation scores for SKI.
Besides, low student literacy also had an impact on students' HOTS. Based on the interviews with the students and Islamic History teachers of M1, students' literacy was classified as low. This was due to the students' low enthusiasm for reading, especially reading the main handbooks for learning. For M2 students, the lack of understanding of the concept was one of the inhibiting factors in solving HOTS questions. This was because students were still too fixated on the textbook.
The last inhibiting factor relates to the less-than-optimal learning process and the learning strategy being carried out. From the result of the observation, the conduciveness of the classroom environment of M1 was more supportive to the learning process and the improvement of HOTS performed by the students. Inversely proportional to this, M2 had an unconducive class environment. The school building of M2 is located next to fairly busy public places. However, even though the learning environment was not conducive, the students of M2 had a quite competitive cognitive environment. The students were enthusiastic during the learning process.
Another factor was related to family interrelation. The researcher focused less on this due to students' less interaction with the family members because they had to stay in the boarding school and dormitories and spend most of their time there.
The competition among M2 students was higher than that of M1 students. M2 students were more highly enthusiastic. This enthusiasm serves as one of the supporting factors for achieving HOTS. Meanwhile, the learning strategies implemented in both schools were unfortunately conventional.
Hypothesis testing showed a discrepancy with the formative test data obtained. This was because each school applied different evaluation standards, and after conducting research using the same evaluation standards, the evaluation results showed no significant comparison/difference between the two samples.
To conclude, students of M1 and M2 did not have input gaps. It indirectly affects the ability to solve HOTS questions of SKI, especially in the 2018/2019 academic year. There were no significant comparisons/differences in performing it. This was influenced by various supporting factors which include adequate facilities and infrastructure, adequate teacher qualifications, as well as support and commitment from the Education Office and Schools Principals to improve the quality of education, especially related to the implementation of HOTS. Meanwhile, some of the students' inhibiting factors to implement HOTS include the students' low initial ability, less thoroughness in the process of solving questions, lack of understanding towards the concepts, low literacy, and unideal learning process.

CONCLUSION
This research results in a conclusion that the ability to solve HOTS questions of Class XI students of MAN I Surakarta Boarding School from 54 samples of students, 12 students obtained scores for the low category with a percentage of 22%, 30 students obtained scores for the medium category with a percentage of 55% and 12 students obtained score for the high category with a percentage of 23%. Meanwhile, for MAN II Surakarta boarding school, from 41 student samples, 6 students scored for the low category with a percentage of 14%, 20 students scored for the medium category with a percentage of 49%, and 15 students scored for the high category with a percentage of 37%. There is no significant comparison/difference related to the ability to solve HOTS questions of Islamic Cultural History subject for both schools. With this research, it is hoped that the application of HOTS questions in Islamic Cultural History Subject can improve students' abilities to analyze, evaluate and be creative. Besides, it is also hoped that teachers will be able to formulate policies to consider applying HOTS and assessment/evaluation in the form of HOTS questions following student capacity. Furthermore, this research is hoped to increase understanding of the importance of learning that can improve students' abilities in line with the demands related to issues of educational development at the international level through the implementation of the 2013 Curriculum.