Your conditions: 蔡艳
  • Development of Online Calibration Method Based on SCAD Penalty and EM Perspective in CD-CAT: a study based on the G-DINA model

    Subjects: Psychology >> Psychological Measurement submitted time 2023-11-22

    Abstract: Cognitive diagnostic computerized adaptive testing (CD-CAT) provides a detailed diagnosis of an examinee’s strengths and weaknesses in the content measured in a timely and accurate manner, which can be used as a reference for further study or remediation planning, thus meeting the practical need for efficient and detailed test results. The successful implementation of CD-CAT is based on an item bank, but its maintenance is a very challenging task. A psychometrically popular choice for maintaining an item bank is online calibration. Currently, the research on online calibration methods in the CD-CAT that can calibrate Q-matrix and item parameters simultaneously is very weak. The existing methods are basically developed based on the deterministic input, noisy and gate (DINA) model. Compared with the DINA model, the generalized DINA (G-DINA) model has been more widely applied because it is less restrictive and can meet the requirements of a large number of test data in psychological and educational assessment. Therefore, if the online calibration method that jointly calibrates the Q-matrix and item parameters can be developed for models with few constraints such as G-DINA, its meaning is understood without explanation.
    In current study, a new online calibration method, SCADOCM, was proposed, which was suitable for the G-DINA model. The construction of SCADOCM was based on the smoothly clipped absolute deviation penalty (SCAD) and marginalized maximum likelihood estimation (MMLE/EM) algorithm. For the new item j, the log-likelihood function with SCAD can be formulated based on the examinees’ responses in this item and the examinees’ attribute marginal mastery probability, and the q-vector of the new item can be estimated by the q-vector estimator based on SCAD. Then, the EM algorithm was used to estimate the item parameter of the new item j based on the posterior distributions of examinees’ attribute patterns, the examinees’ responses to new item j and the estimated q-vector.  
    To examine the performance of the proposed SCADOCM and compare it with the SIE method, two simulation studies (Study 1 and Study 2) are conducted. Study 1 is based on a simulated item bank while Study 2 is based on the real item bank (Internet addiction item bank; Shi, 2017). In these simulation studies, four factors were manipulated: the calibration sample size (nj = 50 vs. 100 vs. 500 vs. 1000 vs. 2000), the distribution of the attribute pattern (uniform distribution vs. high-order distribution vs. normal distribution), the item quality (U (0.05, 0.15) vs. U (0.1, 0.3)), and the online calibration methods (SCADOCM vs. SIE). The results showed that (1) SCADOCM has satisfactory calibration accuracy and calibration efficiency, and is superior to the SIE method. In addition, the traditional SIE method is not applicable for the G-DINA model, and its Q-matrix estimation accuracy rate is low under all experimental conditions. (2) The item calibration accuracy of SCADOCM and SIE increases with the increase of calibration sample and item quality under most conditions, and its item calibration accuracy in the uniform distribution/higher-order distribution is greater than that in the normal distribution. (3) The calibration efficiency of SCADOCM decreases with the increase of calibration samples, but it is less affected by the item quality and the attribute pattern distribution; the calibration efficiency of SIE decreases with the increase of calibration samples, but it is less affected by the item quality. Moreover, the calibration efficiency of the SIE method in the normal distribution is slightly slower than that of uniform distribution/high-order distribution.
    To sum up the results, this study demonstrated that the SCADOCM has higher item calibration accuracy and calibration efficiency, and outperforms the SIE method; meanwhile, the traditional SIE method is not suitable for G-DINA model. All in all, this study provides an efficient and accurate method for item calibration in CD-CAT, and provides important support for further promoting the application of CD-CAT in practice.

  • 心理与教育测验中异常反应侦查新技术:变点分析法

    Subjects: Psychology >> Developmental Psychology submitted time 2023-03-28 Cooperative journals: 《心理科学进展》

    Abstract: The change point analysis (CPA), as one of the most widely used methods for statistical process control, is introduced to psychological and educational measurement for detection of aberrant response patterns in recent years. CPA outperforms the traditional method as follows: In addition to detecting aberrant response patterns, it can also pinpoint the locations of change points, contributing to efficient cleansing of response data. The method is employed to determine whether there is a point so that the complete sequence can be divided into two parts with different statistical properties, where person-fit statistics (PFS) is needed for quantifying the difference between two sub-sequences. Future researchers should pay more attention to multiple change points detection, making full use of other effective information like response time data, developing non-parametric indices as well as reforming the exiting person-fit statistics for polytomous and multidimensional tests, so as to enhance its applicability and power.

  • 基于分部评分模型思路的多级评分认知诊断模型开发

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Currently, a large number of cognitive diagnosis models (CDMs) have been proposed to satisfy the demands of the cognitively diagnostic assessment. However, most existing CDMs are only suitable for dichotomously scored items. In practice, there are lager polytomously-score items/data in educational and psychological tests. Therefore, it is very necessary to develop CDMs for polytomous data. Under the item response theory (IRT) framework, the polytomous models can be divided into three categories: (i) the cumulative probability (or graded-response) models, (ii) continuation ratios (or sequential) models, and (iii) the adjacent-category (or partial-credit) models. At present, several efforts have been made to develop polytomous partial-credit CDMs, including the general diagnostic model (GDM; von Davier, 2008) and the partial credit DINA (PC-DINA; de la Torre, 2012) model. However, the existing polytomous partial-credit CDMs need to be improved in the following aspects: (1) These CDMs do not consider the relationship between attributes and response categories by assuming that all response categories of an item measure the same attributes. This may result in loss of diagnostic information, because different response categories could measure different attributes; (2) More importantly, the PC-DINA is based on reduced DINA model. Therefore, the current polytomous CDMs are established under strong assumptions and do not have the advantages of general cognitive diagnosis model.The current article proposes a general partial credit diagnostic model (GPCDM) for polytomous responses with less restrictive assumptions. Item parameters of the proposed models can be estimated using the marginal maximum likelihood estimation approach via Expectation Maximization (MMLE/EM) algorithm.Study 1 aims to examine (1) whether the EM algorithm can accurately estimate the parameters of the proposed models, and (2) whether using item level Q-matrix (referred to as the Item-Q) to analyze data generated by category level Q-matrix (referred to as the Cat-Q) will reduce the accuracy of parameter estimation. Results showed that when using Cat-Q fitting data, the maximum RMSE was less than 0.05. When the number of attributes was equal to 5 or 7, the minimum pattern match rate (PMR) was 0.9 and 0.8, respectively. These results indicated that item and person parameters could be recovered accurately based on the proposed estimation algorithm. In addition, the results also showed that when Item-Q is used to fit the data generated by Cat-Q, the estimation accuracy of both the item and person parameters could be reduced. Therefore, it is suggested that when constructing the polytomously-scored items for cognitively diagnostic assessment, the item writer should try to identify the association between attributes and categories. In the process, more diagnostic information may be extracted, which in turn helps improve the diagnostic accuracy.The purpose of Study 2 is to apply the proposed model to the TIMSS (2007) fourth-grade mathematics assessment test to demonstrate its application and feasibility and compare with the exiting GDM and PC-DINA model. The results showed that compared with GDM and PC-DINA models, the new model had a better model fit of test-level, higher attribute reliability and better diagnostic effect.

  • 基于基尼指数的双目标CD-CAT选题策略

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Existing literature has shown that dual-objective CD-CAT testing can facilitate the achievement of measurement objectives for both formative and summative assessments. And the Gini Index can be used as a measurement for the degree of uncertainty of random variables since a smaller Gini value indicates a lower degree of uncertainty. Hence, this paper proposed a Gini-Index-based selection method for dual-objective CD-CAT, and it measured the changes in the posterior probability of knowledge state and confidence interval for latent traits estimation. By adopting the Bayesian Decision Theory, the potential information of participants could be detected based on participants’ responses and changes in posterior probability distribution of two the random variables. Monte Carlo Simulation was used to test the performances of the selection method based on Gini, ASI, IPA and JSD, respectively. The item banks measured 5 attributes consisting of 250 items in total, and each item measured 3 attributes at most. The true knowledge state of each participant was generated by HO-CDM and Multivariate Normal Models (both means were 0 and covariance coefficient was 0.8 and 0.2, respectively). G-DINA, DINA and R-RUM were adopted as the cognitive diagnostic models and the item bank of each of these three models included both CDM and 2PL parameters. Specifically, CDM parameters were generated by a G-DINA package in R software with the slipping and guessing parameters randomly selected from uniform distribution in a range from 0.05 to 0.25. The 2PL parameters were estimated by factoring in the responses elicited from 3, 000 participants’ responses to all items in item banks using the mirt package. Four indexes, namely the pattern match ratio, root mean square error of latent trait, chi-square value and time needed for item selection, were adopted in comparing the efficiency of different item selection methods. The value for each index was the mean of 10 repeated simulations of 1, 000 participants’ responses to all item bank. The results showed that (1) The Gini and IPA selection methods had similar performance in terms of pattern match ratio, root mean square error of latent trait and chi-square value. Both methods were high in precision measurement and low in sensitivity to CDM and the distribution of participants’ cognitive patterns, making both methods applicable to the item banks featuring a mixture of cognitive diagnosis models. By comparison, the Gini method outperformed slightly the IPA method in pattern match ratio and time needed for item selection in which the Gini method was only one-tenth that of the IPA method; (2) Both the Gini and ASI selection methods were weighted linear combination approaches. The performances of the two methods were very close in the short test. In the long test, however, although time needed for item selection using the ASI method was only one-third that of the Gini method, the latter was superior to the former in terms of measurement accuracy and chi-square value; (3) Although the JSD method outperformed the Gini method in terms of uniformity of item bank usage and time needed for item selection, its measurement accuracy was far less than the latter. To summarize, the Gini, IPA and ASI selection methods all have good measurement accuracy and hence are all recommended for short tests. For medium and long tests with a limited number of attributes and a smaller item bank, the Gini and IPA selection methods are recommended. As the number of attributes and item bank size grow, the Gini method is recommended. When there are high correlations among different attributes, as well as a large number of attributes and big item bank size, the ASI and JSD selection methods are recommended with the ASI method slightly outperforming the JSD method in measurement accuracy.

  • 一种高效的CD-CAT在线标定新方法:基于熵的信息增益与EM视角

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Cognitive diagnostic computerized adaptive testing (CD-CAT) includes the advantages of both cognitive diagnosis (CD) and computerized adaptive testing (CAT), which can offer detailed diagnosis feedback for each examinee by applying fewer test items and time. It has been a promising field. An item bank is a prerequisite for the implementation of CD-CAT. However, its maintenance is a very challenging task. One of the effective ways to maintain the item bank is online calibration. Till now, there are only a few online calibration methods in the CD-CAT context that can calibrate Q-matrix and item parameters simultaneously. Moreover, the computational efficiency of these methods needs to be further improved. Therefore, it is crucial to find more online calibration methods that jointly calibrate the Q-matrix and item parameters.Inspired by the SIE (Single-Item Estimation) method proposed by Chen et al. (2015) and information gain feature selection criteria in feature selection, an information gain of entropy-based online calibration method (IGEOCM) was proposed in this study. The proposed method can jointly calibrate Q-matrix and item parameters in a sequential manner. The calibration process of the new items was described as follows: First, for the new item j, the q-vector can be calibrated by maximizing the information gain of entropy-based on the basis of the attribute patterns of examinees and the examinees’ responses to item j. Second, the item parameters of the new item j are estimated by the EM algorithm based on the posterior distribution of examinees’ attribute pattern, the examinees’ responses to item j, and the q-vector estimated in the first step. The first and second step are repeated for all other new items to obtain their estimated Q-matrix and item parameters item by item. Two simulation studies were conducted to examine whether the IGEOCM could accurately and efficiently calibrate the Q-matrix and item parameters of the new items under different calibration sample sizes (40, 80, 120, 160, and 200), different attribute pattern distributions (uniform distribution, higher-order distribution, and multivariate normal distribution), the different number of new items answered by examinee (4, 6, and 8), and different item selection algorithms (posterior-weighted Kullback-Leibler, PWKL; the modified PWKL, MPWKL; the generalized deterministic inputs, noisy and gate model discrimination index, GDI; and Shannon entropy, SHE). Furthermore, the performance of the proposed method was compared with the SIE, SIE-R-BIC, and RMSEA-N methods.The results indicated that (1) The IGEOCM worked well in terms of the calibration accuracy and estimation efficiency under all conditions, and outperformed the SIE, SIE-R-BIC, and RMSEA-N methods overall. (2) The accuracy of the item calibration increases as the sample size increases for all calibration methods under all conditions. (3) The SIE, SIE-R-BIC, RMSEA-N, and IGEOCM performed better under the uniform distribution and higher-order distribution than under the multivariate normal distribution. (4) The number of new items answered by the examinee had a negligible impact on the calibration accuracy and computation efficiency of the SIE, SIE-R-BIC, RMSEA-N, and IGEOCM. (5) The item selection algorithm in CD-CAT affects the Q-matrix calibration accuracy of the SIE and IGEOCM methods. Under the higher-order distribution and multivariate normal distribution, the SIE method and IGEOCM had higher Q-matrix calibration accuracy when the item selection algorithms were MPWKL and GDI.On the whole, although the proposed IGEOCM is competitive and outperforms the conventional method irrespective of the calibration precision or computational efficiency, the studies on the online calibration method in CD-CAT still need to be further deepened and expanded.

  • 一种高效的CD-CAT在线标定新方法:基于熵的信息增益与EM视角

    Subjects: Psychology >> Psychological Measurement submitted time 2021-07-30

    Abstract: 项目增补(Item Replenishing)对认知诊断计算机自适应测验(CD-CAT)题库的维护有着至关重要的作用,而在线标定是一种重要的项目增补方式。基于数据挖掘中特征选择(Feature Selection)的思路,提出一种高效的基于熵的信息增益的在线标定方法(记为IGEOCM),该方法利用被试在新旧题上的作答联合估计新题的Q矩阵和项目参数。研究采用Monte Carlo模拟实验验证所开发新方法的效果,并同时与已有的在线标定方法SIE (Chen et al., 2015)、SIE-R-BIC和RMSEA-N (谭青蓉,2019)进行比较。结果表明:新开发的IGEOCM在各实验条件下均具有较好的项目标定精度和项目估计效率,且整体上优于已有的SIE等方法;同时,IGEOCM标定新题所需的时间低于SIE等方法。总之,研究为CD-CAT题库中项目的增补提供了一种更为高效、准确的方法。

  • A New Dual-Objective CD-CAT Item Selection Method Based on the Gini Index

    Subjects: Psychology >> Psychological Measurement submitted time 2020-09-02

    Abstract: " Existing literature has shown that dual-objective CD-CAT testing can facilitate the achievement of measurement objectives for both formative and summative assessments. And the Gini Index can be used as a measurement for the degree of uncertainty of random variables since a smaller Gini value indicates a lower degree of uncertainty. Hence, this paper proposed a Gini-Index-based selection method for dual-objective CD-CAT, and it measured the changes in the posterior probability of knowledge state and confidence interval for latent traits estimation. By adopting the Bayesian Decision Theory, the potential information of participants could be detected based on participants’ responses and changes in posterior probability distribution of two the random variables. Monte Carlo Simulation was used to test the performances of the selection method based on Gini, ASI, IPA and JSD, respectively. The item banks measured 5 attributes consisting of 250 items in total, and each item measured 3 attributes at most. The true knowledge state of each participant was generated by HO-CDM and Multivariate Normal Models (both means were 0 and covariance coefficient was 0.8 and 0.2, respectively). G-DINA, DINA and R-RUM were adopted as the cognitive diagnostic models and the item bank of each of these three models included both CDM and 2PL parameters. Specifically, CDM parameters were generated by a G-DINA package in R software with the slipping and guessing parameters randomly selected from uniform distribution in a range from 0.05 to 0.25. The 2PL parameters were estimated by factoring in the responses elicited from 3,000 participants’ responses to all items in item banks using the mirt package. Four indexes, namely the pattern measurement rates, root mean square error of latent trait, chi-square value and time needed for item selection, were adopted in comparing the efficiency of different item selection methods. The value for each index was the mean of 10 repeated simulations of 1,000 participants’ responses to all item bank. The results showed that (1) The Gini and IPA selection methods had similar performance in terms of pattern measurement rates, root mean square error of latent trait and chi-square value. Both methods were high in precision measurement and low in sensitivity to CDM and the distribution of participants’ cognitive patterns, making both methods applicable to the item banks featuring a mixture of cognitive diagnosis models. By comparison, the Gini method outperformed slightly the IPA method in pattern measurement rates and time needed for item selection in which the Gini method was only one-tenth that of the IPA method; (2) Both the Gini and ASI selection methods were weighted linear combination approaches. The performances of the two methods were very close in the short test. In the long test, however, although time needed for item selection using the ASI method was only one-third that of the Gini method, the latter was superior to the former in terms of measurement accuracy and chi-square value; (3) Although the JSD method outperformed the Gini method in terms of uniformity of item bank usage and time needed for item selection, its measurement accuracy was far less than the latter. To summarize, the Gini, IPA and ASI selection methods all have good measurement accuracy and hence are all recommended for short tests. For medium and long tests with a limited number of attributes and a smaller item bank, the Gini and IPA selection methods are recommended. As the number of attributes and item bank size grow, the Gini method is recommended. When there are high correlations among different attributes, as well as a large number of attributes and big item bank size, the ASI and JSD selection methods are recommended with the ASI method slightly outperforming the JSD method in measurement accuracy.

  • Change point analysis: A new method to detect aberrant responses in psychological and educational testing

    Subjects: Psychology >> Psychological Measurement submitted time 2020-05-12

    Abstract:变点分析法(change point analysis, CPA)近些年才引入心理与教育测量学,相较于传统方法,CPA不仅可以侦查异常作答被试,还能自动精确地定位变点位置,高效清洗作答数据。其原理在于:判断作答序列中是否存在可将该序列划分为具有不同统计学属性两部分的点(即变点),并且需使用被试拟合统计量(person-fit statistic, PFS)来量化两个子序列之间的差异。未来可将单变点分析拓展至多变点,结合反应时等信息,构建非参数化指标以及将现有指标拓展至多级计分或多维测验,以提高CPA的适用广度及效力。

  • A method of Q-matrix validation for polytomous response cognitive diagnosis model based on relative fit statistics

    Subjects: Psychology >> Psychological Measurement submitted time 2019-09-16

    Abstract: Cognitive diagnostic assessments (CDAs) can provide fine-grained diagnostic information about students' knowledge states, so as to help to teach in accordance with the students’ aptitude. The development of cognitive diagnosis model for polytomous response data expands the application scope of cognitive diagnostic assessment. As the basis of CDAs, Q-matrix has aroused more and more attention for the subjective tendency in Q-matrix construction that is typically performed by domain experts. Due to the subjective process of Q-matrix construction, there inevitably have some misspecifications in the Q-matrix, if left unchecked, can result in a serious negative impact on CDAs. To avoid the subjective tendency from experts and to improve the correctness of the Q-matrix, several objective Q-matrix validation methods have been proposed. Many Q-matrix validation methods have been proposed in dichotomous CDMs, however, the research of the Q-matrix validation method under polytomous CDMs is stalling lacking. To address this concern, several relative fit statistics (i.e., -2LL, AIC, BIC) were applied to the Q-matrix validation for polytomous cognitive diagnosis model in this research. The process of Q-matrix validation is as follows: First, the reduced Q-matrix is represented by , which represents a set of potential q-vectors and contains possible q-vectors when attributes are independent. When validating the q-vector of the first category of item j, all possible q-vectors in can be used as the q-vector of the first category of item j, and the Q-matrix of remaining items remains intact. From this, the item parameters and the attribute patterns of students can be estimated, and the -2LL, AIC, and BIC can be calculated accordingly. The q-vector with the largest likelihood (or smallest AIC/BIC) is regarded as the q-vector of the first category of item j. The q-vector of the next category of the item j can also be obtained in the same way. The algorithm stops when the validated Q-matrix is same as the previous Q-matrix, or every item has been reached. In order to improve the efficiency of the method, a sequential search algorithm was proposed. Several simulation studies were conducted to evaluate the effectiveness and practicality of these methods, and the performance of the methods in this paper was compared with the stepwise method (Ma & de la Torre, 2019). Three experimental factors were considered in simulation studies, including sample size, Q-matrix error types and CDMs. The results show that (1) BIC method can be used for Q-matrix validation under polytomous response CDMs, and the performance of the BIC method is better than the stepwise method. (2) In general, the performance of the three methods from good to bad is the BIC method, AIC method, and -2LL method. (3) The performance of Q-matrix validation methods is affected by the sample size, and increasing the number of sample size can improve the accuracy of the Q-matrix validation. In this study, Q-matrix validation methods for polytomous response CDMs were studied. It was found that the BIC method can be used for the Q-matrix validation under polytomous response CDMs. The method proposed in this paper can not only improve the accuracy of Q-matrix specification but also increase the model-data fit level. Besides, the data-based Q-matrix validation method can also reduce the workload of experts in Q-matrix construction and improve the classification accuracy of cognitive diagnosis. " " " " "

  • Operating Unit: National Science Library,Chinese Academy of Sciences
  • Production Maintenance: National Science Library,Chinese Academy of Sciences
  • Mail: eprint@mail.las.ac.cn
  • Address: 33 Beisihuan Xilu,Zhongguancun,Beijing P.R.China