ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2023
2
2021
1

Subjects

Authors

Institution

result total 3.

Hide Summary

Hits

Date

Downloads

Your conditions: 田雪涛

1. ChinaXiv:202312.00248
Download

Automated Scoring of Open-ended Situational Judgment Tests

Subjects: Psychology >> Psychological Measurement submitted time 2023-12-21

Xu Jing Luo Fang Ma Yan Zhen Hu Lu Ming Tian Xue Tao

Abstract： Situational Judgment Tests (SJTs) have gained popularity for their unique testing content and high face validity. However, traditional SJT formats, particularly those employing multiple-choice (MC) options, have encountered scrutiny due to their susceptibility to test-taking strategies. In contrast, open-ended and constructed response (CR) formats present a propitious means to address this issue. Nevertheless, their extensive adoption encounters hurdles primarily stemming from the financial implications associated with manual scoring. In response to this challenge, we propose an open-ended SJT employing a written-constructed response format for the assessment of teacher competency. This study established a scoring framework leveraging natural language processing (NLP) technology to automate the assessment of response texts, subsequently subjecting the system's validity to rigorous evaluation. The study constructed a comprehensive teacher competency model encompassing four distinct dimensions: student-oriented, problem-solving, emotional intelligence, and achievement motivation. Additionally, an open-ended situational judgment test was developed to gauge teachers' aptitude in addressing typical teaching dilemmas. A dataset comprising responses from 627 primary and secondary school teachers was collected, with manual scoring based on predefined criteria applied to 6,000 response texts from 300 participants. To expedite the scoring process, supervised learning strategies were employed, facilitating the categorization of responses at both the document and sentence levels. Various deep learning models, including the convolutional neural network (CNN), recurrent neural network (RNN), long short-term memory (LSTM), C-LSTM, RNN+attention, and LSTM+attention, were implemented and subsequently compared, thereby assessing the concordance between human and machine scoring. The validity of automatic scoring was also verified.
This study reveals that the open-ended situational judgment test exhibited an impressive Cronbach's alpha coefficient of 0.91 and demonstrated a good fit in the validation factor analysis through the use of Mplus. Criterion-related validity was assessed, revealing significant correlations between test results and various educational facets, including instructional design, classroom evaluation, homework design, job satisfaction, and teaching philosophy. Among the diverse machine scoring models evaluated, CNNs have emerged as the top-performing model, boasting a scoring accuracy ranging from 70% to 88%, coupled with a remarkable degree of consistency with expert scores (r= 0.95, QWK=0.82). The correlation coefficients between human and computer ratings for the four dimensions—student-oriented, problem-solving, emotional intelligence, and achievement motivation—approximated 0.9. Furthermore, the model showcased an elevated level of predictive accuracy when applied to new text datasets, serving as compelling evidence of its robust generalization capabilities.
This study ventured into the realm of automated scoring for open-ended situational judgment tests, employing rigorous psychometric methodologies. To affirm its validity, the study concentrated on a specific facet: the evaluation of teacher competency traits. Fine-grained scoring guidelines were formulated, and state-of-the-art NLP techniques were used for text feature recognition and classification. The primary findings of this investigation can be summarized as follows: (1) Open-ended SJTs can establish precise scoring criteria grounded in crucial behavioral response elements; (2) Sentence-level text classification outperforms document-level classification, with CNNs exhibiting remarkable accuracy in response categorization; and (3) The scoring model consistently delivers robust performance and demonstrates a remarkable degree of alignment with human scoring, thereby hinting at its potential to partially supplant manual scoring procedures.

YES

Hits 618 Downloads 150 Comment
2. ChinaXiv:202303.08692
Download

小学生羞怯特质预测及语言风格模型构建

Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

骆方姜力铭田雪涛肖梦格马彦珍张生

Abstract： The present study aimed to explore a new method of measuring shyness based on 1306 elementary school students’ online writing texts. A supervised learning method was used to map students' labels (tagged by their results of scale) with their text features (extracted from online writing texts based on a psychological dictionary) to build a machine learning model. Key feature sets for different dimensions of shyness were built and a machine learning model was constructed based on the selected feature to achieve automatic prediction. The labels were obtained through “National School Children Shyness Scale” completed online by elementary students. The scale includes three dimensions of shyness: shy behavior, shy cognition and shy emotion. Students with Z-scores of each dimension over 1 were labeled as shy and others were labeled as normal. Students’ online writing texts were collected from "TeachGrid" (https://www.jiaokee.com/), an online learning platform wherein students writing texts. The dictionary applied in the present study was Textmind, a widely used Chinese psychological dictionary developed based on Linguistic Inquiry and Word Count (LIWC). The dictionary was compiled mainly based on the corpus of adults. To ensure the validity of extracted features, we modified the original dictionary by expanding the categories and vocabulary with the real writing text of elementary students. The revised dictionary contained 118 categories. Features were extracted based on the revised dictionary. Chi-square algorithm was applied to identify the features that can distinguish between shy and normal groups to the greatest extent. Three sets of key features confirmed a significant lexical difference between shy and normal individuals. Among the selected features, some were shared by multiple dimensions reflecting the universal textual expression of shy individuals (e.g., The average number of words per sentence and the frequency of social words of shy individuals were less than that of normal counterparts.), and there were certain features reflected the unique characteristics of certain dimension (Perception words predicted shy behavior reflecting that high shy behavior individuals frequently felt being watched). Based on the selected features, Python 3.6.2 was used to construct the six prediction modes: Decision Tree, Random Forest, Support Vector Machine, Logistic Stitch Regression, K-Nearest Neighbor and Multilayer Perceptron. Overall, random forests have achieved the best results in the present study. The F1 score was 0.582, 0.552 and 0.545 for behavior cognition and emotion showing the feasibility of automatically predicting shyness characteristics of elementary school students based on textual language. The implication of word embedding, and deep learning models would improve the final prediction.#shyness, online writing, psychological dictionary, text mining, language style model

Hits 215 Downloads 98 Comment
3. ChinaXiv:202108.00005
Download

A New Type of Mental Health Assessment Using Artificial Intelligence Technique

Subjects: Psychology >> Psychological Measurement submitted time 2021-08-06

姜力铭田雪涛任萍骆方

Abstract： The rapid development and application of artificial intelligence technology has promoted the intelligentization of mental health assessment. Being intelligent could solve the issues of traditional mental health assessment methods and decrease the rate of misdiagnosis and improve diagnosis efficiency, which is critical to the general investigation and early warning of mental health problems. Currently, an intelligent mental health assessment is in the initial stage of development. Related studies have explored the field mainly driven by data, in which researchers use online behavioral data and data from portable devices, aiming to achieve a higher prediction accuracy. However, the interpretability of assessment results is not yet ideal. In view of these problems, more emphasis should be laid on the knowledge and experience in the field of psychology, by which the research could be more pertinent, refined, reliable, and valid. These are essential directions for the further development and application of intelligent mental health assessment.

Hits 3318 Downloads 1020 Comment

Automated Scoring of Open-ended Situational Judgment Tests

小学生羞怯特质预测及语言风格模型构建

A New Type of Mental Health Assessment Using Artificial Intelligence Technique