Performance of medical students on a virtual reality simulator for knee arthroscopy: an analysis of learning curves and predictors of performance
BMC Surgery volume 16, Article number: 14 (2016)
Ethical concerns for surgical training on patients, limited working hours with fewer cases per trainee and the potential to better select talented persons for arthroscopic surgery raise the interest in simulator training for arthroscopic surgery. It was the purpose of this study to analyze learning curves of novices using a knee arthroscopy simulator and to correlate their performance with potentially predictive factors.
Twenty medical students completed visuospatial tests and were then subjected to a simulator training program of eight 30 min sessions. Their test results were quantitatively correlated with their simulator performance at initiation, during and at the end of the program.
The mean arthroscopic performance score (z-score in points) at the eight test sessions were 1. -35 (range, -126 to -5) points, 2. -16 (range, -30 to -2), 3. -11 (range, -35 to 4), 4. -3 (range, -16 to 5), 5. -2 (range, -28 to 7), 6. 1 (range, -18 to 8), 7. 2 (range, -9 to 8), 8. 2 (range, -4 to 7). Scores improved significantly from sessions 1 to 2 (p = 0.001), 2 to 3 (p = 0.052) and 3 to 4 (p = 0.001) but not thereafter. None of the investigated parameters predicted performance or development of arthroscopic performance.
Novices improve significantly within four 30 min test virtual arthroscopy knee simulator training but not thereafter within the setting studied. No factors, predicting talent or speed and magnitude of improvement of skills could be identified.
Simulator training is gaining popularity and there are many reasons for the rising interest for its role in orthopaedic resident selection and education. The restriction of work hours of residents directly correlates with fewer hours in the operating room observing and performing operations under supervision , to the extent that residents do not feel prepared for this technically difficult field after regular residency training . For nearly every joint, arthroscopy simulators have been built and it has been shown that training on simulators improves simulator skills but a clear correlation between simulator training and improved arthroscopic skills in the operating room has not yet been fully confirmed [3–6]. Only in one study until this time the transfer validity has been shown . The role of factors potentially facilitating surgical skill development such as parental job, experience with computer games or talents like spatial sense or manual skill or others have not been assessed.
For the selection of a future arthroscopic surgeon it would be helpful to know whether factors outside the operating room, respectively even outside the arthroscopic simulation could identify a certain potential for becoming a skilled arthroscopic surgeon. This fact could be path breaking to select a specialty for both, the educator and the candidate. In non-orthopaedic surgical fields there have been studies testing the aptitude of future surgeons with psychomotor testing. It seems to be useful but yet debatable even though baseline visuospatial abilities have been shown to correlate with operative skill [8–14].
So far it is not known whether visuospatial and/or other factors are predictive for arthroscopic simulator performance. We therefore designed a prospective study to analyze the learning curve and test the hypothesis that baseline factors and measures of visuospatial, motor and mathematical abilities of medical students could predict knee simulator performance and the progression of it using a validated reality-based knee arthroscopy simulator.
Prior participation of the study, all participants gave their oral and written informed consent. The IRB waived the need for ethical approval. (Kanton Zürich, Kantonale Ethikkommission: Nr: 10-2016) Twenty medical students (11 females and 9 males; 5 left handed and 15 right handed) recruited from the University of Zurich with a mean age of 23 years (range 19 to 34) between the first and the fifth year of medical school volunteered to participate in this study.
Arthroscopic simulator course protocol
The system consists of a passive haptic knee arthroscopy simulator with a right plastic knee mock up and given lateral and medial portals. During all the exercises the participants were accompanied and supervised by one of the co-authors. All participants received an identical standardized instruction to the simulator and were taught how to manage the 30 degree arthroscope and the tools (standard punch and shaver to perform partial meniscectomy), which had to be used during the exercises. Further, all the participants could get familiar with the simulator for exactly 3 min using the camera through the lateral portal and instruments through the medial portal. The knee simulator used in this study has been previously described and its face and construct validity has been established and reported .
In week 0 – 4 all participants passed a standardized simulator training including eight sessions with six exercises (30 min) during 4 weeks.
There were three identical exercises which had to be done during all eight sessions to get a continuous record of all the participants. The first exercise was to test the triangulation, the second exercise to test the handling of a partial meniscectomy with the punch and/or the shaver and the third exercise to test the removal of foreign bodies (stars). Additionally, at each session three new tasks of increasing difficulty were introduced to evaluate the learning curve when adapting to a new situation. These three new tasks again consisting of triangulation, partial meniscectomy and removal of foreign bodies in different variations. There was a time limit of five minutes in all exercises.
After performing all the above exercises the metrics from the simulator (operation time in seconds and overall distance in centimetres of the tools and camera inside the knee simulator during a task) were extracted from the computer. Furthermore the number of foreign bodies which had been removed were counted.
Each practice session was completed except for participant 12 who did not finish his last practice session. Therefore score 8 of participant 12 is missing.
In Fig. 1 the study principals of our methods are depicted. In week 0 all participants fulfilled a questionnaire and underwent tests to measure the manual skill, spatial sense and mathematical skills.
Demographic data and questionnaire
The questionnaire consisted of different questions regarding demographic data (age, years in medical school, gender, dexterity), sports activities, playing an instrument, playing videogames, having operating room experience, parents (medical) profession and their planned future medical (surgical) specialty (Table 1).
Test A: Manual skill
The ESCAPA manual skill test (Reference: http://cisa85.de/tmp/MultitaskingTest.htm) is used in the US-Airforce in the evaluation of pilots. It tests a fine motor manual computer skill with the practiced extremity using a computer mouse.
ESCAPA is played in a small white box, in which there are five rectangles: four move automatically; and a square can be moved by the player. The object of the game is to navigate the square by dragging it with the mouse within the small white playing field for as long as possible, avoiding both the randomly moving blocks, and the surrounding border. The time was taken in seconds for this skill.
Test B: Spatial sense
This included a test, which is used in the accept ance test of all swiss medical schools (Reference: http://www.ztd.ch/w/index.php?title=EMS#Berufseignung_und_Studieneignung).
It tests the spatial sense and mental rotation and was proven to correlate with passing the complete medical school exams achieving the Swiss medical degree according to the Swiss medical board. The test included five different hose figures, which were three-dimensional arranged, which had to be rotated mentally in a correct way. 24 of those exercises had to be completed in 12 min (Fig. 2).
Test C: Mathematics
In this test, a two-dimensional representation of a three-dimensional object is given. The task is to figure out which of the 4 three-dimensional objects is the correct representation of the given two-dimensional map. Twelve of those exercises had to be completed in 10 min (Fig. 3). This exercise is used in IQ tests and gives an impression of the spatial sense and mental construction skill (Reference: http://www.fibonicci.com/de/raumliches-vorstellungsvermogen/test-schwierig/).
The statistical analysis and the development of the z-score was performed by an independent professional biostatistician using the SPSS version 20 for Mac (SPSS, Inc., Chicago, IL). To compare the performance of the students, we used a standard score (z-score) built out of the means from the sessions 4 to 8 of all participants concerning time, time per task (item), camera and instrument distance in centimeters and the number of removed foreign bodies. In the first exercise “triangulation”, we used the camera path length, the hook path length and time per item (= subscore 1-3). In the second exercise “partial meniscectomy”, we used camera path length, total path length for all the instruments and total time (= subscore 4-6). In the third exercise “removal of foreign bodies”, we used the camera path length, the punch path length and time per item (= subscore 7-9). Therefore, every student has had one z- score built out of 9 subscores for each of the eight sessions of the exercises. Every pre-test parameter and all demographic factors were correlated to the first and the last test performance. Furthermore, each of the eight z-scores were correlated to the final test result to analyse if the final performance could be predicted to an earlier stage.
The z-scores were correlated with continuous and ordinal variables using the Spearman rank correlation test. Groups were compared with the Mann–Whitney U test. Scores at different points of time were compared using Wilcoxon’s signed ranks test. A p-value < 0.05 is considered as significant.
The relevant data and the distribution of the questionnaires so as the correlation of it to the first score 1 and final score 8 are summarized in Table 1.
Analyzing the learning curve of the participants, we found a relevant improvement during the first four exercise sessions (score 1–2 (p = 0.001), score 2–3 (p = 0.052), score 3–4 (p = 0.001). Thereafter the learning curve reached a plateau with only slight mean improvement of performance (score 4 – 5, p = 0.332; score 5 – 6, p = 0.057; score 6 – 7, p = 0.681; score 7 – 8, p = 0.445). Further we found decreasing variability of the participants’ performance, indicated by gradually decreasing standard deviations from 33.2 points in score 1 to 3.8 points in score 8 (Fig. 4). In the appendix the Table 2 shows the complete scores of all the eight tests and 20 patients.
Only right handedness correlated substantially to the final simulator perfomance (score 8) (p = 0.036). Being active in sports in the past was a very weak but still significant predictor of better performance at the very first session (score 1) (r = 0.54; p: 0.013). No other examined pretest or demographic parameter showed a correlation with the final simulator performance (Table 1).
Whereas there was a significant correlation of score 4 to the final performance (score 8) r = 0.71; p < 0.0001 there was no such correlation of score 1 to the final performance (score 8) r = 0.02; p = 0.925.
This study shows that medical students, participating in a standardized virtual-reality based knee arthroscopy simulator training program, have a significant, but surprisingly steep and short learning curve. This improvement of skills was relevant until the forth test (score 4), which represents about two hours of training, without significant further improvement afterwards. Reasons for this plateau are hypothetical, but this learning curve is consistent with other results in the current literature [3, 16, 17] and might be a result of the so far limited possibility to simulate difficult arthroscopically guided surgical interventions (i.e. meniscal repair, ACL reconstruction). Since the first score does not correlate with the last score, it would be unreasonable to grade a medical student after the first 30 min of arthroscopy of simulator training and it is notable that the the participant who ultimately reached the highest performance in the final session, was amongst the five weakest participants in the first session. This is in sharp contrast to the correlation of score 4 to the final score 8. When looking at the five best participants in score 4 they reached the very high ranking of 3, 4, 5, 6 and 7 in the final score 8. Further the weakest five participants in score 4 reached the low ranking of 11, 15, 17, 19 and 20 in the final score 8. These results show that after a period of two hours of training a reasonably precise predictive statement can be made regarding the future performance of a participant. How a two hour assessment could be carried out during an application process might deserve further study.
Disappointingly, we were unable to identify strong and relevant factors identifying “talented” arthroscopic surgeons. The only factor which correlated significantly to the final performance was to be right handed. This fact might be biased by the fact that an anterolateral portal of a right knee was used for the arthroscope which therefore was held by the left hand. The arthroscopic working tools were inserted through the anteromedial portal and were therefore used with the right hand. In the tested simulators current version, there is only a right knee available even though it should bi possible to change it in the future. The only other factor, which showed a significant but weak correlation to the primary test result was the sportsmanship in the past, but it did afterall not predict final simulator performance. Whether this test result could be related to athletically active persons are more competitive and try harder at the first effort is unknown but a possibility.
Our findings were in contrast to previous laparoscopic simulator studies, which were able to identify different predictive factors. Risucci et al.  suggested that age, experience and visual spatial perceptual ability may play a role in determining the speed a surgeon can acquire and perform laparoscopic skills using a laparoscopic intracorporal simulator. Madan et al.  found that the only predicting factor, which correlated with the laparoscopic simulator performance was eating with chop sticks. There were, however no visuospatial or other tests used in the study of Madan et al. as predicting factors. Stefanidis et al.  showed that residents training on a laparoscopic simulator performed 50 % better after a mean of 12 h and that the only test, which correlated with the simulator performance was the visuospatial test “card rotation”, which is similar to the visuospatial test used in this present study. Further Stefanidis et al. found training duration and repetition correlated with prior video gaming, billiard exposure, grooved pegboard (time to place different pegs with a key with one hand in a pegboard), finger tapping with the index finger as fast as possible and map planning (a mental test for visuospatial capability and concentration).
We are aware of the potential limitations of this study. Although this study was prospectively conducted, the limited number of participants (n = 20) was possibly too small to detect subtle differences but did allow to exclude substantial differences and trends in the study group which would have made impact on selection of trainees. Furthermore, we are aware of the fact that in arthroscopic simulator surgery there have been attempts to not only measure performance with metric parameters but also with subjective scores. We however considered the Arthroscopic Surgery Skill Evaluation Tool (ASSET) , which seems to be a valid and reliable pass-fail examination of diagnostic knee arthroscopy simulation, as being not precise enough to quantitatively evaluate and compare the longitudinal participants performance. Furthermore as this was the first study investigating predictive factors in arthroscopic simulator training there were no established guidelines about visuospatial, manual or mathematical tests. Therefore it remains hypothetical if the chosen tests were either not appropriate to detect the expected predictive factors, or it is simply not possible to predict simulator performance using demographics or psycho-motoric tests.
Despite the above mentioned limitations, this study revealed some interesting information about potential arthroscopy simulator training programs and the potential use recruiting medical students in orthopaedic residents programs. Medical students without any previous arthroscopic experience improve significantly after a two-hour training virtually based simulator program of increasing difficulty levels. The level which is reached after this two hour training on the arthroscopy simulator does not change relevantly after training of more than 2 h. As transfer validity of simulator training in to the operating room was proven before , even though this was on an other virtual reality based arthroscopic simulator, it seems reasonable to state that these trainees do not only improve there simulator skills, but thereof should be able to benefit in real arthroscopic surgeries.
There is a significant learning curve of medical students when completing a standardized training program performed on a validated virtual-reality-based knee arthroscopy simulator during the first 2 h of training. The “good” and the “less good” arthroscopic surgeon can then be evaluated and further training will not change that finding. We however failed to identify factors, which help to predict talent and potential development of arthroscopic skills. Further studies are needed to become more and precise information about the transfer validity to the operating room and the amount of training time and type to make a sophisticated recommendation about the potentially best educational program for future orthopaedic surgeons providing best health care for our patients.
Zuckerman JD, Kubiak EN, Immerman I, Dicesare P. The early effects of code 405 work rules on attitudes of orthopaedic residents and attending surgeons. J Bone Joint Surg Am. 2005;87:903–8.
Hall MP, Kaplan KM, Gorczynski CT, Zuckerman JD, Rosen JE. Assessment of arthroscopic training in U.S. orthopedic surgery residency programs—a resident self-assessment. Bull NYU Hosp Jt Dis. 2010;68:5–10.
Tay C, Khajuria A, Gupte C. International Journal of Surgery. Int J Surg. 2014;12:626–33.
Cannon WD, Nicandri GT, Reinig K, Mevis H, Wittstein J. Evaluation of skill level between trainees and community orthopaedic surgeons using a virtual reality arthroscopic knee simulator. J Bone Joint Surg Am. 2014;96:e57.
Atesok K, Mabrey JD, Jazrawi LM, Egol KA. Surgical simulation in orthopaedic skills training. J Am Acad Orthop Surg. 2012;20:410–22.
Frank RM, Erickson B, Frank JM, Bush-Joseph CA, Bach Jr BR, Cole BJ, Romeo AA, Provencher MT, Verma NN. Utility of modern arthroscopic simulator training models. YJARS. 2014;30:121–33.
Cannon WD, Garrett WE, Hunter RE, Sweeney HJ, Eckhoff DG, Nicandri GT, Hutchinson MR, Johnson DD, Bisson LJ, Bedi A, Hill JA, Koh JL, Reinig KD. Improving residency training in arthroscopic knee surgery with use of a virtual-reality simulator. A randomized blinded study. J Bone Joint Surg Am. 2014;96:1798–806.
Gettman MT, Kondraske GV, Traxer O, Ogan K, Napper C, Jones DB, Pearle MS, Cadeddu JA. Assessment of basic human performance resources predicts operative performance of laparoscopic surgery. J Am Coll Surg. 2003;197:489–96.
Johnson DB, Kondraske GV, Wilhelm DM, Jacomides L, Ogan K, Pearle MS, Cadeddu JA. Assessment of basic human performance resources predicts the performance of virtual ureterorenoscopy. J Urol. 2004;171:80–4.
Kaufman HH, Wiegand RL, Tunick RH. Teaching surgeons to operate—principles of psychomotor skills training. Acta Neurochir (Wien). 1987;87:1–7.
Macmillan AI, Cuschieri A. Assessment of innate ability and skills for endoscopic manipulations by the Advanced Dundee Endoscopic Psychomotor Tester: predictive and concurrent validity. Am J Surg. 1999;177:274–7.
Stefanidis D, Korndorffer Jr JR, Black FW, Dunne JB, Sierra R, Touchard CL, Rice DA, Markert RJ, Kastl PR, Scott DJ. Psychomotor testing predicts rate of skill acquisition for proficiency-based laparoscopic skills training. Surgery. 2006;140:252–62.
Risucci D, Geiss A, Gellman L, Pinard B, Rosser J. Surgeon-specific factors in the acquisition of laparoscopic surgical skills. Am J Surg. 2001;181:289–93.
Keehner MM, Tendick F, Meng MV, Anwar HP, Hegarty M, Stoller ML, Duh Q-Y. Spatial ability, experience, and skill in laparoscopic surgery. Am J Surg. 2004;188:71–5.
Fucentese SF, Rahm S, Wieser K, Spillmann J, Harders M, Koch PP. Evaluation of a virtual-reality-based simulator using passive haptic feedback for knee arthroscopy. Knee Surg Sports Traumatol Arthrosc. 2015;23(4):1077-85.
Jackson WFM, Khan T, Alvand A, Al-Ali S, Gill HS, Price AJ, Rees JL. Learning and retaining simulated arthroscopic meniscal repair skills. J Bone Joint Surg Am. 2012;94:e132 1.
Gomoll AH, Pappas G, Forsythe B, Warner JJP. Individual skill progression on a virtual reality simulator for shoulder arthroscopy: a 3-year follow-up study. Am J Sports Med. 2008;36:1139–42.
Madan AK, Frantzides CT, Park WC, Tebbit CL, Kumari NVA, O’Leary PJ. Predicting baseline laparoscopic surgery skills. Surg Endosc. 2004;19:101–4.
Koehler RJ, Nicandri GT. Using the arthroscopic surgery skill evaluation tool as a pass-fail examination. J Bone Joint Surg Am. 2013;95:e187 1.
Howells NR, Gill HS, Carr AJ, Price AJ, Rees JL. Transferring simulated arthroscopic skills to the operating theatre: a randomised blinded study. J Bone Joint Surg (Br). 2008;90:494–9.
The authors declare that they have no competing interests.
All authors have made substantial contributions to this study; SR was involved in conception and design, analysis and interpretation of data and in drafting the manuscript. KW was involved in conception and design and the interpretation of data, revising the manuscript critically for important intellectual content and supervision. The first two authors contributed equally to this work. IW was involved in conception and the acquisition of data, analysis and interpretation of data. LH was involved in conception and the acquisition of data, analysis and interpretation of data. SF was involved in analysis and interpretation of data and revising the manuscript critically for important intellectual content. CG was involved in conception and design and interpretation of data and revising the manuscript critically for important intellectual content and supervision. All authors read and approved the final manuscript.
About this article
Cite this article
Rahm, S., Wieser, K., Wicki, I. et al. Performance of medical students on a virtual reality simulator for knee arthroscopy: an analysis of learning curves and predictors of performance. BMC Surg 16, 14 (2016). https://doi.org/10.1186/s12893-016-0129-2
- Learning curve
- Predictive factors
- Virtual reality
- Knee arthroscopy