Thumbnail Image

Real time analysis of cataract surgery videos using deep learning

A deep learning framework is proposed that classifies surgical instruments within cataract surgery videos and generates clinical summary reports, proving objective reporting and analysis. This framework is run in real time on an embedded system, potentially allowing real time feedback directly in a clinical environment. To evaluate this framework, a novel dataset of 39 cataract surgery videos is sampled at different frame rates and labeled with 10 surgical instrument classes. This dataset is used to train and evaluate several image classification deep learning model architectures: ResNet-50, ResNet-152 and InceptionV3. An Nvidia Jetson Nano 4GB embedded system is used to operate an optimised variation of the framework, evaluating the accuracy and representation of clinical summary reports. It was found that the deep learning models provided a maximum mean predictive AUC score of 0.982 and a sensitivity of 0.968 on hold out test videos. The Jetson Nano was demonstrated to provide accurate and representative clinical summary reports with a video frame rate of 6 frames per second, completing in real time. These results show that the deep learning framework is viable for integration into the clinical environment, automatically providing clinical summary reports containing objective assessment statistics, video annotations and graphical timeline visualisations of cataract removal surgeries.
Type of thesis
The University of Waikato
All items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.