Permanent link to Research Commons versionhttps://hdl.handle.net/10289/16371
Over the last decade several machine learning (ML) based data-driven approaches have been used for Electromyography (EMG) based control of prosthetic hands. However, the performance of EMG-based frameworks can be affected by: i) the onset of fatigue due to long data collection sessions, ii) musculoskeletal differences between individuals, and iii) sensor position drifting between different sessions with the same user. To evaluate these aspects, in this work, we compare the performance of EMG-based hand gesture decoding models developed using three approaches. This comparison allows for future works in EMG-based Human-Machine Interfaces development to make more informed ML decisions. First, we trained from scratch a Transformer-based architecture, called Temporal Multi-Channel Vision Transformer (TMC-ViT). For our second approach, we utilized a pre-trained and fine-tuned TMC-ViT model (a transfer learning approach). Finally, for our third approach, we developed a Prototypical Network (a few-shot learning approach). The models are trained in a subject-specific and subject-generic manner for eight subjects and validated employing the 10-fold cross-validation procedure. This study shows that training a deep learning decoding model from scratch in a subject-specific manner leads to higher decoding accuracies when a larger dataset is available. For smaller datasets, subject-generic models, or inter-session models, the few-shot learning approach produces more robust results with better performance, and is more suited to applications where long data collection scenarios are not possible, or where multiple users are intended for the interface. Our findings show that the few-shot learning approach can outperform training a model from scratch in different scenarios.
Institute of Electrical and Electronics Engineers
© 2023 The Authors. This work is licensed under a CC BY 4.0 licence.