Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Science and Engineering
      • Science and Engineering Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Science and Engineering
      • Science and Engineering Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      An exploration of using the intel AVX2 gather load instructions for vectorised image processing

      Cree, Michael J.
      Thumbnail
      Files
      exploration-using-cree.pdf
      Accepted version, 228.3Kb
      DOI
       10.1109/IVCNZ.2018.8634707
      Find in your library  
      Citation
      Export citation
      Cree, M. J. (2018). An exploration of using the intel AVX2 gather load instructions for vectorised image processing. Presented at the International Conference on Image and Vision Computing New Zealand (IVCNZ), Auckland, New Zealand: IEEE. https://doi.org/10.1109/IVCNZ.2018.8634707
      Permanent Research Commons link: https://hdl.handle.net/10289/12429
      Abstract
      Processing image data with single-instruction multiple-data (SIMD) CPU instructions provides a means of vectorising, thus speeding up execution, of standard image processing operators. SIMD register loads normally load from consecutive locations in memory, that is, consecutive pixels in a row of the image. For some algorithms, however, data dependencies between pixels along rows render SIMD vectorisation useless. If one could efficiently load pixels from columns of images this problem would be fixed. The Intel AVX2 CPU extension introduces an instruction for the gather loading of data from multiple memory locations into a single CPU SIMD register. We explore using these instructions for column loads of image data in two common image operations, transposing images and mean filtering, to test 1) whether they provide useful speed-ups when other vectorised approaches exist (and find that they do not), and 2) whether they provide means of implementing operations that otherwise would be difficult or extremely inefficient to achieve without a column load (they can provide speed-ups over scalar code).
      Date
      2018
      Type
      Conference Contribution
      Publisher
      IEEE
      Rights
      This is an author’s accepted version of an article published in the Proceedings of International Conference on Image and Vision Computing New Zealand (IVCNZ). © 2018 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
      Collections
      • Science and Engineering Papers [3074]
      Show full item record  

      Usage

      Downloads, last 12 months
      199
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement