Bandwidth-optimal random shuffling for GPUs

dc.contributor.authorMitchell, Roryen_NZ
dc.contributor.authorStokes, Danielen_NZ
dc.contributor.authorFrank, Eibeen_NZ
dc.contributor.authorHolmes, Geoffreyen_NZ
dc.date.accessioned2024-01-17T02:53:12Z
dc.date.available2024-01-17T02:53:12Z
dc.date.issued2022en_NZ
dc.description.abstractLinear-time algorithms that are traditionally used to shuffle data on CPUs, such as the method of Fisher-Yates, are not well suited to implementation on GPUs due to inherent sequential dependencies, and existing parallel shuffling algorithms are unsuitable for GPU architectures because they incur a large number of read/write operations to high latency global memory. To address this, we provide a method of generating pseudo-random permutations in parallel by fusing suitable pseudo-random bijective functions with stream compaction operations. Our algorithm, termed “bijective shuffle” trades increased per-thread arithmetic operations for reduced global memory transactions. It is work-efficient, deterministic, and only requires a single global memory read and write per shuffle input, thus maximising use of global memory bandwidth. To empirically demonstrate the correctness of the algorithm, we develop a statistical test for the quality of pseudo-random permutations based on kernel space embeddings. Experimental results show that the bijective shuffle algorithm outperforms competing algorithms on GPUs, showing improvements of between one and two orders of magnitude and approaching peak device bandwidth.en_NZ
dc.format.mimetypeapplication/pdf
dc.identifier.doi10.1145/3505287en_NZ
dc.identifier.issn2329-4949en_NZ
dc.identifier.urihttps://hdl.handle.net/10289/16345
dc.language.isoen
dc.publisherAssociation for Computing Machinery (ACM)en_NZ
dc.relation.isPartOfACM Transactions on Parallel Computingen_NZ
dc.rightsThis is an author’s accepted version of an article published in ACM Transactions on Parallel Computing. © 2022 ACM.
dc.subjectcomputer scienceen_NZ
dc.subjectshufflingen_NZ
dc.subjectGPUen_NZ
dc.titleBandwidth-optimal random shuffling for GPUsen_NZ
dc.typeJournal Article
dspace.entity.typePublication
pubs.issue1en_NZ
pubs.volume9en_NZ

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bandwidth_Optimal_Random_Shuffling_for_GPUs.pdf
Size:
5.09 MB
Format:
Adobe Portable Document Format
Description:
Accepted version

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Research Commons Deposit Agreement 2017.pdf
Size:
188.11 KB
Format:
Adobe Portable Document Format
Description: