A domain-extensible compiler with controllable automation of optimisations

Kœhler, Thomas (2022) A domain-extensible compiler with controllable automation of optimisations. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2022KoehlerPhD (2).pdf] PDF
Download (1MB)

Abstract

In high performance domains like image processing, physics simulation or machine learning, program performance is critical. Programmers called performance engineers are responsible for the challenging task of optimising programs. Two major challenges prevent modern compilers targeting heterogeneous architectures from reliably automating optimisation. First, domain specific compilers such as Halide for image processing and TVM for machine learning are difficult to extend with the new optimisations required by new algorithms and hardware. Second, automatic optimisation is often unable to achieve the required performance, and performance engineers often fall back to painstaking manual optimisation.
This thesis shows the potential of the Shine compiler to achieve domain-extensibility, controllable automation, and generate high performance code. Domain-extensibility facilitates adapting compilers to new algorithms and hardware. Controllable automation enables performance engineers to gradually take control of the optimisation process. The first research contribution is to add 3 code generation features to Shine, namely: synchronisation barrier insertion, kernel execution, and storage folding. Adding these features requires making novel design choices in terms of compiler extensibility and controllability. The rest of this thesis builds on these features to generate code with competitive runtime compared to established domain-specific compilers. The second research contribution is to demonstrate how extensibility and controllability are exploited to optimise a standard image processing pipeline for corner detection. Shine achieves 6 well-known image processing optimisations, 2 of them not being supported by Halide. Our results on 4 ARM multi-core CPUs show that the code generated by Shine for corner detection runs up to 1.4× faster than the Halide code. However, we observe that controlling rewriting is tedious, motivating the need for more automation.
The final research contribution is to introduce sketch-guided equality saturation, a semiautomated technique that allows performance engineers to guide program rewriting by specifying rewrite goals as sketches: program patterns that leave details unspecified. We evaluate this approach by applying 7 realistic optimisations of matrix multiplication. Without guidance, the compiler fails to apply the 5 most complex optimisations even given an hour and 60GB of RAM. With the guidance of at most 3 sketch guides, each 10 times smaller than the complete program, the compiler applies the optimisations in seconds using less than 1GB.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Trinder, Professor Phil and Steuwer, Dr. Michel
Date of Award: 2022
Depositing User: Theses Team
Unique ID: glathesis:2022-83323
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 22 Dec 2022 09:25
Last Modified: 22 Dec 2022 09:28
Thesis DOI: 10.5525/gla.thesis.83323
URI: https://theses.gla.ac.uk/id/eprint/83323
Related URLs:

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year