An information theoretic approach to the expressiveness of programming languages

Davidson, Joseph Ray (2016) An information theoretic approach to the expressiveness of programming languages. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2016DavidsonJPhD.pdf] PDF
Download (2MB)
Printed Thesis Information: https://eleanor.lib.gla.ac.uk/record=b3153150

Abstract

The conciseness conjecture is a longstanding notion in computer science that
programming languages with more built-in operators, that is more expressive languages with larger semantics, produce smaller programs on average. Chaitin defines the related concept of an elegant program such that there is no smaller program in some language which, when run, produces the same output.

This thesis investigates the conciseness conjecture in an empirical manner. Influenced by the concept of elegant programs, we investigate several models of computation, and implement a set of functions in each programming model. The programming models are Turing Machines, λ-Calculus, SKI, RASP, RASP2, and RASP3. The information content of the programs and models are measured as characters. They are compared to investigate hypotheses relating to how the mean program size changes as the size of the semantics change, and how the relationship of mean program sizes between two models compares to that between the sizes of their semantics.

We show that the amount of information present in models of the same paradigm, or model family, is a good indication of relative expressivity and average program size. Models that contain more information in their semantics have smaller average programs for the set of tested functions. In contrast, the relative expressiveness of models from differing paradigms, is not indicated by their relative information contents.

RASP and Turing Machines have been implemented as Field Programmable Gate Array (FPGA) circuits to investigate hardware analogues of the hypotheses above. Namely that the amount of information in the semantics for a model directly influences the size of the corresponding circuit, and that the relationship of mean circuit sizes between models is comparable to the relationship of mean program sizes.

We show that the number of components in the circuits that realise the semantics and programs of the models correlates with the information required to implement the semantics and program of a model. However, the number of components to implement a program in a circuit for one model does not relate to the number of components implementing the same program in another model. This is in contrast to the more abstract implementations of the programs.

Information is a computational resource and therefore follows the rules of Blum’s axioms. These axioms and the speedup theorem are used to obtain an alternate proof of the undecidability of elegance.

This work is a step towards unifying the formal notion of expressiveness with the notion of algorithmic information theory and exposes a number of interesting research directions. A start has been made on integrating the results of the thesis with the formal framework for the expressiveness of programming languages.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Expressiveness, algorithmic information theory, field programmable gate arrays, turing machines, combinatory logic, λ-calculus, random access stored program, universal machines, semantics.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Trinder, Prof. Phil and Michaelson, Prof. Greg
Date of Award: 2016
Depositing User: Mr Joseph Davidson
Unique ID: glathesis:2016-7200
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 04 Apr 2016 07:43
Last Modified: 19 May 2016 10:14
URI: https://theses.gla.ac.uk/id/eprint/7200

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year