Causal inferential dynamic network analysis

Martínez Bustos, Sebastián (2023) Causal inferential dynamic network analysis. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2021MartinezBustosPhD.pdf] PDF
Download (1MB)


In this dissertation I present developments in statistical methodologies that deal with interdependent data, i.e. data in which the units of observation are connected to each other resulting in a network of interdependence between them. Data considered interdependent poses a challenge to traditional statistical methodologies that assume units of observation to be independent and identically distributed. I focus on networks, and in particular social networks, as a tool to characterise these units of observation, called nodes, their observable attributes, and the connections between them. The developments in this dissertation are used to try to answer questions about the causal relationship between the observed variables, conditional on the network structure.

In chapter 3 I present a causal analysis of the the Sexually Transmitted infections And Sexual Health (STASH) intervention and find that it had a positive effect of treatment (direct effect), but no effect of interference (effect of treatment spilling over to other individuals). I consider the methodology developed by Forastiere et al. (2020), as well as a flexible regression approach, to model the potential outcomes of the intervention for different levels of treatment and spillover, conditional on the joint propensity to be treated, directly and indirectly. Using a simulation study, I find that the proposed flexible approach has similar performance in terms of bias and uncertainty to the approach by Forastiere et al. (2020) when estimating the effect of the intervention, without the need for full information on the outcome model. In addition, our simulations suggest that regardless of methodology, estimation using a small sample produces larger uncertainty bounds.

In chapter 4 I present a methodology to identify social influence and separate it from the effect of prior similarity in bipartite event cascades, when analysed using the relational event model (REM). The REM can be used to analyse the interdependent nature of data where the behaviour by an actor can be caused by the recent behaviour of similar actors (social influence). Homophily statistics can test for such contagion, given one or more actor attributes or network relations. However, social influence along the cascade, and independent but similar behaviour as a consequence of shared attributes, are generally confounded. Using Monte Carlo simulations, I show the limits of a randomisation test as a tool to distinguish from these two competing mechanisms (influence and prior similarities). The simulations, as well as an empirical example in political science, delineate the scope conditions of the randomisation inference test used and demonstrate its efficacy under different mixture regimes of influence and similarity.

Chapter 5 presents a Bayesian methodology to estimate parameters for social networks using the exponential family of distributions via a network sampler that produces candidates in which both the connections between the nodes and their attributes are considered endogenous. Parameter estimation for networks with the exponential family is based on sampling networks candidates conditional on a fixed value of the parameter. Traditional estimation produces networks where only the connections between the nodes are switched to produce viable candidates. Fellows and Handcock (2012) developed a sampler that produces networks where both the connections and some nodal attributes are switched (toggled, as it is referred to in the literature) in order to generate viable samples. I propose using a Bayesian estimation routine with a sampler that also toggles node attributes and network connections, based on Caimo and Friel (2011)’s approach, to replace estimation using maximum likelihood, and produce samples from the posterior distribution for the parameter. This results in an estimating methodology that considers a data generating process in which networks are generated by changing edges and node attributes, and conditional on having a proper model, is less prone to produce degenerate results.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics
R Medicine > RA Public aspects of medicine > RA0421 Public health. Hygiene. Preventive Medicine
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Dean, Dr. Nema, Leifeld, Dr. Philip, McCann, Dr. Mark and Moodie, Dr. Erica E.M.
Date of Award: 2023
Depositing User: Theses Team
Unique ID: glathesis:2023-83500
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 23 Mar 2023 11:27
Last Modified: 23 Mar 2023 11:27
Thesis DOI: 10.5525/gla.thesis.83500

Actions (login required)

View Item View Item


Downloads per month over past year