1. Introduction
It is clear that fluid extraction from or injection into Earth’s crust (including oil exploration and production) can trigger earthquakes (Segall, Reference Segall1989; Davies et al., Reference Davies, Foulger, Bindley and Styles2013; Goebel and Brodsky, Reference Goebel and Brodsky2018; González et al., Reference González, Tiampo, Palano, Cannavó and Fernández2012; Karamzadeh et al., Reference Karamzadeh, Lindner, Edwards, Gaucher and Rietbrock2021; Lei et al., Reference Lei, Huang, Su, Jiang, Wang, Wang, Guo and Fu2017). However, discriminating between earthquakes that occur naturally and those that are triggered by anthropogenic activity is challenging (Ellsworth, Reference Ellsworth2013; Grasso and Wittlinger, Reference Grasso and Wittlinger1990). A key issue is that earthquakes can occur in apparently aseismic regions, without an obvious trigger, and this means that the temporal relationship between earthquake occurrence and anthropogenic activity can always be argued as coincidental. Indeed, the occurrence of unlikely seismic events is expected to occur at a specific frequency. The occurrence of extremely unlikely events is expected to occur at a far reduced frequency, but is expected to occur nevertheless. For this reason, it is essential to carefully evaluate all possibilities before attributing the occurrence of seismic swarms to any specific trigger (Grigoli et al., Reference Grigoli, Cesca, Priolo, Rinaldi, Clinton, Stabile, Dost, Fernandez, Wiemer and Dahm2017).
The Newdigate, Surrey, seismic swarm is characterized by a series of low to moderate magnitude earthquakes (M L -1.34 to 3.18), that began in April 2018 and persisted into early 2019 (Figure 1). Residents of Newdigate reported cracks appearing in the walls, ceilings and foundations of their homes, damage to chimneys, misaligned windows and doors and broken pictures and ornaments that fell due to shaking (BGS, 2019). There are also reports of a small landslide in the vicinity of Newdigate, potentially triggered by seismic activity (BGS, 2019). The swarm’s occurrence in a region with historically low seismicity levels has led to various hypotheses about its origins, including natural tectonic processes and anthropogenic triggers (OAG, 2018). Notably, this period aligns with oil extraction and production activities in the nearby Horse Hill and Brockham oil fields, prompting investigations into whether these operations may have influenced the seismic activity (Hicks et al., Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019; Verdon et al., Reference Verdon, Baptie and Bommer2019; Cavanagh et al., Reference Cavanagh, Gilfillan and Haszeldine2019; Westaway Reference Westaway2022).
Hicks et al., (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019) provided a detailed analysis of the swarm of earthquakes close to the Horse Hill site. Over 168 earthquakes were located between April 2018 and June 2019 ranging in magnitude between M L -1.34 and M L 3.18. During the swarm, there was no gradual migration of the focus of the earthquakes as might be expected if they were produced by fluids fracturing and infiltrating into rocks (Keranen et al., Reference Keranen, Weingarten, Abers, Bekins and Ge2014). Instead, earthquakes are clustered around a previously mapped fault, and focal mechanisms indicate strike slip motion on this fault. The association of the earthquakes with the fault, the lack of the migration in seismicity and the apparent weak correlation between the earthquake frequency and oil extraction led Hicks et al. (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019) to argue that these events were probably not induced by anthropogenic activity. This supported the earlier analysis carried out by Verdon et al. (Reference Verdon, Baptie and Bommer2019).
By contrast, Westaway (Reference Westaway2022) presented a geomechanical model highlighting how oil extraction can lead to changes in the local stress field, which can trigger slip on the identified strike slip fault. In this model, pressure changes caused by oil extraction can be transmitted through permeable units and across permeable faults and through ‘calcite beef’ (Howitt, Reference Howitt1964; Hesselbo and Jenkyns, Reference Hesselbo, Jenkyns and Taylor1995). He also highlighted several complexities associated with previous analyses of the data and local geology. The key complexity raised by Westaway (Reference Westaway2022) for our current analysis, however, is related to extraction of oil from different geological units.
2. Problems with simple correlations between oil extraction and seismicity
There is no clear relationship between the frequency of earthquakes and the timing of oil extraction from Horse Hill. This led Hicks et al. (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019) to argue that the earthquakes were not induced by oil extraction. Two key factors highlight the lack of a clear correlation. First, seismicity began before the onset of oil extraction at Horse Hill. If this is the case, it means that oil extraction could not have caused the seismicity. Second, the period of maximum oil extraction (October 2018–January 2019) coincides with a period of very low seismic activity (Figures 1 and 2). A simple correlation model would require increased seismicity during increased oil extraction. However, we show here that there are also simple explanations for these two apparent anomalies.
2. a. The onset of seismic activity
Prior to the onset of activity at Horse Hill, there was a period of surface works at the same locality (Cavanagh et al., Reference Cavanagh, Gilfillan and Haszeldine2019). However, these also included sub-surface work, such as annular pressure checks on 5th–6th April 2018, and other well activities. The exact details are uncertain, and further details can be found in Westaway (Reference Westaway2022). Here, we therefore consider it appropriate to accept that earthquakes occurring prior to the onset of oil extraction at Horse Hill could be related either to these prior surface works at Horse Hill or to oil extraction at Brockham which was ongoing at the same time. However, it is important to note that Brockham is 10–20 km away, and there are several faults between Brockham and Newdigate.
2. b. Aseismicity during maximum oil extraction
Oil extraction at Horse Hill was not confined to a single geological unit, but switched between the Portland and Kimmeridge units, which have significantly different physical and mechanical properties. Initially, the Portland unit was targeted. This unit has a relatively high permeability, of the order of 10-16 m2 (Brantut et al., Reference Brantut, Heap, Baud and Meredith2014), and a moderately high resistance to fracture propagation (Chandler et al., Reference Chandler, Meredith, Brantut and Crawford2016). On 10 September 2018, production switched to the Kimmeridge unit. By contrast, this shale-rich unit has a permeability that is some six to seven orders of magnitude lower (Gutierrez & Wangen, Reference Gutierrez and Wangen2005). Shale-rich units also have an anisotropic resistance to fracture propagation, with the resistance parallel to layering being much lower than that perpendicular to layering (Chandler et al., Reference Chandler, Meredith, Brantut and Crawford2016), thus encouraging the growth of horizontal fractures at depths less than a few kilometres. Then on 11 February 2019, production returned to the Portland unit. Details of this analysis can be found in Westaway (Reference Westaway2022) and the reports to investors detailing these operations can be found there. The period of reduced seismicity corresponds to the time of oil extraction from the Kimmeridge units (Figure 3). Importantly, this formation-level comparison was not accounted for in Hicks et al. (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019) as the formation-level operation data were not made available by operators either directly or via the regulator (with a sufficient temporal resolution).
2. c. Additional complexity due to time lags
More generally, time lags between extraction and earthquakes complicate the correlation. Such time lags are not uncommon and have been well documented in the literature. For example, it is now known that even small stress or pressure changes can cause fractures to nucleate and propagate through the mechanism of stress corrosion (Atkinson, Reference Atkinson1984; Atkinson & Meredith, Reference Atkinson and Meredith1987). In this instance, rock-fluid chemical reactions allow fractures to grow slowly (at rates that are orders of magnitude below critical velocity) at stresses that are well below the short-term strength of the rock. However, the growth rate of such sub-critical fractures accelerates as they lengthen until becoming critical. This can lead to a natural time delay (lag) in triggering seismic activity (Das & Scholz, Reference Das and Scholz1981). Furthermore, we know that earthquakes can induce changes in fluid pressure that trigger other earthquakes at considerable distances (multiple kilometres) and after significant time delays, and vice versa (e.g., Brodsky et al., Reference Brodsky, Roeloffs, Woodcock, Gall and Manga2003; Brodsky & Prejean, Reference Brodsky and Prejean2005; Van der Elst & Brodsky, Reference Van Der Elst and Brodsky2010). Previously, the extent of the time lag has been determined as a function of the distance and the fluid diffusivity of the intervening lithology.
3. A simple model to predict the observed seismicity
The simplest model relates oil extraction to seismicity accounting for variable forcings and variable time lags. All the data required for this analysis are in in Hicks et al., (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019) and Westaway (Reference Westaway2022). Our model greatly simplifies the geomechanical model proposed by Westaway (Reference Westaway2022) in that the time lag is defined in days and the scaling simply relates the volume of oil extracted (barrels) to seismicity. In general, the time lag is a function of bedrock permeability, width of permeable units, fluid migration pathways, pressure variations and distance. Because these parameters are all unknown and trade-off against one another, we prefer a single time lag parameter. Similarly, the scaling is a function of pressure variations, the state of stress of the fault plane, asperities on the fault plane, earthquake detection limits and many other factors. Our simple model is therefore suitable to explore the correlations between earthquakes and oil extraction. For example, we might expect more seismicity if oil is extracted from the stronger (more brittle) Portland unit, and we might expect this to happen more closely in time after the oil is extracted, due to its higher permeability. In this way, the model prediction of the number of earthquakes in a single day is equal to:
where ${E_t}$ is earthquakes in a given day, ${S_P}$ is the scaling of extraction from the Portland at Horse Hill, with units of earthquakes per barrel extracted, and ${P_{t - {l_P}}}$ is the number of barrels of oil extracted from the Portland at a time in the past given by t-l p, where l p is the lag associated with extraction from the Portland. Similarly, ${S_k}$ is the scaling of ${K_{t - {l_k}}}$ barrels from the Kimmeridge at Horse Hill at a specific time in the past given by the lag. For the Brockham site, we have a unique scaling, but the lag time is given by the Portland lag time. The unknown parameters are the scaling relationships and the lag times. We also solve for the unknown pressure changes associated with surface works before the operational tests. This is modelled as ${E_t} = {S_P}{X_{t - {l_P}}}$ , where X is the unknown number of equivalent barrels, l p is the lag associated with extraction from the Portland and ${S_P}$ is the scaling of extraction from the Portland units.
To infer the unknown parameters and predict the data, we use Bayesian Machine Learning and the Markov Chain Monte Carlo (MCMC) algorithm (Metropolis et al., Reference Metropolis, Rosenbluth, Rosenbluth, Teller and Teller1953; Hastings, Reference Hastings1970). This approach has been applied extensively in the Earth Sciences (Gallagher et al., Reference Gallagher, Charvin, Nielsen, Sambridge and Stephenson2009), and we provide a brief introduction here. This Bayesian approach combines a priori parameter estimates with data likelihood to estimate a posteriori parameter information (Tarantola, Reference Tarantola2005). Our a priori information on the model parameters is limited, and the a priori parameter distributions are defined to be uniform, positive and non-informative. We use a standard likelihood function between the cumulative number of predicted and observed earthquakes assuming Gaussian distributions with an uncertainty equal to 1 additional earthquake per day. The MCMC algorithm is used to sample the a posteriori parameter distributions. We run one million models varying the values of the parameters and storing models that are accepted. Each new model is proposed as a perturbation of the previous model. In order to perturb the model, a random number is drawn from a Gaussian proposal distribution defined for each model parameter. The standard deviations of the Gaussian distributions control how far proposed models are from previous models. These distributions are chosen to provide acceptance rates of approximately 30% (Gilks et al., Reference Gilks, Richardson and Spiegelhalter1995; Rosenthal, Reference Rosenthal2011). Importantly, these choices do not influence the final a posteriori distributions, instead they determine how quickly the algorithm explores the parameter space (Rosenthal, Reference Rosenthal2011). If the likelihood of the proposed model predictions is higher than the previous model, the model parameters are accepted. If the likelihood is lower than the previous model, the model may or may not be accepted. To determine whether the proposal is accepted, the ratio between the proposed and previous likelihoods is calculated; this ratio is then compared to a random number drawn from a uniform distribution between 0 and 1; if the ratio is greater than the random number, the model is accepted; if the ratio is less than the random number, the model is rejected. In this way, models that lead to a worse fit to the data may still be accepted, and the model can sample the full a posteriori model parameter distributions. Ultimately, the frequency of accepted model parameters is proportional to the a posteriori probability of the parameter values.
Importantly, we attempt to explain the cumulative number of earthquakes contained in the catalogue produced by Hicks et al., (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019). We use the cumulative number as this is more robust with respect to earthquake swarms: we will never be able to explain each earthquake on each day, but we can attempt to explain apparent increases or decreases in seismicity. We expect that the magnitude is controlled by far field stresses, and the oil extraction simply triggers the faulting. For this reason, we do not expect a simple relationship between earthquake magnitude and oil extraction, as might be expected for induced seismicity (McGarr, Reference McGarr2014). It is important to highlight that the minimum magnitude earthquake that could have been detected changed in the summer of 2018 due to the installation of a temporary seismic array (Hicks et al., Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019). This means that small events are probably lost from the record before this time and that the record is more complete afterwards. However, a key feature of the dataset highlighting low seismicity during extraction from the Kimmeridge units and then increased seismicity during extraction from the Portland units is robust with respect to this change in completeness.
4. Results
The very simple model reproduces the key characteristics of the dataset, in that we identify two periods of increased seismicity associated with extraction from the Portland units (Figure 3). Most of our model parameters are well resolved by the model (Figures 4 and 5). We are able to estimate the amount of equivalent extraction during the surface preparation work associated with annulus pressure checks. This parameter, however, is poorly resolved, and the maximum a posteriori probability is close to 50 barrels. It is also important to note that this parameter is very sensitive to the change in the completeness of the seismic record. If the detection limit was M L 1 before summer 2018, hundreds of small earthquakes in the range −1 to 1 were probably lost from the record, assuming a Gutenberg–Richter relation with b = 1 (Hicks et al., Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019). If this is the case, the equivalent volumes of extraction during the surface work that we predict would be underestimated.
Our lag times for extraction from the Portland units are on the order of days and suggest connectivity between the well and the fault plane (Figure 4). These lags times are comparable to those predicted by Westaway (2020). In contrast, the lag time for the Kimmeridge units is much longer, but this parameter is less well resolved and might simply reflect the concept that extraction from the Kimmeridge is not producing earthquakes.
The model fails to explain the sharp increase in cumulative earthquakes at the time production at Horse Hill returned to the Portland unit. This highlights that there is not a simple relationship between pressure changes in the well and earthquake frequency. Of course, this is to be expected as pressures are expected to build and release in non-linear and complex ways. We also do not account for the aftershocks that dissipate strain following a large earthquake. Instead, these aftershocks are assumed to be triggered by the same process of oil extraction. We also do not resolve the flattening-off of the earthquake frequency towards the end of the model. This might be related to our inability to accurately resolve the sharp increase during the return of extraction from the Portland units as increased seismicity in the past might reduce seismicity later on. Future work should account for these complexities.
5. Summary
A simple model that accounts for the differences between the two lithologies encountered at Horse Hill reproduces many of the features of the transient earthquake swarm. It is possible that this could be a coincidence, but the fit of the model to the data supports a relationship between the Newdigate earthquakes and oil extraction at Horse Hill. The key component of the model is the differentiation of the data on extraction into two very different lithologies. It is clear that different lithologies respond differently to variations in fluid pressure and stress due to permeability and rheological variations.
Identifying the source of seismicity remains a major challenge, and the Newdigate swarm of earthquakes provides a unique dataset to address this challenge. The change in the detection limit of the earthquake magnitude during the seismic swarm complicates this issue. Crucially, this case study highlights the role seismic monitoring plays in oil exploration.
Data availability
Data used in this analysis are available in Hicks et al., (Reference Hicks, Verdon, Baptie, Luckett, Mildon and Gernon2019). The dates used to determine extraction from different units can be found here (https://www.lse.co.uk/rns/ewt-updates-portland-kimmeridge-oil-discovery-x282xeqhbcutm8m.html) and here (https://www.lse.co.uk/rns/UKOG/portland-oil-production-resumes-at-horse-hill-e1jf7s92q1dsoex.html).
Acknowledgements
We would like to thank Steve Hicks, Gerald Roberts and Eric Newlands for comments on earlier versions of the manuscript and support throughout this research. M. Fox is supported by NERC (NE/X009408/1).