Reducing the CO2 footprint at an LNG asset with replicate trains using operational data-driven analysis. A case study on end flash vessels

Rakesh Paleja; Ekhorutomwen Osemwinyen; Matthew Jones; John Ayoola; Raghuraman Pitchumani; Philip Jonathan

doi:10.1017/dce.2024.23

Reducing the CO2 footprint at an LNG asset with replicate trains using operational data-driven analysis. A case study on end flash vessels

Published online by Cambridge University Press: 08 January 2025

Rakesh Paleja

Ekhorutomwen Osemwinyen ,

Matthew Jones ,

John Ayoola ,

Raghuraman Pitchumani and

Philip Jonathan

Show author details

Rakesh Paleja: Affiliation:
Shell Research Limited, London, SE1 7LZ, UK.
Ekhorutomwen Osemwinyen: Affiliation:
NLNG Plant Complex, Bonny Island, Rivers State, Nigeria
Matthew Jones: Affiliation:
Shell Global Solutions International BV, Amsterdam, 1031 HW, The Netherlands
John Ayoola: Affiliation:
NLNG Plant Complex, Bonny Island, Rivers State, Nigeria
Raghuraman Pitchumani: Affiliation:
Shell International Exploration and Production Inc., Houston, TX 77079, USA
Philip Jonathan*: Affiliation:
Shell Research Limited, London, SE1 7LZ, UK. Department of Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YF, UK
*: Corresponding author: Philip Jonathan; Emails: philip.jonathan@shell.com; p.jonathan@lancaster.ac.uk

Article contents

Abstract
Impact Statement
Introduction
Description of LNG process
Exploratory data analysis and hypothesis testing
Discussion and conclusions
Lessons learned
Data availability statement
Author contribution
Funding statement
Competing interest
Ethical standard
References

Abstract

A liquefied natural gas (LNG) facility often incorporates replicate liquefaction trains. The performance of equivalent units across trains, designed using common numerical models, might be expected to be similar. In this article, we discuss statistical analysis of real plant data to validate this assumption. Analysis of operational data for end flash vessels from a pair of replicate trains at an LNG facility indicates that one train produces 2.8%–6.4% more end flash gas than the other. We then develop statistical models for train operation, facilitating reduced flaring and hence a reduction of up to 45% in CO2 equivalent flaring emissions, noting that flaring emissions for a typical LNG facility account for ~4%–8% of the overall facility emissions. We recommend that operational data-driven models be considered generally to improve the performance of LNG facilities and reduce their CO2 footprint, particularly when replica units are present.

Information

Type: Translational Article
Information: Data-Centric Engineering , Volume 6 , 2025 , e1

DOI: https://doi.org/10.1017/dce.2024.23 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © Shell Information Technology International Ltd., London SE1 7NA, United Kingdom., 2025. Published by Cambridge University Press

Impact Statement

Empirical models based on operational data from a liquefied natural gas production facility are used to identify and exploit differences between the performance of production units that are nominally equivalent from a design perspective. Differential operation of nominally replicate units leads to a reduction of up to 45% in CO₂ equivalent flaring emissions, noting that flaring emissions for a typical LNG facility account for ~4%–8% of the overall facility emissions.

1. Introduction

Natural gas (NG) plays a significant role in the global energy transition since switching from coal to NG reduces greenhouse gas emissions by 50% when producing electricity and 33% when providing heat; globally, up to 500 MtCO₂ were avoided in 2018 compared with 2010 (International Energy Agency, 2019). NG sources in Australia, the Middle East, Russia, North America, and Africa are often distant from consumer demand in Europe, Japan, South Korea, China, and developing Asia (International Energy Agency, 2022). Transporting NG via pipeline over distances >3000 km is not economically viable because of the low energy density of NG on a volumetric basis. Liquefaction of NG to $ -163{}^{\circ} $ C reduces its volume by a factor of around 600, permitting transportation by sea (Hafner and Luciani, Reference Hafner and Luciani2022).

A large-scale LNG train typically consumes 14.3 kW/ton/day of LNG, with 40%–60% of the energy used by compressors (Hasan et al., Reference Hasan, Karimi and Alfadala2009a). The energy required is normally provided by fuel gas (FG) generated from different sources at the LNG-producing facility including end flash gas (EFG) from end flash vessels, and boil-off gas (BOG) from storage tanks and loading vessels (LBOG). Economically and environmentally, it is advantageous to reduce demand for FG as much as possible consistent with demand, while avoiding flaring of excess FG. This is achieved by process modeling using software such as AspenTech’s HYSYS^®, UniSim^®, and MATLAB^®. These software packages use numerical algorithms as summarized, for example, by Bassioni and Klein (Reference Bassioni and Klein2024) and Austbo et al. (Reference Austbo, Lovseth and Gundersen2014).

For example on the demand side, in Alabdulkarem et al. (Reference Alabdulkarem, Mortazavi, Hwang, Radermacher and Rogers2011), minimization of the power of the compressor (which consumes FG) in a C3MR process is performed through simulation in Aspen HYSYS^® and optimization in MATLAB^® using a genetic algorithm, leading to a 9% reduction in energy requirement. Further, Jackson et al. (Reference Jackson, Eiksund and Broda2017) optimize the energy requirement for a typical LNG train at different geographical locations using numerical methods and conclude that liquefaction in colder climates such as that of Norway would require 20%–26% lower energy compared with warmer Australian or Middle Eastern climates. Thus a given compressor can provide more LNG in colder countries. In Ali et al. (Reference Ali, Qyyum, Qadeer and Lee2018), FG demand for a single mixed refrigerant liquefaction process is optimized using the meta-heuristic vortex search algorithm; optimal values of mixed refrigerant flow rates and process operating pressures are determined in the vortex pattern corresponding to the minimum required energy, which is reduced by 41.5%. In Castillo et al. (Reference Castillo, Dahouk, Scipio and Dorao2013), options to pre-cool NG are studied for hot and cold climate conditions using HYSYS^® to determine the most energy-efficient technology for either climate.

On the supply size, in Hasan et al. (Reference Hasan, Zheng and Karimi2009b), dynamic simulations are conducted to facilitate the reduction in LBOG using “heel” as a parameter to be optimized; the heel is the amount of LNG that is retained in the LNG vessel during its return journey to maintain the vessel as close as possible to −163 °C. Numerical simulation studies have been developed by Kurle et al. (Reference Kurle, Wang and Xu2017) for LBOG involving variables such as heat leak, initial temperature of LNG ship tank, compressor capacity, and maximum cooling rate for ship-tank in the model. The study is expected to help proper handling of BOG problems in terms of minimizing flaring at LNG exporting terminals, and thus reducing waste, saving energy. Numerical simulations by Jin et al. (Reference Jin, Lim and Xu2023) on BOG generation and recovery at LNG export terminals have been carried out to understand the Specific Energy Consumption (SEC) using a single mixed refrigerant and compare it with a typical Mark III process. The proposed SMR design has a 50.34% lower SEC than a Mark III process. Shin et al. (Reference Shin, Son, Moon, Jo, Kwon and Hwang2022) model excessive BOG generated because of the temperature difference between the LNG and a tank and design a model predictive control (MPC) system to simultaneously regulate the pressure and temperature of the tank by manipulating the vapor outlet flow rate and the amount of LNG spray injected during the cool-down process. In Widodo and Muharam (Reference Widodo and Muharam2023), simulation models for BOG generation during liquefaction and loading processes are discussed for a typical LNG production plant producing 8 million tons/year LNG, limited by the capacity of BOG recovery. Numerical optimization shows a potential production increase from BOG recovery and fuel gas optimization to be around 90,260 tons/year or equivalent to 1.4 cargo of LNG per year. A numerical simulation of the flow of LNG stored in a small-sized cylindrical tank is presented in Ferrin and Perez-Perez (Reference Ferrin and Perez-Perez2020). The work suggests that the filling level of the tank substantially influences the boiling rate and the degree of stratification, as well as the flow structures generated by free convection.

We note that alternative numerical methods exist to model BOG generated during the shipping of LNG. For instance, in Wu and Ju (Reference Wu and Ju2021), the BOG generation characteristics in a type C independent liquefied natural gas (LNG) tank under sloshing excitation are studied using computational fluid dynamics (CFD). Results show that sloshing excitation influences the thermo-physical process and BOG generation of the LNG tank. Such numerical studies do not consider the fact that BOG generation from the LNG tank, and LBOG can vary significantly because of climatic conditions, the nature and size of the loading vessels, and other factors. Further, we are not aware of literature that considers the varying nature of real-time FG supply (from multiple suppliers) and demand.

Moreover, the literature addressing the use of real operational plant data for process optimization is limited. Katebah et al. (Reference Katebah, Hussein, Al-musleh and Almomani2023) note the considerable potential for, yet the dearth of literature on, the exploitation of real plant data to optimize the performance of LNG processes, over and above that achieved using numerical simulation.

Typically, multiple trains at a given LNG production facility have the same design. Multiple trains are preferred over a single large train for reasons such as (a) improved robustness of production to interruptions on an individual train, and (b) physical limitations on the design of a single large train. When multiple trains are operated at an actual LNG facility, some trains may be exact replicas of others in terms of liquefaction technology, size of compressors, and other units such as end flash vessels. Yet the literature examining the performance of multiple trains, from a numerical or operation plant data perspective, is again limited.

1.1. Objectives and layout

In this article, we use a two-step data-driven approach to demonstrate the divergence in performance between two replicate trains at a full-scale LNG facility, focusing on the comparison of end flash vessels at an LNG facility. We emphasize that this article exploits real operational data from the full-scale LNG facility. The first analysis step (reported in Section 3.1) involves exploratory analysis of historical data corresponding to multiple years of operation, to elucidate whether flash vessels from different trains produce different amounts of EFG under similar process conditions. Then we use statistical hypothesis testing (Section 3.2) to confirm significant divergence in EFG production between LNG trains. The second step (Section 3.3) involves the estimation of regression models for EFG production with respect to driver-manipulable process variables. We demonstrate (Section 3.4) that these can be used to control excess EFG to minimize excess end flash gas and reduce CO $ {}_2 $ footprint. We emphasize that the two-step approach is not specific to any particular process unit or production technology. All that is required is a representative period of historical operational data for the near-replica production units.

Preceding the main analysis sections, Section 2 provides an overview of typical large-scale liquefaction. Following the analysis, Section 4 then provides discussion and conclusions. Summary statistics for flow rate from the two end flash vessels considered and details of statistical hypothesis testing using Welch’s t-test are relegated to Appendices A and B.

2. Description of LNG process

This section provides a brief overview of the components and operation of a liquefaction train, followed by a discussion of LNG facilities containing replicate trains and the potential this offers for improved operation.

2.1. The liquefaction train

A liquefaction train at an LNG facility is comprised of a hot and a cold section. NG from the gas field enters hot section, operating at above ambient temperature. Here, NG is pre-treated to remove acid gas (carbon dioxide and hydrogen sulfide), water and mercury. The processed NG then enters the cold section at temperature T1, pressure P1 and flow rate Q1, respectively, as shown in Figure 1. Temperature T1 depends on the geographical location and can vary from 25 to 30 °C, pressure P1 usually ranges from 50 to 60 bar whereas mass flow rate Q1 (tons per day, T/d) depends on the availability of NG. There are different designs for the cold section. In the C3MR design (Lim et al., Reference Lim, Choi and Moon2012), the cold section pre-cools NG in C3 kettles from T1 to temperature T2 and subsequently to T3 in the main cryogenic heat exchanger (MCHE) using a mixed refrigerant (MR). MR consists of nitrogen (N2), C1, C2, and C3. T2 usually approaches −30 to −27 °C whereas T3 ranges from −150 to −145 °C depending on a variety of factors such as NG composition, MR composition and pressure, and flow rates of NG and MR. The C3 kettles and MCHE are shell-and-tube heat exchanger units with NG flowing on the tube side, C3 in the kettles, and MR in the MCHE, both on the shell side. The duty required to circulate propane and MR to cool NG from T1 to T3 is provided by two compressors. Figure 1 illustrates compressor 1 (C3) and compressor 2 (MR). Cooling NG from T1 to T3 results in the vaporization of C3 and MR; vapor heat is ejected to the atmosphere by air or water cooler before returning back to C3 kettles and MCHE respectively. When upstream pressure P1 is high, the final cooling to T4 = −163 °C occurs in the flash vessel, where NG from MCHE is flash evaporated at pressure P4 (close to the atmospheric pressure). As a result, the flow Q3 from the MCHE is divided into a vapor stream with flow rate Q5, and a liquid stream with flow rate Q4, the latter to storage tanks maintained at atmospheric pressure. The vapor stream is EFG to the FG pool, whereas the liquid stream is LNG for export. The nature of the flash evaporation process is such that Q5 $ \ll $ Q4 with $ \mathrm{Q}3=\mathrm{Q}4+\mathrm{Q}5 $ to retain mass balance; the temperatures and pressures of the EFG and LNG are similar.

Figure 1. Schematic of the cold section of LNG train. The end flash vessel shown in blue produces end flash gas, used as fuel gas for the facility.

2.2. Replicate trains

As noted in Section 1, LNG facilities often contain replicate trains; Figure 2 shows a schematic for two replicate trains studied in this article. Here, EFG from the end flash vessels of each train is sent to the FG pool along with other sources of FG such as BOG and LBOG. The FG pool supplies the FG to the LNG facility. When there is excess FG, the flare valve is opened and the excess FG is flared. To prevent flaring, the typical practice is to reduce EFG production from both trains equally, since trains are notionally replicates by design.

Figure 2. Schematic of two replicate trains, Train 1 and Train 2, feeding EFG to FG pool besides BOG from LNG tank LBOB from tank in the loading vessel (also shown in blue). When the FG pool has excess FG it is released and flared through the flare valve.

In this work, we take advantage of replicate flash vessels at the LNG facility to minimize flare value opening. The presence of replicate components such as compressors, MCHEs, coolers, and C3 kettles at LNG facilities generally can be similarly exploited for operational improvements.

3. Exploratory data analysis and hypothesis testing

In this section, we present an analysis of operational data from an LNG facility with two replicate liquefaction trains. The objective of the analysis is to identify differences in the operating characteristics of the end flash vessels of the two trains. The differences identified are then exploited in Sections 3.3 and 3.4 to improve the overall performance of liquefaction, in particular with respect to reduced flaring of EFG. Section 3.1 provides an exploratory analysis of operational data, and Section 3.2 uses statistical hypothesis testing to demonstrate significant differences in operating characteristics for the trains.

We emphasize that the analysis is intended to exploit different operating characteristics of notionally replicated LNG trains. A necessary preliminary step therefore is to ensure that the trains considered are indeed replicates. We have confirmed this for a pair of trains, henceforth identified as Tr₁ and Tr₂, from the LNG facility.

3.1. Exploratory analysis

We consider the operation of flash vessel units U₁, U₂ of replicate trains Tr₁, Tr₂, with EFG mass flow rates Q5₁, Q5₂. Figure 1 motivates the assumption that Q5 for individual units is dependent on (a) the corresponding flow Q3 of NG from the MCHE to the flash vessel, (b) the outlet temperature T3 of NG from the MCHE to the flash vessel, and (c) flash vessel pressure P4. The “manipulated” (or, in statistical terminology, “treatment”) variables Q3, T3, and P4 can be changed independently, thereby influencing Q5. We anticipate that increasing the values of Q3 and T3 will lead to a higher value of Q5. Conversely, a higher P4 will lead to a lower Q5.

We seek to assess fairly whether Q5 from U₁ and U₂ is similar. Ideally, we would conduct a series of experiments on both units, where the values of Q3, T3, and P4 were set at common values across trains, and differences in Q5 were quantified. However, such experiments are impractical economically for trains in continuous operation. Nevertheless, over the course of the normal operation of the trains in time, the set points of Q3, T3, and P4 for the two trains vary, exploring a domain of typical set points for both trains. We can therefore exploit these historical data to quantify differences in Q5. It is, of course, critical that our assessment is fair, in particular, because the domains of Q3, T3, and P4 might be different for the two trains. Since Q5 depends on Q3, T3, and P4, it is essential that the historical data for both trains is filtered such that the treatment variables Q3, T3, and P4 correspond to similar sets of values across the two units. Concisely in mathematical notation, we wish to compare Q5 $ \mid $ (Q3,T3,P4) conditionally across trains, rather than Q5 marginally. The simple filter condition applied takes the form

(1)

$$ \mathrm{LL}\le {X}_1^t/{X}_2^t\le \mathrm{UL}\hskip1em \mathrm{for}\hskip0.5em \mathrm{all}\hskip0.5em \mathrm{of}\hskip0.5em X=\mathrm{Q}3,\mathrm{T}3,\mathrm{P}4 $$

where $ {X}^t $ is the value of $ X $ and time $ t $ , for data sampled every 5 minutes for a period of a contiguous calendar year. We emphasize that the filter considered is applied to all of $ X= $ Q3, T3, and P4. Further, LL indicates a common lower limit for the ratio of manipulated variables across trains, set at 0.98 in this work. UL indicates the corresponding common upper limit, set at 1.02. The effect of filtering manipulated variables is illustrated in Figure 3, for data corresponding to the calendar year 2019. Panels of the figure are scatter plots of X₂ on X₁ for X = Q3, T3, P4, and Q5, with green dots indicating data for time points at which the filter conditions in Equation 1 are satisfied, corresponding to ~10% of the unfiltered sample, over all years of available data. Data for all other time points is shown in blue. Of course, filtering yields subsets of operational data for Tr₁ and Tr₂ of equal size. For reasons of commercial confidentiality, note also that all flow Q3 and Q5 presented in this work (e.g., in Figures 3, 4, and accompanying tables in Appendix A) have been normalized using a common factor $ k $ (i.e., Normalized Flow = $ k $ × Observed Flow) such that the maximum Q5 (over all trains and years) in the filtered data is 100 T/d after normalization. No other variables are normalized.

Figure 3. Scatter plots of Q3, T3, P4, and Q5 across trains Tr₁, Tr₂ of operational data sampled at 5-minute intervals for the year 2019. Values for time points satisfying the filter conditions in Equation 1 are shown in green. All other time points are shown in blue. Values of Q3 and Q5 have been normalized using a common factor so that the global maximum value of Q5 is 100 T/d.

Figure 4. Histograms of filtered Q5₁ (blue) and Q5₂ (red) data per annum, for years 2015–2019. Panel titles indicate the number of observations n retained after filtering. Vertical lines and annotated text give mean values of filtered data. Values of Q5 have been normalized using a common factor so that the global maximum value is 100 T/d.

The quality of NG sourced from upstream wells, distributed to the two trains, varies over time. As the proportion of low boiling point components (e.g. C2, C3, and butane, C4) in NG increases, Q5 production reduces in both U₁ and U₂. Moreover, the performance of LNG trains often exhibits seasonal patterns that can influence Q5; filtering (Equation 1) ensures that the comparison of units is not influenced by season and other external variations of the common NG input to liquefaction. Filtering, therefore, allows us to characterize underlying differences in the operational characteristics of the trains, rather than differences in inputs and operating set points.

Since (replicate) trains are optimized through numerical simulations during design, we expect differences in Q5 across trains to be small. We might, therefore, expect that a comparatively long period of historical data might be required to quantify differences in operational characteristics with confidence: in particular, analysis of filtered data from only 1 year can lead to spurious conclusions. Therefore, here, we analyze historical operational data for the 5-year period 2015–2019. The panels of Figure 4 show histograms of filtered Q5 per annum for the years 2015–2019, for train Tr₁ (blue) and Tr₂ (red). The title of each plot shows the year and number n of filtered 5-minute observations. Vertical blue and red lines and annotated text give sample means of filtered Q5₁ and Q5₂, respectively. Figure 4 suggests for each of the 5 years, that Q5 through U₁ is greater than that through U₂. The difference in sample means ranges from 2.5 to 5.5 T/d. Corresponding tables of summary statistics are provided in Appendix A. It is also interesting that the number of observations retained after filtering is considerably higher in 2018 and 2019 than in 2016 in particular, possibly indicating a more consistent setting of operational conditions across trains in more recent years.

The figure for unfiltered data corresponding to Figure 4 is shown in Figure 5. It is notably difficult to see from the figure that there is a material difference between the operating characteristics of trains Tr₁ and Tr₂. This emphasizes the need to consider the conditional behavior of Q5 given its driver variables Q3, T3, and P4.

Figure 5. Histograms of full unfiltered data for Q5₁ (blue) and Q5₂ (red) per annum, for years 2015–2019. Panel titles indicate the number of observations n retained after filtering. Vertical lines and annotated text give mean values of filtered data. Values of Q5 have been normalized using a common factor so that the global maximum value is 100 T/d.

3.2. Statistical testing

The exploratory analysis above suggests that Q5 from vessel U₁ in train Tr₁ is higher than that from U₂ in train Tr₂. We can quantify this using a statistical hypothesis test to assess whether the population mean $ {\overline{\mathrm{Q}5}}_1 $ of Q5 in train Tr₁ is greater than the corresponding population mean $ {\overline{\mathrm{Q}5}}_2 $ in train Tr₂. To perform this one-sided hypothesis test, we set the null hypothesis H₀ that there is no difference between $ {\overline{\mathrm{Q}5}}_1 $ and $ {\overline{\mathrm{Q}5}}_2 $ , and an alternative hypothesis H $ {}_1 $ that $ {\overline{\mathrm{Q}5}}_1 $ $ > $ $ {\overline{\mathrm{Q}5}}_2 $ . Then we calculate whether there is sufficient evidence in the data to reject the null hypothesis in favour of the alternative. Various parametric and non-parametric tests are suggested in the literature (e.g., Marshall and Jonker, Reference Marshall and Jonker2011) for this purpose. The choice of test depends on the nature of the data and the specific question at hand. Here we use the independent two-sample Student’s t-test, calculating test-statistic $ t $ measuring the difference in population means relative to the variability within the groups using sample data. This test assumes that the variances of the two samples are approximately equal. For samples of random variables $ {X}_1 $ and $ {X}_2 $ with common sample size $ n $ , $ t $ is calculated as

(2)

$$ t=\left({\overline{X}}_1-{\overline{X}}_2\right)/{s}_d $$

where $ {\overline{X}}_1 $ and $ {\overline{X}}_2 $ are the sample means for Q5₁ and Q5₂ (from Tables A1 and A2 in Appendix A), and $ {s}_d $ is the standard error of the difference in means given by $ {s}_d^2=\left({s}_1^2+{s}_2^2\right)/n $ , where $ {s}_1^2 $ and $ {s}_2^2 $ are corrected sample estimates for the variance of $ {X}_1 $ and $ {X}_2 $ . $ {s}_d $ can also be written as $ {s}_d^2=2{s}_p^2/n $ , where $ {s}_p $ is an estimate for the pooled standard deviation of the samples given by $ {s}_p^2=\left({s}_1^2+{s}_2^2\right)/2 $ . The test statistic $ t $ follows a t-distribution with $ \nu =2\left(n-1\right) $ degrees of freedom (Evans et al., Reference Evans, Hastings and Peacock2000). This probability distribution generalizes the standard normal distribution: both the t-distribution and standard normal distribution have mean zero and exhibit a bell-shaped curve, but the t-distribution has heavier tails controlled by shape parameter $ \nu $ . Typically, the null hypothesis H₀ is rejected at the $ \alpha =0.05 $ level; this occurs when the value of the t-statistic calculated exceeds a critical value $ {t}_{\mathrm{crit},\nu}\left(1-\alpha \right) $ equal to the $ \left(1-\alpha \right)\times 100 $ $ =95 $ %ile of the t-distribution with $ \nu $ degrees of freedom.

(3)

$$ \left({\overline{X}}_1-{\overline{X}}_2\right)/{s}_d-{t}_{\mathrm{crit},\nu }(0.95)>0. $$

Multiplying the left-hand side above by $ {s}_d $ gives $ \left({\overline{X}}_1-{\overline{X}}_2\right)-{s}_d\times {t}_{\mathrm{crit},\nu }(0.95) $ , equal to the lower confidence limit LCL for the difference $ {\overline{X}}_1-{\overline{X}}_2 $ in population means. Rejecting H $ {}_0 $ is therefore also equivalent to estimating LCL > 0. For the total sample $ n>100 $ , $ {t}_{\mathrm{crit},2\left(n-1\right)}(0.95)\approx 1.645 $ , the 95%ile of standard normal distribution, to at least two decimal places; for smaller sample sizes, values of $ {t}_{\mathrm{crit},2\left(n-1\right)}(0.95) $ are provided by standard statistical software.

Table 1 shows the results of significance testing for the difference in population mean duty, $ {\overline{\mathrm{Q}5}}_1 $ - $ {\overline{\mathrm{Q}5}}_2 $ , between trains Tr₁ and Tr₂, annually from 2015 to 2019. In percentage terms, $ {\overline{\mathrm{Q}5}}_1 $ exceeds $ {\overline{\mathrm{Q}5}}_2 $ by some 2.8% to 6.4%.

Table 1. Independent two-sample t-test for population mean difference $ {\overline{\mathrm{Q}5}}_1 $ - $ {\overline{\mathrm{Q}5}}_2 $ per annum. Null hypothesis rejected for each year since LCL > 0. Note that the critical value $ {t}_{\mathrm{crit},\nu }(0.95) $ at infinite sample size is adopted as a good approximation, since $ n>1000 $ throughout

When there is evidence that the variance of the two samples is not equal, we can use Welch’s t-test (Welch, Reference Welch1947) as an alternative to the test above. For the current data, using the corresponding Welch test at $ \alpha =0.05 $ , the null hypothesis of equality of $ {\overline{\mathrm{Q}5}}_1 $ and $ {\overline{\mathrm{Q}5}}_2 $ was also rejected for each of the years 2015 to 2019; see Appendix B for details.

3.3. Regression and adjusted regression plots

For each of units U₁ and U₂ on trains Tr₁ and Tr₂, respectively, in turn, we establish linear regression models for Q5 in terms of Q3, T3, and P4 of the form

(4)

$$ \mathrm{Q}5=f\left(\mathrm{Q}3,\mathrm{T}3,\mathrm{P}4\right)+\unicode{x025B} $$

for regression function $ f $ , where $ \unicode{x025B} $ is assumed to be a zero-mean Gaussian random variable with unknown standard deviation. Here, we assume that $ f $ takes the linear form

(5)

$$ f(\mathrm{Q}3,\mathrm{T}3,\mathrm{P}4)=a+b\ \mathrm{Q}3+c\ \mathrm{T}3+d\ \mathrm{P}4 $$

for parameters $ a $ , $ b $ , $ c $ , and $ d $ to be estimated. Following DuMouchel (Reference DuMouchel1988), we then use adjusted response or adjusted regression plots to quantify the effects of individual treatment variables (more naturally referred to as covariates in a regression context) in regression models for Q5 in terms of Q3, T3, and P4, for each of trains Tr₁ and Tr₂. In essence, these are generalizations of partial residual and augmented partial residual plots (Mallows, Reference Mallows1986), useful for linear regression models with arbitrary power and interaction terms. The fitted regression function $ \hat{f} $ from Equation 5 is

(6)

$$ \hat{f}\left(\mathrm{Q}3,\mathrm{T}3,\mathrm{P}4\right)=\hat{a}+\hat{b}\;\mathrm{Q}3+\hat{c}\;\mathrm{T}3+\hat{d}\;\mathrm{P}4 $$

where $ \hat{\bullet} $ represents an estimate. The corresponding residuals from the regression form the set $ {\left\{{r}^i\right\}}_{i=1}^n $ , with

(7)

$$ {r}^i=Q{5}^i-\hat{f}\left(\mathrm{Q}{3}^i,\mathrm{T}{3}^i,\mathrm{P}{4}^i\right)\hskip1em \mathrm{for}\hskip0.24em i=1,2,\dots, \mathrm{n} $$

where $ {\left\{\mathrm{Q}{3}^i,\mathrm{T}{3}^i,\mathrm{P}{4}^i\right\}}_{i=1}^{\mathrm{n}} $ is the set of values of Q3, T3, and P4 in the data sample of filtered data for regression model fitting.

Next, adjusted fit functions are calculated for each of Q3, T3, and P4 in turn. For example in the case of Q3, the adjusted fit function is the average value of $ \hat{f} $ , expressed as a function of Q3, over all n observations in the data sample

(8)

$$ {g}_{Q3}(q)=\frac{1}{n}\sum \limits_{i=1}^n\hat{f}\left(q,\mathrm{T}{3}^i,\mathrm{P}{4}^i\right). $$

Similar adjusted fit functions can be derived for each covariate in each train in turn. Finally, the set $ {\left\{\tilde{Q}{5}_{Q3}^i\right\}}_{i=1}^n $ of adjusted response values for Q5 with respect to Q3 is calculated using

(9)

$$ \tilde{Q}{5}_{Q3}^i={g}_{Q3}\left(\mathrm{Q}{3}^i\right)+{r}^i\hskip1em \mathrm{for}\hskip0.24em i=1,2,\dots, \mathrm{n} $$

where $ {\left\{{r}^i\right\}}_{i=1}^n $ are the residuals from the full regression (Equation 4). Similar sets of adjusted response values can be calculated for response Q5 with respect to each covariate in each train in turn.

Adjusted response values for Q5 with respect to each of Q3, T3, and P4 are shown in Figure 6, for train Tr₁ (blue) and Tr₂ (red). The anticipated directions of the trends of Q5 with covariates are seen in each case. However, despite the trains being nominally replicates, the magnitudes of gradients are larger for train Tr₁ regardless of covariate. Briefly, Q5 is more sensitive to changes in covariates for train Tr₁. To achieve unit reduction in Q5, the reduction in Q3 (and/or T3) needed in Tr₁ is smaller than that needed in Tr₂. This is potentially a valuable handle with which to reduce the need for flaring.

Figure 6. Adjusted response values for Q5 with respect to Q3, T3, and P4 for U₁ (blue circles) and U₂ (red circles). Corresponding adjusted fit functions $ g $ are shown as black lines.

Note that the adjusted regression methodology is applicable generally, regardless of the form of the regression function in Equation 4.

3.4. Implementation of recommendations

Given the findings above, trials were conducted on the liquefaction trains to evaluate the impact on flaring of different manipulations of set-points of manipulated variables on Tr₁ and Tr₂ end flash units U₁ and U₂. In the first period (“Period 1”), each time the flare valve was on the verge of opening, a common reduction of T3 was made for both trains, followed by a common reduction of Q3 if necessary. In the second period (“Period 2”), preferential treatment was given to Tr₁:T3₁ and was reduced first, followed if necessary by Q3₁, T3₂, and Q3₂ if flaring persisted. P4 was not used as a handle during the trial. Results are shown in Figure 7. Panels show the mean flare valve opening in Periods 1 (left) and 2 (right) as a function of the mean T3 (x-axis) and total Q3 (y-axis). The figure indicates a reduction in High and Medium flare valve opening in Period 2 compared with Period 1, resulting in a reduction of up to 45% in flaring-related CO₂ emissions. Polygons (magenta) in each panel show approximate ranges for mean T3 and total Q3 within which the risk of High or Medium flaring is low. The area of the polygon for Period 2 is considerably wider than for Period 1, indicating that the reduction of T3 and Q3 for Tr₁ before those of Tr₂ is advantageous in reducing FG flaring.

Figure 7. Flare valve opening, ranging from High, Medium to Low for Period 1 (left) and Period 2 (right) as a function of a mean of T3 and a sum of Q3 from Tr₁ and Tr₂. In Period 1, simultaneous and equal reductions were made, first for T3 and subsequently if necessary for Q3, for both trains at the point of flare onset. In Period 2, T3₁ and then Q3₁ (if necessary) were reduced first, followed (if necessary) by T3₂ and Q3₂. Polygons show domains of mean T3 and total Q3 corresponding to low risk of High and Medium flare opening.

4. Discussion and conclusions

This article demonstrates that differences in the operating characteristics of nominally replicated units at an LNG facility can be exploited to improve the overall performance of the facility, in particular by minimizing flaring. We demonstrate that careful exploratory analysis can be used to identify differences in operating characteristics and that statistical hypothesis testing lends weight to findings from exploratory work. We then show that simple regression models can be used to illustrate and quantify differences in unit performance. Finally, we demonstrate by modifying operating practices at the “live” LNG facility, that the recommendations of the statistical analysis provide clear material benefit. We emphasize that this article exploits real operational data from the full-scale LNG facility.

The statistical analysis conducted here is elementary but sound. Indeed, we hope the current work demonstrates the real-world benefits available from the careful application of straightforward statistical thinking and method: complicated models are not always necessary for process improvement in manufacturing. Nevertheless, there are numerous ways in which the current analysis can be improved. For example, preliminary analysis suggests there may be some benefit from consideration of seasonal trends in the relative operating characteristics of the flash vessel units.

Specifically, our study of end flash vessels from two trains at an LNG facility has shown that statistically significant differences in train performance can exist even though trains may be “exact copies” of each other from a design perspective. In fact, although the end flash vessels in the two trains are designed to identical specifications, in operation they may not perform equivalently for a number of reasons. For example, they may be exposed to different localized variations in ambient conditions, causing variation in end flash gas produced. We actually observe the flow rate of end flash gas (EFG, Q5) produced from one end flash vessel to be 2.8%–6.4% higher than from the other replicate unit. As a result, on the onset of EFG flaring, the standard practice of reducing NG input temperature (T3) and flow rate (Q3) simultaneously and equally to the main cryogenic heat exchangers of the two trains to minimize flaring is demonstrably not the best practice. An improved procedure based on the current statistical analysis first reduces T3 and Q3 for the train whose EFG production is more sensitive to operating conditions. When this strategy was followed at the LNG facility, flaring-related CO₂ emissions were reduced by up to 45% compared with standard practice, noting that flaring emissions for a typical LNG facility account for ~4%–8% of the overall facility emissions.

Insights from analysis of operational data cannot be obtained from simulation studies of model trains with identical designs. We hope the current article serves as motivation for the wider use of data-informed and data-driven approaches for improved efficiency in manufacturing.

5. Lessons learned

• Statistical analysis of recent operational data from production units is a useful source of information to improve the operation of those units.
• Numerical simulations are useful to design production units, but may not give the full picture regarding the real-world operation of the units. Production units that are nominally thought to be equivalent from a design perspective often diverge in performance in reality.
• Empirical models can be used to identify and exploit differences in unit operating characteristics, to further optimize the overall performance of manufacturing facilities comprised of multiple units.
• We show that differential operation of LNG trains leads to reduced medium and high-intensity flaring and hence a reduction of CO₂ flaring emissions by up to 45% for the LNG facility.
• The methodology presented is demonstrated for LNG facilities, but we believe the approach would be useful generally.

Acknowledgments

We acknowledge the support of colleagues at Shell, particularly Jasper Stolte, Bei Li, and Sharan Nair, during the project execution and publication of this article.

Data availability statement

Data are withheld for reasons of confidentiality.

Author contribution

Conceptualization: RPa. Methodology: RPa, MJ, PJ. Data curation: OE, JA. Data visualization: RPa, MJ. Chemical engineering expertise: RPi, OE, JA. Writing original draft: RPa, PJ. All authors approved the final submitted draft.

Funding statement

No external funding sources are associated with this research.

Competing interest

None.

Ethical standard

The research meets all ethical guidelines, including adherence to the legal requirements of the study country.

Appendix A: Annual summary statistics for Q5 from trains Tr₁, Tr₂ for years 2015–2019

This appendix gives summary statistics for normalized filtered Q5 from trains Tr₁, Tr₂ for years 2015–2019, corresponding to Figure 4. These values are also used in the statistical testing reported in Section 3.

Table A1: Summary statistics of samples of filtered Q5 values for train Tr₁ over years 2015 to 2019. Values have been normalized using a common factor so that the global maximum value (over both trains and all years) is 100 T/d

Table A2: Summary statistics of filtered Q5 values for train Tr₂ over years 2015 to 2019. Values have been normalized using a common factor so that the global maximum value (over both trains and all years) is 100 T/d

Appendix B: One-tailed, two-sample Welch’s t-test for un-equal variance

In the notation of Section 3.2, the expression for Welch’s t-test statistic (Welch, Reference Welch1947) to compare the means of populations with unequal population variances of $ {X}_1 $ and $ {X}_2 $ , but equal sample size $ n $ , is the same as that given in Equation 2. The degrees of freedom $ \nu $ of the t-distribution is however different, given by Satterthwaite’s approximation (Satterthwaite, Reference Satterthwaite1946) as

(10)

$$ \nu =\frac{\left(n-1\right){\left({s}_1^2+{s}_2^2\right)}^2}{s_1^4+{s}_2^4} $$

where $ {s}_1 $ and $ {s}_2 $ are the corrected sample standard deviations for the two groups; the Welch’s t-test is more conservative in estimating $ \nu $ . The corresponding table of results using Welch’s t-test is given in Table (c.f. Table 1) is given in Table B1.

Table B1: Welch’s t-test for population mean difference $ {\overline{\mathrm{Q}5}}_1 $ - $ {\overline{\mathrm{Q}5}}_2 $ per annum, assuming unequal population variances. Null hypothesis rejected for each year since LCL > 0. Note that the critical value $ {t}_{\mathrm{crit},\nu }(0.95) $ at infinite sample size is adopted as a good approximation, since $ n>1000 $ throughout

References

Alabdulkarem, A, Mortazavi, A, Hwang, Y, Radermacher, R and Rogers, P (2011) Current status and perspectives of liquefied natural gas (lng) plant design. Applied Thermal Engineering 31, 1091–1098.CrossRef Google Scholar

Ali, W, Qyyum, MA, Qadeer, K and Lee, M (2018) Energy optimization for single mixed refrigerant natural gas liquefaction process using the metaheuristic vortex search algorithm. Applied Thermal Engineering 129, 782–791.CrossRef Google Scholar

Austbo, B, Lovseth, SW and Gundersen, T (2014) Annotated bibliography - Use of optimization in LNG process design and operation. Computers and Chemical Engineering 71, 391–414.CrossRef Google Scholar

Bassioni, G and Klein, H (2024) Liquefaction of natural gas and simulated process optimization: A review. Ain Shams Engineering Journal 15, 102431.CrossRef Google Scholar

Castillo, L, Dahouk, MM, Scipio, SD and Dorao, C (2013) Conceptual analysis of the precooling stage for LNG processes. Energy Conversion and Management 66, 41–47.CrossRef Google Scholar

DuMouchel, W (1988) Graphical representations of main effects and interaction effects in a polynomial regression on several predictors. In DTIC ADA208838: Computing Science and Statistics: Proceedings of the 20th Symposium on the Interface: Computationally Intensive Methods in Statistics. Fairfax, Virginia, pp. 127–132.Google Scholar

Evans, M, Hastings, N and Peacock, B (2000) Statistical Distributions. Wiley.Google Scholar

Ferrin, JL and Perez-Perez, LJ (2020) Numerical simulation of natural convection and boil-off in a small size pressurized LNG storage tank. Computers and Chemical Engineering 138, 106840.CrossRef Google Scholar

Hafner, M and Luciani, G (eds.) (2022) The Palgrave Handbook of International Energy Economics.CrossRef Google Scholar

Hasan, M, Karimi, I and Alfadala, H (2009a) Optimizing compressor operations in an LNG plant. Proceedings of the 1st Annual Gas Processing Symposium 48, 179–184.CrossRef Google Scholar

Hasan, MMF, Zheng, AM and Karimi, IA (2009b) Minimizing boil-off losses in liquefied natural gas transportation. Industrial & Engineering Chemistry Research 48, 9571–9580.CrossRef Google Scholar

International Energy Agency (2019) The role of gas in today’s energy transitions.Google Scholar

International Energy Agency (2022) World energy outlook.Google Scholar

Jackson, S, Eiksund, O and Broda, E (2017) Impact of ambient temperature on LNG liquefaction process performance: Energy efficiency and CO2 emissions in cold climates. Industrial and Engineering Chemical Research, 56, 3388–3398.CrossRef Google Scholar

Jin, C, Lim, Y and Xu, X (2023) Performance analysis of a boil-off gas re-liquefaction process for LNG carriers. Energy 278, 127823.CrossRef Google Scholar

Katebah, MA, Hussein, MM, Al-musleh, EI and Almomani, F (2023) A systematic optimization approach of an actual LNG plant: Power savings and enhanced process economy. Energy 269, 126710.CrossRef Google Scholar

Kurle, YM, Wang, S and Xu, Q (2017) Dynamic simulation of LNG loading, BOG generation, and BOG recovery at LNG exporting terminals. Computers and Chemical Engineering 97, 47–58.CrossRef Google Scholar

Lim, W, Choi, K and Moon, I (2012) Current status and perspectives of liquefied natural gas (LNG) plant design. Industrial and Engineering Chemistry Research 52, 3056–3088.Google Scholar

Mallows, CL (1986) Augmented partial residuals. Technometrics 28, 313–319.CrossRef Google Scholar

Marshall, G and Jonker, L (2011) An introduction to inferential statistics: A review and practical guide. Radiography 17, 1–6.CrossRef Google Scholar

Satterthwaite, F (1946) An approximate distribution of estimates of variance components. Biometrics Bulletin 2, 110–14.CrossRef Google Scholar PubMed

Shin, K, Son, S, Moon, J, Jo, Y, Kwon, JS and Hwang, S (2022) Dynamic modeling and predictive control of boil-off gas generation during LNG loading. Computers and Chemical Engineering 160, 107698.CrossRef Google Scholar

Welch, BL (1947) The generalisation of “student’s” problem when several different population variances are involved. Biometrika 34, 28–35.Google Scholar PubMed

Widodo, A and Muharam, Y (2023) Simulation of boil-off gas recovery and fuel gas optimization for increasing liquefied natural gas production. Energy Reports 10, 4503–4515.CrossRef Google Scholar

Wu, S and Ju, Y (2021) Numerical study of the boil-off gas (BOG) generation characteristics in a type C independent liquefied natural gas (LNG) tank under sloshing excitation. Energy 223, 120001.CrossRef Google Scholar

Figure 1. Schematic of the cold section of LNG train. The end flash vessel shown in blue produces end flash gas, used as fuel gas for the facility.

Figure 3. Scatter plots of Q3, T3, P4, and Q5 across trains Tr1, Tr2 of operational data sampled at 5-minute intervals for the year 2019. Values for time points satisfying the filter conditions in Equation 1 are shown in green. All other time points are shown in blue. Values of Q3 and Q5 have been normalized using a common factor so that the global maximum value of Q5 is 100 T/d.

Figure 4. Histograms of filtered Q51 (blue) and Q52 (red) data per annum, for years 2015–2019. Panel titles indicate the number of observations n retained after filtering. Vertical lines and annotated text give mean values of filtered data. Values of Q5 have been normalized using a common factor so that the global maximum value is 100 T/d.

Figure 5. Histograms of full unfiltered data for Q51 (blue) and Q52 (red) per annum, for years 2015–2019. Panel titles indicate the number of observations n retained after filtering. Vertical lines and annotated text give mean values of filtered data. Values of Q5 have been normalized using a common factor so that the global maximum value is 100 T/d.

Figure 6. Adjusted response values for Q5 with respect to Q3, T3, and P4 for U1 (blue circles) and U2 (red circles). Corresponding adjusted fit functions $ g $ are shown as black lines.

Figure 7. Flare valve opening, ranging from High, Medium to Low for Period 1 (left) and Period 2 (right) as a function of a mean of T3 and a sum of Q3 from Tr1 and Tr2. In Period 1, simultaneous and equal reductions were made, first for T3 and subsequently if necessary for Q3, for both trains at the point of flare onset. In Period 2, T31 and then Q31 (if necessary) were reduced first, followed (if necessary) by T32 and Q32. Polygons show domains of mean T3 and total Q3 corresponding to low risk of High and Medium flare opening.

Table A1: Summary statistics of samples of filtered Q5 values for train Tr1 over years 2015 to 2019. Values have been normalized using a common factor so that the global maximum value (over both trains and all years) is 100 T/d

Table A2: Summary statistics of filtered Q5 values for train Tr2 over years 2015 to 2019. Values have been normalized using a common factor so that the global maximum value (over both trains and all years) is 100 T/d

Submit a response

Comments

No Comments have been published for this article.

Article contents

Reducing the CO2 footprint at an LNG asset with replicate trains using operational data-driven analysis. A case study on end flash vessels

Abstract

Information

Impact Statement

1. Introduction

1.1. Objectives and layout

2. Description of LNG process

2.1. The liquefaction train

2.2. Replicate trains

3. Exploratory data analysis and hypothesis testing

3.1. Exploratory analysis

3.2. Statistical testing

3.3. Regression and adjusted regression plots

3.4. Implementation of recommendations

4. Discussion and conclusions

5. Lessons learned

Acknowledgments

Data availability statement

Author contribution

Funding statement

Competing interest

Ethical standard

Appendix A: Annual summary statistics for Q5 from trains Tr1, Tr2 for years 2015–2019

Appendix B: One-tailed, two-sample Welch’s t-test for un-equal variance

References

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

Appendix A: Annual summary statistics for Q5 from trains Tr₁, Tr₂ for years 2015–2019