Modelling non-stationary annual maximum flood heights in the lower Limpopo River basin of Mozambique

In this article we fit a time-dependent generalised extreme value (GEV) distribution to annual maximum flood heights at three sites: Chokwe, Sicacate and Combomune in the lower Limpopo River basin of Mozambique. A GEV distribution is fitted to six annual maximum time series models at each site, namely: annual daily maximum (AM1), annual 2-day maximum (AM2), annual 5-day maximum (AM5), annual 7-day maximum (AM7), annual 10-day maximum (AM10) and annual 30-day maximum (AM30). Non-stationary time-dependent GEV models with a linear trend in location and scale parameters are considered in this study. The results show lack of sufficient evidence to indicate a linear trend in the location parameter at all three sites. On the other hand, the findings in this study reveal strong evidence of the existence of a linear trend in the scale parameter at Combomune and Sicacate, whilst the scale parameter had no significant linear trend at Chokwe. Further investigation in this study also reveals that the location parameter at Sicacate can be modelled by a nonlinear quadratic trend; however, the complexity of the overall model is not worthwhile in fit over a time-homogeneous model. This study shows the importance of extending the time-homogeneous GEV model to incorporate climate change factors such as trend in the lower Limpopo River basin, particularly in this era of global warming and a changing climate.


Introduction
There is a general notion that the occurrence of extreme events has changed over these recent years and is anticipated to continue to change in terms of intensity, frequency and complexity of the risks. These recent changes are mainly attributed to global warming and natural modes of interannual and interdecadal variability, such as the El Niño phenomenon (Katz 2010;Katz, Parlange & Naveau 2002;Towler et al. 2010). The anticipated climate-induced changes are of major concern as they have the potential to render our estimates biased and/or useless, particularly those estimates based on traditional approaches that do not take climate changes into consideration. These probable climate changes can also cause negative societal impacts and disruptions, for instance, destruction of schools, children dropping out of schools leading to early marriages, particularly for girls, and thus creating a vicious poverty circle in the community (Katz 2010;Mudavanhu 2014). According to Katz (2010), previous studies in extreme value theory have shown that the frequency of all forms of extreme events, whether in the form of a single value or a sequence of annual maxima, is more sensitive to variations in the scale parameter (or, in particular, the standard deviation) than to the location parameter (or mean) of a distribution. Cooley (2009) wrote a commentary on the potential application of statistics of extremes to climate change based on the previous work of Wigley.
The annual maximum series (AMS), also known as block maxima, has long been employed to estimate the distribution of extreme events such as flood flows, precipitation and wind speeds. The time-homogeneous generalised extreme value (GEV) distribution, which uses standard properties of the likelihood function, has traditionally been used in designing flood estimation (Coles 2001;Maposa et al. 2014b;Towler et al. 2010). The use of a stationary GEV distribution assumes that climate changes and all other variables that may affect the validity of the estimation of design floods remain constant over time. In his wisdom, Gumbel (1941:187), the statistician and pioneer in the application of statistics of extremes to hydrology and other various fields, cautioned that: In order to apply any theory we have to suppose that the data are homogeneous, i.e., no systematical change of climate and important change in the basin have occurred within the observation period and that no such change will take place in the period for which such extrapolations are made. (Katz 2010:71;Katz et al. 2002:188) In this article we fit a time-dependent generalised extreme value (GEV) distribution to annual maximum flood heights at three sites: Chokwe, Sicacate and Combomune in the lower Limpopo River basin of Mozambique. A GEV distribution is fitted to six annual maximum time series models at each site, namely: annual daily maximum (AM1), annual 2-day maximum (AM2), annual 5-day maximum (AM5), annual 7-day maximum (AM7), annual 10-day maximum (AM10) and annual 30-day maximum (AM30). Non-stationary time-dependent GEV models with a linear trend in location and scale parameters are considered in this study. The results show lack of sufficient evidence to indicate a linear trend in the location parameter at all three sites. On the other hand, the findings in this study reveal strong evidence of the existence of a linear trend in the scale parameter at Combomune and Sicacate, whilst the scale parameter had no significant linear trend at Chokwe. Further investigation in this study also reveals that the location parameter at Sicacate can be modelled by a nonlinear quadratic trend; however, the complexity of the overall model is not worthwhile in fit over a time-homogeneous model. This study shows the importance of extending the time-homogeneous GEV model to incorporate climate change factors such as trend in the lower Limpopo River basin, particularly in this era of global warming and a changing climate.
Without loss of generality, it is easy to understand that the assumption of homogeneity of climatic conditions and other important changes in the basin cannot hold forever. In other words, it is inevitable that climatic conditions change over time. The traditional fitting of the time-homogeneous GEV distribution also assumes that the observations are independent and identically distributed (i.i.d.). According to Katz et al. (2002), stationarity implies identicalness and not necessarily independence.
Dr Walter J. Ammann, chairman of the recent International Disaster and Risk Conference held in Davos, Switzerland, 24-28 August 2014, attested that the scope, intensity and complexity of risks as well as the frequency of natural hazards such as floods, earthquakes and forest fires are on the rise in these recent years (IDRC Davos 2014). The increase in the frequency of floods is also supported by a unique survey of 139 national meteorological and hydrological services carried out by the World Meteorological Organisation in 2013, which revealed that floods were the most frequently experienced extreme events worldwide over the course of the decade 2001-2010 (Mudavanhu 2014;WMO 2013). Floods and droughts account for 90% of all the people who are affected by natural disasters (Mudavanhu 2014;Smakhtin 2014). According to Munich Re (2013), the natural catastrophic statistics for the year 2013 was dominated by floods that caused billions of American dollars in losses. In Arya, Boen and Ishiyama (2014), the Director-General of UNESCO, Irina Bokova, stated that: Every year, more than 200 million people are affected by natural hazards, and the risks are increasing -especially in developing countries, where a single major disaster can set back healthy economic growth for years. As a result, approximately one trillion dollars have been lost in the last decade alone. This is why disaster risk reduction is so essential. Mitigating disasters requires training, capacity building at all levels, and it calls for a change of thinking to shift from postdisaster reaction to pre-disaster action -this is UNESCO's position. (p. 6) Although it is clear from literature that the intensity and frequency of floods have increased over the recent years, it is not clear whether the magnitudes of floods, that is, flood heights, have also increased. If the magnitudes of flood heights have increased over the years, it is expected that the location parameter of the GEV, which is associated with the mean estimate of the distribution, should increase with increase in time. On the other hand, if there is no gradual increase in flood heights over the years and the sporadic extremely high floods are nearly or purely random, then we expect the scale parameter, which is associated with dispersion from the central location, to vary with time ( Figure 1). Based on literature cited in the study, there is overwhelming evidence of covariates such as longterm trends or cycles in recent years attributed to manmade activities, which may lead to global warming and atmosphere-ocean circulation patterns such as the El Niño phenomenon, which undermine the long-held traditional assumption of stationarity (Towler et al. 2010;Vasiliades, Galiatsatou & Loukas 2015). Evidence of non-stationarity is exhibited in Figure 1 at all the three sites: Chokwe, Combomune and Sicacate of the lower Limpopo River of Mozambique. However, a visual inspection of Figure 1 shows no apparent trend. Figure 1 reveals that in the year 2000 flood height was a very rare extreme event at all the three sites.
Source: Authors' own construction The present study considers a non-stationary time-dependent GEV distribution model whose location and scale parameters are expected to vary linearly or nonlinearly with time ( Figure 1), whilst the shape parameter remains constant over time. This investigation is carried out for the lower Limpopo River basin (LLRB) of Mozambique. Detailed studies on the goodness of fit of the time-homogeneous GEV distribution in the basin are found in the studies by Maposa, Cochran and Lesaoana (2014a) and Maposa et al. (2014b). In the present study, we advocate for a statistical modelling approach based on maximum likelihood (ML) estimation in the possible presence of covariates. These covariates can be in the form of trends, cycles and physical variables such as El Niño (Katz et al. 2002). The covariate of particular interest in the present study is the trend. To the best of our knowledge, no similar work in the previous studies relating to statistics of extremes in a changing climate have been done for the LLRB of Mozambique.
According to Katz et al. (2002), although the probability weighted moments, also known as L-moments, are more popular in the application of hydrological extremes compared to ML estimation mainly because of their computational simplicity and better performance for small samples where ML is often inconsistent, the probability weighted moment technique has the demerit of being unable to readily incorporate covariates (see also Ferreira & De Haan 2015).
On the other hand, the application of ML technique in the presence of covariates is straightforward in both block maxima and peaks-over-threshold (POT) approaches (Katz et al. 2002).
The outline of the rest of the article is such that Section 2 presents the research methodology, Section 3 presents the results and discussion of the findings, and finally Section 4 gives the concluding remarks.

Research methodology
The section presents the sequential steps taken to sort the data into the block maxima series (Ferreira & De Haan 2015), briefly discuss the probability framework of block maxima including the extension of time-homogeneous GEV model to linear and quadratic trend models.

Study sites and data
Mozambique National Directorate of Water, the authority responsible for water management in Mozambique in the Ministry of Public Works and Housing, provided the data used in the study. The data are hydrometric daily flood heights (in metres) recorded at the sites Chokwe , Combomune (1966Combomune ( -2010 and Sicacate (1952Sicacate ( -2010, which are hydrometric stations for the lower Limpopo River of Mozambique (Maposa et al. 2014a(Maposa et al. , 2014b. The three sites are such that Combomune is located in the upper part of the basin about 162 km from the border with South Africa and Zimbabwe, Chokwe is located in the middle of the basin about 130 km downstream of Combomune and Sicacate is further downstream of Chokwe in the lower part of the basin on way to the sea.

Moving sums and block maxima
The raw data at the three sites were originally recorded as daily flood heights (or water levels). The data records at some sites stretch back to as far as 1930s. However, because of missing values, the records used in the study are for the period 1951-2010 for Chokwe, 1966-2010 for Combomune and 1952-2010 for Sicacate ( Figure 1). In order to obtain AMS, sequential steps were taken to obtain the highest flood peak in each hydrological year (or block). Further steps were taken to obtain annual maximum (AM) flood heights of the moving sums of 2, 5, 7, 10 and 30 days. Finally, the following AM time series models were obtained: annual daily maximum (AM1), annual 2-day maximum (AM2), annual 5-day maximum (AM5), annual 7-day maximum (AM7), annual 10-day maximum (AM10) and annual 30-day maximum (AM30). The procedure to obtain these cumulative AM time series models was necessitated by the need to investigate whether the cumulative annual floods have any significant effect on the long-term linear or quadratic trend in location, scale or both.
In statistics of extremes, there are two fundamental approaches used in flood frequency analysis, namely, block maxima (or AMS) and POT (or partial duration series) (Ferreira & De Haan 2015). The approach used in the study is block maxima.
In hydrological studies, when sample sizes are large, it is natural to block observations by years (Ferreira & De Haan 2015;Maposa et al., 2014b). The data used in this study have sufficiently large AM records extending over 40 years at each site. In flood frequency analysis, the block maxima approach is commonly used ahead of POT when data records have sufficiently large sample sizes and the data quality is adequate (Ferreira & De Haan 2015). The GEV distribution arises naturally when modelling block maxima, whereas for POT the generalised Pareto distribution is commonly used (Ferreira & De Haan 2015).

Extreme value models
Comprehensive details of probability framework of block maxima and the practical reasons for using block maxima over POT are given by Ferreira and De Haan (2015). Dombry (2013) proved the consistency of ML estimators when using block maxima approach.
We are already familiar with the background of extreme value theory, beginning with the limiting distributions of Fisher and Tippett (1928) and advanced theory and applications in Coles (2001). Let (X i ) i≥1 be i.i.d. random variables with common distribution function F ∈ D(G ξ ) and corresponding normalisation sequences of constants a m > 0 and b m such that: where n is the total number of observations (Ferreira & De Haan 2015). Then there exists a non-degenerate function G such that for a fixed block length m ≥ 1, the variables As suggested by Eqn (2), an approximation of the distribution of M k,m , or the distribution function F m , is the GEV distribution with parameters a m , b m and ξ estimated by the ML method in the study (Dombry 2013;Ferreira & De Haan 2015). The GEV cumulative distribution function, G, is given in Eqn (3) as: where µ, σ and ξ are the location, scale and shape parameters, respectively. The model in Eqn (3) is the time-homogeneous GEV model. We shall call it model M 0 and it shall be used as the reference model such that all other extended models are compared to it for their significance.

Model choice
One important question to answer is whether the nonstationary model provides an improvement in fit over the time-homogeneous model M 0 ; that is, is it worthwhile to have the non-stationary model? The ML estimation of nested models uses a simple procedure called the deviance (D) statistic to compare one model against the other. In the study, the time-homogeneous GEV model, M 0 , is a special case of the time-dependent models M 1 , M 2 , M 3 and M 4 . In general, consider M 0 ⊂ M i,∀i=1,2,3,4 , then we define deviance statistic, D, as in Eqn (7): where l i (M i ) and l 0 (M 0 ) are the maximised negative loglikelihood for models M i,∀i=1,2,3,4 and M 0 , respectively. D has a chi-square (c 2 k,a ) asymptotic distribution, with k degrees of freedom tested at α (=0.05 or 5%) level of significance, where k is the difference in dimensionality (or difference in number of parameters) of Mi and M 0 . Thus, D is compared to critical values of χ 2 k,α , where D > c 2 k,α suggests that model M i explains substantially more of the variability in the data than model M 0 .

Results and discussion
In order to avoid presenting too many tables in the article, only tables for the AM1 time series data are presented for each of the three sites Chokwe, Combomune and Sicacate in Tables 1, 2 and 3 respectively. The results for the AMS moving sums for each site are only discussed in detail if there is discrepancy with the AM1 results, else they are simply http://www.jamba.org.za Open Access mentioned if there is consistency. The interested reader can obtain results for the AMS moving sums upon request from the corresponding author. The order of the models is maintained for the AMS moving sums, for example, for AM2 time series data model M 1 still refers to a time-dependent GEV model with a linear trend in both the location and scale parameters as in AM1.

Chokwe models
Consider the pair of models (M 0 , M 1 ) from Table 1, where M 0 is taken as the reference model, c 2 2,0.05 = 5.991, D = 2(−125.802− (−126.313)) = 2(126.313−125.802) = 1.022 and the likelihood ratio test for m 1 = 0 has p-value = 0.4928 and for σ 1 = 0 has p-value = 0.1676 Because D is too small compared to the critical value (5.991) and the likelihood ratio test is not significant at the 5% level of significance (p-value > 0.05) for both the location and scale parameters, it clearly shows that the non-stationary model is not important and does not give any improvement in fit over the time-homogeneous GEV model. Similar insignificant results were obtained for the AMS moving sums AM2, AM5, AM7, AM10 and AM30 for model M 1 . Table 1, (M 0 , M 2 ) and (M 0 , M 3 ), have D = 0.01 and 1.022, respectively, with a critical value of χ 2 1,0.05 = 3.841 for both pairs. The likelihood ratio test for µ 1 = 0 has p-value =0.4591 and σ 1 = 0 has p-value = 0.1653 for M 2 and M 3 , respectively, which is insignificant for both models at the 5% level of significance. The D statistic is again too small (< 3.841) for both models, implying that both models do not provide any improvement in fit over the time-homogeneous GEV model. Similar insignificant results were exhibited for the AMS moving sums AM2, AM5, AM7, AM10 and AM30 for models M 2 and M 3 .

The other pairs from
The quadratic model pair (M 0 , M 4 ) in Table 1 has a D statistic value of 0.006 with a critical value of χ 2 2 , 0.05 = 5.991, implying that model M 4 does not provide any improvement in fit to justify its importance over the time-homogeneous model. The likelihood ratio tests for µ 1 = 0 and µ 2 = 0 are also not significant at the 5% significance level ( p-value > 0.05). Again, similar results were obtained for AM2, AM5, AM7, AM10 and AM30.
In general, the results from Chokwe showed that the prevailing model for the site is the time-homogeneous GEV model given by Eqn (3). In other words, time is not an important factor for the AMS data at Chokwe. The general model estimate for Chokwe is given in Eqn (8) where x = x i,∀i=1,2,…,k is the AM flood height. The diagnostic plots for the time-homogeneous model in Eqn (8) are presented in Figure 2. The diagnostic plots in Figure 2 show that the model is of good fit, with the exception of the year 2000 flood height which falls slightly outside the confidence limits.

Combomune models
We start by considering the pair (M 0 , M 1 ) from Table 2 with χ 2 2 , 0.05 = 5.991, D = 5.36, and likelihood ratio test for µ 1 = 0 has p-value = 0.2570. and for σ 1 = 0 has p-value = 0.0117. These results show that the linear trend in location parameter is not   We now consider the pairs (M 0 , M 2 ) and (M 0 , M 2 ) from Table 2. The critical value for both pairs is χ 2 1,0.05 = 3.841 with respective D statistic values of 0.252 and 4.928 for the two pairs. The likelihood ratio test for µ 1 = 0 has p-value = 0.3092 and σ 1 = 0 has p-value = 0.0141 for models M 2 and M 3 , respectively.
These results show that model M 2 is not significant at the 5% significance level ( p-value > 0.05) and is not worthwhile because D statistic value (0.252) is too small compared to the critical value (3.841). On the other hand, model M 3 is significant at the 5% significance level ( p-value < 0.05) and provides an improvement in fit over the time-homogeneous GEV model because the D statistic value of 4.928 (> 3.841) is significantly large. Similar findings were obtained for all the AMS moving sums. The nonlinear quadratic model pair (M 0 , M 4 ) has a D statistic of 0.226, which is too small compared to the critical value of 5.991 with two degrees of freedom. The likelihood ratio tests for µ 1 = 0 and µ 2 = 0 are not significant at the 5% significance level ( p-value > 0.05). Thus, the nonlinear quadratic model M 4 is neither significant nor worthwhile Source: Authors' own construction Overall, the final model for Combomune is the non-stationary model, M 3 , with a linear trend in the scale parameter of the GEV. The general model for Combomune is given in Eqn (9) 1965, τ i = 1966, 1967, … and t i = 1, 2, 3, … is time in years and x = x i,∀i=1,2,…, is the AM flood height. The diagnostic plots for the time-heterogeneous model in Eqn (9) are presented in Figure 3. The residual probability plot suggests a good fit to the data.

Sicacate models
The model pair (M 0 , M 1 ) from Table 3 has χ 2 2,0.05 = 5.991 and a D statistic value of 8.482. The likelihood ratio test for µ 1 = 0 has p-value = 0.3217 and σ 1 = 0 has p-value = 0.0045, which indicates that the linear trend in location parameter is not significant at the 5% significance level ( p-value > 0.05), whereas the linear trend in scale parameter is highly significant ( p-value < 0.05) in the model. Because the D statistic value (8.482) is greater than the critical value of 5.991, we conclude that model M 1 provides an improvement in fit over the time-homogeneous GEV model; that is model M 1 is worthwhile. These findings are consistent with findings from AMS moving sums AM2, AM5, AM7, AM10 and AM30.
The model pairs (M 0 , M 2 ) and (M 0 , M 3 ) in Table 3 share a critical value of χ 2 1,0.05 = 3.841 with a D statistic value of 0.536 and 8.268 for M 2 and M 3 respectively. The likelihood ratio test for µ 0 = 0 has p-value = 0.2638 and σ 1 = 0 has p-value = 0.0013 This indicates that model M 2 is insignificant at the 5% level of significance ( p-value > 0.05) and not worthwhile (D < 3.841), whilst model M 3 is highly significant at the 5% significance level ( p-value < 0.05) and provides an improvement in fit over the time-homogeneous GEV model, with a large D statistic value of 8.268 (> 3.841). These findings are also consistent with findings from all the AMS moving sums. The nonlinear quadratic model pair (M 0 , M 4 ) in Table 3 has a D statistic value of 0.348 with a critical value of χ 2 2,0.05 = 5.991 The likelihood ratio test for µ 1 = 0 has p-value = 0.4991 and µ 2 = 0 has p-value < 0.0001 This implies that the linear trend term in the location parameter is not significant at the 5% level of significance ( p-value > 0.05) whilst the quadratic trend term in location parameter is highly significant ( p-value < 0.0001). However, the overall nonlinear quadratic model is not worthwhile because the D statistic value of 0.348 is too small compared with the critical value of 5.991. Again, these findings are consistent with findings from all the AMS moving sums.
We now have two competing 'good', non-stationary, linear, time-dependent models for Sicacate. To identify the most appropriate of these models, we rate the one with the smaller standard errors and the smaller p-value as the most appropriate model. In this case, model M 3 has a smaller p-value in the slope of the scale parameter 0.0013 compared to 0.0045 for M 1 , and the standard errors for M 3 are much smaller than those of M 1 for example 0.47428 compared to 0.64696 for μ 0 and 0.01722 compared to 0.02249 for scale slope σ 1 for m 3 and M 1 , respectively. Therefore, the nonstationary linear trend in scale GEV model for Sicacate is given in Eqn (10)  where t i = τ i -1951, τ i =1952,1953,… and t i = 1,2,3,… is time in years and x = x i,∀i=1,2,…,k is the AM flood height. The diagnostic plots for the time-heterogeneous models in Eqn (10) and Source: Authors' own construction Eqn (11) are presented in Figure 4 and Figure 5, respectively. The residual probability plots for both models suggest a good fit to the data.
The interesting findings are that whilst most studies in other regions have found a dominant linear trend in the location parameter of the GEV distribution for some rivers (e.g. Katz et al. 2002), the study has found no evidence of a significant linear trend in the location parameter of the GEV distribution for the LLRB of Mozambique. On the other hand, the study has revealed a dominant time-dependent scale parameter for the river at Combomune upstream and Sicacate further downstream. The study also revealed evidence of a highly significant nonlinear trend, quadratic term, in the location parameter at Sicacate, although the complexity of the overall model was not worthwhile with reference to the timehomogeneous GEV model. The findings in the study are in full support of a previous study by Aich et al. (2014), who used a geoscientific model called eco-hydrological SWIM model to compare the climate change impacts on streamflow in four large African basins including the Niger, upper Blue Nile, Oubangui and Limpopo and found the Limpopo basin to be highly sensitive to climate change variability. The results obtained in the study complemented by those of Aich et al. (2014) explain the reason for the increased frequency of extreme floods in the LLRB of Mozambique, which can be attributed to the variability in climatic conditions. The timedependent GEV models developed in the study are worth considering by the Mozambican government and its partners, as well as its neighbours, in their policies and decision making. Chokwe Irrigation Scheme, the largest irrigation scheme in Mozambique is situated in the LLRB, making the basin the backbone of the country's economy, which is mainly characterised by agriculture and fishing.

Conclusion
The study considered the use of statistics of extremes in a changing climate for the LLRB of Mozambique. Three hydrometric stations representing three sites along the lower Limpopo River were considered for the study. The ML estimation method was used to estimate the parameters of the GEV distribution in the presence of a trend covariate. The study has revealed the importance of considering nonstationary linear and nonlinear trend models when using statistics of extremes in a changing climate as these models provide an improvement in fit over the time-homogeneous models. This improvement in fit is very important for the planning and policy-making of the government of Mozambique and its partners in the LLRB, where the largest irrigation scheme of the country is situated. The importance of the developed models is attributed to the fact that these non-stationary models take into account the reasons for increased frequency of floods in the basin. Once the government and its partners are fully aware of the reasons behind the increased frequency of floods in the basin, their planning can be much improved.
The study has successfully identified the prevailing models at the three sites such that Chokwe is the only site with a time-homogeneous GEV model. This can be attributed to the fact that some of the water at the site is diverted to the Chokwe Irrigation Scheme for irrigation purposes. The other two sites Combomune and Sicacate have a prevailing nonstationary GEV model with a dominant linear trend in the scale parameter. The site of Sicacate has an alternative nonstationary model with a linear trend in both the location and scale parameters of a GEV distribution. The prevailing Source: Authors' own construction  models established in the study are consistent with cumulative (or moving sums) AMS flood flows and therefore appear reliable to use for flood frequency analysis in the basin. The use of the identified time-dependent GEV models with a trend in the scale parameter in the basin would also reduce the sensitivity of the frequency of floods, which is known to vary with changes in the scale parameter and therefore lead to more reliable estimates in the frequency of floods.
Future studies will attempt to advance the study to consider non-stationary generalised Pareto distributions, Bayesian inference and Markov chain Monte Carlo methods in a changing climate for the lower Limpopo River of Mozambique. Covariates in the form of cycles and/or a physical variable such as a dummy variable indicating the occurrence of cyclones in the region will also be considered in future studies involving statistics of extremes in a changing climate.