In this article we fit a time-dependent generalised extreme value (GEV) distribution to annual maximum flood heights at three sites: Chokwe, Sicacate and Combomune in the lower Limpopo River basin of Mozambique. A GEV distribution is fitted to six annual maximum time series models at each site, namely: annual daily maximum (AM1), annual 2-day maximum (AM2), annual 5-day maximum (AM5), annual 7-day maximum (AM7), annual 10-day maximum (AM10) and annual 30-day maximum (AM30). Non-stationary time-dependent GEV models with a linear trend in location and scale parameters are considered in this study. The results show lack of sufficient evidence to indicate a linear trend in the location parameter at all three sites. On the other hand, the findings in this study reveal strong evidence of the existence of a linear trend in the scale parameter at Combomune and Sicacate, whilst the scale parameter had no significant linear trend at Chokwe. Further investigation in this study also reveals that the location parameter at Sicacate can be modelled by a nonlinear quadratic trend; however, the complexity of the overall model is not worthwhile in fit over a time-homogeneous model. This study shows the importance of extending the time-homogeneous GEV model to incorporate climate change factors such as trend in the lower Limpopo River basin, particularly in this era of global warming and a changing climate.

There is a general notion that the occurrence of extreme events has changed over these recent years and is anticipated to continue to change in terms of intensity, frequency and complexity of the risks. These recent changes are mainly attributed to global warming and natural modes of interannual and interdecadal variability, such as the El Niño phenomenon (Katz

The annual maximum series (AMS), also known as block maxima, has long been employed to estimate the distribution of extreme events such as flood flows, precipitation and wind speeds. The time-homogeneous generalised extreme value (GEV) distribution, which uses standard properties of the likelihood function, has traditionally been used in designing flood estimation (Coles

In order to apply any theory we have to suppose that the data are homogeneous, i.e., no systematical change of climate and important change in the basin have occurred within the observation period and that no such change will take place in the period for which such extrapolations are made. (Katz

Without loss of generality, it is easy to understand that the assumption of homogeneity of climatic conditions and other important changes in the basin cannot hold forever. In other words, it is inevitable that climatic conditions change over time. The traditional fitting of the time-homogeneous GEV distribution also assumes that the observations are independent and identically distributed (i.i.d.). According to Katz

Dr Walter J. Ammann, chairman of the recent International Disaster and Risk Conference held in Davos, Switzerland, 24–28 August 2014, attested that the scope, intensity and complexity of risks as well as the frequency of natural hazards such as floods, earthquakes and forest fires are on the rise in these recent years (IDRC Davos

Every year, more than 200 million people are affected by natural hazards, and the risks are increasing – especially in developing countries, where a single major disaster can set back healthy economic growth for years. As a result, approximately one trillion dollars have been lost in the last decade alone. This is why disaster risk reduction is so essential. Mitigating disasters requires training, capacity building at all levels, and it calls for a change of thinking to shift from post-disaster reaction to pre-disaster action – this is UNESCO’s position. (p. 6)

Although it is clear from literature that the intensity and frequency of floods have increased over the recent years, it is not clear whether the magnitudes of floods, that is, flood heights, have also increased. If the magnitudes of flood heights have increased over the years, it is expected that the location parameter of the GEV, which is associated with the mean estimate of the distribution, should increase with increase in time. On the other hand, if there is no gradual increase in flood heights over the years and the sporadic extremely high floods are nearly or purely random, then we expect the scale parameter, which is associated with dispersion from the central location, to vary with time (

Time series plots of annual daily maximum (AM1) flood heights (in metres) at the three sites: (a) Chokwe (1951–2010), (b) Combomune (1966–2010) and (c) Sicacate (1952–2010) along lower Limpopo River of Mozambique.

The present study considers a non-stationary time-dependent GEV distribution model whose location and scale parameters are expected to vary linearly or nonlinearly with time (

According to Katz

The outline of the rest of the article is such that Section 2 presents the research methodology, Section 3 presents the results and discussion of the findings, and finally Section 4 gives the concluding remarks.

The section presents the sequential steps taken to sort the data into the block maxima series (Ferreira & De Haan

Mozambique National Directorate of Water, the authority responsible for water management in Mozambique in the Ministry of Public Works and Housing, provided the data used in the study. The data are hydrometric daily flood heights (in metres) recorded at the sites Chokwe (1951–2010), Combomune (1966–2010) and Sicacate (1952–2010), which are hydrometric stations for the lower Limpopo River of Mozambique (Maposa

The raw data at the three sites were originally recorded as daily flood heights (or water levels). The data records at some sites stretch back to as far as 1930s. However, because of missing values, the records used in the study are for the period 1951–2010 for Chokwe, 1966–2010 for Combomune and 1952–2010 for Sicacate (

In statistics of extremes, there are two fundamental approaches used in flood frequency analysis, namely, block maxima (or AMS) and POT (or partial duration series) (Ferreira & De Haan

Comprehensive details of probability framework of block maxima and the practical reasons for using block maxima over POT are given by Ferreira and De Haan (

We are already familiar with the background of extreme value theory, beginning with the limiting distributions of Fisher and Tippett (_{i}_{i≥1} be i.i.d. random variables with common distribution function _{ξ}_{m}_{m}_{k,m}_{(k−1)m+1},…, _{km}_{k,m}_{k≥1} are i.i.d. with distribution function ^{m}

As suggested by _{k,m},^{m}_{m}, b_{m}_{0} and it shall be used as the reference model such that all other extended models are compared to it for their significance.

The log-likelihood function for the GEV in

Now consider the time-dependent GEV model, call it _{1}, with a linear trend in the location and the scale parameter such that _{0} + _{1} _{0}+_{1}

The log-likelihood function of model _{1} for the case

In the present study, we also propose three more models, _{2}, _{3}, and _{4}. Model _{2} has a linear trend in the location parameter such that _{0}+ _{1}(_{2} and its log-likelihood are of the form _{0}, _{1}, _{3} has a linear trend in the scale parameter and _{4} has a nonlinear quadratic trend in the location parameter such that _{0} + _{1}_{0} + _{1}_{2}^{2}, _{3} and _{4}, respectively. The model for _{3} and its log-likelihood are of the form _{0}, _{1}, _{4} and its log-likelihood are of the form _{0}, _{1}, _{2},

One important question to answer is whether the non-stationary model provides an improvement in fit over the time-homogeneous model _{0}; that is, is it worthwhile to have the non-stationary model? The ML estimation of nested models uses a simple procedure called the deviance (_{0}, is a special case of the time-dependent models _{1}, _{2}, _{3} and _{4}. In general, consider _{0} ⊂ _{i,∀i=1,2,3,4}, then we define deviance statistic, _{i}_{i}_{0}(_{0}) are the maximised negative log-likelihood for models _{i,∀i=1,2,3,4} and _{0}, respectively. ^{2}_{k,α}) asymptotic distribution, with _{0}. Thus, ^{2}_{k,α}, where ^{2}_{k,α} suggests that model _{i}_{0}.

In order to avoid presenting too many tables in the article, only tables for the AM1 time series data are presented for each of the three sites Chokwe, Combomune and Sicacate in _{1} still refers to a time-dependent GEV model with a linear trend in both the location and scale parameters as in AM1.

Annual daily maximum time-dependent generalised extreme value models for Chokwe for the period 1951–2010.

Model | Maximised negative log-likelihood | ||||||
---|---|---|---|---|---|---|---|

_{0} |
4.248 | 0 | 0 | 1.785 | 0 | −0.081 | 126.313 |

_{1} |
4.222 | −0.0003 | 0 | 2.204 | −0.015 | −0.041 | 125.802 |

_{2} |
4.294 | −0.0015 | 0 | 1.787 | 0 | −0.081 | 126.308 |

_{3} |
4.212 | 0 | 0 | 2.205 | −0.015 | −0.040 | 125.802 |

_{4} |
4.237 | −0.0002 | 0.0000 | 1.784 | 0 | −0.080 | 126.310 |

Annual daily maximum time-dependent generalised extreme value models for Combomune for the period 1966–2010.

Model | Maximised negative log-likelihood | ||||||
---|---|---|---|---|---|---|---|

_{0} |
5.163 | 0 | 0 | 1.660 | 0 | −0.124 | 90.740 |

_{1} |
5.394 | −0.012 | 0 | 2.321 | −0.033 | −0.045 | 88.060 |

_{2} |
5.445 | −0.011 | 0 | 1.685 | 0 | −0.150 | 90.614 |

_{3} |
5.034 | 0 | 0 | 2.268 | −0.031 | −0.043 | 88.276 |

_{4} |
5.338 | −0.000 | −0.000 | 1.681 | 0 | −0.146 | 90.627 |

Annual daily maximum time-dependent generalised extreme value models for Sicacate for the period 1952–2010.

Model | Maximised negative log-likelihood | ||||||
---|---|---|---|---|---|---|---|

_{0} |
6.151 | 0 | 0 | 3.328 | 0 | −0.454 | 148.547 |

_{1} |
6.901 | −0.012 | 0 | 1.813 | 0.061 | −0.682 | 144.306 |

_{2} |
5.499 | 0.025 | 0 | 3.443 | 0 | −0.526 | 148.279 |

_{3} |
6.675 | 0 | 0 | 1.966 | 0.055 | −0.693 | 144.413 |

_{4} |
5.887 | 0.000 | 0.0003 | 3.396 | 0 | −0.499 | 148.373 |

Consider the pair of models (_{0}, _{1}) from _{0} is taken as the reference model, χ^{2}_{2,0.05} = 5.991, _{1}= 0 has _{1}= 0 has _{1}.

The other pairs from _{0}, _{2}) and (_{0}, _{3}), have ^{2}_{1,0.05} = 3.841 for both pairs. The likelihood ratio test for _{1} = 0 has _{1} = 0 has _{2} and _{3}, respectively, which is insignificant for both models at the 5% level of significance. The _{2} and _{3}.

The quadratic model pair (_{0}, _{4}) in ^{2}_{2},_{0.05} = 5.991, implying that model _{4} does not provide any improvement in fit to justify its importance over the time-homogeneous model. The likelihood ratio tests for _{1} = 0 and _{2} = 0 are also not significant at the 5% significance level (

In general, the results from Chokwe showed that the prevailing model for the site is the time-homogeneous GEV model given by _{i,∀i=1,2,…,k} is the AM flood height. The diagnostic plots for the time-homogeneous model in

Diagnostic plots for the time-homogeneous generalised extreme value model at Chokwe hydrometric station: (a) Probability plot; (b) Quantile plot; (c) Return level plot; (d) Density plot.

We start by considering the pair (_{0}, _{1}) from ^{2}_{2},_{0.05} = 5.991, _{1} = 0 has _{1}= 0 has _{1} is not worthwhile compared to the time-homogeneous GEV model in

We now consider the pairs (_{0}, _{2}) and (_{0}, _{2}) from ^{2}_{1,0.05} = 3.841 with respective _{1} = 0 has _{1} = 0 has _{2} and _{3}, respectively. These results show that model _{2} is not significant at the 5% significance level (_{3} is significant at the 5% significance level (

The nonlinear quadratic model pair (_{0}, _{4}) has a _{1} = 0 and _{2}= 0 are not significant at the 5% significance level (_{4} is neither significant nor worthwhile over the time-homogeneous GEV model. Likewise, the same conclusions were reached for all the AMS moving sums.

Overall, the final model for Combomune is the non-stationary model, _{3}, with a linear trend in the scale parameter of the GEV. The general model for Combomune is given in _{i}_{i}_{i} =_{i}_{i,∀i=1,2,…,} is the AM flood height. The diagnostic plots for the time-heterogeneous model in

Diagnostic plots for the time-heterogeneous generalised extreme value model with a trend in the scale parameter at Combomune hydrometric station: (a) Residual probability plot; (b) Residual quantile plot.

The model pair (_{0}, _{1}) from ^{2}_{2,0.05} = 5.991 and a _{1} = 0 has _{1} = 0 has _{1} provides an improvement in fit over the time-homogeneous GEV model; that is model _{1} is worthwhile. These findings are consistent with findings from AMS moving sums AM2, AM5, AM7, AM10 and AM30.

The model pairs (_{0}, _{2}) and (_{0}, _{3}) in ^{2}_{1,0.05} = 3.841 with a _{2} and _{3} respectively. The likelihood ratio test for _{0} = 0 has _{1}= 0 has _{2}is insignificant at the 5% level of significance (_{3} is highly significant at the 5% significance level (

The nonlinear quadratic model pair (_{0}, _{4}) in ^{2}_{2,0.05} = 5.991 The likelihood ratio test for _{1} = 0 has _{2} = 0 has

We now have two competing ‘good’, non-stationary, linear, time-dependent models for Sicacate. To identify the most appropriate of these models, we rate the one with the smaller standard errors and the smaller _{3} has a smaller _{1}, and the standard errors for _{3} are much smaller than those of _{1} for example 0.47428 compared to 0.64696 for _{0} and 0.01722 compared to 0.02249 for scale slope _{1} for _{3} and _{1}, respectively. Therefore, the non-stationary linear trend in scale GEV model for Sicacate is given in _{i}_{i}_{i}_{i}_{i,∀i=1,2,…,k} is the AM flood height. The alternative non-stationary linear trend in location and scale GEV model is given as follows:
_{i}_{i}_{i} =_{i}_{i,∀i=1,2,…,k} is the AM flood height. The diagnostic plots for the time-heterogeneous models in

Diagnostic plots for the time-heterogeneous generalised extreme value model with a trend in the scale parameter at Sicacate hydrometric station: (a) Residual probability plot; (b) Residual quantile plot.

Diagnostic plots for the time-heterogeneous generalised extreme value model with a trend in both the location and scale parameters at Sicacate hydrometric station: (a) Residual probability plot; (b) Residual quantile plot.

The interesting findings are that whilst most studies in other regions have found a dominant linear trend in the location parameter of the GEV distribution for some rivers (e.g. Katz

The study considered the use of statistics of extremes in a changing climate for the LLRB of Mozambique. Three hydrometric stations representing three sites along the lower Limpopo River were considered for the study. The ML estimation method was used to estimate the parameters of the GEV distribution in the presence of a trend covariate. The study has revealed the importance of considering non-stationary linear and nonlinear trend models when using statistics of extremes in a changing climate as these models provide an improvement in fit over the time-homogeneous models. This improvement in fit is very important for the planning and policy-making of the government of Mozambique and its partners in the LLRB, where the largest irrigation scheme of the country is situated. The importance of the developed models is attributed to the fact that these non-stationary models take into account the reasons for increased frequency of floods in the basin. Once the government and its partners are fully aware of the reasons behind the increased frequency of floods in the basin, their planning can be much improved.

The study has successfully identified the prevailing models at the three sites such that Chokwe is the only site with a time-homogeneous GEV model. This can be attributed to the fact that some of the water at the site is diverted to the Chokwe Irrigation Scheme for irrigation purposes. The other two sites Combomune and Sicacate have a prevailing non-stationary GEV model with a dominant linear trend in the scale parameter. The site of Sicacate has an alternative non-stationary model with a linear trend in both the location and scale parameters of a GEV distribution. The prevailing models established in the study are consistent with cumulative (or moving sums) AMS flood flows and therefore appear reliable to use for flood frequency analysis in the basin. The use of the identified time-dependent GEV models with a trend in the scale parameter in the basin would also reduce the sensitivity of the frequency of floods, which is known to vary with changes in the scale parameter and therefore lead to more reliable estimates in the frequency of floods.

Future studies will attempt to advance the study to consider non-stationary generalised Pareto distributions, Bayesian inference and Markov chain Monte Carlo methods in a changing climate for the lower Limpopo River of Mozambique. Covariates in the form of cycles and/or a physical variable such as a dummy variable indicating the occurrence of cyclones in the region will also be considered in future studies involving statistics of extremes in a changing climate.

We thank the Mozambique National Directorate of Water and Mr. Isac Filimone of the Directorate, in particular, who provided us with all the necessary data used in the study. We are also indebted to United Nations Office for the Coordination of Humanitarian Affairs-southern Africa for providing us with weekly update reports of floods in southern Africa, particularly for the LLRB of Mozambique. We are also greatly indebted to the Department of Science and Technology – National Research Foundation Centre of Excellence in Mathematical and Statistical Sciences of South Africa who provided funds for the postgraduate studies. Lastly, we thank the University of Limpopo for their support in research.

The authors declare that they have no financial or personal relationships which may have inappropriately influenced them in writing this article.

D.M. drafted the original manuscript, acquired and analysed the data and made interpretations. The work is part of the PhD thesis chapters for D.M. under the supervision of J.J.C. and M.L. J.J.C. critically revised the original manuscript and made final approval of the version to be published. M.L. also critically revised the original manuscript and made final approval of the manuscript to be published.