Chapter 3: Semi-distributed hydrological modeling

The problem

Now we move forward to a semi-distributed catchment case. This means that we are conceiving the basin as a set of homogeneous polygons that are selected by some criteria; in any basin the hydrologist is faced with a variety of geology, soils, vegetation, land use and topographic characteristics that affects the precipitation-runoff generation. One possible solution to deal with such a complexity is to consider that there are some sectors that behave (e.g.: in terms of runoff generation) in a similar way, hence we can split the basin in what the modeler can consider as hydrological homogeneous areas. As you can imagine, the criteria is not unique and depends on many factors: modeling objectives, knowledge about the runoff generation processes in the catchment, available input data and numerical models, among others (K. J. Beven 2012).

In this case study we are going to work on a perfect fit case (again a synthetic basin). The catchment has been discretised in elevation bands (keeping in mind a mountain basin case).
After this vignette is expect that you:

  • understand how to construct a semi-distributed HBV hydrological model.
  • calibrate the parameters in order to get a perfect river discharge fit.
  • study the effects of changing the snow module parameters in the streamflow discharge.

Data

library(HBV.IANIGLA)

data("semi_distributed_hbv")

str(semi_distributed_hbv)
#> List of 5
#>  $ basin:'data.frame':   15 obs. of  4 variables:
#>   ..$ elev_band: chr [1:15] "eb_1" "eb_2" "eb_3" "eb_4" ...
#>   ..$ area(km2): num [1:15] 10 15 17 20 25 18 17 16 14 12 ...
#>   ..$ rel_area : num [1:15] 0.0535 0.0802 0.0909 0.107 0.1337 ...
#>   ..$ h(masl)  : num [1:15] 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 ...
#>  $ tair : num [1:5310, 1:15] 24.4 26.9 27 24.5 22.1 ...
#>  $ prec : num [1:5310, 1:15] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ pet  : num [1:5310, 1:15] 0 0.999 0.997 0.993 0.988 ...
#>  $ qout : num [1:5310] 0e+00 0e+00 0e+00 2e-04 5e-04 4e-04 4e-04 4e-04 4e-04 4e-04 ...
#>  - attr(*, "comment")= chr "Semi-distributed HBV model input data"

To get more details about the dataset just type ?semi_distributed_hbv

Building a semi-distributed hydrological model

For this exercise is supposed that we have just one type of vegetation soil and that the runoff generation is controlled by the snow accumulation and melting process. As this basin is located in a mountain region, we consider a mean and homogeneous snowpack evolution within a pre-defined elevation range.

## brief arguments description
  # basin: data frame with the same structure of the data("semi_distributed_hbv) (colnames included).
  # tair: numeric matrix with air temperature inputs. 
  # precip: numeric matrix with precipitation inputs. 
  # pet: numeric matrix with potential eavapotranspiration inputs. 
  # param_snow: numeric vector with snow module parameters.
  # param_soil: numeric vector with soil moisture parameters.
  # param_routing: numeric vector with the routing parameters.
  # param_tf: numeric vector with the transfer function parameter.
  # init_snow: numeric value with initial snow water equivalent. Default value being 20 mm.
  # init_soil: numeric value with initial soil moisture content. Default value being 100 mm.
  # init_routing: numeric vector with bucket water initial values. Default values are 0 mm.
## output
  # simulated streamflow series.
hydrological_hbv <- function(basin,
                             tair,
                             precip,
                             pet,
                             param_snow,
                             param_soil,
                             param_route,
                             param_tf,
                             init_snow = 20,
                             init_soil = 0,
                             init_routing = c(0, 0, 0) 
                             ){
  n_it <- nrow(basin)

  # create output lists
  snow_module  <- list()
  soil_module  <- list()
  route_module <- list()
  tf_module    <- list()

  # snow and soil module in every elevation band
  for(i in 1:n_it){
    snow_module[[ i ]] <-
      SnowGlacier_HBV(model = 1, inputData = cbind(tair[ , i], precip[ , i]),
                      initCond =  c(init_snow, 2), param = param_snow)
    soil_module[[ i ]] <-
      Soil_HBV(model = 1, inputData = cbind(snow_module[[i]][ , 5] , pet[ , i]),
               initCond = c(init_soil, basin[i, 'rel_area']), param = param_soil )

  } # end for

  # get total soil discharge
  soil_disch <- lapply(X = 1:n_it, FUN = function(x){
    out <- soil_module[[x]][ , 1]
  })
  soil_disch <- Reduce(f = `+`, x = soil_disch)

  # route module
  route_module <- Routing_HBV(model = 1, lake = F, inputData = as.matrix(soil_disch),
                              initCond = init_routing, param = param_route )

  # transfer function
  tf_module <- round(
    UH(model = 1, Qg = route_module[ , 1], param = param_tf), 4
  )

  return(tf_module)


}# end fun

As in the Lumped model case, this is just a way of constructing an HBV semi-distributed model but not the only one.

Calibrating the parameters

In the next lines I will show you how to generate many parameter sets in order to get close to the correct one. Remember that we are talking about the correct parameter set because is a synthetic case. In real world problems this will not be the case.

The calibrating issue has been focus of a lot of debate and research in the hydrological modeling field, I recommend the following material for the interested reader:

  • A manifesto for the equifinality thesis (K. Beven 2006).
  • Sensitivity analysis of environmental models: A systematic review with practical workflow (Pianosi et al. 2016).
  • Rainfall-runoff modelling (K. J. Beven 2012).
  • Environmental Modelling: An Uncertain Future? (K. Beven 2008).
# first we are going to create set the parameter range
  # snow module
snow_range <- rbind(
  sfcf = c(0, 1.5),
  tr   = c(-1, 1),
  tt   = c(0, 3),
  fm   = c(1.5, 4)
)

  # soil module
soil_range <- rbind(
  fc   = c(100, 200),
  lp   = c(0.5, 1),
  beta = c(1, 3)
)

  # routing module (here I will give you the correct values)
routing_range <- rbind(
  k0   = c(0.09, 0.09),
  k1   = c(0.07, 0.07),
  k2   = c(0.05, 0.05),
  uzl  = c(5, 5),
  perc = c(2, 2)
)

  # transfer function module (I will give the correct value)
tf_range <- rbind(
  bmax = c(2.25, 2.25)
)

Then we are going to condense the parameter ranges in a matrix,

param_range <- 
  rbind(
    snow_range,
    soil_range,
    routing_range,
    tf_range
  )

head(param_range)
#>       [,1]  [,2]
#> sfcf   0.0   1.5
#> tr    -1.0   1.0
#> tt     0.0   3.0
#> fm     1.5   4.0
#> fc   100.0 200.0
#> lp     0.5   1.0

In the next step we will generate random sets of parameters. Then we will use them to run the model and save our goodness of fit function,

# set the number of model runs that you want to try
n_run <- 1000

# build the matrix
n_it <- nrow(param_range)

param_sets <- matrix(NA_real_, nrow = n_run, ncol = n_it)

colnames(param_sets) <- rownames(param_range)
for(i in 1:n_it){
  
  param_sets[ , i] <- runif(n = n_run, 
                            min = param_range[i, 1],
                            max = param_range[i, 2]
                            )
    
}

head(param_sets)
#>           sfcf         tr       tt       fm       fc        lp     beta   k0
#> [1,] 0.2808843 -0.1988102 2.248042 1.670547 127.2266 0.7437620 1.130969 0.09
#> [2,] 0.2514527  0.4926597 1.752910 3.756478 142.2262 0.5810407 2.783933 0.09
#> [3,] 0.9610129 -0.1669551 2.128964 2.645919 163.3129 0.9437365 1.434353 0.09
#> [4,] 1.4435390 -0.1261128 2.732200 1.764122 184.6563 0.5170997 1.513836 0.09
#> [5,] 0.9952805 -0.1296802 1.149875 2.874869 139.4242 0.8517132 2.989876 0.09
#> [6,] 0.4596527 -0.1342559 2.927786 3.815124 141.1265 0.5239499 1.216035 0.09
#>        k1   k2 uzl perc bmax
#> [1,] 0.07 0.05   5    2 2.25
#> [2,] 0.07 0.05   5    2 2.25
#> [3,] 0.07 0.05   5    2 2.25
#> [4,] 0.07 0.05   5    2 2.25
#> [5,] 0.07 0.05   5    2 2.25
#> [6,] 0.07 0.05   5    2 2.25

Finally we run our semi-distributed model,

# goodness of fit vector
gof <- c()

# make a loop
for(i in 1:n_run){
  streamflow <- hydrological_hbv(
                             basin = semi_distributed_hbv$basin,
                             tair = semi_distributed_hbv$tair,
                             precip = semi_distributed_hbv$prec,
                             pet = semi_distributed_hbv$pet,
                             param_snow = param_sets[i, rownames(snow_range) ],
                             param_soil = param_sets[i, rownames(soil_range)],
                             param_route = param_sets[i, rownames(routing_range)],
                             param_tf = param_sets[i, rownames(tf_range)]
                             )
  
  gof[i] <- cor(x = streamflow, y = semi_distributed_hbv$qout)
}

param_sets <- cbind(param_sets, gof)

head(param_sets)
#>           sfcf         tr       tt       fm       fc        lp     beta   k0
#> [1,] 0.2808843 -0.1988102 2.248042 1.670547 127.2266 0.7437620 1.130969 0.09
#> [2,] 0.2514527  0.4926597 1.752910 3.756478 142.2262 0.5810407 2.783933 0.09
#> [3,] 0.9610129 -0.1669551 2.128964 2.645919 163.3129 0.9437365 1.434353 0.09
#> [4,] 1.4435390 -0.1261128 2.732200 1.764122 184.6563 0.5170997 1.513836 0.09
#> [5,] 0.9952805 -0.1296802 1.149875 2.874869 139.4242 0.8517132 2.989876 0.09
#> [6,] 0.4596527 -0.1342559 2.927786 3.815124 141.1265 0.5239499 1.216035 0.09
#>        k1   k2 uzl perc bmax       gof
#> [1,] 0.07 0.05   5    2 2.25 0.9349033
#> [2,] 0.07 0.05   5    2 2.25 0.9104437
#> [3,] 0.07 0.05   5    2 2.25 0.9961141
#> [4,] 0.07 0.05   5    2 2.25 0.9797558
#> [5,] 0.07 0.05   5    2 2.25 0.9882781
#> [6,] 0.07 0.05   5    2 2.25 0.9600641

Is time to extract the parameter set with the maximum gof value,

# get the row index
max_gof   <- which.max(param_sets[ , "gof"])

# extract the parameter set
param_opt <- param_sets[max_gof, ]

param_opt
#>         sfcf           tr           tt           fm           fc           lp 
#>   1.07377743   0.02083456   1.15235790   1.92040776 145.75885478   0.95564897 
#>         beta           k0           k1           k2          uzl         perc 
#>   1.46933269   0.09000000   0.07000000   0.05000000   5.00000000   2.00000000 
#>         bmax          gof 
#>   2.25000000   0.99744171

Now compare your best parameter set with the ones that I used to generate the catchment streamflow output,

  • param_snow = c(sfcf = 1.1, tr = 0, tt = 0, fm = 1.75)
  • param_soil = c(fc = 150, lp = 0.90, beta = 1.5)

Now is your turn

  • Are your best parameter set values far away from the true ones?
  • If your previous answer is yes is time to take a breath and think, what is happening? Discuss with your partners…
  • Maybe is time not only to look at the streamflow discharge but also to the actual evapotranspiration and soil moisture series.

References

Beven, Keith. 2006. “A Manifesto for the Equifinality Thesis.” Journal of Hydrology, The model parameter estimation experiment, 320 (1): 18–36. https://doi.org/10.1016/j.jhydrol.2005.07.007.
———. 2008. Environmental Modelling: An Uncertain Future? 1 edition. London: CRC Press.
Beven, Keith J. 2012. Rainfall - Runoff Modelling. 2 edition. Chichester: Wiley.
Pianosi, Francesca, Keith Beven, Jim Freer, Jim W. Hall, Jonathan Rougier, David B. Stephenson, and Thorsten Wagener. 2016. “Sensitivity Analysis of Environmental Models: A Systematic Review with Practical Workflow.” Environmental Modelling & Software 79 (May): 214–32. https://doi.org/10.1016/j.envsoft.2016.02.008.