;+
; NAME:
;    PROCESS_LONLATMONTH
;
; PURPOSE:
;    This procedure returns a processed version of the input lon-lat-month data 
;    set.  Processing includes lon-lat interpolation masking, taking anomalies, 
;    and taking averages.
;
; CATEGORY:
;    GEOGRAPHICAL
;
; CALLING SEQUENCE:
;    process_lonlatmonth, data
;
; INPUTS:
;    Data:  A array of data to be processed.  Of dimensions 
;        N_LON*N_LAT*N_TIME*N_REALIZATION or, if the INCLUDE_HEIGHT option is 
;        set then N_LON*N_LAT*N_HEIGHT*N_REALIZATION.  N_REALIZATION can be 1 
;        (and hence ignored).  Missing data should be denoted by NaN.  Also 
;        returns the processed data array.
;    ANOMALY, DATA_MEAN, DESEASONALISE, FRAC_INTERPOLATE_THRESH, 
;        FRAC_BASE_THRESH, FRAC_SEASON_THRESH, LAT, LEN_SEASON, LON, MASK_DATA, 
;        MASK_LON, MASK_LAT, PERIOD_BASE, SEASON, TIME
;
; KEYWORD PARAMETERS:
;    ANOMALY:  If set then anomalies with respect to the PERIOD_BASE 
;        climatology are returned.  If set to 'absolute' then the absolute 
;        anomalies are returned.  If set to 'fractional' then the fractional 
;        anomalies are returned.  The default is not set (no anomalies).
;    COVERAGE:  An optional input floating array containing the fractional data 
;        coverage of each element in Data.  Returns an altered array according 
;        to interpolation and masking operations.  Of same dimensions as Data.
;    DATA_MEAN:  Returns an array of the temporal average of Data over the 
;        PERIOD_BASE period.  Of size N_LON_OUT*N_LAT_OUT or 
;        N_LON_OUT*N_LAT_OUT, depending on whether INCLUDE_HEIGHT is set.  This 
;        is returned only if DATA_MEAN is set or if ANOMALY is set.
;    DESEASONALISE:  If set then the seasonal cycle is removed from the data 
;        before the seasonal values are calculated.  See months_to_seasons.pro 
;        for more information.
;    FRAC_COVERAGE_THRESH:  The minimum fraction of (weighted) data that must 
;        be defined (i.e. not NaN) in order to proceed with an integration 
;        command.  If the criterion is not met then a NaN is returned from that 
;        calculation.  The default is 0.
;    FRAC_INTERPOLATE_THRESH:  The threshold fraction of the area in a new 
;        spatial cell with non-missing data in the old spatial cells being 
;        interpolated required for the interpolation to be carried out (as 
;        against being counted as missing data).  The default is 0.
;    FRAC_BASE_THRESH:  The fraction of time steps in the base period required 
;        to have non-missing (non-NaN) values in order for the base value to be 
;        calculated.  If this criterion is not satisfied then a NaN is 
;        recorded.  The default is 0.
;    FRAC_SEASON_THRESH:  The fraction of the season that needs non-missing 
;        data in order to be averaged when seasonal averaging is selected.  The 
;        default is 0.
;    INCLUDE_HEIGHT:  If set, then the input and output Data array is assumed 
;        to be of the form N_LON,N_LAT,N_HEIGHT,N_TIME.  The default is to 
;        omit the height dimension.
;    INTEGRATE:  A string array containing integration instructions to run in 
;        array_total.pro through the Instruct keyword.  See array_total.pro for 
;        further details.  'mean=1,2' gives the spatial average, 'total=1' 
;        gives the zonal total, 'mean=2' gives the meridional average, 'mean=3' 
;        gives the temporal average.
;    INTERPOLATE_TOTAL:  If set then when interpolating to the MASK_LAT and/or 
;        MASK_LON coordinates the area-weighted total of all values in the new 
;        grid cells that are overlapping the area in the new grid cell is 
;        taken.  The default is to return the area-weighted average.
;    LAT:  A vector of latitude values for Data.  Of length N_LAT.  Returns the 
;        vector of latitude values for the returned processed Data, length 
;        N_LAT_OUT.
;    LEN_SEASON:  The length of the season for returning seasonal data.  See 
;        months_to_seasons.pro (SEASONLEN) for more information.
;    LON:  A vector of longitude values for Data.  Of length N_LON.  Returns 
;        the vector of longitude values for the returned processed Data, of 
;        length N_LON_OUT.
;    MASK_DATA:  A masking array of dimensions N_LON_MASK*N_LAT_MASK or 
;        N_LON_MASK*N_LAT_MASK*N_TIME.  Values are 1 (keep) or NaN (discard).
;    MASK_LAT:  A vector of the latitude coordinates for the mask.  Of length 
;        N_LAT_MASK.  If not given then MASK_LAT is assumed to equal LAT.
;    MASK_LON:  A vector of the longitude coordinates for the mask.  Of length 
;        N_LON_MASK.  If not given then MASK_LON is assumed to equal LON.
;    MEAN_AREA:  If set then the spatial mean is returned.  The default is not 
;        set.
;    MEAN_LAT:  If set then the zonal mean is returned.  The default is not set.
;    MEAN_LON:  If set then the meridional mean is returned.  The default is 
;        not set.
;    PERIOD_BASE:  A vector of length two defining the climatological base 
;        period.  The first/second elements give the first/last year of the 
;        period.  The default value is the full period in TIME.
;    SEASON:  If a value is given, then the function returns seasonal mean or 
;        total data for the desired season of length LEN_SEASLON.  See 
;        months_to_seasons.pro for more information.
;    TIME:  A vector of time values for Data.  Of length N_TIME.  Returns the 
;        vector of time values for the returned processed Data, of length 
;        N_TIME_OUT.
;    WEIGHT:  An array size N_LON_OUT*N_LAT_OUT*N_TIME_OUT of weightings to use 
;        when taking an average over one or more dimensions, as requested in 
;        INTEGRATE or MEAN_*.  Note this should not include the area weightings 
;        for a polar grid, which are included automatically.  If INCLUDE_HEIGHT 
;        is set, then this should be of size 
;        N_LON_OUT*N_LAT_OUT*N_HEIGHT*N_TIME_OUT.
;
; OUTPUTS:
;    Data
;    COVERAGE, DATA_MEAN, LAT, LON, TIME
;
; USES:
;    add_dim.pro
;    array_total.pro
;    mask_lonlattime.pro
;    months_to_season.pro
;
; PROCEDURE:
;
; EXAMPLE:
;
; MODIFICATION HISTORY:
;    Written by:  Daithi A. Stone (stoned@atm.ox.ac.uk), 2008-03-26, adapted 
;        from mask_lonlatmonth.pro
;    Modified:  DAS, 2008-04-14 (Added WEIGHT keyword;  added default base 
;        period to averaging section;  added NGOOD=1 to months_to_season.pro 
;        call for time)
;    Modified:  DAS, 2009-04-07 (Corrected bug to allow anomaly calculation of 
;        one-dimensional data)
;    Modified:  DAS, 2011-03-30 (Fixed bug in recording lon and lat from 
;        mask_lonlattime.pro when COVERAGE is undefined;  modified formating;  
;        added description for COVERAGE)
;    Modified:  DAS, 2017-10-09 (Added INCLUDE_HEIGHT option;  modified 
;        internal calculations to use height dimension by default and allow 
;        inclusion of a realization dimension)
;    Modified:  DAS, 2018-02-15 (Corrected bug in calculation of time mean for 
;        the default of full period when calculating the anomaly)
;    Modified:  DAS, 2020-05-20 (Added check on non-negative area weighting)
;    Modified:  DAS, 2020-09-16 (Fixed erroneous trigger for removal of height 
;        dimension)
;-

PRO PROCESS_LONLATMONTH, $
    Data, $
    LAT=lat, LON=lon, TIME=time, $
    ANOMALY=anomaly, $
    COVERAGE=coverage, FRAC_COVERAGE_THRESH=frac_coverage_thresh, $
    INCLUDE_HEIGHT=include_height_opt, $
    INTEGRATE=integrate, $
    MASK_DATA=mask_data, MASK_LAT=lat_mask, MASK_LON=lon_mask, $
      FRAC_INTERPOLATE_THRESH=frac_interpolate_thresh, $
      INTERPOLATE_TOTAL=interpolate_total_opt, $
    PERIOD_BASE=period_base, FRAC_BASE_THRESH=frac_base_thresh, $
    SEASON=season, LEN_SEASON=len_season, $
      FRAC_SEASON_THRESH=frac_season_thresh, DESEASONALISE=deseasonalise_opt, $
    WEIGHT=weight, $
    DATA_MEAN=data_mean

;***********************************************************************
; Constants and Options

; The number of months in a year
mina = 12

; Degrees to radians conversion factor
degrad = 2 * !pi / 360.

; The option to include a height dimension
include_height_opt = keyword_set( include_height_opt )

; The number of time steps
n_time = n_elements( time )
if n_time eq 0 then begin
  if include_height_opt eq 1 then begin
    n_time = n_elements( data[0,0,0,*,0] )
  endif else begin
    n_time = n_elements( data[0,0,*,0] )
  endelse
endif
; The number of other dimensions
n_lon = n_elements( data[*,0,0,0,0] )
n_lat = n_elements( data[0,*,0,0,0] )
if include_height_opt eq 1 then begin
  n_height = n_elements( data[0,0,*,0,0] )
endif else begin
  n_height = 1
endelse
if include_height_opt eq 1 then begin
  n_realization = n_elements( data[0,0,0,0,*] )
endif else begin
  n_realization = n_elements( data[0,0,0,*] )
endelse
; Reform data to lon-lat-height-time-realization format
data = reform( data, n_lon, n_lat, n_height, n_time, n_realization )

; The default value of frac_coverage_thresh
if n_elements( frac_coverage_thresh ) eq 0 then frac_coverage_thresh = 0.
; The default value of frac_interpolate_thresh
if n_elements( frac_interpolate_thresh ) eq 0 then frac_interpolate_thresh = 0.
; The default value of frac_base_thresh
if n_elements( frac_base_thresh ) eq 0 then frac_base_thresh = 0.
; The default value of frac_season_thresh
if n_elements( frac_season_thresh ) eq 0 then frac_season_thresh = 0.

;***********************************************************************
; Interpolate and/or Mask the Data

; Proceed if interpolation coordinates and/or a mask is given
if ( n_elements( lon_mask ) gt 0 ) or ( n_elements( lat_mask ) gt 0 ) $
    or keyword_set( mask_data ) then begin

  ; Interpolate and/or mask the data
  if keyword_set( lon ) then temp_lon = lon
  if keyword_set( lat ) then temp_lat = lat
  data = reform( data, n_lon, n_lat, n_height * n_time * n_realization )
  mask_lonlattime, data, mask_data, lon=temp_lon, lat=temp_lat, $
      mask_lon=lon_mask, mask_lat=lat_mask, $
      frac_interpolate_thresh=frac_interpolate_thresh, $
      total=interpolate_total_opt
  n_temp_lon = n_elements( temp_lon )
  n_temp_lat = n_elements( temp_lat )
  data = reform( data, n_temp_lon, n_temp_lat, n_height, n_time, n_realization )
  if keyword_set( coverage ) then begin
    if keyword_set( mask_data ) then temp = 1. * finite( mask_data )
    coverage = reform( coverage, n_lon, n_lat, $
        n_height * n_time * n_realization )
    mask_lonlattime, coverage, temp, lon=lon, lat=lat, mask_lon=lon_mask, $
        mask_lat=lat_mask, frac_interpolate_thresh=0
    temp = 0
    coverage = reform( coverage, n_temp_lon, n_temp_lat, n_height, n_time, $
        n_realization )
  endif else begin
    lon = temp_lon
    lat = temp_lat
  endelse
  temp_lon = 0
  temp_lat = 0
  n_lon = n_elements( lon )
  n_lat = n_elements( lat )

endif

;***********************************************************************
; Calculate Seasonal Values

; Proceed if a season has been requested
if n_elements( season ) gt 0 then begin

  ; Because of masking issues (changing spatiotemporal masks), this needs to be 
  ; done regardless of whether it was requested.  The mean will be added back 
  ; onto the data at the end if necessary.
  temp_mean = array_total( data, 'mean=4', nan=1, frac_coverage_thresh=0. )
  temp_mean = reform( temp_mean, n_lon, n_lat, n_height, 1, n_realization )
  for i_time = 0 * n_time, n_time - 1 do begin
    data[*,*,*,i_time,*] = data[*,*,*,i_time,*] - temp_mean
  endfor

  ; Determine the base period
  if keyword_set( period_base ) then begin
    id_period_base = intarr( 2 )
    id_period_base[0] = where( period_base[0] * mina eq floor( time * mina ) )
    id_period_base[1] = where( period_base[1] * mina eq floor( time * mina ) )
    id_period_base = id_period_base / mina
  endif else begin
    id_period_base = [ 0, n_time / mina - 1 ]
  endelse
  ; Calculate seasonal values
  n_base_good = floor( frac_base_thresh $
      * ( id_period_base[1] - id_period_base[0] + 1 ) )
  n_seas_good = max( [ 1, floor( frac_season_thresh * len_season ) ] )
  
  time = months_to_seasons( time, season, len_season, ngood=1 )
  n_time_new = n_elements( time )
  for i_realization = 0, n_realization - 1 do begin
    data[*,*,*,0:n_time_new-1,i_realization] = months_to_seasons( $
        data[*,*,*,*,i_realization], season, len_season, $
        baseperiod=id_period_base, deseasonalise=deseasonalise_opt, $
        nbasegood=n_base_good, ngood=n_seas_good )
  endfor
  data = data[*,*,*,0:n_time_new-1,*]
  if keyword_set( coverage ) then begin
    for i_realization = 0, n_realization - 1 do begin
      coverage[*,*,*,0:n_time_new-1,i_realization] = months_to_seasons( $
          coverage[*,*,*,*,i_realization], season, len_season )
    endfor
    coverage = coverage[*,*,*,0:n_time_new-1,*]
  endif
  n_time = n_time_new

  ; Go back to absolute values
  for i_time = 0 * n_time, n_time - 1 do begin
    data[*,*,*,i_time,*] = data[*,*,*,i_time,*] + temp_mean
  endfor
  temp_mean = 0

endif

;***********************************************************************
; Calculate the Temporal Average and Anomalies

; Proceed if requested
if keyword_set( anomaly ) or keyword_set( data_mean ) then begin

  ; Identify the base period
  if keyword_set( period_base ) and ( n_elements( time ) gt 0 ) then begin
    id = where( ( time ge period_base[0] ) and ( time le period_base[1] + 1 ), $
        n_id )
  endif else begin
    id = indgen( n_elements( data[0,0,0,*] ) )
  endelse

  ; Calculate the mean
  data_mean = array_total( data[*,*,*,id,*], 'mean=4', nan=1, $
      frac_coverage_thresh=frac_coverage_thresh )
  ;temp = intarr( 2 )
  ;if n_elements( lat ) gt 0 then begin
  ;  temp[1] = n_elements( lat )
  ;endif
  ;if n_elements( lon ) gt 0 then begin
  ;  temp[0] = n_elements( lon )
  ;endif else if temp[1] ne 0 then begin
  ;  temp[0] = n_elements( data_mean ) / temp[1]
  ;endif
  ;if min( temp ) ne 0 then data_mean = reform( data_mean, temp )
  data_mean = reform( data_mean, n_lon, n_lat, n_height, n_realization )

  ; Take the requested anomalies
  if keyword_set( anomaly ) then begin
    temp_mean = add_dim( data_mean, 3, n_time )
    if anomaly eq 'absolute' then data = data - temp_mean
    if anomaly eq 'fractional' then data = data / temp_mean
    temp_mean = 0
  endif

endif

;***********************************************************************
; Calculate Spatial Averages

; Proceed if a spatial average is requested
if keyword_set( integrate ) then begin

  ; Create area weighting array if averaging over latitude
  if strpos( integrate, '2' ) ge 0 then begin
    ; Create the area weighting array
    weight_all = add_dim( cos( degrad * lat ), 0, n_lon )
    ; Ensure non-negative values (numeric error when lat=+/-90)
    id = where( weight_all lt 0., n_id )
    if n_id gt 0 then weight_all[id] = 0.
    ; Add height dimension
    weight_all = add_dim( weight_all, 2, n_height * n_time * n_realization )
    weight_all = reform( weight_all, n_lon, n_lat, n_height, n_time, $
        n_realization )
  endif
  ; Meld the area weighting and input weighting together
  if keyword_set( weight_all ) and keyword_set( weight ) then begin
    weight_all = weight_all * weight
  ; Otherwise just copy the input weighting if given
  endif else if keyword_set( weight ) then begin
    weight_all = weight
  endif

  ; Determine list of dimensions to integrate
  integrate_dim = strsplit( integrate, '=,', extract=1, count=n_integrate_dim )
  n_integrate_dim = n_integrate_dim - 1
  if n_integrate_dim lt 1 then stop
  ; Re-define dimensions to integrate if include_height option is not set
  if include_height_opt eq 0 then begin
    id = where( fix( integrate_dim[1:n_integrate_dim] ) gt 2, n_id )
    if n_id gt 0 then begin
      integrate_dim[1+id] = strtrim( string( fix( integrate_dim[1:id] ) + 1 ), $
          2 )
    endif
  endif
  integrate_use = integrate_dim[0] + '=' + integrate_dim[1]
  for i_dim = 1, n_integrate_dim - 1 do begin
    integrate_use = integrate_use + ',' + integrate_dim[1+i_dim]
  endfor

  ; Calculate spatial average
  if keyword_set( weight_all ) then temp_weight = weight_all
  data = array_total( data, integrate_use, nan=1, weight=temp_weight, $
      frac_coverage_thresh=frac_coverage_thresh, coverage=coverage )
  if keyword_set( data_mean ) then begin
    data_mean = reform( data_mean, n_lon, n_lat, n_height, 1, n_realization )
    data_mean = array_total( data_mean, integrate_use, nan=1, $
        weight=temp_weight, frac_coverage_thresh=frac_coverage_thresh )
  endif
  ; Average spatial coordinates too
  if max( integrate_dim eq '1' ) eq 1 then begin
    if keyword_set( lon ) then lon = mean( lon )
  endif
  if max( integrate_dim eq '2' ) eq 1 then begin
    if keyword_set( lat ) then lat = mean( lat )
  endif

  ; Clear memory
  weight_all = 0
  temp_weight = 0

endif

;***********************************************************************
; Prepare for output

; Remove height dimension if requested (this will already have been done 
; as a by-product of any integration command if there was no height dimension)
if not( keyword_set( integrate ) ) and ( include_height_opt eq 0 ) then begin
  data = reform( data, n_lon, n_lat, n_time, n_realization )
  if keyword_set( coverage ) then begin
    coverage = reform( coverage, n_lon, n_lat, n_time, n_realization )
  endif
  if keyword_set( data_mean ) then begin
    data_mean = reform( data_mean, n_lon, n_lat, n_time, n_realization )
  endif
endif

;***********************************************************************
; The End

return
END
