SAL_StructureAmplitudeLocation

思想

SAL方法通过三个分量，分别从 场结构、降水强度和场位置偏差 对预测结果进行评价

threshold

确定方法是经验法

阈值的设定，效果是大于阈值的格点才会被定为降水主体成员（降水体：降水主体内（各不连续的小降水区域）每个连续的小区域），只有降水主体成员才参与上述S、L2的运算

three component

计算总雨带特征：A、L1 （与阈值无关）
计算内部结构特征：S、L2

Structure

S表示Structure 结构，提供预测场和观测场有关其大小和形状的信息。

正值 -> 表示模拟的降水对象太大或太平
负值 -> 表示对象太尖锐或太小

Amplitude

A表示Amplitude （振幅）降水强度

正值 -> 表示对总降水量的估计过高
负值 -> 表示对总降水量的估计过低

Location

L表示Location位置，提供预测场和观测场 位置偏差信息

分为L1、L2两部分

L1表示预报场与观测场质心之间的归一化距离。（两个场质心之间的距离）
L2是总降水场的质心与观测和预报的个别降水对象的归一化距离之差。（每个降水对象到质心的距离）

L2是为了预防下图情况，由于观测场分为两块，质心在中间，与预测场重合

some example

（ S = 0、A=0、L=0表示完美匹配）

具体公式

S - 结构

公式：$S = \frac{V(R_{mod})-V(R_{obs})}{0.5[V(R_{mod})+V(R_{obs})]}$

$V(R_{mod})$：预测场中所有降水体以体内总降水量为权重的$V_n$的加权平均
1. 公式：$V(R) = \frac{\sum^{m}_{n=1}R_nV_n}{\sum^m_{n=1}R_n}$
  1. $R_n$：降水体的总降水量
$V_n$：第n个降水体内的总降水量与最大降水量之比

😃降水量就是某一个降水体每个降水值的总和
1. 公式：$V_n = {\displaystyle \sum_{(i,j)\in R_n} {R_{i,j}/R^{max}_n} = R_n/R^{max}_n}$
  1. $R_{i,j}$：格点（i，j）处的降水
  2. $R^{max}_{n}$：降水体n内的最大降水值
    
    😃降水体相当于是面积，最大值表示这块面积里面的某一个点的最大值

A - 强度

公式：$A = \frac {D(R_{mod}) - D(R_{obs})} {0.5[D(R_{mod}) + D(R_{obs})]}$

$D(R) = \frac{1}{N} \displaystyle \sum_{(i,j)\in D}{R_{i,j}}$ → $D(R)$为所取区域内非缺省格点降水的平均值

L - 位置

公式

$L = L_1 + L_2$
1. $L_1 = \frac{|x(R_{mod})-x(R_{obs})|}{d}$ → L1为区域D实况与预报降水主体中心之间的距离
  1. $x(R)$：表示所取区域D内降水主体的重心位置
  2. $d$：区域D内最大距离
2. $L_2 = 2\cdot{\frac{r(R_{mod})-r(R_{obs})}{d}}$ → L2为降水主体重心与降水场每个降水体重心之间的平均距离
  1. $r = \frac{\sum ^{m}_{n=1}{R_n \cdot{|x-x_n|}}}{\sum^{m}_{n=1}R_n}$
    1. $m$ 表示降水体个数
    2. $r(R)$ 表示各降水体重心与降水主体中心之间距离加权平均（权重是雨量）的权重
    3. 降水体雨量越大 → 离降水主体重心越远 → r值越大 → 最大值为(1/2)d
  2. $R_n = \displaystyle \sum_{(i,j)\in R_n}{R_{i,j}}$ → 区域内第n个降水体的降水量
    
    😃降水体是一个小连续区域，就是它的总降水量

代码

import numpy as np
from scipy import ndimage
#numpy.linalg模块包含线性代数的函数。使用这个模块，可以计算逆矩阵、求特征值、解线性方程组以及求解行列式等。
#norm则表示范数
from numpy.linalg import norm
import cv2
import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(threshold=np.inf)

#obs：观测场；fc：预测场
#返回值前三个是S、A、L，还有其他信息
def compute_SAL(field_obs, field_fc, user_threshold=None):
    """SAL-score (Structure, Amplitude, Location)

    Parameters
    ----------
    field_obs : numpy.ndarray
        2D observation (or reference) field
    field_fc : numpy.ndarray
        2D forecast field (same shape as `field_obs`)
    user_threshold : float, optional
        If set, use this threshold for object separation. 如果已设置，请使用此阈值进行对象分离。
        If not set, use the default method of Wernli is used: ``threshold = field_obs.max() * 1/15``

    Returns
    -------
    S : float
        Structure score
    A : float
        Amplitude score
    L : float
        Location score
    L1, L2 : float
        Location score components `L1` and `L2`
    NObjects_obs : float
        Number of thresholded objects in observation 观察中的阈值对象数
    NObjects_mod : float
        Number of objects in model (forecast `fc`) 模型中的对象数（预测“fc”）
    R_max : float
        Maximum Rain-value of observations `field_obs` 观测值的最大降雨值`
    user_treshold : float
        Threshold that was actually used 实际使用的阈值
    """
    if (field_obs.shape != field_fc.shape):
        raise Exception("field_obs and field_fc need to have the same shape" +\
        " (this function also assumes that they cover the same domain!)")

    if (user_threshold is not None) and (type(user_threshold) != float):#user_threshold是给定的阈值
        raise Exception("user_threshold must be a float")

    R_mod = field_fc  # "mod" is for "model"          预测场的数据
    R_obs = field_obs # "obs" is for "observation"    观测场的数据

    #不能出现有数据小于0的，出现即报错
    if np.any(R_mod < 0.):
        raise Exception("negative values in forecast." +\
        "use different field or apply SAL.threshold_to_zero(field)")

    if np.any(R_obs < 0.):
        raise Exception("negative values in forecast." +\
        "use different field or apply SAL.threshold_to_zero(field)")

    f_threshold = 1/15. 
    
    #nanpercentile：不算nan值找到第95个百分数
    #https://www.cjavapy.com/article/1088/
    R_max_obs = np.nanpercentile(np.where(R_obs>0.1, R_obs, np.nan), 95)#最大*0.95
    threshold_obs = R_max_obs * f_threshold#1/15
    
    R_max_mod = np.nanpercentile(np.where(R_mod>0.1, R_mod,np.nan), 95)
    threshold_mod = R_max_mod * f_threshold

    if (user_threshold is not None):
        threshold_obs = user_threshold
        threshold_mod = user_threshold

    R_mod_thr = np.where(R_mod > threshold_mod, R_mod, 0.) # 所有的降水体
    R_obs_thr = np.where(R_obs > threshold_obs, R_obs, 0.) # maskiert

    D_R_mod = np.mean(R_mod)#平均值
    D_R_obs = np.mean(R_obs)

    A = (D_R_mod - D_R_obs) / (0.5 * (D_R_mod + D_R_obs))#与阈值无关

    # x_R_mod: center of mass of mod
    x_R_mod = np.array(ndimage.measurements.center_of_mass(R_mod))#nb，直接调用函数找出质点
    # x_R_obs: center of mass of obs
    x_R_obs = np.array(ndimage.measurements.center_of_mass(R_obs))

    #norm默认是2范式，每个元素先平方，再根号，也就是对角线距离
    d_diagonal = norm(#field_obs.shape)
   
    L1 = norm(x_R_mod - x_R_obs) / d_diagonal#与阈值无关

    ##label标记连接成分
    #https://blog.csdn.net/Monkey_dada/article/details/119353323?spm=1001.2101.3001.6661.1&utm_medium=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1.no_search_link&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1.no_search_link
    labels_mod, NObjects_mod = ndimage.measurements.label(R_mod_thr)#label标记连接成分,在找降水体
    labels_obs, NObjects_obs = ndimage.measurements.label(R_obs_thr)

    labels_list = [labels_mod, labels_obs]
    NObjects_list = [NObjects_mod, NObjects_obs]

    R_list = [R_mod, R_obs]
    x_R_list = [x_R_mod, x_R_obs]

    r_list = [0,0] # fuer L2
    V_list = [0,0] # fuer S

    for iList, NObjects in enumerate(NObjects_list):
        R_n_list = np.zeros(NObjects)
        V_n_list = np.zeros(NObjects)
        distance_n_list = np.zeros(NObjects)

        for n in range(NObjects):

            label = n + 1

            object_n = np.where(labels_list[iList] == label, R_list[iList], 0)
            R_n_list[n] = object_n.sum()
            V_n_list[n] = R_n_list[n] / object_n.max()
            x_n = np.array(ndimage.measurements.center_of_mass(object_n))
            distance_n_list[n] = norm(x_R_list[iList] - x_n)

        R_n_list = np.array(R_n_list)
        distance_n_list = np.array(distance_n_list)
        r_list[iList] = (R_n_list*distance_n_list).sum() / R_n_list.sum()
        V_list[iList] = (R_n_list*V_n_list).sum() / R_n_list.sum()

    L2 = 2.*norm(r_list[0] - r_list[1]) / d_diagonal
    L = L1 + L2
    S = (V_list[0] - V_list[1]) / (0.5 * (V_list[0] + V_list[1])) # 0: mod, 1:obs
    
    return S, A, L, L1, L2, NObjects_obs, NObjects_mod, R_max_obs, R_max_mod, threshold_obs, threshold_mod

#打印结果	
def PrintResultSAL(field_obs, field_fc, user_threshold=None):
    S, A, L, L1, L2, NObjects_obs, NObjects_mod, R_max_obs, R_max_mod, threshold_obs, threshold_mod = compute_SAL(field_obs,field_fc)
    print("S = ", S)
    print("A = ", A)
    print("L = ", L)
#     print("L1 = ", L1)
#     print("L2 = ", L2)
#     print("NObjects_obs = ", NObjects_obs)
#     print("NObjects_mod = ", NObjects_mod)
#     print("R_max_obs = ", R_max_obs)
#     print("R_max_mod = ", R_max_mod)
#     print("threshold_obs = ", threshold_obs)
#     print("threshold_mod = ", threshold_mod)
#     print()

def matrix_SAL(filename1, filename2):
    #预测
    EC_fst = np.load(filename1)
    file_fst = EC_fst['f']
    result_fst = file_fst[0].reshape(210,211)
    #真值

    EC_obs = np.load(filename2)
    file_obs = EC_obs['l']
    result_obs = file_obs[0].reshape(210,211)
    PrintResultSAL(result_fst,result_obs)

#之前强降水数据集使用例子
#matrix_SAL('testset_ECrainPredi_03h_202008.npz','../CMPA/08/testset_CMPA_03h_202008.npz')

Shuwen's blog