► Keras 3 API文档 / 评估指标

指标

评估指标是用于评判模型性能的函数。

评估指标函数与损失函数类似，不同之处在于评估指标的计算结果不会在模型训练时使用。请注意，你可以将任何损失函数用作评估指标。

可用的评估指标

基础 Metric 类

评估指标类

准确率指标

概率指标

回归指标

基于真/假阳性和阴性的分类指标

图像分割指标

用于“最大间隔”分类的 Hinge 指标

指标包装器和缩减指标

与`compile()` & `fit()`的用法

compile()方法接受一个metrics参数，该参数是一个评估指标列表

model.compile(
    optimizer='adam',
    loss='mean_squared_error',
    metrics=[
        metrics.MeanSquaredError(),
        metrics.AUC(),
    ]
)

评估指标值在fit()期间显示，并记录在fit()返回的History对象中。它们也由model.evaluate()返回。

请注意，在训练期间监控评估指标的最佳方法是通过 TensorBoard。

要跟踪特定名称下的评估指标，可以在评估指标构造函数中传递name参数

model.compile(
    optimizer='adam',
    loss='mean_squared_error',
    metrics=[
        metrics.MeanSquaredError(name='my_mse'),
        metrics.AUC(name='my_auc'),
    ]
)

所有内置评估指标也可以通过其字符串标识符传递（在这种情况下，将使用默认的构造函数参数值，包括默认的评估指标名称）

model.compile(
    optimizer='adam',
    loss='mean_squared_error',
    metrics=[
        'MeanSquaredError',
        'AUC',
    ]
)

独立用法

与损失函数不同，评估指标是有状态的。你可以使用update_state()方法更新其状态，并使用result()方法查询标量评估指标结果

m = keras.metrics.AUC()
m.update_state([0, 1, 1, 1], [0, 1, 0, 0])
print('Intermediate result:', float(m.result()))

m.update_state([1, 1, 1, 1], [0, 1, 1, 0])
print('Final result:', float(m.result()))

可以通过metric.reset_states()清除内部状态。

以下是如何将评估指标用作简单自定义训练循环的一部分

accuracy = keras.metrics.CategoricalAccuracy()
loss_fn = keras.losses.CategoricalCrossentropy(from_logits=True)
optimizer = keras.optimizers.Adam()

# Iterate over the batches of a dataset.
for step, (x, y) in enumerate(dataset):
    with tf.GradientTape() as tape:
        logits = model(x)
        # Compute the loss value for this batch.
        loss_value = loss_fn(y, logits)

    # Update the state of the `accuracy` metric.
    accuracy.update_state(y, logits)

    # Update the weights of the model to minimize the loss value.
    gradients = tape.gradient(loss_value, model.trainable_weights)
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))

    # Logging the current accuracy value so far.
    if step % 100 == 0:
        print('Step:', step)        
        print('Total running accuracy so far: %.3f' % accuracy.result())

创建自定义评估指标

作为简单的可调用对象（无状态）

与损失函数类似，任何签名metric_fn(y_true, y_pred)且返回一个损失值数组（输入批次中的一个样本）的可调用对象都可以作为评估指标传递给compile()。请注意，样本加权会自动支持任何此类评估指标。

这是一个简单的例子：

from keras import ops

def my_metric_fn(y_true, y_pred):
    squared_difference = ops.square(y_true - y_pred)
    return ops.mean(squared_difference, axis=-1)  # Note the `axis=-1`

model.compile(optimizer='adam', loss='mean_squared_error', metrics=[my_metric_fn])

在这种情况下，你在训练和评估期间跟踪的标量评估指标值是给定epoch（或在给定model.evaluate()调用期间）看到的所有批次的每批次评估指标值的平均值。

作为`Metric`的子类（有状态）

并非所有评估指标都可以通过无状态的可调用对象来表达，因为评估指标在训练和评估期间为每个批次进行计算，但在某些情况下，每批次值的平均值并不是你所关心的。

假设你想计算给定评估数据集上的AUC：每批次AUC值的平均值不等于整个数据集的AUC。

对于此类评估指标，你需要继承Metric类，该类可以跨批次维护状态。这很容易

在__init__中创建状态变量
在update_state()中根据y_true和y_pred更新变量
在result()中返回标量评估指标结果
在reset_states()中清除状态

这是一个计算二元真阳性的简单示例

class BinaryTruePositives(keras.metrics.Metric):

  def __init__(self, name='binary_true_positives', **kwargs):
    super().__init__(name=name, **kwargs)
    self.true_positives = self.add_weight(name='tp', initializer='zeros')

  def update_state(self, y_true, y_pred, sample_weight=None):
    y_true = ops.cast(y_true, "bool")
    y_pred = ops.cast(y_pred, "bool")

    values = ops.logical_and(ops.equal(y_true, True), ops.equal(y_pred, True))
    values = ops.cast(values, self.dtype)
    if sample_weight is not None:
      sample_weight = ops.cast(sample_weight, self.dtype)
      values = values * sample_weight
    self.true_positives.assign_add(ops.sum(values))

  def result(self):
    return self.true_positives

  def reset_state(self):
    self.true_positives.assign(0)

m = BinaryTruePositives()
m.update_state([0, 1, 1, 1], [0, 1, 0, 0])
print(f'Intermediate result: {m.result().numpy()}')

m.update_state([1, 1, 1, 1], [0, 1, 1, 0])
print(f'Intermediate result: {m.result().numpy()}')