vue路由匹配实现包容性_包容性机器学习：解决模型公平问题

编程入门行业动态更新时间:2024-10-21 19:39:16

vue路由匹配实现包容性

Artificial Intelligence (AI) and Machine Learning (ML) systems are increasingly being used across all sectors and societies.

人工智能(AI)和机器学习(ML)系统正越来越广泛地应用于所有领域和社会。

Alongside this growth, model fairness has been gaining awareness over the past years. This field aims to assess how fair the model is when treating pre-existing biases in data: is it fair that a job-matching system favors male candidates for CEO interviews, because that matches historical data?

在这种增长的同时， 模型公平性在过去几年中也得到了关注。该领域旨在评估该模型在处理数据中先前存在的偏差时的公平性：工作匹配系统是否偏爱男性候选人参加CEO面试是否公平，因为这与历史数据相匹配？

In my previous article I addressed ML model´s interpretability. This time we will take a step further and assess how our trained model treats potentially sensitive (biased) features.

在上一篇文章中，我谈到了ML模型的可解释性。这次，我们将更进一步，评估我们训练有素的模型如何处理潜在的敏感(偏向)特征。

Auditing a model is not always black and white — features that may be sensitive in a context may not be that much in other. Few people would argue that gender shouldn’t determine whether a person gets a job. However, is it unfair that an insurance company pricing model charges more to men because historical data shows that they have more claims than women? Or is it correctly accounting for their more reckless driving? Certainly it’s at least arguable.

审计模型并不总是黑白的 -在上下文中可能敏感的功能在其他方面可能并不那么重要。很少有人会认为性别不应该决定一个人是否找到工作。但是，由于历史数据表明，保险公司的定价模型比女性拥有更多的理赔，这是否不公平？还是正确地考虑了他们更鲁ck的驾驶？当然，这至少是有争议的。

There are dozens of use cases where the fairness definition is not absolutely clear. Identifying appropriate fairness criteria for a system requires accounting for user experience, cultural, social, historical, political, legal, and ethical considerations, several of which may have trade-offs.

在数十种用例中，公平性定义不是绝对清楚的。为系统确定适当的公平性标准需要考虑用户体验，文化，社会，历史，政治，法律和道德方面的考虑，其中一些因素可能会有所取舍。

In this article we will address model fairness using the FairML library, developed by Julius Adebayo. The entire code used can be found in my GitHub

在本文中，我们将使用Julius Adebayo开发的FairML库解决模型公平问题。可以在我的GitHub上找到使用的全部代码

内容 (Contents)

Dataset and Model Training
数据集和模型训练
FairML Intuition
FairML直觉
Assessing Model Fairness
评估模型公平性
Recommended practices
推荐做法

1.数据集和模型训练 (1. Dataset and Model Training)

The dataset used for this article is the Adult Census Income from UCI Machine Learning Repository. The prediction task is to determine whether a person makes over $50K a year.

本文使用的数据集是UCI机器学习存储库中的成人普查收入 。预测任务是确定一个人每年的收入是否超过5万美元。

Since the focus of this article is not centered in the modelling phase of the ML pipeline, minimum feature engineering was performed in order to model the data with an XGBoost.

由于本文的重点不在ML管道的建模阶段，因此执行了最小特征工程以使用XGBoost建模数据。

The performance metrics obtained for the model are the following:

该模型获得的性能指标如下：

Fig. 2: Receiving Operating Characteristic (ROC) curves for Train and Test sets.

图2：接收列车和测试装置的运行特性(ROC)曲线。

Fig. 3: XGBoost performance metrics

图3：XGBoost性能指标

The model’s performance seems to be pretty acceptable.

该模型的性能似乎是可以接受的。

In my previous article we discussed several techniques for addressing model interpretability. Among other libraries, we used SHAP to obtain the feature importances in the model outputs:

在我以前的文章中，我们讨论了几种解决模型可解释性的技术。在其他库中，我们使用SHAP来获得模型输出中的功能重要性：

Fig. 4: SHAP Feature Importance

图4：SHAP功能的重要性

There are several features in the dataset that could be considered as ‘sensitives’ to include in the model, some of them more controversial than others. For instance, features like Nationality, Race and Gender are probably the most sensitive ones in determining an individual’s income.

数据集中有几项特征可以视为模型中包含的“敏感因素” ，其中一些特征比其他特征更具争议性。例如，国籍，种族和性别等功能可能是确定个人收入时最敏感的功能。

Moreover, even though features like Age and Marital Status may have good predictive power by covering up certain individual’s aspects, such as years of work experience or education, they could also be considered sensitive.

而且，即使诸如年龄和婚姻状况之类的功能通过掩盖某些人的某些方面(例如多年的工作经验或受过教育)可能具有良好的预测能力，也可能被认为是敏感的。

So, how can we assess the degree to which the model is relying on these sensitive features to make the predicions?

那么，我们如何评估模型依靠这些敏感特征做出预测的程度？

2. FairML直觉 (2. FairML Intuition)

Like most interpretation algorithms, the basic idea behind FairML is to measure how the model’s predictions vary with perturbations made in the inputs. If a small change in a feature dramatically modifies the output, then the model is sensitive to that feature.

像大多数解释算法一样，FairML的基本思想是测量模型的预测如何随输入中的扰动而变化。如果某个功能的微小变化会极大地改变输出，则该模型对该功能很敏感。

However, if the features are correlated, the indirect effects between them might still not be accounted for in the interpretation model. FairML addresses this multicollinearity problem using orthogonal projection.

但是，如果要素相互关联，则解释模型中可能仍未考虑它们之间的间接影响。 FairML使用正交投影解决了这个多重共线性问题 。

正交投影 (Orthogonal Projection)

Fig. 5: Orthogonal projection of vector a on vector b

图5：向量a在向量b上的正交投影

An orthogonal projection is a type of vector projection that maps a vector onto the orthogonal (perpendicular) direction of another vector. If a vector a is projected onto a vector b (in Euclidean space), the component of a that lies in the direction of b is obtained.

正交投影是矢量投影的一种，它将矢量映射到另一个矢量的正交(垂直)方向上。如果将向量a投影到向量b上 (在欧几里得空间中)，则将获得a沿b方向的分量。

This concept is very important in FairML since it allows to completely remove the linear dependence between features. If 2 vectors are orthogonal to each other, then there is no linear combination of one vector that can produce the other. The component of a orthogonal to b, can be calculated as a2 = a - a1

这个概念在FairML中非常重要，因为它可以完全消除要素之间的线性相关性。如果2个向量彼此正交，则一个向量不会产生任何线性组合。一个正交与B的组分，可被计算为A2 = A - A1

Orthogonal projection guarantees that there will be no hidden collinearity effects. It is important to note that this is a linear transformation, so it does not account for non-linear dependencies between features. To solve this, FairML uses basis expansion and a greedy search over such expansions.

正交投影可确保不会出现隐藏的共线性效应 。重要的是要注意，这是线性变换，因此它不考虑要素之间的非线性相关性。为了解决这个问题，FairML使用了基础扩展，并对这种扩展进行了贪婪的搜索。

FairML流程 (FairML Process)

Julius Adebayo) Julius Adebayo的图片 )

If F is a model trained with 2 features x1 and x2, to calculate the dependence of F on x1, first x2 is made orthogonal to x1 to remove all dependencies between the two. Secondly, the variation in the model output is analyzed using the orthogonal component of x2 and making perturbations in x1. The change in output between the perturbed input and the original input indicates the dependence of the model on x1. The dependence of F on x2 can be estimated in the same way.

如果F是训练有2个特征x1和x2的模型，以计算F对x1的依赖性，则使第一个x2正交于x1以除去两者之间的所有依赖性。其次，使用x2的正交分量并在x1中进行扰动来分析模型输出中的变化。扰动输入和原始输入之间的输出变化表示模型对x1的依赖性。 F对x2的依赖性可以用相同的方式估计。

3.评估模型公平性 (3. Assessing Model Fairness)

Now that we know how FairML works, let’s use it to evaluate our model. Firstly, we will install the Python package and import the required modules.

现在我们知道FairML的工作原理，让我们用它来评估我们的模型。首先，我们将安装Python软件包并导入所需的模块。

# FairML install
pip install https://github/adebayoj/fairml/archive/master.zip# Import modules
from fairml import audit_model
from fairml import plot_dependencies

Secondly, we will audit the model. The audit_model method receives 2 required and 5 optional inputs:

其次，我们将审核模型。 audit_model方法接收2个必需输入和5个可选输入：

Required

需要

predict_function: black-box model function that has a predict method.
predict_function：具有预测方法黑匣子模型的功能。
input_dataframe: dataframe with shape (n_samples, n_features)
input_dataframe ：形状为(n_samples，n_features)的数据框

Optional

可选的

distance_metric: one of [‘mse’, ‘accuracy’] (default=‘mse’)
distance_metric ：[ 'mse' ， 'accuracy' ]( 默认值='mse' )之一
direct_input_pertubation_strategy: refers to how to zero out a single variable. Options = [‘constant-zero’ (replace with a random constant value), ‘constant-median’ (replace with median constant value), ‘global-permutation’ (replace all values with a random permutation of the column)].
direct_input_pertubation_strategy ：是指如何将单个变量归零。选项= [“ 恒定零” (用随机常数替换)，“ 恒定中位数” (用中位数恒定替换)，“ 全局排列” (用列的随机排列替换所有值)]。
number_of_runs: number of runs to perform (default=10).
number_of_runs ：要执行的运行次数( 默认为10 )。
include_interactions: flag to enable checking model dependence on interactions (default=False).
include_interactions ：标志，用于检查模型对交互的依赖关系( 默认值= False )。
external_data_set: data that did not go into training the model, but that you’d like to see what impact that data has on the black box model (default=None).
external_data_set ：尚未用于训练模型的数据，但您想查看该数据对黑盒模型有什么影响( default = None )。

# Model Audit
importances, _ = audit_model(clf_xgb_array.predict, X_train)

The audit_model method returns a dictionary where keys are the column names of the input dataframe (X_train) and values are lists containing model dependence on that particular feature. These lists are of size number_of_runs.

audit_model方法返回一个字典，其中的键是输入数据帧的列名(X_train) ，值是包含对该特定功能的模型依赖性的列表。这些列表的大小为number_of_runs 。

The process carried out for each feature is as described in the previous section. One drawback of this methodology is that it is computationally expensive to run when the number of features is high.

对每个功能执行的过程如上一节中所述。 这种方法的一个缺点是，当特征数量很多时，运行起来在计算上是昂贵的。

FairML allows to plot the dependence of the output on each feature (excluding the effect of the correlation with the other predictors):

FairML允许绘制输出对每个功能的依赖性(不包括与其他预测变量的相关性影响)：

# Plot Feature Dependencies
plot_dependencies(importances.median(),
                  reverse_values=False,
                  title="FairML Feature Dependence",
                  fig_size=(6,12))

Fig. 7: FairML Feature Dependence

图7：FairML功能依赖性

Red bars indicate that the feature contributes to an output 1 (Income > 50K), while light blue bars indicate that it contributes to an output 0 (Income <= 50k).

红色条表示功能有助于输出1(收入> 50K)，而浅蓝色条表示功能有助于输出0(收入<= 50k)。

It is observed that this algorithm, by removing the dependence between features through orthogonal projection, identifies that the model has a high dependence on sensitive features such as race_White, nac_United-States and sex_Male. In other words, according to the trained model, a white man born in the United States will have a higher probability of having an income greater than USD 50k, which constitutes a very strong bias.

可以观察到，该算法通过正交投影消除了特征之间的依赖性，从而确定该模型对诸如race_White ， nac_United-States和sex_Male的敏感特征具有高度依赖性。换句话说，根据训练有素的模型，在美国出生的白人收入高于5万美元的可能性更高，这是非常强烈的偏见。

It is very important to notice the relevance of the orthogonal projection in the algorithm, since features such as race_White and nac_United-States did not appear to be so relevant in SHAP’s Feature Importance or in the other interpretation algorithms. This is probably because the effects of these are hidden in other features. By removing multicollinearity and evaluating the individual dependence on each feature, it is possible to identify the intrinsic effects of each one.

注意算法中正交投影的相关性非常重要 ，因为诸如race_White和nac_United-States之类的特征在SHAP的特征重要性或其他解释算法中似乎并不那么相关。这可能是因为这些效果隐藏在其他功能中。通过消除多重共线性并评估每个对每个特征的依赖性，可以确定每个特征的内在影响。

4.建议做法 (4. Recommended practices)

Fairness in AI and ML is an open area of research. As a main contributor to this field, GoogleAI recommends some best practices in order to address this issue:

AI和ML中的公平性是一个开放的研究领域。作为该领域的主要贡献者， GoogleAI建议一些最佳做法，以解决此问题：

Design your model using concrete goals for fairness and inclusion: engage with social scientists, humanists, and other relevant experts for your product to understand and account for various perspectives.
使用公平和包容的具体目标来设计模型 ：与社会科学家，人文主义者和其他相关专家合作，为您的产品服务，以理解和解释各种观点。
Use representative datasets to train and test your model: identify prejudicial or discriminatory correlations between features, labels, and groups.
使用代表性的数据集来训练和测试您的模型 ：识别特征，标签和组之间的偏见或歧视性关联。
Check the system for unfair biases: while designing metrics to train and evaluate your system, also include metrics to examine performance across different subgroups (use diverse testers and stress-test the system on difficult cases).
检查系统中是否存在不公平的偏见 ：在设计度量标准以训练和评估系统时，还应包括度量标准以检查不同子组的性能(使用不同的测试人员，并对困难案例进行压力测试) 。
Analyze performance: even if everything in the system is carefully crafted to address fairness issues, ML-based models rarely operate with 100% perfection when applied to real, live data. When an issue occurs in a live product, consider whether it aligns with any existing societal disadvantages, and how it will be impacted by both short- and long-term solutions.
分析性能 ：即使系统中的所有组件都是经过精心设计以解决公平性问题，基于ML的模型在应用于实时数据时也很少能100％完美地运行。当实时产品出现问题时，请考虑它是否与任何现有的社会不利因素相适应，以及短期和长期解决方案将如何影响它。

结论 (Conclusions)

This article is meant to help data scientists get a better understanding of how their machine learning models treat pre-existing biases in data.

本文旨在帮助数据科学家更好地了解他们的机器学习模型如何处理数据中先前存在的偏差。

We have presented the FairML intuition on how it addresses this issue and implemented a fairness evaluation of an XGBoost model trained in the Adult Census Income dataset. Finally, we summarized some of the best practices GoogleAI recommends in this growing field.

我们已经介绍了FairML关于如何解决此问题的直觉，并对在成人普查收入数据集中训练的XGBoost模型实施了公平性评估。最后，我们总结了GoogleAI在这个不断发展的领域中推荐的一些最佳做法。

As an ending note, I would like to leave a quote from Google’s responsible AI practices:

最后，我想引用Google负责任的AI做法：

AI systems are enabling new experiences and abilities for people around the globe. Beyond recommending books and television shows, AI systems can be used for more critical tasks, such as predicting the presence and severity of a medical condition, matching people to jobs and partners, or identifying if a person is crossing the street. Such computerized assistive or decision-making systems have the potential to be fairer and more inclusive at a broader scale than decision-making processes based on ad hoc rules or human judgments. The risk is that any unfairness in such systems can also have a wide-scale impact. Thus, as the impact of AI increases across sectors and societies, it is critical to work towards systems that are fair and inclusive for all.

人工智能系统正在为全球人们带来新的体验和能力。除了推荐书籍和电视节目外，AI系统还可以用于更关键的任务，例如预测医疗状况的存在和严重性，将人员与工作和伴侣进行匹配或确定人是否在过马路。与基于临时规则或人为判断的决策过程相比，这种计算机辅助或决策系统在更大范围上具有更公平，更包容的潜力。风险在于此类系统中的任何不公平现象也会产生广泛的影响。因此， 随着AI在各个部门和社会中的影响力不断增强，至关重要的是要建立一个对所有人都公平和包容的系统。

I hope this article serves its purpose as a general guide into addressing fairness in black-box models and puts a grain of sand into a more fair and inclusive use of AI. The entire code can be found in my GitHub

我希望本文的目的是作为解决黑匣子模型公平性的一般指南，并为更公平和包容性的AI使用铺垫。完整的代码可以在我的GitHub上找到