AI Regulation Is Coming

过去十年间,公众对数字技术的担忧主要集中在对个人信息的滥用问题上。人们对企业追踪个人网络足迹的方式感到不适,包括搜集信用卡号、地址等关键信息。用户因随意搜索点开广告链接,之后就被企业追踪网络足迹,这让他们感到恐怖,担心身份盗取和网络欺诈。这种忧虑让欧美开始研究相关法规,希望能保证互联网用户对个人信息和影像拥有一定控制权。最著名的成果是2018年颁布的《欧盟通用数据保护条例》(GDPR)。
For most of the past decade, public concerns about digital technology have focused on the potential abuse of personal data. People were uncomfortable with the way companies could track their movements online, often gathering credit card numbers, addresses, and other critical information. They found it creepy to be followed around the web by ads that had clearly been triggered by their idle searches, and they worried about identity theft and fraud. Those concerns led to the passage of measures in the United States and Europe guaranteeing internet users some level of control over their personal data and images—most notably, the European Union’s 2018 General Data Protection Regulation (GDPR).
针对企业使用个人信息的讨论自然并不会因这些举措而终止。一些人认为,相较于对数据使用监管较宽松的国家,这种限制会影响欧美国家的经济表现。另有人指出,众多证据表明更严格的监管令欧洲小企业和谷歌、亚马逊等财力雄厚的美国竞争对手相比处于严重劣势。
Of course, those measures didn’t end the debate around companies’ use of personal data. Some argue that curbing it will hamper the economic performance of Europe and the United States relative to less restrictive countries, notably China, whose digital giants have thrived with the help of ready, lightly regulated access to personal information of all sorts. (Recently, however, the Chinese government has started to limit the digital firms’ freedom—as demonstrated by the large fines imposed on Alibaba.) Others point out that there’s plenty of evidence that tighter regulation has put smaller European companies at a considerable disadvantage to deeper-pocketed U.S. rivals such as Google and Amazon.
但是讨论进入了新阶段。随着企业逐步在产品、服务、流程和决策过程中加入人工智能,公众注意力转移到应当如何在软件中使用数据,特别是可能用于诊断癌症、自动驾驶或审核贷款等复杂且不断进化的算法。欧盟再次走在世界前列(2020年发布“人工智能白皮书——欧洲迈向卓越及互信的方式”以及2021年的人工智能法律框架提案),认为监管是发展公众可信任的人工智能工具的必备条件。这一切对企业来说意味着什么?我们一直在研究如何监管人工智能算法、如何基于监管框架提案的关键原则部署人工智能系统,一直在帮助各行业发起并推进人工智能驱动项目规模化。我们将在下文中结合这方面的研究所得和其他人的研究成果,探索企业领导者在决策过程和流程中整合人工智能(下称“AI”)时,为确保安全并获得客户信任,面临的三大关键挑战。我们也提出了指导高管完成这些任务的框架,部分利用了战略风险管理中使用的概念。
But the debate is entering a new phase. As companies increasingly embed artificial intelligence in their products, services, processes, and decision-making, attention is shifting to how data is used by the software—particularly by complex, evolving algorithms that might diagnose a cancer, drive a car, or approve a loan. The EU, which is again leading the way (in its 2020 white paper “On Artificial Intelligence—A European Approach to Excellence and Trust” and its 2021 proposal for an AI legal framework), considers regulation to be essential to the development of AI tools that consumers can trust.
不公平的结果:使用AI的风险
Unfair Outcomes: The Risks of Using AI
媒体报道过AI系统会产生带有偏见的结果。著名的例子是苹果手机的信用卡算法被指控歧视女性,并引发纽约金融服务部门的调查。但问题隐藏在很多其他表象下:例如无处不在的网络广告算法中,商家按种族、宗教、性别寻找受众;亚马逊的简历筛选器会筛掉大部分女性候选人。《科学》杂志近期的一篇研究论文表明,每年影响美国数百万人的医疗风险预测工具体现出极大的种族歧视。另一项发表在《普通内科杂志》的研究表明,大医院用于安排肾脏移植手术优先级名单的软件歧视黑人。
But the problem crops up in many other guises: for instance, in ubiquitous online advertisement algorithms, which may target viewers by race, religion, or gender, and in Amazon’s automated résumé screener, which filtered out female candidates. A recent study published in Science showed that risk prediction tools used in health care, which affect millions of people in the United States every year, exhibit significant racial bias. Another study, published in the Journal of General Internal Medicine, found that the software used by leading hospitals to prioritize recipients of kidney transplants discriminated against Black patients.
多数情况下,问题源于AI使用的训练数据。如果数据有偏见,AI会延续甚至放大这种偏见。例如,微软用推文训练聊天机器人和Twitter用户互动,却不得不在产品上线第二天就撤下来,因为聊天机器人大量使用煽动性的种族言论。但不能简单地将种族或性别等人口统计信息从训练数据中剔除,因为一些情况下需要这部分数据来纠正偏见。
In most cases the problem stems from the data used to train the AI. If that data is biased, then the AI will acquire and may even amplify the bias. When Microsoft used tweets to train a chatbot to interact with Twitter users, for example, it had to take the bot down the day after it went live because of its inflammatory, racist messages. But it’s not enough to simply eliminate demographic information such as race or gender from training data, because in some situations that data is needed to correct for biases.
理论上讲,我们能够做到在软件编程时加入某种公平的概念,要求所有结果符合一定条件。亚马逊公司做过实验,加入一种公平指标,名为“有条件的人口统计差异”,其他公司也在开发类似指标。但一大障碍是公平没有统一定义,决定公平结果的一般条件也不绝对。而且各种不同条件下的利益相关者对公平内涵的理解都截然不同。最终任何试图在软件设计中加入公平的尝试都有隐患。
In theory, it might be possible to code some concept of fairness into the software, requiring that all outcomes meet certain conditions. Amazon is experimenting with a fairness metric called conditional demographic disparity, and other companies are developing similar metrics. But one hurdle is that there is no agreed-upon definition of fairness, nor is it possible to be categorical about the general conditions that determine equitable outcomes. What’s more, the stakeholders in any given situation may have very different notions of what constitutes fairness. As a result any attempts to design it into the software will be fraught.
监管者大多数依赖标准化的反歧视法规来解决有偏见的算法结果。如果出了问题可以追溯到责任人,这种做法是可行的。但随着AI应用的增加,个体问责遭到破坏。更糟糕的是AI增加了偏见潜在的影响力:任何缺陷都能影响数百万人,企业可能面临的集体诉讼数量打破历史纪录,遭遇声誉危机。高管该如何避免这类问题?首先,做出任何决定之前,应该考察四个因素,加深对利害关系的理解:
Regulators have mostly fallen back on standard antidiscrimination legislation. That’s workable as long as there are people who can be held responsible for problematic decisions. But with AI increasingly in the mix, individual accountability is undermined. Worse, AI increases the potential scale of bias: Any flaw could affect millions of people, exposing companies to class-action lawsuits of historic proportions and putting their reputations at risk. What can executives do to head off such problems? As a first step, prior to making any decision, they should deepen their understanding of the stakes, by exploring four factors:
结果的影响。一些算法影响的决定会给人们的生活带来直接且严重的后果。例如通过算法进行医疗诊断、筛选求职者信息、审批购房贷款或给出量刑建议。这种情况下明智的做法是避免使用AI,或只用AI作为人类判断的辅助。
The impact of outcomes. Some algorithms make or affect decisions with direct and important consequences on people’s lives. They diagnose medical conditions, for instance, screen candidates for jobs, approve home loans, or recommend jail sentences. In such circumstances it may be wise to avoid using AI or at least subordinate it to human judgment.
但后一种方式仍然需要仔细斟酌。假设一位法官不顾AI建议,批准提前释放一名罪犯,之后此人又犯下暴力案件。法官会被迫解释当初忽略AI建议的原因。所以使用AI增加了人类决策者的问责,可能导致人类更多地遵从算法。
The latter approach still requires careful reflection, however. Suppose a judge granted early release to an offender against an AI recommendation and that person then committed a violent crime. The judge would be under pressure to explain why she ignored the AI. Using AI could therefore increase human decision-makers’ accountability, which might make people likely to defer to the algorithms more often than they should.
但这并不意味着不能在影响力很大的情况下使用AI。依赖人类决策者的组织仍然需要控制人类下意识的偏见,AI可以帮助揭示这类偏见。亚马逊最终决定不再使用AI进行招聘,而是用AI检测目前招聘方式中的缺陷。结论就是在选择是否使用AI时,需要考虑算法相对人类决策者的公平性。
That’s not to say that AI doesn’t have its uses in high-impact contexts. Organizations relying on human decision-makers will still need to control for unconscious bias among those people, which AI can help reveal. Amazon ultimately decided not to leverage AI as a recruiting tool but rather to use it to detect flaws in its current recruiting approach. The takeaway is that the fairness of algorithms relative to human decision-making needs to be considered when choosing whether to use AI.
决策的范围和本质。研究表明,人类对AI的信任程度随着不同的决策类型而变化。如果是被认为相对机械化且有边界的任务,例如优化时间表或作图像分析,软件和人类受信任程度相同甚至更高。
The nature and scope of decisions. Research suggests that the degree of trust in AI varies with the kind of decisions it’s used for. When a task is perceived as relatively mechanical and bounded—think optimizing a timetable or analyzing images—software is regarded as at least as trustworthy as humans.
但如果是主观决策或有变量(如量刑问题中,罪犯情况可能不尽相同),则人类判断更受信任,部分原因是人类有共情能力。这意味着企业需要非常小心地对外阐述应用AI做决策的本质和范围,以及为什么在这些情况下AI比人类的判断更适用。这里的区别很直观,不受决策后果严重与否影响。例如,运用AI处理医学扫描,人们很容易接受软件的优势:人类只能处理几千个数据点,但训练机器软件的数据库是数十亿。
But when decisions are thought to be subjective or the variables change (as in legal sentencing, where offenders’ extenuating circumstances may differ), human judgment is trusted more, in part because of people’s capacity for empathy. This suggests that companies need to communicate very carefully about the specific nature and scope of decisions they’re applying AI to and why it’s preferable to human judgment in those situations. This is a fairly straightforward exercise in many contexts, even those with serious consequences. For example, in machine diagnoses of medical scans, people can easily accept the advantage that software trained on billions of well-defined data points has over humans, who can process only a few thousand.
而精神健康诊断可能就不适合应用AI,因为精神健康方面的因素是行为上的,难以定义且有具体情景影响。人们很难接受用机器处理依赖具体情景的情况。即便准确定义了关键变量,机器也不能完全理解其在不同人群中的变化——这也引出了下一个因素。
On the other hand, applying AI to make a diagnosis regarding mental health, where factors may be behavioral, hard to define, and case-specific, would probably be inappropriate. It’s difficult for people to accept that machines can process highly contextual situations. And even when the critical variables have been accurately identified, the way they differ across populations often isn’t fully understood—which brings us to the next factor.
运营复杂性和规模化限制。一种算法在不同地区和市场中不一定都公平。例如,决定哪些消费者获得折扣的算法在全美人口中也许是公平的,但如果用在某地,比如曼哈顿地区,如果这个地方的消费者行为和态度不符合全国平均水平,且没有在训练算法时体现出来,算法就带有歧视性。平均数据会掩盖不同地区或不同人群中的歧视,避免歧视需要给每个子集定制算法。鉴于此,任何旨在降低地方性和小组人群偏见的监管都可能降低AI的规模化优势,而规模化优势正是使用AI的初衷。
Operational complexity and limits to scale. An algorithm may not be fair across all geographies and markets. For example, one selecting consumers for discounts may appear to be equitable across the entire U.S. population but still show bias when applied to, say, Manhattan residents if consumer behavior and attitudes in Manhattan don’t correspond to national averages and aren’t reflected in the algorithm’s training. Average statistics can mask discrimination among regions or subpopulations, and avoiding it may require customizing algorithms for each subset. That explains why any regulations aimed at decreasing local or small-group biases are likely to reduce the potential for scale advantages from AI, which is often the motivation for using it in the first place.
调整市场变量会让算法更复杂,研发成本随之增加。为特定市场定制产品和服务同样极大增加了生产和监控成本。一切变量都会增加组织复杂性和运营费用。如果成本过高,企业也许会放弃部分市场。例如,因为GDPR,Gravity Interactive(作品有《仙境传说》和《梦幻龙族》)等开发商选择了短期内停止向欧盟出售产品。尽管多数企业会找到符合监管规定的方式(《梦幻龙族》去年5月重新在欧洲发行),但所需成本和丢失的机会是很重要的。
Adjusting for variations among markets adds layers to algorithms, pushing up development costs. Customizing products and services for specific markets likewise raises production and monitoring costs significantly. All those variables increase organizational complexity and overhead. If the costs become too great, companies may even abandon some markets. Because of GDPR, for example, certain developers, like Gravity Interactive (the maker of Ragnarok and Dragon Saga games), chose to stop selling their products in the EU for some time. Although most will have found a way to comply with the regulation by now (Dragon Saga was relaunched last May in Europe), the costs incurred and the opportunities lost are important.
合规和治理能力。为遵守即将来临的(至少在欧美)更严格的AI监管,企业需要新的流程和工具:系统审计、文件记录和数据协议(用于可追溯性)。AI监控和多元化培训。一些企业已经针对不同利益相关者测试了每种新的AI算法,评估输出结果是否符合公司价值观、会不会招致监管问题。
Compliance and governance capabilities. To follow the more stringent AI regulations that are on the horizon (at least in Europe and the United States), companies will need new processes and tools: system audits, documentation and data protocols (for traceability), AI monitoring, and diversity awareness training. A number of companies already test each new AI algorithm across a variety of stakeholders to assess whether its output is aligned with company values and unlikely to raise regulatory concerns.
谷歌、微软、宝马和德国电信公司正在制定正式的AI政策,在安全、公平、多样性和隐私方面做出承诺。联邦住房贷款抵押公司(Freddie Mac)等企业甚至任命了首席伦理官负责监督此类政策的制定和执行,在很多情况下由伦理治理董事会支持。
Google, Microsoft, BMW, and Deutsche Telekom are all developing formal AI policies with commitments to safety, fairness, diversity, and privacy. Some companies, like the Federal Home Loan Mortgage Corporation (Freddie Mac), have even appointed chief ethics officers to oversee the introduction and enforcement of such policies, in many cases supporting them with ethics governance boards.
透明度:解释问题出在哪里
Transparency: Explaining What Went Wrong
AI和人类判断一样并不全然可靠。算法不可避免地会做出一些不公平甚至不安全的决定。人类犯错时往往会被问责,决策者可能要承担法律责任。这有助于组织或社区理解并纠正错误决定,和利益相关者建立信任。那么我们应该要求甚至期待AI对自己的决定做出解释吗?
Just like human judgment, AI isn’t infallible. Algorithms will inevitably make some unfair—or even unsafe—decisions. When people make a mistake, there’s usually an inquiry and an assignment of responsibility, which may impose legal penalties on the decision-maker. That helps the organization or community understand and correct unfair decisions and build trust with its stakeholders. So should we require—and can we even expect—AI to explain its decisions, too?
监管者已向这个方向迈进。GDPR中提到“有权……得到(算法)决策达成过程的解释”,欧盟也在白皮书中和AI监管提案中将可解释性作为增加AI信任的关键因素。但是我们对因果关系的理解往往不够全面,机器决策的解释到底是怎样的东西?亚里士多德曾指出,这种情况下,比起解释如何得出结论的能力,更重要的是复制结果并以经验验证准确度的能力——企业可以通过对比AI预测和结果做到这点。考虑应用AI的企业领导者还需要思考两个因素:
Regulators are certainly moving in that direction. The GDPR already describes “the right…to obtain an explanation of the decision reached” by algorithms, and the EU has identified explainability as a key factor in increasing trust in AI in its white paper and AI regulation proposal. But what does it mean to get an explanation for automated decisions, for which our knowledge of cause and effect is often incomplete? It was Aristotle who pointed out that when this is the situation, the ability to explain how results are arrived at can be less important than the ability to reproduce the results and empirically verify their accuracy—something companies can do by comparing AI’s predictions with outcomes. Business leaders considering AI applications also need to reflect on two factors:
需要解释到什么程度。AI算法的解释大致可分为两类,适用于不同情况。
The level of explanation required. With AI algorithms, explanations can be broadly classified into two groups, suited to different circumstances.
全局解释是针对某个流程所有结果的完整解释,需要说明输入变量间的关系规则或公式。流程的公平性至关重要时,需要此类解释,例如针对资源分配的决定,利益相关者需要提前知道决策模型背后的逻辑。
Global explanations are complete explanations for all outcomes of a given process and describe the rules or formulas specifying relationships among input variables. They’re typically required when procedural fairness is important—for example, with decisions about the allocation of resources, because stakeholders need to know in advance how they will be made.
针对算法给出全局解释看似直截了当:只需要共享公式。但是多数人缺乏理解这些公式所需的数学或计算机高级知识,更不用说判断公式中的变量关系是否合适。在机器学习里,AI软件的算法会对训练数据中不同变量间明显的关系进行描述——缺陷或偏见存在于数据,而不是算法之中,也许这是所有问题的根本原因。
Providing a global explanation for an algorithm may seem straightforward: All you have to do is share its formula. However, most people lack the advanced skills in mathematics or computer science needed to understand such a formula, let alone determine whether the relationships specified in it are appropriate. And in the case of machine learning—where AI software creates algorithms to describe apparent relationships between variables in the training data—flaws or biases in that data, not the algorithm, may be the ultimate cause of any problem.
此外,企业可能缺乏对自身算法工作原理的理解,面对监管部门对可解释性的要求,企业不仅需要查看数据和IT部门,还需要外部专家。比如甲骨文、SAP和salesforce等大型软件即服务提供商的产品往往整合了多个第三方供应商的AI部件。客户有时择优挑选并整合AI解决方案。但所有终端产品组成以及整合互连的原理都需要可解释。
In addition, companies may not even have direct insight into the workings of their algorithms, and responding to regulatory constraints for explanations may require them to look beyond their data and IT departments and perhaps to external experts. Consider that the offerings of large software-as-a-service providers, like Oracle, SAP, and Salesforce, often combine multiple AI components from third-party providers. And their clients sometimes cherry-pick and combine AI-enabled solutions. But all an end product’s components and how they combine and interconnect will need to be explainable.
本地解释要对某个具体结果背后的基本原理做出解释,例如为什么某个申请者(或者一类申请者)的贷款申请被拒绝,而另一类却获批。通常由可解释AI算法完成,这种算法能够向输出结果的接收方解释算法背后的依据。当个人需要知道针对自身决策背后的原因,但不能或无法看到其他人的相关决定时,就可以用这种方法。
Local explanations offer the rationale behind a specific output—say, why one applicant (or class of applicants) was denied a loan while another was granted one. They’re often provided by so-called explainable AI algorithms that have the capacity to tell the recipient of an output the grounds for it. They can be used when individuals need to know only why a certain decision was made about them and do not, or cannot, have access to decisions about others.
本地解释可以采用回答问题的陈述形式。客户关键特征是什么?如果特征有所不同,决策或结果是否会变化?例如,如果两个申请人的唯一区别是一个24岁,一个25岁,那么解释就是第一个申请人如果超过24岁,贷款就会获批。问题在于这里的特征本身也许隐含有偏见。例如,申请人的住所邮编是关键,来自黑人居住区的合格申请人会因此被拒。
Local explanations can take the form of statements that answer the question, What are the key customer characteristics that, had they been different, would have changed the output or decision of the AI? For example, if the only difference between two applicants is that one is 24 and the other is 25, then the explanation would be that the first applicant would have been granted a loan if he’d been older than 24. The trouble here is that the characteristics identified may themselves conceal biases. For example, it may turn out that the applicant’s zip code is what makes the difference, with otherwise solid applicants from Black neighborhoods being penalized.
权衡利弊。最强大的算法必然不透明。比如中国阿里巴巴公司旗下的蚂蚁金服推出的MYbank网商银行,审批小额商业贷款AI只需三分钟,无需人工介入。该服务整合了包括电子商务平台销售信息在内的阿里生态系统的所有数据,通过机器学习预测违约风险,保持实时信用评级。
The trade-offs involved. The most powerful algorithms are inherently opaque. Look at Alibaba’s Ant Group in China, whose MYbank unit uses AI to approve small business loans in under three minutes without human intervention. To do this, it combines data from all over the Alibaba ecosystem, including information on sales from its e-commerce platforms, with machine learning to predict default risks and maintain real-time credit ratings.
由于蚂蚁金服软件采用3000多种数据建模,几乎不可能清晰阐明得出具体评估结果的过程(更不用说做出全局解释)。很多最令人兴奋的AI应用都需要类似规模的算法输入。对AI解释性的严格要求可能会妨碍很多企业创新或增长的能力,包括但不限于B2B市场的定制付费条款、保险承保和自动驾驶汽车等领域。
Because Ant’s software uses more than 3,000 data inputs, clearly articulating how it arrives at specific assessments (let alone providing a global explanation) is practically impossible. Many of the most exciting AI applications require algorithmic inputs on a similar scale. Tailored payment terms in B2B markets, insurance underwriting, and self-driving cars are only some of the areas where stringent AI explainability requirements may hamper companies’ ability to innovate or grow.
企业在推出类似蚂蚁金服这样消费者和监管者都高度重视个体权利的服务时会面临挑战,特别是在欧美。企业想要使用这样的AI,需要解释清楚算法如何定义不同顾客的相似性,两个候选人之间的区别为何会导致不同的决策结果,为什么相似的顾客会获得来自AI的不同解释。
Companies will face challenges introducing a service like Ant’s in markets where consumers and regulators highly value individual rights—notably, the European Union and the United States. To deploy such AI, firms will need to be able to explain how an algorithm defines similarities between customers, why certain differences between two prospects may justify different treatments, and why similar customers may get different explanations about the AI.
地理位置不同,对解释的期待也不同,这也会给全球化企业带来挑战。这些企业可以简单地在全球范围内统一采用最严格的解释性要求,但这样做明显会让它们在部分市场相比本土企业处于劣势。遵循欧盟规定的银行在预测借贷者违约几率方面,很难拥有蚂蚁金服那样准确的算法,结果可能在信贷要求方面更严苛。另一方面,采用不同的解释性标准可能会更复杂昂贵,因为本质上企业在为不同市场创造不同算法,可能需要增加更多AI来确保互动性。
Expectations for explanations also vary by geography, which presents challenges to global operators. They could simply adopt the most stringent explainability requirements worldwide, but doing so could clearly put them at a disadvantage to local players in some markets. Banks following EU rules would struggle to produce algorithms as accurate as Ant’s in predicting the likelihood of borrower defaults and might have to be more rigorous about credit requirements as a consequence. On the other hand, applying multiple explainability standards will most likely be more complex and costly—because a company would, in essence, be creating different algorithms for different markets and would probably have to add more AI to ensure interoperability.
但机会仍然存在。解释性方面的要求可以成为差异化来源。研发AI算法并拥有更强解释能力的企业会更容易赢得消费者和监管者的信任。这点具有战略性意义。例如,如果花旗银行可以提供和蚂蚁金服一样强大的审批小额贷款的可解释AI,必然会统治欧盟和美国市场,甚至有可能在蚂蚁金服占领的市场拥有一席之地。能否向大众解释服务和产品决策背后的公平性和透明度,对科技企业来说也是潜在的差异化优势。IBM研发出了帮助企业实现这点的产品:沃森OpenScale——AI赋能的企业数据分析平台。
There are, however, some opportunities. Explainability requirements could offer a source of differentiation: Companies that can develop AI algorithms with stronger explanatory capabilities will be in a better position to win the trust of consumers and regulators. That could have strategic consequences. If Citibank, for example, could produce explainable AI for small-business credit that’s as powerful as Ant’s, it would certainly dominate the EU and U.S. markets, and it might even gain a foothold on Ant’s own turf. The ability to communicate the fairness and transparency of offerings’ decisions is a potential differentiator for technology companies, too. IBM has developed a product that helps firms do this: Watson OpenScale, an AI-powered data analytics platform for business.
说到底,尽管要求AI对其决策提供解释也许是增强公平性和提升利益相关者信任度的好方式,但代价高昂,有时并不值得。这种情况下唯一的选择是要么放弃使用AI,要么在冒险获得部分不公平的结果和整体获得更准确的结果间取得平衡。
The bottom line is that although requiring AI to provide explanations for its decisions may seem like a good way to improve its fairness and increase stakeholders’ trust, it comes at a stiff price—one that may not always be worth paying. In that case the only choice is either to go back to striking a balance between the risks of getting some unfair outcomes and the returns from more-accurate output overall, or to abandon using AI.
学习并进化:形势不断变化
Learning and Evolving: A Shifting Terrain
AI的独特性之一是学习能力;用越多的标出奶牛和斑马的图片训练算法的图像识别能力,算法识别奶牛或斑马的可能性越高。但持续学习也有缺点:虽然准确度可以随时间提高,但同样的输入每天产生的结果可能不同,因为算法在这段时间内接收的数据变了。要想弄清如何管理不断进化的算法以及是否在最初就允许持续性学习,企业需要关注三个因素:
One of the distinctive characteristics of AI is its ability to learn; the more labeled pictures of cows and zebras an image-recognition algorithm is fed, the more likely it is to recognize a cow or a zebra. But there are drawbacks to continuous learning: Although accuracy can improve over time, the same inputs that generated one outcome yesterday could register a different one tomorrow because the algorithm has been changed by the data it received in the interim. In figuring out how to manage algorithms that evolve—and whether to allow continuous learning in the first place—business leaders should focus on three factors:
风险和奖励。消费者对进化型AI的态度很大程度上取决于自己的风险回报计算。例如在保险定价中,学习型算法很可能为顾客提供比人工更符合其需求的结果,消费者可能对这类AI的宽容度更高。另一种情况下AI学习也许无人在意,例如推荐电影或书籍的AI会随消费者购买数据和评价选择的数据增多,安全地悄然进化。
Risks and rewards. Customer attitudes toward evolving AI will probably be determined by a personal risk-return calculus. In insurance pricing, for example, learning algorithms will most likely provide results that are better tailored to customer needs than anything humans could offer, so customers will probably have a relatively high tolerance for that kind of AI. In other contexts, learning might not be a concern at all. AI that generates film or book recommendations, for instance, could quite safely evolve as more data about a customer’s purchases and viewing choices came in.
但是,如果AI不公平或负面的决策结果会给人类带来很大的风险和影响,人类就没那么容易接受进化型AI了。比如说医疗设备,如果在没有监管的情况下发生变化,可能会给用户带来伤害。因此,一些监管者特别是美国食品药品监管局仅授权这些产品使用“锁定”算法,即不会随产品使用而不断进化的算法,因此相对稳定。
But when the risk and impact of an unfair or negative outcome are high, people are less accepting of evolving AI. Certain kinds of products, like medical devices, could be harmful to their users if they were altered without any oversight. That’s why some regulators, notably the U.S. Food and Drug Administration, have authorized the use of only “locked” algorithms—which don’t learn every time the product is used and therefore don’t change—in them.
对于这类产品,企业会针对同样的算法采用两个平行版本:一套不断学习但仅用于研发,另一套监管者审批通过的锁定版本用于商业用途。监管者审批通过后,持续进化的研发版会定期替代商用版。监管者担心持续学习会导致算法用极隐蔽的新方式产生歧视或变得危险。如果产品或服务的不公平会带来严重问题,其进化性会更受监管者关注。
For such offerings, a company can run two parallel versions of the same algorithm: one used only in R&D that continuously learns, and a locked version for commercial use that is approved by regulators. The commercial version could be replaced at a certain frequency with a new version based on the continuously improving one—after regulatory approval. Regulators also worry that continuous learning could cause algorithms to discriminate or become unsafe in new, hard-to-detect ways. In products and services with which unfairness is a major concern, you can expect a brighter spotlight on evolvability as well.
复杂性及成本。广泛使用学习型AI会提高运营成本。首先,企业也许发现自己在不同地区、市场或环境下采用不同算法,每种都针对本地数据和环境设计。组织或许需要设置新的哨兵职位和流程,确保所有算法在监管风险范围内合规运营。首席风险官的工作范围也许要扩大到对自主AI流程的监管,以及评估企业愿意为进化型AI承担的法律、财务、名誉和自然风险。
Complexity and cost. Deploying learning AI can add to operational costs. First, companies may find themselves running multiple algorithms across different regions, markets, or contexts, each of which has responded to local data and environments. Organizations may then need to create new sentinel roles and processes to make sure that all these algorithms are operating appropriately and within authorized risk ranges. Chief risk officers may have to expand their mandates to include monitoring autonomous AI processes and assessing the level of legal, financial, reputational, and physical risk the company is willing to take on evolvable AI.
企业必须在提升AI学习速率的标准实践和去中心化间找到平衡。能否打造并维护全球数据支柱,赋能企业数字化和AI解决方案?自身系统是否准备好去中心化存储和数据处理?是否准备好应对网络安全威胁?生产是否需要向终端顾客靠拢,这样做是否会让运营暴露在新风险中?能否在本土市场吸引足够的AI人才担任合适的领导职位?这些问题都要详细回答。
Firms also must balance decentralization against standardized practices that increase the rate of AI learning. Can they build and maintain a global data backbone to power the firm’s digital and AI solutions? How ready are their own systems for decentralized storage and processing? How prepared are they to respond to cybersecurity threats? Does production need to shift closer to end customers, or would that expose operations to new risks? Can firms attract enough AI-savvy talent in the right leadership positions in local markets? All those questions must be answered thoughtfully.
人力投入。人也会因为新数据或环境变化而调整自己的决策甚至改变思维模式。例如,如果竞争同一职位的面试者能力发生变化,或者招聘经理第二次面试时身体疲惫,都可能导致面对同样的应聘者两次做出不同决定。既然监管不会阻止人类的这类行为,那么AI因为新数据而进化也应该是被允许的,但是需要说服人们接受这一观点。
Human input. New data or environmental changes can also cause people to adjust their decisions or even alter their mental models. A recruiting manager, for example, might make different decisions about the same job applicant at two different times if the quality of the competing candidates changes—or even because she’s tired the second time around. Since there’s no regulation to prevent that from happening, a case could be made that it’s permissible for AI to evolve as a result of new data. However, it would take some convincing to win people over to that point of view.
人们更容易接受的也许是让人类决策巧妙地辅助AI。2020年《哈佛商业评论》文章《部署AI的正确方式》(赛奥佐罗斯·沃金尼奥是作者之一)提出,AI系统可以作为“教练”使用,为员工(如资产管理公司的金融安全交易员)提供反馈和输入。但这不是单方面的:协作的大部分价值来自人类给算法的反馈。实际上Facebook已经通过Dynabench平台采用有趣的方式监控并加速AI学习过程——人类专家使用“动态对抗数据集”等方式欺骗AI,使其产生错误或不公平的结果。
What people might accept more easily is AI complemented in a smart way by human decision-making. As described in the 2020 HBR article “A Better Way to Onboard AI” (coauthored by Theodoros Evgeniou), AI systems can be deployed as “coaches”—providing feedback and input to employees (for instance, traders in financial securities at an asset management firm). But it’s not a one-way street: Much of the value in the collaboration comes from the feedback that humans give the algorithms. Facebook, in fact, has taken an interesting approach to monitoring and accelerating AI learning with its Dynabench platform. It tasks human experts with looking for ways to trick AI into producing an incorrect or unfair outcome using something called dynamic adversarial data collection.
人类积极提升AI能力,可以很快地发掘价值。最近一期TED演讲中,波士顿咨询公司的西尔万·迪朗东(Sylvain Duranton)讲述了某服装零售商如何通过将人类买手的专业知识输入AI预测服装趋势,从而一年省下1亿美元的故事。
When humans actively enhance AI, they can unlock value fairly quickly. In a recent TED Talk, BCG’s Sylvain Duranton described how one clothing retailer saved more than $100 million in just one year with a process that allowed human buyers to input their expertise into AI that predicted clothing trends.
由于企业对AI,特别是机器学习的依赖性不断增长,显著增加了企业面临的战略风险。企业需要积极为算法撰写规则手册。人类逐渐在审批贷款、给罪犯量刑等决策中使用人工智能分析,隐藏的偏见比例也逐步攀升。机器学习之下的复杂编程本质上是不透明的,这点令人沮丧。人们越来越担心为一群人开发的AI赋能工具是否可以针对其他群体做出安全决策。除非所有企业,包括没有直接涉及AI研发的公司能早日应对这些挑战,否则人类对AI赋能产品的信任会受到损害,引发不必要的限制性监管,后者不仅会破坏商业利润,还会破坏AI为消费者及社会提供的潜在价值。
Given that the growing reliance on AI—particularly machine learning—significantly increases the strategic risks businesses face, companies need to take an active role in writing a rulebook for algorithms. As analytics are applied to decisions like loan approvals or assessments of criminal recidivism, reservations about hidden biases continue to mount. The inherent opacity of the complex programming underlying machine learning is also causing dismay, and concern is rising about whether AI-enabled tools developed for one population can safely make decisions about other populations. Unless all companies—including those not directly involved in AI development—engage early with these challenges, they risk eroding trust in AI-enabled products and triggering unnecessarily restrictive regulation, which would undermine not only business profits but also the potential value AI could offer consumers and society.
弗朗索瓦·坎德龙是波士顿咨询公司董事总经理及高级合伙人,BCG亨德森研究所全球总监。鲁道夫·查尔梅·帝卡洛是波士顿咨询公司巴黎办公室合伙人。迈达斯·德波特是波士顿咨询公司布鲁塞尔办公室项目主管。赛奥佐罗斯·沃金尼奥是欧洲工商管理学院教授。
弗朗索瓦·坎德龙(François Candelon)
鲁道夫·查尔梅·帝卡洛(Rodolphe Charme di Carlo)
迈达斯·德波特(Midas de Bondt)
赛奥佐罗斯·沃金尼奥(Theodoros Evgeniou)| 文
牛文静 | 译 蒋荟蓉 | 校 时青靖 | 编辑
