Operationalizing responsible AI principles for defense
Artificial intelligence (AI) is transforming society, including the very character of national security. Recognizing this, the Department of Defense (DoD) launched the Joint Artificial Intelligence Center (JAIC) in 2019, the predecessor to the Chief Digital and Artificial Intelligence Office (CDAO), to develop AI solutions that build competitive military advantage, conditions for human-centric AI adoption, and the agility of DoD operations. However, the roadblocks to scaling, adopting, and realizing the full potential of AI in the DoD are similar to those in the private sector.
A recent IBM survey found that the top barriers preventing successful AI deployment include limited AI skills and expertise, data complexity, and ethical concerns. Further, according to the IBM Institute of Business Value, 79% of executives say AI ethics is important to their enterprise-wide AI approach, yet less than 25% have operationalized common principles of AI ethics. Earning trust in the outputs of AI models is a sociotechnical challenge that requires a sociotechnical solution.
Defense leaders focused on operationalizing the responsible curation of AI must first agree upon a shared vocabulary—a common culture that guides safe, responsible use of AI—before they implement technological solutions and guardrails that mitigate risk. The DoD can lay a sturdy foundation to accomplish this by improving AI literacy and partnering with trusted organizations to develop governance aligned to its strategic goals and values.
AI literacy is a must-have for security
It’s important that personnel know how to deploy AI to improve organizational efficiencies. But it’s equally important that they have a deep understanding of the risks and limitations of AI and how to implement the appropriate security measures and ethics guardrails. These are table stakes for the DoD or any government agency.
A tailored AI learning path can help identify gaps and needed training so that personnel get the knowledge they need for their specific roles. Institution-wide AI literacy is essential for all personnel in order for them to quickly assess, describe, and respond to fast-moving, viral and dangerous threats such as disinformation and deepfakes.
IBM applies AI literacy in a customized manner within our organization as defining essential literacy varies depending on a person’s position.
Supporting strategic goals and aligning with values
As a leader in trustworthy artificial intelligence, IBM has experience in developing governance frameworks that guide responsible use of AI in alignment with client organizations’ values. IBM also has its own frameworks for use of AI within IBM itself, informing policy positions such as the use of facial recognition technology.
AI tools are now utilized in national security and to help protect against data breaches and cyberattacks. But AI also supports other strategic goals of the DoD. It can augment the workforce, helping to make them more effective, and help them reskill. It can help create resilient supply chains to support soldiers, sailors, airmen and marines in roles of warfighting, humanitarian aid, peacekeeping and disaster relief.
The CDAO includes five ethical principles of responsible, equitable, traceable, reliable, and governable as part of its responsible AI toolkit. Based on the US military’s existing ethics framework, these principles are grounded in the military’s values and help uphold its commitment to responsible AI.
There must be a concerted effort to make these principles a reality through consideration of the functional and non-functional requirements in the models and the governance systems around those models. Below, we provide broad recommendations for the operationalization of the CDAO’s ethical principles.
1. Responsible
“DoD personnel will exercise appropriate levels of judgment and care, while remaining responsible for the development, deployment, and use of AI capabilities.”
Everyone agrees that AI models should be developed by personnel that are careful and considerate, but how can organizations nurture people to do this work? We recommend:
- Fostering an organizational culture that recognizes the sociotechnical nature of AI challenges. This must be communicated from the outset, and there must be a recognition of the practices, skill sets and thoughtfulness that need to be put into models and their management to monitor performance.
- Detailing ethics practices throughout the AI lifecycle, corresponding to business (or mission) goals, data preparation and modeling, evaluation and deployment. The CRISP-DM model is useful here. IBM’s Scaled Data Science Method, an extension of CRISP-DM, offers governance across the AI model lifecycle informed by collaborative input from data scientists, industrial-organizational psychologists, designers, communication specialists and others. The method merges best practices in data science, project management, design frameworks and AI governance. Teams can easily see and understand the requirements at each stage of the lifecycle, including documentation, who they need to talk to or collaborate with, and next steps.
- Providing interpretable AI model metadata (for example, as factsheets) specifying accountable persons, performance benchmarks (compared to human), data and methods used, audit records (date and by whom), and audit purpose and results.
Note: These measures of responsibility must be interpretable by AI non-experts (without “mathsplaining”).
2. Equitable
“The Department will take deliberate steps to minimize unintended bias in AI capabilities.”
Everyone agrees that use of AI models should be fair and not discriminate, but how does this happen in practice? We recommend:
- Establishing a center of excellence to give diverse, multidisciplinary teams a community for applied training to identify potential disparate impact.
- Using auditing tools to reflect the bias exhibited in models. If the reflection aligns with the values of the organization, transparency surrounding the chosen data and methods is key. If the reflection does not align with organizational values, then this is a signal that something must change. Discovering and mitigating potential disparate impact caused by bias involves far more than examining the data the model was trained on. Organizations must also examine people and processes involved. For example, have appropriate and inappropriate uses of the model been clearly communicated?
- Measuring fairness and making equity standards actionable by providing functional and non-functional requirements for varying levels of service.
- Using design thinking frameworks to assess unintended effects of AI models, determine the rights of the end users and operationalize principles. It’s essential that design thinking exercises include people with widely varied lived experiences—the more diverse the better.
3. Traceable
“The Department’s AI capabilities will be developed and deployed such that relevant personnel possess an appropriate understanding of the technology, development processes, and operational methods applicable to AI capabilities, including with transparent and auditable methodologies, data sources, and design procedure and documentation.”
Operationalize traceability by providing clear guidelines to all personnel using AI:
- Always make clear to users when they are interfacing with an AI system.
- Provide content grounding for AI models. Empower domain experts to curate and maintain trusted sources of data used to train models. Model output is based on the data it was trained on.
IBM and its partners can provide AI solutions with comprehensive, auditable content grounding imperative to high-risk use cases.
- Capture key metadata to render AI models transparent and keep track of model inventory. Make sure that this metadata is interpretable and that the right information is exposed to the appropriate personnel. Data interpretation takes practice and is an interdisciplinary effort. At IBM, our Design for AI group aims to educate employees on the critical role of data in AI (among other fundamentals) and donates frameworks to the open-source community.
- Make this metadata easily findable by people (ultimately at the source of output).
- Include human-in-the-loop as AI should augment and assist humans. This allows humans to provide feedback as AI systems operate.
- Create processes and frameworks to assess disparate impact and safety risks well before the model is deployed or procured. Designate accountable people to mitigate these risks.
4. Reliable
“The Department’s AI capabilities will have explicit, well-defined uses, and the safety, security, and effectiveness of such capabilities will be subject to testing and assurance within those defined uses across their entire life cycles.”
Organizations must document well-defined use cases and then test for compliance. Operationalizing and scaling this process requires strong cultural alignment so practitioners adhere to the highest standards even without constant direct oversight. Best practices include:
- Establishing communities that constantly reaffirm why fair, reliable outputs are essential. Many practitioners earnestly believe that simply by having the best intentions, there can be no disparate impact. This is misguided. Applied training by highly engaged community leaders who make people feel heard and included is critical.
- Building reliability testing rationales around the guidelines and standards for data used in model training. The best way to make this real is to offer examples of what can happen when this scrutiny is lacking.
- Limit user access to model development, but gather diverse perspectives at the onset of a project to mitigate introducing bias.
- Perform privacy and security checks along the entire AI lifecycle.
- Include measures of accuracy in regularly scheduled audits. Be unequivocally forthright about how model performance compares to a human being. If the model fails to provide an accurate result, detail who is accountable for that model and what recourse users have. (This should all be baked into the interpretable, findable metadata).
5. Governable
“The Department will design and engineer AI capabilities to fulfill their intended functions while possessing the ability to detect and avoid unintended consequences, and the ability to disengage or deactivate deployed systems that demonstrate unintended behavior.”
Operationalization of this principle requires:
- AI model investment does not stop at deployment. Dedicate resources to ensure models continue to behave as desired and expected. Assess and mitigate risk throughout the AI lifecycle, not just after deployment.
- Designating an accountable party who has a funded mandate to do the work of governance. They must have power.
- Invest in communication, community-building and education. Leverage tools such as watsonx.governance to monitor AI systems.
- Capture and manage AI model inventory as described above.
- Deploy cybersecurity measures across all models.
IBM is at the forefront of advancing trustworthy AI
IBM has been at the forefront of advancing trustworthy AI principles and a thought leader in the governance of AI systems since their nascence. We follow long-held principles of trust and transparency that make clear the role of AI is to augment, not replace, human expertise and judgment.
In 2013, IBM embarked on the journey of explainability and transparency in AI and machine learning. IBM is a leader in AI ethics, appointing an AI ethics global leader in 2015 and creating an AI ethics board in 2018. These experts work to help ensure our principles and commitments are upheld in our global business engagements. In 2020, IBM donated its Responsible AI toolkits to the Linux Foundation to help build the future of fair, secure, and trustworthy AI.
IBM leads global efforts to shape the future of responsible AI and ethical AI metrics, standards, and best practices:
- Engaged with President Biden’s administration on the development of its AI Executive Order
- Disclosed/filed 70+ patents for responsible AI
- IBM’s CEO Arvind Krishna co-chairs the Global AI Action Alliance steering committee launched by the World Economic Forum (WEF),
- Alliance is focused on accelerating the adoption of inclusive, transparent and trusted artificial intelligence globally
- Co-authored two papers published by the WEF on Generative AI on unlocking value and developing safe systems and technologies.
- Co-chair Trusted AI committee Linux Foundation AI
- Contributed to the NIST AI Risk Management Framework; engage with NIST in the area of AI metrics, standards, and testing
Curating responsible AI is a multifaceted challenge because it demands that human values be reliably and consistently reflected in our technology. But it is well worth the effort. We believe the guidelines above can help the DoD operationalize trusted AI and help it fulfill its mission.
For more information on how IBM can help, please visit AI Governance Consulting | IBM