As part of an ongoing effort to keep you informed about our latest work, this blog post summarizes some recent publications from the SEI in the areas of large language models for cybersecurity, software engineering and acquisition with generative AI, zero trust, large language models in national security, capability-based planning, supply chain risk management, generative AI in software engineering and acquisition, and quantum computing.
These publications highlight the latest work of SEI technologists in these areas. This post includes a listing of each publication, author(s), and links where they can be accessed on the SEI website.
Considerations for Evaluating Large Language Models for Cybersecurity Tasks
by Jeff Gennari, Shing-hon Lau, Samuel J. Perl, Joel Parish (OpenAI), and Girish Sastry (OpenAI)
Generative artificial intelligence (AI) and large language models (LLMs) have taken the world by storm. The ability of LLMs to perform tasks seemingly on par with humans has led to rapid adoption in a variety of different domains, including cybersecurity. However, caution is needed when using LLMs in a cybersecurity context due to the impactful consequences and detailed particularities. Current approaches to LLM evaluation tend to focus on factual knowledge as opposed to applied, practical tasks. But cybersecurity tasks often require more than just factual recall to complete. Human performance on cybersecurity tasks is often assessed in part on their ability to apply concepts to realistic situations and adapt to changing circumstances. This paper contends the same approach is necessary to accurately evaluate the capabilities and risks of using LLMs for cybersecurity tasks. To enable the creation of better evaluations, we identify key criteria to consider when designing LLM cybersecurity assessments. These criteria are further refined into a set of recommendations for how to assess LLM performance on cybersecurity tasks. The recommendations include properly scoping tasks, designing tasks based on real-world cybersecurity phenomena, minimizing spurious results, and ensuring results are not misinterpreted.
Read the white paper.
The Future of Software Engineering and Acquisition with Generative AI
by Douglas Schmidt (Vanderbilt University), Anita Carleton, James Ivers, Ipek Ozkaya, John E. Robert, and Shen Zhang
We stand at a pivotal moment in software engineering, with artificial intelligence (AI) playing a crucial role in driving approaches poised to enhance software acquisition, analysis, verification, and automation. While generative AI tools initially sparked excitement for their potential to reduce errors, scale changes effortlessly, and drive innovation, concerns have emerged. These concerns encompass security risks, unforeseen failures, and issues of trust. Empirical research on generative AI development assistants reveals that productivity and quality gains depend not only on the sophistication of tools but also on task flow redesign and expert judgment.
In this webcast, SEI researchers explore the future of software engineering and acquisition using generative AI technologies. They examine current applications, envision future possibilities, identify research gaps, and discuss the critical skill sets that software engineers and stakeholders need to effectively and responsibly harness generative AI’s potential. Fostering a deeper understanding of AI’s role in software engineering and acquisition accentuates its potential and mitigates its risks.
The webcast covers
- how to identify suitable use cases when starting out with generative AI technology
- the practical applications of generative AI in software engineering and acquisition
- how developers and decision makers can harness generative AI technology
Zero Trust Industry Days 2024 Scenario: Secluded Semiconductors, Inc.
by Rhonda Brown
Each accepted presenter at the SEI Zero Trust Industry Days 2024 event develops and proposes a solution for this scenario: A company is operating a chip manufacturing facility on an island where there may be loss of connectivity and cloud services for short or extended periods of time. There are many considerations when addressing the challenges of a zero trust implementation, including varying perspectives and philosophies. This event offers a deep examination of how solution providers and other organizations interpret and address the challenges of implementing zero trust. Using a scenario places boundaries on the zero trust space to yield richer discussions.
This year’s event focuses on the Industrial Internet of Things (IIoT), legacy systems, smart cities, and cloud-hosted services in a manufacturing environment.
Read the white paper.
Using Large Language Models in the National Security Realm
By Shannon Gallagher
At the request of the White House, the Office of the Director of National Intelligence (ODNI) began exploring use cases for large language models (LLMs) within the Intelligence Community (IC). As part of this effort, ODNI sponsored the Mayflower Project at Carnegie Mellon University’s Software Engineering Institute from May 2023 through September 2023. The Mayflower Project attempted to answer the following questions:
- How might the IC set up a baseline, stand-alone LLM?
- How might the IC customize LLMs for specific intelligence use cases?
- How might the IC evaluate the trustworthiness of LLMs across use cases?
In this SEI Podcast, Shannon Gallagher, AI engineering team lead, and Rachel Dzombak, former special advisor to the director of the SEI’s AI Division, discuss the findings and recommendations from the Mayflower Project and provide additional background information about LLMs and how they can be engineered for national security use cases.
Listen/View the SEI Podcast.
Navigating Capability-Based Planning: The Benefits, Challenges, and Implementation Essentials
By Anandi Hira and William Nichols
Capability-based planning (CBP) defines a framework that has an all-encompassing view of existing abilities and future needs for strategically deciding what is needed and how to effectively achieve it. Both business and government acquisition domains use CBP for financial success or to design a well-balanced defense system. The definitions understandably vary across these domains. This paper endeavors to consolidate these definitions to provide a comprehensive view of CBP, its potential, and practical implementation of its principles.
Read the white paper.
Ask Us Anything: Supply Chain Risk Management
By Brett Tucker and Matthew J. Butkovic
According to the Verizon Data Breach Report, Log4j-related exploits have occurred less frequently over the past year. However, this Common Vulnerabilities and Exposures (CVE) flaw was originally documented in 2021. The threat still exists despite increased awareness. Over the past few years, the Software Engineering Institute has developed guidance and practices to help organizations reduce threats to U.S. supply chains. In this webcast, Brett Tucker and Matthew Butkovic, answer enterprise risk management questions to help organizations achieve operational resilience in the cyber supply chain. The webcast covers
- enterprise risk governance and how to assess organization’s risk appetite and policy as it relates to and integrates cyber risks into a global risk portfolio
- regulatory directives on third-party risk
- the agenda and topics to be covered in the upcoming CERT Cyber Supply Chain Risk Management Symposium in February
The Measurement Challenges in Software Assurance and Supply Chain Risk Management
by Nancy R. Mead, Carol Woody, and Scott Hissam
In this paper, the authors discuss the metrics needed to predict cybersecurity in open source software and how standards are needed to make it easier to apply these metrics in the supply chain. The authors provide examples of potentially useful metrics and underscore the need for data collection and analysis to validate the metrics. They assert that defining metrics, collecting and analyzing data to illustrate their utility, and using standard methods requires unbiased collaborative work to achieve the desired results.
Read the white paper.
The Cybersecurity of Quantum Computing: 6 Areas of Research
By Tom Scanlon
Research and development of quantum computers continues to grow at a rapid pace. The U.S. government alone spent more than $800 million on quantum information science research in 2022. Thomas Scanlon, who leads the data science group in the SEI CERT Division, was recently invited to be a participant in the Workshop on Cybersecurity of Quantum Computing, co-sponsored by the National Science Foundation (NSF) and the White House Office of Science and Technology Policy, to examine the emerging field of cybersecurity for quantum computing. In this SEI podcast, Scanlon discusses how to create the discipline of cyber protection of quantum computing and outlines six areas of future research in quantum cybersecurity.