Artificial Intelligence/Machine Learning, ASA(ALT), Phase I

Large Language Models for System Security Engineering Analysis

Release Date: 11/06/2024
Solicitation: 25.4
Open Date: 12/04/2024
Topic Number: A254-006
Application Due Date: 01/08/2025
Duration: 6 months
Close Date: 01/08/2025
Amount Up To: $250,000

Objective

The proposed topic will develop a Large Language Model (LLM) tailored to reduce the risk of traditional and emerging Artificial Intelligence (AI) and Machine Learning (ML) adversarial attacks on aviation and missile systems.

The proposed LLM will assist “blue” and adversarial assessment teams with vulnerability identification and exploitation modeling. Outputs from the LLM will be utilized as inputs to US Army DEVCOM AvMC “blue” team activities and inputs to designated DoD red teams. The LLM prototype will identify vulnerabilities and create proof-of-concept exploits in a threat intelligence-informed environment.

Description

The primary problem this topic addresses is the growing complexity of traditional and AI/ML-based cybersecurity threats to DoD weapon systems, increasingly connected and automated weapon systems architectures, and the need for advanced tools to enhance development-phase cyber resiliency efforts and exploit modeling.

This topic is essential because it will improve US DoD aviation and missile platforms’ resilience, survivability, and lethality through the enhanced identification and exploitation modeling of system vulnerabilities during system development. This will achieve significant impact in the most cost-effective phase of the system lifecycle.

The proposed approach leverages state-of-the-art LLMs to develop a specialized tool for system security engineering risk reduction efforts (blue team and AAMCATs) and provides critical exploitation inputs to DoD-approved red team operations.

This topic innovates on existing technology by fine-tuning a foundational LLM built to analyze complex weapon system codebases, identify vulnerabilities, and generate proof-of-concept exploits. This approach combines natural language processing and code analysis capabilities, creating a powerful tool that surpasses current manual and automated analysis methods.

The proposed LLM will be trained on the large amounts of relevant data available to DEVCOM AvMC including code, network traffic, static and dynamic analysis results, external software code bases and malicious software examples, and threat intelligence from partner US Government organizations.

The LLM will be integrated with AvMC software and hardware integration labs to provide further tuning and address the issue of run-time analysis when software code is unavailable to the cyber resiliency team.

Phase I

This topic is only accepting Phase I proposals for a cost up to $250,000 for a 6-month period of performance. Technical Feasibility Study of the proposed LLM including data and technical requirements, cost, and acquisition plans/schedule.

The final Study whitepaper milestone is due six months after the start of the project. Interim milestones include Data Identification, Sufficiency, and Access (2 months), Weapon System Prototype Identification (3 months), Interim Model Selection/Strategy (4 months), and Resource Identification (5 months).

At the end of Phase I, the Army will require the company to provide a concept demonstration of their technology to demonstrate a high probability that continued design and development will result in a Phase II mature product.

Phase II

Prototype-scale model training and deployment in a controlled environment that identifies defined vulnerabilities and generates representative proof of concept exploits of a selected aviation or missile weapon system (24 months).

Interim milestones will validate or inform the project development team to include a Concept Review in month 6 (design concept approval), a Preliminary Design Review (PDR) in month 10 (review of preliminary design and datasets), an Intermediate Design Review (IDR) in month 14 (approval to proceed to model development/training), a Training Review in month 19 (approval for full-scale training), and a Critical Design Review (CDR) at month 22 (review fully trained model/results). Monthly status reports will inform progress toward milestone and objective achievement.

Phase III

Implementation of the developed LLM can be used as a resource to design systems and identify security flaws for commercial, academic, and research use cases. The model potentially scales when trained with commercial data.

  • Financial Services: As the biggest buyer of cybersecurity software and services, financial services and Wall Street will be a big market segment for these systems.
  • eCommerce: Protect systems that use AI/ML, like Amazon, from malicious actors.
  • Manufacturing: As digital engineering continues to proliferate, manufacturing will need AI/ML vector analysis.
  • Healthcare: Similar to the proliferation of AI in other sectors, healthcare contains very sensitive data that requires state-of-the-art AI/ML cybersecurity.
  • Energy & Utilities: Again, similar to the above sectors, it requires AI/ML cybersecurity.

Submission Information

For more information, and to submit your full proposal package, visit the DSIP Portal.

SBIR|STTR Help Desk: usarmy.sbirsttr@army.mil

A254-006 | Phase I

References:

Objective

The proposed topic will develop a Large Language Model (LLM) tailored to reduce the risk of traditional and emerging Artificial Intelligence (AI) and Machine Learning (ML) adversarial attacks on aviation and missile systems.

The proposed LLM will assist “blue” and adversarial assessment teams with vulnerability identification and exploitation modeling. Outputs from the LLM will be utilized as inputs to US Army DEVCOM AvMC “blue” team activities and inputs to designated DoD red teams. The LLM prototype will identify vulnerabilities and create proof-of-concept exploits in a threat intelligence-informed environment.

Description

The primary problem this topic addresses is the growing complexity of traditional and AI/ML-based cybersecurity threats to DoD weapon systems, increasingly connected and automated weapon systems architectures, and the need for advanced tools to enhance development-phase cyber resiliency efforts and exploit modeling.

This topic is essential because it will improve US DoD aviation and missile platforms’ resilience, survivability, and lethality through the enhanced identification and exploitation modeling of system vulnerabilities during system development. This will achieve significant impact in the most cost-effective phase of the system lifecycle.

The proposed approach leverages state-of-the-art LLMs to develop a specialized tool for system security engineering risk reduction efforts (blue team and AAMCATs) and provides critical exploitation inputs to DoD-approved red team operations.

This topic innovates on existing technology by fine-tuning a foundational LLM built to analyze complex weapon system codebases, identify vulnerabilities, and generate proof-of-concept exploits. This approach combines natural language processing and code analysis capabilities, creating a powerful tool that surpasses current manual and automated analysis methods.

The proposed LLM will be trained on the large amounts of relevant data available to DEVCOM AvMC including code, network traffic, static and dynamic analysis results, external software code bases and malicious software examples, and threat intelligence from partner US Government organizations.

The LLM will be integrated with AvMC software and hardware integration labs to provide further tuning and address the issue of run-time analysis when software code is unavailable to the cyber resiliency team.

Phase I

This topic is only accepting Phase I proposals for a cost up to $250,000 for a 6-month period of performance. Technical Feasibility Study of the proposed LLM including data and technical requirements, cost, and acquisition plans/schedule.

The final Study whitepaper milestone is due six months after the start of the project. Interim milestones include Data Identification, Sufficiency, and Access (2 months), Weapon System Prototype Identification (3 months), Interim Model Selection/Strategy (4 months), and Resource Identification (5 months).

At the end of Phase I, the Army will require the company to provide a concept demonstration of their technology to demonstrate a high probability that continued design and development will result in a Phase II mature product.

Phase II

Prototype-scale model training and deployment in a controlled environment that identifies defined vulnerabilities and generates representative proof of concept exploits of a selected aviation or missile weapon system (24 months).

Interim milestones will validate or inform the project development team to include a Concept Review in month 6 (design concept approval), a Preliminary Design Review (PDR) in month 10 (review of preliminary design and datasets), an Intermediate Design Review (IDR) in month 14 (approval to proceed to model development/training), a Training Review in month 19 (approval for full-scale training), and a Critical Design Review (CDR) at month 22 (review fully trained model/results). Monthly status reports will inform progress toward milestone and objective achievement.

Phase III

Implementation of the developed LLM can be used as a resource to design systems and identify security flaws for commercial, academic, and research use cases. The model potentially scales when trained with commercial data.

  • Financial Services: As the biggest buyer of cybersecurity software and services, financial services and Wall Street will be a big market segment for these systems.
  • eCommerce: Protect systems that use AI/ML, like Amazon, from malicious actors.
  • Manufacturing: As digital engineering continues to proliferate, manufacturing will need AI/ML vector analysis.
  • Healthcare: Similar to the proliferation of AI in other sectors, healthcare contains very sensitive data that requires state-of-the-art AI/ML cybersecurity.
  • Energy & Utilities: Again, similar to the above sectors, it requires AI/ML cybersecurity.

Submission Information

For more information, and to submit your full proposal package, visit the DSIP Portal.

SBIR|STTR Help Desk: usarmy.sbirsttr@army.mil

References:

A254-006 | Phase I

Large Language Models for System Security Engineering Analysis

Scroll to Top