3. Security evaluation suites for agentic AI systems on AWS - AWS Prescriptive Guidance

3. Security evaluation suites for agentic AI systems on AWS

Foundation models form the intelligence core of agentic AI systems. This makes their security characteristics critical to overall system safety. Systematic evaluation and testing of model behavior can help you identify vulnerabilities before deployment.

3.1 Conduct model system card reviews (AI-specific)

Review model system cards thoroughly to understand the security posture, documented safeguards, and known limitations before deployment. System cards describe how models handle adversarial inputs and inappropriate requests. Examine documented results for adversarial safety measures and cyber evaluations. These assessments indicate model resilience against attack patterns and help you determine additional control requirements. This information is essential for risk assessment and determining additional security controls for your specific use case. Understanding baseline model security capabilities informs your overall security architecture decisions.

3.2 Use security evaluation suites (AI-specific)

Use model evaluation tools to probe your AI application with adversarial prompts that are designed to elicit security vulnerabilities or responsible AI failures. This systematic testing approach identifies potential vulnerabilities, such as:

  • Attempts to extract environment variables or credentials from the model

  • Prompts designed to generate malicious code or exploits

  • Information leakage of training data or sensitive information

Libraries and tools to support model evaluation include fmeval and SecEval.