---
id: "concept-guardrails-security-design"
type: "concept"
source_timestamps: ["00:16:28", "00:17:15"]
tags: ["security", "risk-management", "skill-5"]
related: ["concept-blast-radius", "concept-reversibility", "concept-semantic-vs-functional-correctness", "framework-7-ai-skills"]
definition: "The architectural skill of designing deterministic constraints, authorizations, and human-in-the-loop boundaries around probabilistic AI agents to ensure safe execution."
sources: ["s42-job-market-split"]
sourceVaultSlug: "s42-job-market-split"
originDay: 42
---
# Guardrails and Security Design

## Skill #5 of [[framework-7-ai-skills]]

Because AI models are **probabilistic**, simply instructing them to 'be good' or 'be safe' in a system prompt is insufficient for production environments.

**Guardrails and Security Design** is the higher-level skill of building **deterministic containers and infrastructure around probabilistic agents**.

## What it involves

- Defining exactly where the line between human and agent is drawn.
- Establishing strict authorization protocols for agent actions.
- Ensuring the agent cannot take inappropriate actions even if it hallucinates.
- Analyzing the risk profile of tasks via four metrics:
  - [[concept-blast-radius]]
  - [[concept-reversibility]]
  - **Frequency** (how often the action runs)
  - **Verifiability** (how readily output can be checked) — closely related to [[concept-semantic-vs-functional-correctness]]

## Adjacent literature

Deloitte and PwC stress *built-in* guardrails (permissions, audit trails, human-in-loop) over prompt-based safety. Only ~20% of firms are mature on this dimension.
