---
id: "entity-factory-ai"
type: "entity"
entityType: "organization"
canonicalName: "Factory.ai"
aliases: ["Factory"]
source_timestamps: ["00:04:05", "00:04:21", "00:06:59"]
tags: ["ai-startups", "developer-tools"]
related: ["concept-dark-code", "claim-pipeline-layers-insufficiency", "prereq-evals"]
sources: ["s23-amazon-16k-engineers"]
sourceVaultSlug: "s23-amazon-16k-engineers"
originDay: 23
---
# Factory.ai

## Profile

Factory.ai is an AI company building developer tooling. In the source they are cited as an example of an organization attempting to solve [[concept-dark-code]] through extreme discipline at the evaluation layer.

## Approach as Described in the Source

Factory.ai's working hypothesis is that **extraordinary testing and discipline at the 'evals layer' can proxy for human understanding** — letting agents learn from their own code via rigorous evaluation feedback rather than via human comprehension.

The speaker treats this as a respectable and serious effort but uses it to illustrate [[claim-pipeline-layers-insufficiency]]: even sophisticated evals do not transfer comprehension to the human engineer who must respond when production breaks.

## Why It Matters in This Vault

Factory.ai represents the most credible version of the 'tooling-only' response to dark code. The framework in [[framework-dark-code-solution]] is explicitly an alternative to this approach — shifting effort from generation-side discipline to organization-side comprehension.

## Verification Status

The enrichment overlay notes: 'Not independently verified in search results; no canonical URL found' for the specific evals-as-proxy hypothesis. Treat the speaker's characterization as his interpretation of Factory.ai's strategy.

## Prerequisite Concept

Understanding the role of evals in modern AI development — see [[prereq-evals]] — is essential to grasp Factory.ai's pitch.