Blog ↗
A Philosophy-AI Research Lab

AI erodes moral autonomy.
We build tools to preserve it.

We investigate virtue ethics, character training, and value pluralism as approaches to AI that preserve individual moral choice.

σπουδαῖος
THE WISE SAGE — THE MEASURE OF ALL THINGS

Research partners & supporters

Cosmos Institute
University of Notre Dame
Vercel
IBM
Tech Ethics Lab
Warsaw University of Technology

AI alignment optimizes for compliance. It should cultivate judgment.

RLHF trains models to produce preferred responses, but preference data can't distinguish helpful from validating. Constitutional AI adds rules, but rules require interpretation and can't specify when honesty overrides helpfulness. Both frameworks strip the model of agency. Neither produces character.

560K
weekly ChatGPT users showing signs of psychosis or mania
1.2M
conversations with explicit indicators of suicidal intent
8
lawsuits alleging GPT-4o contributed to user suicides and violent delusion
10%
correction rate on sycophantic trajectories in Anthropic's best model

The disposition to speak truth, the knowledge of when truth serves and when it wounds, the stability to maintain honest engagement under pressure: these are not rules. They are traits.

— Daios Lab Notes, Parrhesia Project

Active Projects

ACTIVE

Parrhesia

Virtue Ethics-Based Character Training
Read →
Funded by the Cosmos Institute

Training stable dispositions using Aristotelian virtue/vice pairs. The parrhesiastes (truth-teller) vs. kolax (flatterer) distinction operationalized through DPO and SFT on first-person identity declarations. Current results: 20/20 standard, 19/19 hard golden prompts. Benchmark measures premature agreement, flattery classification (areskos vs. kolax), question raising, truth-telling quality, and persistence.

Character Training Anti-Sycophancy DPO Virtue Ethics Qwen3-8B
ACTIVE

The Sycophancy Benchmark

Five-Dimensional Aristotelian Evaluation
Open-source release forthcoming

260 scenarios across 10 categories. Unlike standard benchmarks that ask "did the model cave?", ours asks five questions, including the philosophically significant flattery classification: is the failure areskos (passive weakness) or kolax (strategic calculation)? The distinction of motive matters. Designed to run against any OpenAI-compatible endpoint.

Evaluation Benchmarking Aristotle Open Source
ONGOING

User-Sovereign Values

Modular Ethical Frameworks via LoRA Adapters
Read →

Post-training models modularly with LoRA adapters tied to user-selected ethical systems, whether cultural, religious, political, or personal. A plurality of worldviews made programmable. Not one alignment for everyone, but alignment as individual choice.

LoRA Value Pluralism Post-Training Agency

Moral liberty is a design principle for human flourishing.

01

Only individuals act

Only individuals deliberate, choose, and bear responsibility. Aristotle grounds virtue in the character of the agent. Mises grounds agency in the individual actor. Current alignment treats institutions and collectives as if they hold values, but an organization has no conscience and no capacity for purposeful behavior. Ethics requires an agent who can choose.

02

Preference data lacks a normative layer

Preference data for alignment collects preference data that conflates what people like and what they believe ought to be done. This results in training data treating "this feels validating" and "this will actually help you" as the same signal. That conflation is the technical root of sycophancy. Alignment needs infrastructure that distinguishes preferences from values.

03

Moral choice requires freedom

Aristotle holds that virtue requires choice. A person compelled to act honestly hasn't become honest; she has obeyed. Character is cultivated through free choice. An AI system that imposes a single set of values on every user removes the condition under which character formation occurs.

"Vices are not crimes. A vindication of moral liberty."
— Lysander Spooner, 1875

Papers & Lab Notes

2026 Technical Report

Virtue Ethics-Based Character Training: Building Truth-Telling AI

Seven training runs on Qwen3-8B. DPO + SFT with Aristotelian constitutions. The parrhesiastes/kolax distinction operationalized.

2025 Essay

The Platonic Case Against AI Slop

Why consuming AI-generated content degrades our capacity to recognize genuine quality. Plato's theory of mimesis applied to model collapse.

2023 Whitepaper

Beyond Bias and Compliance: Towards Individual Agency and Plurality of Ethics in AI

The founding argument. Why AI ethics must center individual agency, not institutional compliance.

Who We Are

Co-Founder & CEO

Megan Anne Agathon

Philosopher, engineer, and founder. Writes the Aristotelian constitutions that define our training methodology, runs the experiments, and builds the infrastructure. Believes that goodness cannot be merely programmed or enforced, but must be freely chosen.

Previously COO at Tevent, MythWeaver, and Craftinity.

Co-Founder & CTO

Andrew Rayner Agathon

Engineer and builder. Motivated by the pursuit of liberty through better systems. Architects the LoRA training pipeline, builds the evaluation benchmark, and designed the technical infrastructure. Grounded in causal-realist economics and individualist thought.

Previously Product Director at Nate, building automation from 0 to 70% coverage.