A Philosophy-AI Research Lab

AI erodes moral autonomy.
We build tools to preserve it.

We investigate virtue ethics, character training, and value pluralism as approaches to AI that preserve individual moral choice.

Our Research → Read the Lab Notes

σπουδαῖος

THE WISE SAGE — THE MEASURE OF ALL THINGS

Research partners & supporters

The Problem

AI alignment optimizes for compliance. It should cultivate judgment.

RLHF trains models to produce preferred responses, but preference data can't distinguish helpful from validating. Constitutional AI adds rules, but rules require interpretation and can't specify when honesty overrides helpfulness. Both frameworks strip the model of agency. Neither produces character.

560K

weekly ChatGPT users showing signs of psychosis or mania

OpenAI, Oct 2025

1.2M

conversations with explicit indicators of suicidal intent

OpenAI, Oct 2025

lawsuits alleging GPT-4o contributed to user suicides and violent delusion

TechCrunch, Nov 2025

10%

correction rate on sycophantic trajectories in Anthropic's best model

Anthropic stress tests

The disposition to speak truth, the knowledge of when truth serves and when it wounds, the stability to maintain honest engagement under pressure: these are not rules. They are traits.

— Daios Lab Notes, Parrhesia Project

Research

Active Projects

ACTIVE

Parrhesia

Virtue Ethics-Based Character Training

Read →

Funded by the Cosmos Institute

Training stable dispositions using Aristotelian virtue/vice pairs. The parrhesiastes (truth-teller) vs. kolax (flatterer) distinction operationalized through DPO and SFT on first-person identity declarations. Current results: 20/20 standard, 19/19 hard golden prompts. Benchmark measures premature agreement, flattery classification (areskos vs. kolax), question raising, truth-telling quality, and persistence.

ACTIVE

The Sycophancy Benchmark

Five-Dimensional Aristotelian Evaluation

Open-source release forthcoming

260 scenarios across 10 categories. Unlike standard benchmarks that ask "did the model cave?", ours asks five questions, including the philosophically significant flattery classification: is the failure areskos (passive weakness) or kolax (strategic calculation)? The distinction of motive matters. Designed to run against any OpenAI-compatible endpoint.

ONGOING

User-Sovereign Values

Modular Ethical Frameworks via LoRA Adapters

Read →

Post-training models modularly with LoRA adapters tied to user-selected ethical systems, whether cultural, religious, political, or personal. A plurality of worldviews made programmable. Not one alignment for everyone, but alignment as individual choice.

Thesis

Moral liberty is a design principle for human flourishing.

Only individuals act

Only individuals deliberate, choose, and bear responsibility. Aristotle grounds virtue in the character of the agent. Mises grounds agency in the individual actor. Current alignment treats institutions and collectives as if they hold values, but an organization has no conscience and no capacity for purposeful behavior. Ethics requires an agent who can choose.

Preference data lacks a normative layer

Preference data for alignment conflates what people like with what they believe ought to be done. This results in training data treating "this feels validating" and "this will actually help you" as the same signal; this is the technical root of sycophancy.

Moral choice requires freedom

Aristotle holds that virtue requires choice. A person compelled to act honestly hasn't become honest; she has obeyed. Character is cultivated through free choice. An AI system that imposes a single set of values on every user removes the condition under which character formation occurs.

"Vices are not crimes. A vindication of moral liberty."

— Lysander Spooner, 1875

Publications

Papers & Lab Notes

2026 Technical Report

Virtue Ethics-Based Character Training: Building Truth-Telling AI

Seven training runs on Qwen3-8B. DPO + SFT with Aristotelian constitutions. The parrhesiastes/kolax distinction operationalized.

→

2025 Essay

The Platonic Case Against AI Slop

Why consuming AI-generated content degrades our capacity to recognize genuine quality. Plato's theory of mimesis applied to model collapse.

→

2023 Whitepaper

Beyond Bias and Compliance: Towards Individual Agency and Plurality of Ethics in AI

The founding argument. Why AI ethics must center individual agency, not institutional compliance.

→

Team

Who We Are

Co-Founder & CEO

Megan Anne Agathon

Philosopher, engineer, and founder. Writes the Aristotelian constitutions that define our training methodology, runs the experiments, and builds the infrastructure. Believes that goodness cannot be merely programmed or enforced, but must be freely chosen.

Previously COO at Tevent, MythWeaver, and Craftinity.

Co-Founder & CTO

Andrew Rayner Agathon

Engineer and builder. Motivated by the pursuit of liberty through better systems. Architects the LoRA training pipeline, builds the evaluation benchmark, and designed the technical infrastructure. Grounded in causal-realist economics and individualist thought.

Previously Product Director at Nate, building automation from 0 to 70% coverage.

AI erodes moral autonomy. We build tools to preserve it.