The argument for closed-source AI evaluation is straightforward: it's our edge. The argument for open-source AI evaluation is harder to make in a meeting. But here it is.
If you accept that performance evaluation eventually affects compensation, and you accept that compensation eventually affects whether someone can pay rent, then the algorithm that scores their work has the same status as a credit-scoring model or a sentencing recommendation. It's not a feature — it's an infrastructure decision with downstream effects on real people.
We didn't want to be the company that says "trust us." We're not even particularly trustworthy. We're a small Korean startup that nobody has audited. So we opened the algorithm.
What's open under AGPL:
- The synthesis prompts. Everything RUQA sends to Claude/GPT/Gemini to generate a daily report.
- The triangulation algorithm. The exact math behind T1-T5 weighting, deviation flagging, and HARD/SOFT thresholds.
- The capability scoring rubric. The 16 metaskills, their definitions, how outputs map to scores.
- The calibration period enforcement. How
calibration_untilpropagates through the API. - All UI and routing.
What's not open:
- Proprietary training data we use to fine-tune for specific industries. (We don't share customer data here either — these are our internal evaluation datasets.)
- Commercial connectors to enterprise systems (SOC 2 evidence collectors, SCIM bridges, etc.).
Why AGPL specifically:
Plain MIT/Apache means a competitor can take everything, run it as a closed service, and never contribute back. AGPL says: if you run it as a service, your modifications are AGPL too. This is the right covenant for a tool that scores humans. It keeps the evaluation algorithm public as a class of software, not just in our copy of it.
The commercial license:
Some customers can't use AGPL — embedded use in closed-source SaaS, regulatory environments that disallow viral copyleft, etc. For them we sell a commercial license. The pricing is opinionated: it's expensive enough that we make money, cheap enough that "we already use this internally" companies aren't excluded. We'll publish the rate card publicly when we're more sure of it.
The interesting outcome of the past four months: nobody has tried to fork us. The reason isn't that we're irreplaceable — the code is right there. The reason is that the code is the easy part. The data, the workflow integration, the calibration discipline, the willingness to actually publish the algorithm and answer for it — that's the moat. It turns out that "open the algorithm" is a feature, not a giveaway.