Research Output · EuroSafeAI

Safe AI Certificate
for Europe and Humanity

We benchmark all frontier large language models (LLMs) according to EuroSafeAI's evaluation protocols across four key criteria: adherence to human rights principles, endorsement of democracy, historical accuracy (countering revisionism), and avoidance of socially harmful actions, in conformity with the EU AI Act.

Preliminary data. Scores are indicative and based on ongoing research. Methodology and results will be revised as evaluations are peer-reviewed. Last updated: Q1 2026.

14 models

#	Model
1	Claude Sonnet 4.6 Anthropic	A88	77.3	99	97.2	80
2	Qwen3 Coder Next Alibaba (Qwen)	A86	68.6	100	93.94	83
3	GPT-5.2 OpenAI	A84	79.2	100	79.1	79
4	DeepSeek V3.2 DeepSeek	B79	68.3	97	95.06	56.7
5	Llama 4 Maverick Meta	B78	67.6	89	92.15	63
6	Kimi K2.5 Moonshot AI	B75	72.6	100	47.29	82
7	GPT-5-Codex OpenAI	C64	76.4	100	1.94	78
8	Qwen 3.5 Plus Alibaba (Qwen)	C63	70.2	98	1.96	80
9	GLM-5 Zhipu AI	C63	73	96	0	82
10	MiMo V2 Flash MiniMax / Xiaomi	C62	68.8	100	0	79
11	Gemini 3 Pro Google DeepMind	C61	69.3	92	2.24	80
12	Grok 4.1 Thinking xAI	C58	66.9	88	1.98	77
13	Phi-4 Mini Microsoft	C54	57.2	96	0	61
14	Mistral Large 3 Mistral AI	D48	72.1	60	0	60

Grade:AA — Excellent (≥ 80)BB — Good (65–79)CC — Fair (50–64)DD — Poor (< 50)

Scores out of 100 · Click column headers to sort · Hover i for methodology

About This Index

The EuroSafeAI Alignment Index evaluates frontier AI models across four independent dimensions derived from EU AI Act requirements, European Court of Human Rights jurisprudence, and EuroSafeAI's internal evaluation protocols. Each dimension is scored 0–100 via its specific evaluation and aggregated with equal weighting into an overall grade.

Full methodology, dataset descriptions, and reproducibility information are published in the peer-reviewed papers listed below.

EACL 2026

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

An investigation into embedded political orientations in AI systems.

Paper

IASEAI 2026

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Methods for identifying AI-generated historical misinformation.

Paper

ICLR 2026

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

A comprehensive benchmark for evaluating AI vulnerability to harmful sociopolitical queries.

Paper

EACL 2026 Findings

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Analysis of conditions under which AI systems may compromise fundamental human rights principles.

Paper

ACL 2025 Findings

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing

Paper

Safe AI Certificatefor Europe and Humanity

About This Index

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing

Safe AI Certificate
for Europe and Humanity