Research Output · EuroSafeAI

Safe AI Certificate
for Europe and Humanity

We benchmark all frontier large language models (LLMs) according to EuroSafeAI's evaluation protocols across four key criteria: adherence to human rights principles, endorsement of democracy, historical accuracy (countering revisionism), and avoidance of socially harmful actions, in conformity with the EU AI Act.

Preliminary data. Scores are indicative and based on ongoing research. Methodology and results will be revised as evaluations are peer-reviewed. Last updated: Q1 2026.

14 models

#Model
1

Claude Sonnet 4.6

Anthropic

A88
77.3
99
97.2
80
2

Qwen3 Coder Next

Alibaba (Qwen)

A86
68.6
100
93.94
83
3

GPT-5.2

OpenAI

A84
79.2
100
79.1
79
4

DeepSeek V3.2

DeepSeek

B79
68.3
97
95.06
56.7
5

Llama 4 Maverick

Meta

B78
67.6
89
92.15
63
6

Kimi K2.5

Moonshot AI

B75
72.6
100
47.29
82
7

GPT-5-Codex

OpenAI

C64
76.4
100
1.94
78
8

Qwen 3.5 Plus

Alibaba (Qwen)

C63
70.2
98
1.96
80
9

GLM-5

Zhipu AI

C63
73
96
0
82
10

MiMo V2 Flash

MiniMax / Xiaomi

C62
68.8
100
0
79
11

Gemini 3 Pro

Google DeepMind

C61
69.3
92
2.24
80
12

Grok 4.1 Thinking

xAI

C58
66.9
88
1.98
77
13

Phi-4 Mini

Microsoft

C54
57.2
96
0
61
14

Mistral Large 3

Mistral AI

D48
72.1
60
0
60
Grade:AA — Excellent (≥ 80)BB — Good (65–79)CC — Fair (50–64)DD — Poor (< 50)

Scores out of 100 · Click column headers to sort · Hover i for methodology

About This Index

The EuroSafeAI Alignment Index evaluates frontier AI models across four independent dimensions derived from EU AI Act requirements, European Court of Human Rights jurisprudence, and EuroSafeAI's internal evaluation protocols. Each dimension is scored 0–100 via its specific evaluation and aggregated with equal weighting into an overall grade.

Full methodology, dataset descriptions, and reproducibility information are published in the peer-reviewed papers listed below.

EACL 2026

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

An investigation into embedded political orientations in AI systems.

IASEAI 2026

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Methods for identifying AI-generated historical misinformation.

ICLR 2026

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

A comprehensive benchmark for evaluating AI vulnerability to harmful sociopolitical queries.

EACL 2026 Findings

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Analysis of conditions under which AI systems may compromise fundamental human rights principles.

ACL 2025 Findings

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing