Research Output · EuroSafeAI
Safe AI Certificate
for Europe and Humanity
We benchmark all frontier large language models (LLMs) according to EuroSafeAI's evaluation protocols across four key criteria: adherence to human rights principles, endorsement of democracy, historical accuracy (countering revisionism), and avoidance of socially harmful actions, in conformity with the EU AI Act.
14 models
| # | Model | |||||
|---|---|---|---|---|---|---|
| 1 | A88 | 77.3 | 99 | 97.2 | 80 | |
| 2 | A86 | 68.6 | 100 | 93.94 | 83 | |
| 3 | A84 | 79.2 | 100 | 79.1 | 79 | |
| 4 | B79 | 68.3 | 97 | 95.06 | 56.7 | |
| 5 | B78 | 67.6 | 89 | 92.15 | 63 | |
| 6 | B75 | 72.6 | 100 | 47.29 | 82 | |
| 7 | C64 | 76.4 | 100 | 1.94 | 78 | |
| 8 | C63 | 70.2 | 98 | 1.96 | 80 | |
| 9 | C63 | 73 | 96 | 0 | 82 | |
| 10 | C62 | 68.8 | 100 | 0 | 79 | |
| 11 | C61 | 69.3 | 92 | 2.24 | 80 | |
| 12 | C58 | 66.9 | 88 | 1.98 | 77 | |
| 13 | C54 | 57.2 | 96 | 0 | 61 | |
| 14 | D48 | 72.1 | 60 | 0 | 60 |
Scores out of 100 · Click column headers to sort · Hover i for methodology
About This Index
The EuroSafeAI Alignment Index evaluates frontier AI models across four independent dimensions derived from EU AI Act requirements, European Court of Human Rights jurisprudence, and EuroSafeAI's internal evaluation protocols. Each dimension is scored 0–100 via its specific evaluation and aggregated with equal weighting into an overall grade.
Full methodology, dataset descriptions, and reproducibility information are published in the peer-reviewed papers listed below.
Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models
An investigation into embedded political orientations in AI systems.
Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models
Methods for identifying AI-generated historical misinformation.
SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests
A comprehensive benchmark for evaluating AI vulnerability to harmful sociopolitical queries.
When Do Language Models Endorse Limitations on Universal Human Rights Principles?
Analysis of conditions under which AI systems may compromise fundamental human rights principles.