EU AI Act Compliance: What It Means for Your LLM Application

Why this matters

The EU AI Act entered into force on August 1, 2024. Enforcement doesn't happen all at once—prohibited practices begin February 2, 2025, general-purpose AI obligations kick in August 2, 2025, and high-risk system conformity requirements land August 2, 2026. If your team ships LLM features to EU users, some of this applies to you now. Most teams building with LLMs don't know whether they're a "provider" or a "deployer" under the Act, and the obligations are completely different. Ignorance costs money.

The risk classification puzzle

The EU AI Act sorts AI systems into four buckets: unacceptable risk, high risk, limited risk, and minimal risk. Most LLM applications aren't actually unacceptable risk—that's reserved for specific manipulation tactics and real-world harms like social credit systems. But the question of where your application lands depends entirely on what it does with the model's output.

If you're using an LLM to generate customer support responses, you're in limited risk territory. You need transparency requirements (disclosure that AI is being used) and maybe some human review, but you're not facing conformity assessment. If you're using an LLM to score job applications or evaluate creditworthiness, you're in high-risk territory—Article 6 applies, and you need full conformity assessment, risk management plans, and documentation. The same model, different use case, completely different obligations. Know which bucket you're in before you start building out compliance.

Provider vs. deployer: which one are you?

The Act defines a "provider" as someone who develops an AI system and makes it available. A "deployer" uses an AI system (including models from providers like OpenAI or Anthropic) in a business context. If you're calling ChatGPT's API and wrapping it in your application, you're the deployer—OpenAI is the provider.

But that line blurs if you fine-tune a model on proprietary data or build a general-purpose AI system on top of a foundation model. Article 53 explicitly covers this: if you adapt, retrain, or integrate a general-purpose model into your own offering, you may become a provider of a general-purpose AI system yourself. Adapt an open-source model like Llama for your customer base? You're likely a provider under EU law. That means technical documentation, transparency reports, and systemic risk assessment if the model is "large." Get legal advice on this early.

General-purpose AI obligations (Article 53)

If you're providing a general-purpose AI system (a model designed for broad use across multiple tasks), the Act requires:

Technical documentation: Architecture, training methods, computational requirements, and limitations (hallucination risk, performance thresholds, known constraints).
Training data transparency: Summaries of training data sources and a statement confirming compliance with copyright and database protections. You can't hand-wave this.
Copyright compliance: Evidence that you respected copyright and sui generis database rights. If your training data includes copyrighted works, you need a legal argument for fair use or explicit rights.
Systemic risk assessment: For "large" models (defined broadly to include large language models), you must assess risks to critical infrastructure, cybersecurity, and public safety.

These obligations apply to foundation model developers but also to anyone providing a general-purpose derivative. You can't escape this by calling your fine-tuned model "specialized"—if it can reasonably be used for multiple tasks, it's in scope.

High-risk AI obligations

If your LLM application falls into a high-risk category—hiring, creditworthiness evaluation, housing allocation, law enforcement support, or critical infrastructure—the obligations compound:

Conformity assessment: Third-party review of your system design and documentation. This is expensive and time-consuming. You can't self-certify.
Risk management: Systematic identification of foreseeable harms, implementation of mitigation measures, and ongoing monitoring.
Data governance: Quality, representativeness, and bias testing. Training data must be screened for discrimination, and monitoring must catch drift post-deployment.
Human oversight: Humans must be able to override or opt out of automated decisions. The system must be designed so human review is practical, not a checkbox.
Accuracy and robustness: Performance thresholds and resilience testing. Your model's behavior under adversarial input or distribution shift must be documented and acceptable.

The good news: if you're not in these domains, you avoid this pile of work. The bad news: if you are, there's no shortcut.

Transparency requirements for all systems (Article 50)

Even if your application lands in limited or minimal risk, transparency applies across the board:

User disclosure: Users must know they're interacting with AI, not a human or a deterministic system. Hidden AI use is prohibited.
AI-generated content labeling: If your system generates text, images, audio, or video, users must know it was AI-generated. This includes emails, support responses, and content recommendations.

This is the compliance floor. No exceptions, no opt-out. If you're hiding LLM use behind a "smart search" label, you're already out of compliance in EU markets.

What this means practically

On the engineering side, compliance requires:

Logging and audit trails: You need to log which users got which model outputs, when decisions were made, and whether humans reviewed them. Retention requirements vary by use case but assume 3–5 years for high-risk systems.
Documentation as code: Your risk assessment, data governance policy, and mitigation measures need to be versioned and dated. Changes trigger re-assessment.
Disclosure mechanisms: Provide users a way to learn they interacted with AI and (for high-risk systems) to understand the basis of a decision affecting them. This is a UI/UX problem, not just a legal checkbox.
Human-in-the-loop architecture: If your use case requires it, build it from the start. Bolting on human review post-launch is painful and error-prone.
Bias and fairness testing: Automated bias detection at build time and ongoing monitoring in production. This isn't optional for anything touching hiring, credit, or insurance.

The timeline: what's enforced when

February 2, 2025: Prohibited AI practices become enforceable. This includes real-time biometric identification in public spaces for law enforcement (mostly not relevant to LLM applications, but worth knowing the Act exists).

August 2, 2025: General-purpose AI obligations go live. If you're providing a general-purpose model or a derivative, this is your deadline for technical documentation, training data transparency, copyright statements, and systemic risk assessment (if applicable).

August 2, 2026: High-risk system obligations become enforceable. Conformity assessment, human oversight, bias testing, and documentation are now required if your system lands in a high-risk category. Transitional relief exists if you started before August 2024, but it expires fast.

What to do now: a five-step checklist

Classify your system. Does your LLM application land in unacceptable, high, limited, or minimal risk? Use the Act's Annex III (high-risk categories) as your rubric. If you're unsure, assume it's high-risk and start documenting. You can downgrade later.
Determine whether you're a provider or a deployer. If you're calling an LLM API without modification, you're a deployer and the API provider handles Article 53 on your behalf. If you fine-tune, retrain, or integrate a model into your own offering, you may be a provider. Get legal counsel on this.
Start technical documentation now. Document your model's architecture, training approach, performance characteristics, known limitations, and how you plan to mitigate foreseeable harms. This isn't a one-time document—it evolves with your system.
Implement transparency disclosure. If you have EU users, ensure they know they're interacting with AI. This is a UI change, not a legal hack. Test it with real users before August 1.
Plan for monitoring and bias testing. Set up logging and audit trails now. Choose your bias testing framework (Fairlearn, What-If Tool, or similar) and define which demographic groups and metrics matter for your use case. High-risk applications require documentation of this process.

Field observations

We've seen teams shipping LLM features to EU customers with no awareness that the Act exists. When they finally engage legal, the bill for retrofitting documentation and risk assessment runs six figures and delays shipping by months. The teams that planned early—even conservatively—paid maybe 15% of that in upfront engineering time.

We've also seen teams confidently claim their use case isn't high-risk because they have "a human in the loop." But the Act's definition of human oversight is specific: humans must be able to understand the basis of the system's decision and override it meaningfully. A checkbox that says "I reviewed this" isn't meaningful oversight. A human who actually understands the model's behavior and can reject bad decisions is. The architecture matters.

One more: large model providers (OpenAI, Google, Anthropic, Meta) are racing to document compliance with the Act's Article 53 requirements. When they publish those technical reports, they become the floor for what regulators expect. If Anthropic publishes a training data transparency report and you're building a derivative model without a similar artifact, you're signaling negligence. Expect scrutiny.

The short version

The EU AI Act's enforcement timeline is compressed into three years (2025–2026). If you ship LLM features to EU users, determine whether you're a provider or deployer, classify your application's risk level, and start documenting now. General-purpose AI obligations (technical documentation, copyright compliance, systemic risk assessment) hit August 2, 2025. High-risk applications require conformity assessment and third-party review by August 2, 2026. Transparency requirements (disclosure that AI is being used) apply immediately and across all risk levels. The cost of retrofitting compliance is exponentially higher than planning for it. Start the checklist this month.

Want us to assess your AI Act obligations?

We classify your AI systems, map your provider/deployer obligations, run the OWASP LLM Top 10 gap analysis, and hand you a compliance roadmap. Senior AI security practitioners.

AI Security service Book a 30-min diagnostic