Privacy compliance is not a back-office concern that shows up after your product ships. The moment you add an AI feature that touches user data, at least three regulatory regimes become relevant. Get this wrong and the consequences are not abstract: GDPR fines alone have exceeded €1.3 billion in enforcement actions since 2018, according to GDPR Enforcement Tracker data published through mid-2024.
This article walks through the four questions founders ask most often once they realize their AI feature is not just a product decision.
Which privacy regulations apply to AI-powered products?
The answer depends on where your users are, not where your company is incorporated.
GDPR applies to any product used by people in the European Union, regardless of whether you are based in Berlin or Boston. The California Consumer Privacy Act (CCPA) applies once you have California users above certain thresholds, roughly 100,000 users annually or $25 million in revenue. Brazil's LGPD and Canada's PIPEDA follow similar geographic logic. If your product has users in multiple countries, you are operating under multiple frameworks simultaneously.
The EU AI Act, which entered force in August 2024, adds a layer specific to AI systems. It classifies AI features by risk level. A product that uses AI to generate personalized recommendations sits in a lower-risk category than one that uses AI to make decisions about credit, employment, or healthcare. High-risk AI systems face mandatory conformity assessments, detailed documentation requirements, and human oversight obligations before they can be offered to EU users.
For most early-stage products, GDPR is the one that matters most immediately. The fines are large, the enforcement track record is real, and its rules on automated decision-making directly address what AI features do.
How does GDPR treat data used for AI training?
GDPR does not distinguish between data used to train a model and data used to run one. Both are subject to the same rules.
To use personal data for AI training, you need a lawful basis. The six lawful bases under GDPR include consent, legitimate interest, and contractual necessity, among others. Legitimate interest sounds convenient, but the European Data Protection Board has made clear it is not a blanket permit for AI training. A 2023 ruling against Meta found that using behavioral data to train ad-targeting models without explicit consent violated GDPR, resulting in a €1.2 billion fine.
The practical implication for your product: if you are using data your users generated inside your product to improve or fine-tune an AI model, that is AI training in the regulatory sense. You need a documented lawful basis, and consent is the safest one to rely on if your use case does not fit a narrower exception.
GDPR also introduces a specific rule for automated decision-making under Article 22. If your AI feature makes decisions that have a legal or similarly significant effect on users, those users have the right to request human review of that decision. "Similarly significant" is broad enough to include decisions about what content a user sees, what pricing they receive, or whether they qualify for a service. Build a human-review path into any AI feature that outputs a decision, not just a recommendation.
Do I need user consent before feeding data to an AI model?
In most cases, yes. The cleaner answer is: you need a lawful basis, and consent is the one most founders can actually demonstrate.
Consent under GDPR must be freely given, specific, informed, and unambiguous. A pre-checked box buried in your terms of service does not qualify. The user must take an affirmative action, understand what they are consenting to, and be able to withdraw consent without losing access to the core product.
The "informed" requirement is where AI features get complicated. Telling a user their data "may be used to improve your experience" is not informed consent for training an AI model. You need to describe, in plain language, what the model does, what data it uses, and how that data is processed. A 2024 survey by the International Association of Privacy Professionals found that 67% of users said they would use a product less if they learned their data was used for AI training without their knowledge.
If your product is consumer-facing, consent also needs to be age-appropriate. GDPR sets the threshold for parental consent at 16 in most EU member states, though some countries like the UK use 13. Any product that could plausibly reach minors needs an age-verification step before collecting data for AI use.
For B2B products, the picture is slightly different. When your customer is a business and the data you process belongs to their users, your customer is the data controller and you are the data processor. You still need a data processing agreement in place, and your customer's own consent obligations flow through to you. If their users did not consent, your customer cannot lawfully hand that data to your AI model.
| Scenario | Consent Required? | What You Need |
|---|---|---|
| Using session data to personalize in-app recommendations | Yes, with exceptions | Explicit opt-in or documented legitimate interest test |
| Fine-tuning an AI model on user-generated content | Yes | Explicit, informed consent before data is used |
| Sending conversation logs to a third-party AI API | Yes | Consent plus a data processing agreement with the vendor |
| Using anonymized, aggregated behavioral data | Depends | Anonymization must be irreversible; pseudonymized data still counts as personal |
| Automated decisions that affect user access or pricing | Yes | Consent plus a human-review mechanism under GDPR Article 22 |
What happens if my AI vendor stores user data overseas?
This is where many AI products run into compliance problems they did not anticipate.
Most AI APIs, including the major large language model providers, process requests on servers located in the United States. Under GDPR, transferring personal data from the EU to a third country requires one of three mechanisms: an adequacy decision from the European Commission, standard contractual clauses (SCCs), or binding corporate rules. The US does have an adequacy decision in place through the EU-US Data Privacy Framework, adopted in July 2023. But that framework applies only to companies that have self-certified under it. Before you send a single API request containing EU user data, confirm your AI vendor is on the Data Privacy Framework list.
If your vendor is not on the list, you can still use SCCs, but you need to conduct a transfer impact assessment. That is a documented review of whether the destination country's laws undermine the protections in the SCCs. For the US, the relevant laws include intelligence surveillance authorities under FISA Section 702, which the European Court of Justice has previously found problematic enough to invalidate earlier transfer frameworks entirely.
The geographic risk is not only about the US. If your AI vendor has infrastructure in multiple regions and you do not know which region processes your requests, you cannot complete a transfer impact assessment. Before signing a vendor contract, get written confirmation of where your data is stored and processed, and whether the vendor will sign data processing agreements and SCCs.
| Data Transfer Scenario | Mechanism Needed | Risk If Missing |
|---|---|---|
| EU user data sent to US-based AI vendor certified under DPF | Data Privacy Framework adequacy | GDPR violation if vendor not certified |
| EU user data sent to US vendor without DPF certification | Standard Contractual Clauses + transfer impact assessment | Fines up to 4% of global annual revenue |
| EU user data processed in UK post-Brexit | UK adequacy decision (in place as of 2024) | Low risk currently, subject to review |
| Data stored in vendor's multi-region infrastructure without written confirmation of location | SCCs plus written data location guarantee from vendor | High risk; cannot complete required assessment |
One practical step founders often skip: read the data retention section of your AI vendor's terms. Some LLM providers retain conversation data for 30 days to improve their models by default. If you are sending user data to that API, those users' data may be used to train a model they never consented to train. OpenAI, for example, allows enterprise customers to opt out of data training, but the default for API users changed over 2023 and 2024. Check the current default, not what you remember reading last year.
The cost of getting this right is real but manageable. A privacy lawyer reviewing your AI data flows and vendor contracts runs $3,000 to $8,000 for a focused engagement. US law firms doing the same work charge $15,000 to $30,000. Building compliant consent flows and a data processing agreement template into your product from the start costs far less than a single regulatory inquiry, which can run six figures before any fine is issued.
Privacy compliance is not a feature you add later. The founders who treat it as infrastructure from day one spend a fraction of what the ones who retrofit it spend. If you are building an AI product and have not mapped your data flows, consented your users, and reviewed your vendor agreements, those are the three things to do before your next sprint.
