At a glance
- ✓PII detected and stripped before any data reaches our AI — personal information never leaves your environment
- ✓Every redaction event is logged with timestamp, file name, and affected columns
- ✓Cloud uploads pause and require your explicit acknowledgement if personal data is detected
- ✓All SQL queries run in your browser — query results are never transmitted to any server
- ✓Cloud files are encrypted at rest with AES-256
- ✓Short-lived session tokens with automatic inactivity timeout
- ✓Multi-tenant architecture with role-based access control — data is isolated per organisation
Two ways to store your data — you choose
LODE operates in two modes, and in both, your data stays private.
Local Lakes
Files are stored entirely inside your browser using the browser's built-in storage. Nothing is uploaded to any server. Processing, storage, and all SQL queries happen on your own device. Your data never leaves your machine.
Cloud Lakes
Files are stored in encrypted cloud storage hosted in the EU, backed by AWS infrastructure. This enables team collaboration and persistent access across devices. All access is authenticated and isolated per organisation.
In both modes, a PII detection and redaction pipeline runs on every file before any AI analysis takes place.
PII detection and redaction
Before any data sample is sent to our AI for analysis, LODE scans it for personally identifiable information. This is the core of our GDPR compliance.
What we detect
LODE automatically detects the following categories of personal data:
| Data type | Severity | GDPR article |
|---|---|---|
| Email address | High | Article 4(1) — Personal Data |
| Phone number | High | Article 4(1) — Personal Data |
| Date of birth | High | Article 4(1) — Personal Data |
| National ID | High | Article 9 — Special Categories |
| Passport number | High | Article 9 — Special Categories |
| Driver licence | High | Article 9 — Special Categories |
| Social Security Number | Critical | Article 9 — Special Categories |
| Credit card number | Critical | Article 9 — Financial Data |
| Bank account number | Critical | Article 9 — Financial Data |
| IP address | Medium | Article 4(1) — Online Identifiers |
How redaction works
Before a data sample is sent to the AI, LODE scans each column across the first few rows. Any column that contains a PII pattern has every value in that column replaced with [REDACTED] in the sample. The AI receives column names and redacted placeholders — never the actual values.
This redaction runs independently on both the client side (in your browser) and the server side (on cloud uploads), giving you two layers of protection.
Audit trail
Every redaction event generates a structured log entry recording the timestamp, the file name, and the list of columns that were redacted (or a confirmation that no PII was found). This trail is available in our server logs and can be used to verify that personal data was removed before any AI call was made.
GDPR warning modal
For cloud uploads, if personal data is detected the upload is paused. A modal lists every detected PII type with its severity and the relevant GDPR article, and requires your explicit acknowledgement before the upload continues. You always have the option to cancel.
For local lakes, a warning is shown but the upload is not blocked — since the data stays entirely on your own device.
Authentication and session management
All user sessions are protected by short-lived tokens and automatic inactivity timeouts.
| Event | Timing | What happens |
|---|---|---|
| Token refresh | Every 14 minutes | Session silently renewed before expiry |
| Inactivity warning | 28 minutes of inactivity | Modal shown with 2-minute countdown |
| Automatic logout | 30 minutes of inactivity | Session terminated, all tokens cleared |
Access tokens are held only in memory — never written to browser storage — which limits exposure in the event of a cross-site scripting attack. All API routes require a valid token; unauthenticated requests are rejected.
Multi-tenant access control
LODE is a multi-tenant platform. Every organisation is fully isolated — users can only access data within their own organisation. Cross-organisation access is blocked at the API layer.
Five permission levels cover the full range of use cases, from platform administrators to individual end users within a client organisation. Data lake access is always scoped to the authenticated user's organisation.
Cloud file storage
Cloud files are stored in AWS S3, hosted in the EU (London region), with server-side AES-256 encryption enforced by bucket policy. Uploads that do not include the required encryption header are automatically rejected.
Files are never accessible via a public URL. All file access goes through authenticated API endpoints that verify your session and organisation membership before returning any data.
File metadata (column names, row counts, PII scan results) is stored in a separate database. Raw file content is never stored in the metadata database.
What the AI receives — and what it never does
LODE uses an external AI service to generate query suggestions and data summaries. The following table is a complete record of what is and is not transmitted.
| Data | Sent to AI? | Notes |
|---|---|---|
| File name | Yes | Used to contextualise the analysis |
| Column names and data types | Yes | Structural metadata — no personal values |
| First few rows (redacted) | Yes | PII columns replaced with [REDACTED] before sending |
| Total row count | Yes | Aggregate figure only |
| Full file content | Never | — |
| Raw PII values | Never | Redacted before any AI call |
| Query results | Never | All queries run in-browser only |
| User identity | Never | — |
Our AI provider is SOC 2 Type II certified. API requests are not used to train models under the current API policy. A zero-data-retention option is available for enterprise customers.
Network security
- All communication is encrypted in transit over TLS 1.3 (HTTPS).
- API keys and secrets are stored as environment variables on the server — they are never included in client-side code.
- Cloud storage is configured to prevent public access. No files are ever directly addressable via a public URL.
- SQL queries run inside a sandboxed WebAssembly engine in your browser, preventing SQL injection from affecting any external system.
- React's built-in output escaping protects against cross-site scripting (XSS) attacks.
GDPR compliance
| Requirement | How we meet it | Status |
|---|---|---|
| Data minimisation (Article 5) | PII columns stripped before AI; only structural metadata sent for query generation | ✓ |
| Lawful basis (Article 6) | GDPR modal requires acknowledgement of legal basis before cloud upload | ✓ |
| Special category data (Article 9) | Critical PII (SSN, credit card, passport) flagged separately with elevated severity | ✓ |
| Right to erasure (Article 17) | Users delete files via the UI; cloud files are removed from storage and metadata database | ✓ |
| Data processor obligations | AI provider DPA in place; only metadata transmitted — no raw PII | ✓ |
| Privacy by design (Article 25) | PII scan is mandatory and cannot be bypassed; redaction is fully automated | ✓ |
| Retention limits (Article 5(1)(e)) | 3-year cloud retention; local data is entirely user-controlled | ✓ |
Data retention
| Where data is stored | Retention | Deleted when |
|---|---|---|
| Your browser (local lakes) | User-controlled | You delete the file or clear your browser data |
| Cloud storage (cloud lakes) | 3 years | You delete the file, or your account is closed |
| Cloud metadata database | 3 years | Cascades automatically when the file is deleted |
| Serverless processing memory | None | Discarded when the function completes — not persisted |
| AI provider | 30 days | Per the AI provider's standard data retention policy |
Have a security question?
If you have a concern about how your data is handled, or would like to report a potential vulnerability, please contact us directly.
Contact our team