Honest answer: yes, the OCR layer does process the raw document text. We use Google Cloud Vision API for text extraction (industry-standard, GDPR data-processor agreement, used by major financial and healthcare institutions). That's a separate layer from the generative AI that performs the legal reasoning.
Two AI layers, two privacy boundaries:
1. OCR (Google Cloud Vision): extracts text from images. Industry-standard utility AI, not generative. Subject to standard data-processor agreements.
2. Generative AI (the case-builder): performs legal reasoning. Only ever sees the privacy-redacted version. Names, addresses, identifiers replaced with role tags ([CLIENT], [OTHER_PARTY], [WITNESS]) before any LLM call.
The redaction happens between layer 1 and layer 2, on your client's device. We're explicit about this distinction so you can answer your bar association honestly: no client PII ever reaches a generative AI model. The OCR layer is the same infrastructure your firm probably already trusts when scanning court filings.