Upload an electronic PDF to extract text and tables. Download the results as TXT, JSON, CSV, or structured text. Electronic PDFs only. Image-only pages will be flagged but not processed — see OCR section below.
Electronic PDFs only (no OCR) — up to 50 MB each. Folder drag supported.
Extracting...
Processing your PDF
🔬 Diagnostics
🖼️ OCR (Experimental — Not Recommended)
Attempts to read text from image-only and mixed pages. Text content on mixed pages was already extracted during normal extraction.
🚫 No GPU available — this environment has no hardware acceleration for OCR. Processing is unrealistically slow (~1 min per page). Not recommended for any real workload.
⚠️ This feature is experimental and unrealistically slow. It is not meant for production OCR workloads — use a dedicated OCR service with GPU support instead.
Technology
Built with Python 3.12 and FastAPI
This API is built using FastAPI with modern Python async patterns,
featuring automatic OpenAPI documentation, health monitoring, and RESTful endpoints
for sample entity operations.
API Endpoints
Quick links to explore the API documentation and endpoints.