A quick camera check, right from the menu bar
Requires macOS Monterey or later,Filedotto often comes bundled with an outdated Tika version (1.x or early 2.x). Tika 1.x is end-of-life.
By default, BodyContentHandler limits output to -1 (unlimited) or some implementations default to 100,000 characters. If you are seeing truncated text, you found the issue.
The Fix: Explicitly define the character limit. filedotto tika fixed
// Set limit to 10MB of text (-1 for unlimited, but dangerous for RAM)
BodyContentHandler handler = new BodyContentHandler(10 * 1024 * 1024);
Fix: Use the SAX parser event model rather than DOM model. Tika does this by default, but ensure you are not loading the entire file into a ByteBuffer before passing it to Tika. Pass the InputStream directly.
Isolate the issue by running Tika directly on the offending file. Use the Tika App JAR: Filedotto often comes bundled with an outdated Tika
java -jar tika-app-2.9.1.jar --text problematic.pdf
If this works, the issue is in Filedotto's integration (e.g., wrong API usage, threading, or timeout settings). If it fails, the file is corrupt or Tika needs a parser upgrade.
If Filedotto connects to a remote Tika server and you see Connection reset or SocketTimeoutException: Fix: Use the SAX parser event model rather than DOM model
Filedotto sometimes caches Tika errors based on filename. Rename the file to document_fixed.pdf and re-upload.