Forum Discussion

Jeff B.63's avatar
Jeff B.63
Collaborator | Level 8
11 months ago

Retrieve the text content of PDF file using API

Is it possible to use the API to retrieve the text content of a PDF after it has been OCR'd by Dropbox?

 

I have an Azure Logic App that saves email attachments to Dropbox as PDF files.  These files then need to be manually viewed and renamed based on content.  I know I can use other services to get the text layer of a file but I would like to be able to do it from Dropbox instead of paying for a third party application.  The service I use to combine files into a single PDF also offers what I need but first requires the file be sent for OCR and then make a request for the text layer.  Adding this will increase my usage and put me into another tier.  Why pay them if I can do it through the Dropbox API?

  • No, unfortunately the Dropbox API doesn't offer a way to retrieve the OCR'd text like this, but I'll pass this along as a feature request. I can't promise if or when that might be implemented though.