The PaddleOCR framework is an essential part of Adevinta’s OCR API, but our implementation is more than just choosing an optical character recognition technology. Building the service we wanted meant refactoring the code to remove unwanted features and functionality to assure smooth integration with existing architecture.
By Urszula Czerwinska Data scientist-Deep Learning engineer
In our deep-dive article we discuss:
- The general architecture of PaddleOCR based on PP-OCR and PP-Structure
- How to refactor and reduce the PaddleOCR codebase to improve OCR performance
- The logic behind PaddleOCR inference
- Practical ways in which PaddleOCR could be improved by anyone
To learn all this and more, take a look at the full article on the Adevinta blog.