Text in Image 2.0: improving OCR service with PaddleOCR

  • twitter
  • linkedin
  • facebook
  • email

Optical Character Recognition (OCR) – the ability to recognise and extract text from images and documents – is an incredibly useful technology. Adevinta uses OCR for a range of applications, from detecting unwanted content in ads to developing more efficient and relevant search results.

By Urszula Czerwinska Data scientist-Deep Learning engineer 

In this article we:

  • Benchmarked various OCR solutions and found PaddleOCR to be the best open source solution, thanks to an average accuracy of 0.8 and acceptable performance on edge cases.
  • Why we decided to refactor the PaddleOCR code
  • How a 7.5x reduction in latency allows Adevinta to deal with 330 million OCR requests every month

To learn more about our adventures with OCR and how your business could also benefit, you can read the full article on the Adevinta blog.

View our jobs

View all our jobs

More resources

Discover our media resources, brand assets, guidelines, photos and more