craig craigls

## README.md

      
              2 files
            
          
              69 forks
            
          
              9 comments
            
          
              407 stars
            
          
                dannguyen
                / README.md
            
            
              Last active
              May 17, 2024 02:07
            
              
                Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs

  
## understanding-word-vectors.ipynb

      
              1 file
            
          
              349 forks
            
          
              51 comments
            
          
              1438 stars
            
          
                aparrish
                / understanding-word-vectors.ipynb
            
            
              Last active
              May 22, 2024 14:36
            
              
                Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.