Skip to content

Instantly share code, notes, and snippets.

View gr8Adakron's full-sized avatar
💭
I may be slow to respond.

A.Juneja gr8Adakron

💭
I may be slow to respond.
View GitHub Profile
@gr8Adakron
gr8Adakron / Sublime-Shortcuts.md
Last active November 5, 2021 09:03
Sublime text shortcuts

Insert and Select

  • Cmd+D - Select a word
  • Cmd+Shift+Enter - Insert a line before current line
  • Cmd+Enter - Insert a line after current line
  • Cmd+L - Select current line

Delete

  • Cmd+Shift+K - Delete a line
  • Cmd+K+K - Delete from cursor to end of line.
  • Cmd+KBackspace - Delete from Cursor to start of line
@gr8Adakron
gr8Adakron / multiprocess_JSONtoCSV.py
Last active August 17, 2018 11:28
Convert the JSON fields to CSV using the multiprocess along with map reduce architecture. - No memory issue - Easy handling of enormous conversion(100GB files) - Lightening fast - Works fine of linux - Haven't tested for windows.
#author: gr8_adakron.
#python: 3.6 (necessary for fstrings)
from subprocess import PIPE, Popen
from multiprocessing import Pool
import multiprocessing as mp
import pandas as pd
import random
@gr8Adakron
gr8Adakron / csv2json-berkeley.pl
Created December 27, 2017 07:46
Its a perl script which on-time creates a berkeley-DB with the JSON IDs as primary_key. Whle forming a output JSON from input CSV, it searches the extra attributes of CSV in JSON-DB(berkeley-DB){Just like Vlookup}. the only difference is it takes, creates, utilize everything using parallel processing{reading as well writing} and it is lightening…
#Author : gr8_Adakron.
use Term::ANSIColor;
use Time::HiRes qw(time);
use strict;
use warnings;
use Carp;
use POSIX ":sys_wait_h";
use Data::Dumper;
@gr8Adakron
gr8Adakron / README.md
Created August 11, 2017 06:29 — forked from dannguyen/README.md
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@gr8Adakron
gr8Adakron / csvtojson.pl
Last active January 27, 2023 02:01
Perl : Perfect Conversion of csv to json 'n' numbers of rows .(Handle Multiline rows too.)
#!/usr/bin/perl
#Author: gr8_Adakron.
#--------------------- Perl Packages --------------------
use strict;
use warnings;
use JSON;
use Text::CSV;
#-------------------- Globaling Variables --------------------
my $flag_header = 1;
@gr8Adakron
gr8Adakron / zamzar_conversion.py
Created April 24, 2016 08:09
Python script to convert anything, from any format to desire format using zamzar api (example : convert image.jpeg to image.png / data.odt to data.doc / data.pdf to data.txt etc)
import requests
from requests.auth import HTTPBasicAuth
#--------------------------------------------------------------------------#
api_key = 'Put_Your_API_KEY' #your Api_key from developer.zamzar.com (Create Signup id and get api_key)
source_file = "tmp/armash.pdf" #source_file_path
target_file = "results/armash.txt" #target_file_path_and_name
target_format = "txt" #targeted Format.
#-------------------------------------------------------------------------#
@gr8Adakron
gr8Adakron / compareList.py
Created April 2, 2016 12:59
Compare multiple list using pandas in python
import pandas as pd
List1=[10,11,12,15,16,18,19]
List2=[11,15,16,19,13]
List3=[11,12,15,19]
d = {'List1' : pd.Series(List1),'List2' : pd.Series(List2),'List3': pd.Series(List3)}
df = pd.DataFrame(d)
print(df)