Skip to content

Instantly share code, notes, and snippets.

View andreinechaev's full-sized avatar
🐢
💨 💨 💨

Andrei Nechaev andreinechaev

🐢
💨 💨 💨
View GitHub Profile

GCP Document Understanding

Google vision has orientation on it's own Cloud Storage when we talk about BigData. API is extensive. Although SDK doesn't have all features available.

Advantages of GCP is a wide variety of pre-defined models and ability to upload your own. Existance of human labeled model can give better results, although the price is higher and there is no free tier.

What's interesing

from nuxeo.client import Nuxeo
import argparse
# size in MB
def _query(client, begin, end=None):
query = None
if (end is None):
query = f"SELECT * FROM Document WHERE ecm:primaryType = 'Design' AND ecm:currentLifeCycleState != 'deleted' AND ecm:isVersion = 0 AND file:content/length > {begin * 1024 * 1024}"
else:
import os
def get_meta(line):
splitted = line.split()
if len(splitted) <= 4:
return None, None, None
s = float(splitted[2])
t = splitted[3]
name = splitted[4].strip()
@andreinechaev
andreinechaev / audio_spliter.py
Created July 19, 2018 13:52
An example how to chunk audio with Python and pydub
from pydub import AudioSegment
from pydub.utils import make_chunks
import csv
import os
from pprint import pprint
chunk_s = 2000
natives = {}
@andreinechaev
andreinechaev / creator.py
Last active July 3, 2018 17:56
Creates users and groups from a given CSV file.
from time import sleep
from nuxeo.client import Nuxeo
from nuxeo.exceptions import HTTPError
from nuxeo.users import User
from nuxeo.groups import Group
import csv
import argparse
from jira import JIRA
import csv
import re
import time
from jira.exceptions import JIRAError
no_format = re.compile(r'{(code|noformat}).*?(code|noformat)}')
no_inline = re.compile(r'{{.*?}}')
no_xml = re.compile(r'<[^>]+>')
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andreinechaev
andreinechaev / colab_cuda_install.sh
Last active August 10, 2023 11:23
Installing CUDA (nvcc) on Google Colab
/opt/bin/nvidia-smi
wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb 2> /dev/null
apt-key add /var/cuda-repo-8-0-local-ga2/7fa2af80.pub
apt-get update
apt-get install -qq cuda gcc-5 g++-5 -y
ln -s /usr/bin/gcc-5 /usr/local/cuda/bin/gcc
ln -s /usr/bin/g++-5 /usr/local/cuda/bin/g++
/usr/local/cuda/bin/nvcc --version
#include <stdio.h>
#define N 2048 * 2048 // Number of elements in each vector
/*
* Optimize this already-accelerated codebase. Work iteratively,
* and use nvprof to support your work.
*
* Aim to profile `saxpy` (without modifying `N`) running under
* 50us.
#include <stdio.h>
/*
* Host function to initialize vector elements. This function
* simply initializes each element to equal its index in the
* vector.
*/
__global__
void initWith(float num, float *a, int N)