Chirag Dadia cdadia

## how_to_reset_kafka_consumer_group_offset.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                cdadia
                / how_to_reset_kafka_consumer_group_offset.md
            
            
              Created
              November 17, 2021 03:14
                — forked from marwei/how_to_reset_kafka_consumer_group_offset.md
            
              
                How to Reset Kafka Consumer Group Offset
              
          
    Kafka 0.11.0.0 (Confluent 3.3.0) added support to manipulate offsets for a consumer group via cli kafka-consumer-groups command.

List the topics to which the group is subscribed

kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --describe
Note the values under "CURRENT-OFFSET" and "LOG-END-OFFSET". "CURRENT-OFFSET" is the offset where this consumer group is currently at in each of the partitions.

Reset the consumer offset for a topic (preview)


## gource-multiple-repositories.sh
#!/usr/bin/env bash
# Generates gource video (h.264) out of multiple repositories.
# Pass the repositories in command line arguments.
# Example:
# <this.sh> /path/to/repo1 /path/to/repo2

RESOLUTION="1600x1080"
outfile="gource.mp4"

i=0

## tmux-cheatsheet.markdown

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                cdadia
                / tmux-cheatsheet.markdown
            
            
              Created
              March 5, 2019 02:51
                — forked from MohamedAlaa/tmux-cheatsheet.markdown
            
              
                tmux shortcuts & cheatsheet
              
          
    tmux shortcuts & cheatsheet

start new:
tmux

start new with session name:
tmux new -s myname


## README.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                cdadia
                / README.md
            
            
              Created
              March 3, 2017 12:22
                — forked from dannguyen/README.md
            
              
                Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs

  
## queue.py
import threading

class EventQueue:
    def __init__(self):
        self._queue = self.Queue()
        self._results = self.Results()
        self._runner = self.Runner(self._queue.dequeue)

        self._runner_start()

## ipython_notebook_in_git.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                cdadia
                /  ipython_notebook_in_git.md
            
            
              Created
              October 9, 2016 02:38
                — forked from pbugnion/ ipython_notebook_in_git.md
            
              
                Keeping IPython notebooks under Git version control
              
          
    This gist lets you keep IPython notebooks in git repositories. It tells git to ignore prompt numbers and program outputs when checking that a file has changed.
To use the script, follow the instructions given in the script's docstring.
For further details, read this blogpost.
The procedure outlined here is inspired by this answer on Stack Overflow.

  
## cleanup-docker.sh
#!/bin/bash

# remove exited containers:
docker ps --filter status=dead --filter status=exited -aq | xargs -r docker rm -v

# remove unused images:
docker images --no-trunc | grep '<none>' | awk '{ print $3 }' | xargs -r docker rmi

# remove unused volumes:
docker volume ls -qf dangling=true | xargs -r docker volume rm

## dlAttachments.py
# Something in lines of http://stackoverflow.com/questions/348630/how-can-i-download-all-emails-with-attachments-from-gmail
# Make sure you have IMAP enabled in your gmail settings.
# Right now it won't download same file name twice even if their contents are different.

import email
import getpass, imaplib
import os
import sys

detach_dir = '.'

## resize_boot2docker.sh
# Steps we will take:
# 1. Change boot2docker image type (this will take long)
# 2. Resize image
# 3. Resize partion (using GParted)
#
# Also see: https://docs.docker.com/articles/b2d_volume_resize/

# Stop boot2docker
boot2docker stop

## example.scala
scala> import scala.reflect.runtime.universe._
import scala.reflect.runtime.universe._

scala> showCode(reify {
     | for{
     |     x <- 1 to 5
     |     _ = print("hi")
     |   } print(x)
     | }.tree)
res1: String =
	#!/usr/bin/env bash
	# Generates gource video (h.264) out of multiple repositories.
	# Pass the repositories in command line arguments.
	# Example:
	# <this.sh> /path/to/repo1 /path/to/repo2

	RESOLUTION="1600x1080"
	outfile="gource.mp4"

	i=0
	import threading

	class EventQueue:
	def __init__(self):
	self._queue = self.Queue()
	self._results = self.Results()
	self._runner = self.Runner(self._queue.dequeue)

	self._runner_start()
	#!/bin/bash

	# remove exited containers:
	docker ps --filter status=dead --filter status=exited -aq \| xargs -r docker rm -v

	# remove unused images:
	docker images --no-trunc \| grep '<none>' \| awk '{ print $3 }' \| xargs -r docker rmi

	# remove unused volumes:
	docker volume ls -qf dangling=true \| xargs -r docker volume rm
	# Something in lines of http://stackoverflow.com/questions/348630/how-can-i-download-all-emails-with-attachments-from-gmail
	# Make sure you have IMAP enabled in your gmail settings.
	# Right now it won't download same file name twice even if their contents are different.

	import email
	import getpass, imaplib
	import os
	import sys

	detach_dir = '.'
	# Steps we will take:
	# 1. Change boot2docker image type (this will take long)
	# 2. Resize image
	# 3. Resize partion (using GParted)
	#
	# Also see: https://docs.docker.com/articles/b2d_volume_resize/

	# Stop boot2docker
	boot2docker stop
	scala> import scala.reflect.runtime.universe._
	import scala.reflect.runtime.universe._

	scala> showCode(reify {
	\| for{
	\| x <- 1 to 5
	\| _ = print("hi")
	\| } print(x)
	\| }.tree)
	res1: String =