Skip to content

Instantly share code, notes, and snippets.

@diomed
Last active May 27, 2024 17:46
Show Gist options
  • Save diomed/a671e8fb41dd192e2dc4 to your computer and use it in GitHub Desktop.
Save diomed/a671e8fb41dd192e2dc4 to your computer and use it in GitHub Desktop.
things 4 ocr in SE | silver sombrero | http://git.io/vtMyW
Step1
- uvijek je bolje imati omit na početku, nego dohvaćanje slova i potom njihovo zamjenjivanje putem $1.
Step2
- ako imaš ovo (?<!r) i planiraš zamijeniti sa ovim [?<!r] grdno se varaš da će to raditi - dakle ne!
Try to make Subtitle Edit JAVASCRIPT APPLICATION
WORDLIST
<Word from="■" to="■" />
<RegEx find="■" replaceWith="■" />
izbaci i ubaci u word
iscediti - iscijediti [<check]
Dalee - Dale
29. orisao - orirao | regex - može li?
------
dirigujem
dirigiram
dirigujete
dirigirate
dirigovanj - dirigiranj
------
LINELIST
<LinePart from=" . " to=" . " />
Sekao - rezao
isljedni[kc] - detektiv|istražitelj
Lindzi - Lindsay / Lindsey [-.-]
nipodištava - omalovažava
---------------------------------
Rascepati
Džetski - Jetski
-----------------------------------------
word
Čarlijeva
---------------------------------
-- regex --
--------------------------
NEW:
--------------------------
ENGLISH
amke - make
Sprry - Sorry
rigth - right
smae - same
peple - people
knwo - know
thme - them
onw - now
peoc - proc
soem - some
-------------------
mediocre - osrednje
according - prema
crucial - ključn
college - fax
coincidence - slučajnost
tiresome
basic - priprosti
"burn" - "kajla"
denial - nijekanje, odbijanje, poricanje
seemingly - naizgled
live up to - dorasti
Lara Satriani
---------------------------------------------------------------------------------------------------
Boli me uvo - Briga me baš / Fučka mi se živo
ಠ ͜ʖ ಠ
==================================================================================================
učtiv - uljudan, pristojan, fin
<RegEx find="\b([sS])eti([hšmo]|mo|l[aeio]|še|vši|t[ei])?\b" replaceWith="$1jeti$2" />
---------------------------------------------------------------------------------------------------
<RegEx find="([sS])vež([aeiu]|e[mg]|[io]m|oj|in[aeiou]|inom)?\b" replaceWith="$1vjež$2" />
al gle: pod ovo spada i osvježe // OSTAVI KAKO JE U REGEXU!
<Word from="sveža" to="svježa" />
<Word from="sveže" to="svježe" />
<Word from="sveži" to="svježi" />
<Word from="svežu" to="svježu" />
<Word from="svežem" to="svježem" />
<Word from="svežeg" to="svježeg" />
<Word from="svežim" to="svježim" />
<Word from="svežom" to="svježom" />
<Word from="svežoj" to="svježoj" />
<Word from="svežina" to="svježina" />
<Word from="svežine" to="svježine" />
<Word from="svežini" to="svježini" />
<Word from="svežino" to="svježino" />
<Word from="svežinu" to="svježinu" />
<Word from="svežinom" to="svježinom" />
ljekari - liječnici
lekari - liječnici
lekarima - liječnicima
ljekarima - liječnicima
¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦ ¦
sicamore trees
===================================================================================================
(?&lt;=A|a) or |
-zamajava - zavlači-
===================================================================================================
|| regex:
zvinja - spriča
Duhu svijetom
***************************
cepanje - razdiranje
*******************************
OVO JE VRLO VAŽNO JER PRIKAZUJE KAKO ZAMIJENITI SLOVA UNUTAR RIJEČI no ne i samostalnu riječ
<RegEx find="/ \bni[cć]k" replaceWith="ničk" />
=================================
Evo kako stvari ponekad ispadnu u životu:
"Yes, 10 entries in any Words (or Lines) list is still faster than 1 <RegEx /> entry."
- misija je dakle - staviti neke od kraćih regexa u wordlist.
--------------------------------------------------
o živote što me tlačiš:
:::vrati ovo u srpski OCR::: - jok :: postoje dvije opcije sa ovim , jedna je Đ a druga je č, čini mi se
<RegEx find="Ä" replaceWith="Đ" />
-------------------------------------------------
ovo stavi u njemački, no prije obriši mnoštvo brojeva povezanih sa slučajem:
<RegEx find="([\d]){1}?m" replaceWith="$1 m" />
-------------------------------------------------
slightly | neznatno, malo
spinel
**********************************************
The Shocking Blue
----------------------------------
<RegEx find="ÄŤ" replaceWith="č" />
<RegEx find="Ä" replaceWith="č" />
<RegEx find="ć" replaceWith="ć" />
<RegEx find="Ä‘" replaceWith="đ" />
<RegEx find="Ĺľ" replaceWith="ž" />
<RegEx find="ž" replaceWith="ž" />
<RegEx find="š" replaceWith="š" />
<RegEx find="Å¡" replaceWith="š" />
<RegEx find="ÄŚ" replaceWith="Č" />
<RegEx find="ÄŒ" replaceWith="Č" />
<RegEx find="Ć" replaceWith="Ć" />
<RegEx find="Ĺ " replaceWith="Š" />
<RegEx find="Å " replaceWith="Š" />
<RegEx find="Ĺ˝" replaceWith="Ž" />
<RegEx find="Ž" replaceWith="Ž" />
ĆIRILICA U LATINICU
<RegEx find="а" replaceWith="a" />
<RegEx find="б" replaceWith="b" />
<RegEx find="в" replaceWith="v" />
<RegEx find="г" replaceWith="g" />
<RegEx find="д" replaceWith="d" />
<RegEx find="ђ" replaceWith="đ" />
<RegEx find="ж" replaceWith="ž" />
<RegEx find="з" replaceWith="z" />
<RegEx find="и" replaceWith="i" />
<RegEx find="л" replaceWith="l" />
<RegEx find="љ" replaceWith="lj" />
<RegEx find="н" replaceWith="n" />
<RegEx find="њ" replaceWith="nj" />
<RegEx find="п" replaceWith="p" />
<RegEx find="р" replaceWith="r" />
<RegEx find="с" replaceWith="s" />
<RegEx find="ћ" replaceWith="ć" />
<RegEx find="у" replaceWith="u" />
<RegEx find="ф" replaceWith="f" />
<RegEx find="х" replaceWith="h" />
<RegEx find="ц" replaceWith="c" />
<RegEx find="ч" replaceWith="č" />
<RegEx find="џ" replaceWith="dž" />
<RegEx find="ш" replaceWith="š" />
************************************************************************************
Lojd - Lloyd
drugari - prijatelji
---------
bogovetno - sve živo
preslišava ???? - preispitati , propitati
******
bosanski čćžšđ
tere[cč]en - terećen
*********************
identificiremo *** | allowed
druže - prijatelju (*ruski)
***********************************************
opasuljiti - urazumiti [talk some sense]
isleđivanje - istraga [suđenje]
batali - pusti / zaboravi
magacin - spremište
plakar - ormar
ćilibar - smaragd
sleduje
vještački - umjetni
ponekad ne radi izgleda:
izvinjavamo se - ispričavamo se
***********************************************
Charmaigne
Ampsyomysia
Myopya
Maccao
primača soba - dnevna soba [HR]
germa - kvasac
ćebencetom - dekicom
Ispričate - NE!!! izvadih > razradi rješenje
Beck
magacin >> warehouse << skladište
okvasila - smočila
sljedovanje ???
************************
kao sto - kao što ?*!*?
************************
<RegEx find="\b([vV])rj?ed([ei])" replaceWith="$1rijed$2" />
************************
experimental:
currently testing: <RegEx find="(?&lt;![ml])a([blcrnz])ić" replaceWith="a$1it ć" />
******************************************
trivia:
<RegEx find="(oO])d([kp])" replaceWith="$1t$2" />
******************************************
wordlist:
imaj u svom | ali nikad u službenom:
<LinePart from="da idem" to="ići" />
<LinePart from="da ide" to="ići" />
<Word from="ko" to="tko" />
***
<Word from="sedmice" to="tjedna" />
<Word from="sedmica" to="tjedan" />
<Word from="sedmicu" to="tjedan" />
NE U SLUŽBENO, AL IMAJ U SVOM
JER TI ĆEŠ PAZIT, DOK ONI NEĆE:
postara - pobrinu
<RegEx find="ostara" replaceWith="obrinu" />
<RegEx find="ajać" replaceWith="ajat ć" />
******************************************
word:
og - od
/zabacit//ubacit/odbacit
podstrekivati - poticati
postrekuje - potiče
nipodišt - omalovaž // stavi u svoj
batali([lto]) -pusti$1 // stavi u svoj
prank - psina
prank - zeznuti |glagol
Džo - Jo | ali samo kad je na početku <kej> - limit džokej
- jok! zato jer je npr. Džordž -> George i sl.
posekao-porezao [dunno - ako je prst ok je - al ako je drvo nije ok]
razumjeti
ali razumijete!
-----------------
savešću
saviješću - premjestiti
-------------------------
isekla - odsečem
==================================
Pr?ovj?eriću
zvać - zvat ć
---------------------------------------------------------------------------------
ujest - ugrist | REGEX: tamlje - tamnje
---------------------------------------------------------------------------------
zamajavati - zavaravati**
kidnapovanje - otimanje | otmica
*****************************
apply yourself - potruditi se
mooch - žicati
tangible progress - osjetni napredak
pogruženosti -???
pomerim - pomaknem
pasaž - odlomak *
**************************************************
nipodištavanje
"omalovažavanje", glagol "omalovažiti", "omalovažávati",
ili "podcijeniti", "podcjenjívati".
eliminiše -> ra
stimulisani - stimulirani
uspeti - uspjeti
stavi u svoj:
<Word from="neko" to="netko" />
<Word from="Neko" to="Netko" />
- u korist češće riječi!
N kći, G kćeri, D kćeri, A kćer, V kćeri, L kćeri, I kćeri/kćerju.
underachiever - neambiciozan
warrant - nalog
my mess - moj problem
reluctance - nevoljkost
descending
glass is on the blue illuminated keyboard
ireparrable
ogorčeni
ANIMALIA
Marsupilami
thoughtful - pažljivo
ubuđalu - pljesnjivu / trulu / učmalu
@diomed
Copy link
Author

diomed commented Aug 18, 2015

What's the difference between a limousine and a dead baby? I didn't lose my virginity in the back of a limo.

@diomed
Copy link
Author

diomed commented Sep 3, 2015

If debugging is the process of removing software bugs, then programming must be the process of putting them in.

@diomed
Copy link
Author

diomed commented Nov 22, 2015

[url=http://titlovi.com/titlovi/it-s-such-a-beautiful-day-219045][img]https://img.shields.io/badge/titlovi.com-It's%20such%20a%20beautiful%20day-brightgreen.png[/img][/url]

@diomed
Copy link
Author

diomed commented Jun 20, 2016

'People with no morals often considered themselves more free,but mostly they lacked the ability to feel or love.' Charles Bukowski

@diomed
Copy link
Author

diomed commented May 8, 2017

too much - not enough

@diomed
Copy link
Author

diomed commented Jul 23, 2017

.

@diomed
Copy link
Author

diomed commented Jul 26, 2018

<RegEx find="([^\s]+)nesl" replaceWith="$1nijel" />

primjer kako staviti riječ koja NE POČINJE sa tim slovima ali ima tu grupu slova uključenu u sebi

@diomed
Copy link
Author

diomed commented Aug 9, 2018

<RegEx find="([^\s]+)strova" replaceWith="$1strira" />

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment