Created
June 5, 2018 12:37
-
-
Save kalaspuffar/2eabd4d38cd3a7de0dde4b35c8be7aa3 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
addrs = [ | |
'PetImages/A/test1.jpg', | |
'PetImages/B/test3.jpg', | |
'PetImages/B/test3.jpg', | |
'PetImages/C/test4.jpg', | |
'PetImages/D/test2.jpg', | |
'PetImages/C/test4.jpg', | |
'PetImages/A/test1.jpg', | |
'PetImages/D/test2.jpg' | |
] | |
letterString = 'ABCDEFGHJKLMNOPQRSTUVWXYZ' | |
labels = [] | |
for addr in addrs: | |
m = re.match('PetImages/([A-Z])/.*', addr) | |
labels.append(letterString.find(m.group(1))) | |
print labels |
Hi @kalaspuffar. I am trying out this code for around 40 classes, however my label names are not single letters such A,B,C. They have such names as character_1_ka. As i understand, the letterString looks at each character of the string as a label, yes? Would this be fixed by creating an array of strings as letterString?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Shazuka.
Sorry for not responding earlier. I must have missed this comment.
If you get the problem that the match is a NoneType that means that the regular expression might not match.
Is the path to the files correct?
The regular expression i use above is 'PetImages/([A-Z])/.*'
That requires that you have a set of paths starting with PetImages then the directory inside there should be a uppercase letter and the files inside those directories can have any name.
You can make it more specific by chaning it to say 'PetImages/([A-Z])/[a-zA-Z]+.jpg'
Then you only allowing names with letters a to z and A to Z ending with jpg.
If you think regular expressions are strange I have a video that might help
https://www.youtube.com/watch?v=tXhwgHXH8OM
Best regards
Daniel