Skip to content

Instantly share code, notes, and snippets.

@daino3
Last active December 12, 2021 11:51
Show Gist options
  • Save daino3/b671b2d171b3948692887e4c484caf47 to your computer and use it in GitHub Desktop.
Save daino3/b671b2d171b3948692887e4c484caf47 to your computer and use it in GitHub Desktop.
Converting an image data uri to (28, 28) numpy array and writing to csv
from PIL import Image
import base64
import numpy
from io import BytesIO
data_uri = ""
dimensions = (28, 28)
encoded_image = data_uri.split(",")[1]
decoded_image = base64.b64decode(encoded_image)
### APPROACH 1 (BROKEN):
# ____________________
# image is (302, 302)
img = Image.open(BytesIO(decoded_image))
# image is (28, 28)
img = img.resize(dimensions, Image.ANTIALIAS)
# pixels.shape == (28, 28, 4)
pixels = numpy.asarray(img, dtype='uint8')
# force (28, 28)
pixels = numpy.resize(pixels, (28,28))
# image is distorted
img = Image.fromarray(pixels)
img.show()
### APPROACH 2 (NOPE):
# ____________________
# image is (302, 302)
img = Image.open(BytesIO(decoded_image)).convert('LA')
# image is (28, 28)
img = img.resize(dimensions, Image.ANTIALIAS)
# pixels.shape == (28, 28, 2)
pixels = numpy.asarray(img, dtype='uint8')
# pixel data is lost
img = Image.fromarray(pixels)
img.show()
@Kukunin
Copy link

Kukunin commented Feb 2, 2020

Your method doesn't work because DataURL is encoded JPEG or PNG binary. So you need to use those libraries to unpack images to an array of pixels. Using OpenCV works for me:

    image_b64 = dataurl.split(",")[1]
    binary = base64.b64decode(image_b64)
    image = np.asarray(bytearray(binary), dtype="uint8")
    image = cv2.imdecode(image, cv2.IMREAD_COLOR)

@daino3
Copy link
Author

daino3 commented Feb 3, 2020

Thanks, Kukunin. I'll give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment