jasdeep06/computer_vision.md

## computer_vision.md

      
    Raw
  

              computer_vision.md
            
          
Reading an image in PIL gives a native PIL object.The size of the image is represented as width,height(PIL does not return channels)(400,300).This PIL object can be converted into numpy array by passing the PIL object to numpy.array().The returned object is a numpy array of shape attribute rows,columns,channels(300,400).Plotting both images in matplotlib gives rightly plotted image.

import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

img = Image.open('data/image.jpg').convert('L')
print(img.size)#(400,300)

np_img = np.array(img)
print(np_img.shape)#(300,400)

fig = plt.figure()
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)

ax1.imshow(img,cmap='gray')
ax2.imshow(np_img,cmap='gray')

plt.show()


Reshape is a very dangerous command.The prime reason is as it can have many outputs.Input and output shape is simply not enough information to lock the output down to one option.What numpy does is,it first ravels the input tensor i.e. arranges its element in a 1d array and then arranges the output to required shape.

import numpy as np
l = np.random.randint(1,10,(5,4,3))
print(l)
[[[5 1 2]
#   [2 9 7]
#   [9 2 9]
#   [6 8 1]]
# 
#  [[1 9 1]
#   [8 9 2]
#   [2 3 5]
#   [6 1 7]]
# 
#  [[3 1 8]
#   [2 6 7]
#   [4 5 9]
#   [2 9 3]]
# 
#  [[9 1 8]
#   [9 6 5]
#   [9 4 4]
#   [7 1 5]]
# 
#  [[2 7 8]
#   [1 7 6]
#   [1 3 1]
#   [6 2 1]]]


print(np.reshape(l,(4,5,3)))

# [[[5 1 2]
#   [2 9 7]
#   [9 2 9]
#   [6 8 1]
#   [1 9 1]]
# 
#  [[8 9 2]
#   [2 3 5]
#   [6 1 7]
#   [3 1 8]
#   [2 6 7]]
# 
#  [[4 5 9]
#   [2 9 3]
#   [9 1 8]
#   [9 6 5]
#   [9 4 4]]
# 
#  [[7 1 5]
#   [2 7 8]
#   [1 7 6]
#   [1 3 1]
#   [6 2 1]]]


If you want operations like flipping image by 90 deg etc,reshaping wont cut.You need to use things like transpose etc.

imshow in matplotlib takes in negative pixel values too.For a grayscale image it has to plot pixel in range (0,255).The minimum pixel value in our image may be negative.It maps that negative velue to pure black and the maximum value to pure white.It divides (0,255) range in our new range of (0,|max|+|min|) and does plotting accordingly.