Image Formats
Introduction for the various formats in reading an image and then use it in tensorrt/Cudnn.
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
img = cv2.imread("image_formats/test.png")
Row,Column,Channel = img.shape
print("Rows = ", Row, "Columns = ", Column)
print("Height = ", Row, "Width = ", Column, "Channels = ", Channel)
img_pil = Image.open("image_formats/test.png")
- 1 (1-bit pixels, black and white, stored with one pixel per byte)
- L (8-bit pixels, black and white)
- RGB (3x8-bit pixels, true color)
- RGBA (4x8-bit pixels, true color with transparency mask)
- CMYK (4x8-bit pixels, color separation)
- HSV (3x8-bit pixels, Hue, Saturation, Value color space)
img_pil.mode
Column,Row = img_pil.size
print("Rows = ", Row, "Columns = ", Column)
png_np_img = np.asarray(img)
plt.imshow(png_np_img)
# if its grayscale plt.imshow(png_np_img, cmap='gray')
# Showing the np characteritics
print("shape is ", png_np_img.shape)
print("dtype is ", png_np_img.dtype)
print("ndim is ", png_np_img.ndim)
print("itemsize is ", png_np_img.itemsize) # size in bytes of each array element
print("nbytes is ", png_np_img.nbytes) # size in bytes of each array element
png_np_img = np.asarray(img_pil)
plt.imshow(png_np_img) # this will graphit in a jupyter notebook
# or if its grayscale plt.imshow(png_np_img, cmap='gray')
# FWIW, this will show the np characteritics
print("shape is ", png_np_img.shape)
print("dtype is ", png_np_img.dtype)
print("ndim is ", png_np_img.ndim)
print("itemsize is ", png_np_img.itemsize) # size in bytes of each array element
print("nbytes is ", png_np_img.nbytes) # size in bytes of each array element
- At (600,100) in GIMP we have following R,G,B values (21,131,172)
row = 100
col = 600
pixel_b, pixel_g, pixel_r = img[row][col]
print("BGR values = ", pixel_b, pixel_g, pixel_r)
row = 100
col = 600
pixel_r, pixel_g, pixel_b = img_pil.getpixel((col,row))
print("BGR values = ", pixel_b, pixel_g, pixel_r)
row = 100
col = 600
pixel_b, pixel_g, pixel_r = np.array(img)[row,col]
print("BGR values = ", pixel_b, pixel_g, pixel_r)
row = 100
col = 600
pixel_r, pixel_g, pixel_b = np.array(img_pil)[row,col]
print("BGR values = ", pixel_b, pixel_g, pixel_r)
b,g,r = cv2.split(img)
f, axarr = plt.subplots(2,2)
axarr[0,0].imshow(img)
axarr[0,1].imshow(b)
axarr[1,0].imshow(g)
axarr[1,1].imshow(r)
r,g,b = img_pil.split()
#r = img.getchannel(0)
#g = img.getchannel(1)
#b = img.getchannel(2)
f, axarr = plt.subplots(2,2)
axarr[0,0].imshow(img_pil)
axarr[0,1].imshow(b)
axarr[1,0].imshow(g)
axarr[1,1].imshow(r)
arr = np.array(img_pil)
r = arr[:,:,0]
g = arr[:,:,1]
b = arr[:,:,2]
f, axarr = plt.subplots(2,2)
axarr[0,0].imshow(arr)
axarr[0,1].imshow(b)
axarr[1,0].imshow(g)
axarr[1,1].imshow(r)
arr = np.array(img)
b = arr[:,:,0]
g = arr[:,:,1]
r = arr[:,:,2]
f, axarr = plt.subplots(2,2)
axarr[0,0].imshow(arr)
axarr[0,1].imshow(b)
axarr[1,0].imshow(g)
axarr[1,1].imshow(r)
Offset (byte) : 0 1 2 3 4 5 6 7 8 ...29 30 31 32 33 34 35 36 37 ...
Height Pos : 0 0 0 0 0 0 0 0 0 ... 0 0 0 1 1 1 1 1 1 ...
Width Pos : 0 0 0 1 1 1 2 2 2 ... 9 9 9 0 0 0 1 1 1 ...
Color Index : B G R B G R B G R ... B G R B G R B G R ...
Offset (byte) : 0 1 2 3 ... 9 10 11 12 13 ...90 91 92 93 ... 99 100 ... 199 200 ... 299
Color Index : B B B B ... B B B B B ... B B B B ... B G ... G R ... R
Height Pos : 0 0 0 0 ... 0 0 0 0 0 ... 9 9 9 9 ... 9 0 ... 9 0 ... 9
Width Pos : 0 1 2 3 ... 9 0 1 2 3 ... 0 1 2 3 ... 9 0 ... 9 0 ... 9
-
legacy ("HWC") mode (CPU and GPU without cudnn): Channels are tuples of scalars
Each sample is stored as a column-major matrix (height, width) of float[numChannels] (r00, g00, b00, r10, g10, b10, r01, g01, b01, r11, g11, b11).
- input : [C x W x H x N] or ARRAY[1..N] OF ARRAY[1..H] OF ARRAY[1..W] OF ARRAY[1..C]
- output : [C' x W' x H' x N] or ARRAY[1..N] OF ARRAY[1..H'] OF ARRAY[1..W'] OF ARRAY[1..C']
- filter : [C' x W" x H" x C] or ARRAY[1..C] OF ARRAY[1..H"] OF ARRAY[1..W"] OF ARRAY[1..C']
-
cudnn ("CHW") mode (GPU only): Channels are planes (b00, b10, b01, b11, g00, g10, g01, g11, r00, r10, r01, r11).
- input : [W x H x C x N] or ARRAY[1..N] OF ARRAY[1..C] OF ARRAY[1..H] OF ARRAY[1..W]
- output : [W' x H' x C' x N] or ARRAY[1..N] OF ARRAY[1..C'] OF ARRAY[1..H'] OF ARRAY[1..W']
- filter : [W" x H" x C x C'] or ARRAY[1..C'] OF ARRAY[1..C] OF ARRAY[1..H] OF ARRAY[1..W]
where:
- using ' for output and " for filter
- N = samples
- W, H = width, height (W', H' for output, W", H" for kernel)
- C = input channels
- 3 for color images, 1 for B&W images
- for hidden layer: dimension of activation vector for each pixel
- C' = output channels = dimension of activation vector for each pixel