Image Processing
Johnny Depp, the members of Guns and Roses, Donald Trump, Jon Berasategui. What links all these people? That's right, they're all in need of a good haircut. But should I be added to this list? Most likely, but why let conjecture and speculation decide when our trusty friend LiveCode can do all the work? So how do we go about this? Lets find out...
You can download the stack associated with this lesson from this url: https://tinyurl.com/y8ygagaq
The doINeedAHaircut function
What we ideally want is a function with the following prototype that takes an image and returns true if we need a haircut or false if we don't.
function DoINeedAHaircut pImage, pWidth, pHeight
Above is what I looked like this morning. Do I need a haircut? Only LiveCode knows.
Extracting the image data
First of all, we need to extract the data from the image for processing. One option is to use the standard binary data that defines the image (available via the command get the text of "my image"). This data however is dependent upon the format of the image and is not easily processed. A better option is the imageData property offered by LiveCode (get the imageData of "my image").
The image data defines four properties for each pixel in our image; the alpha content, the red content, the green content and LiveCode blue content. Each property is a value between 0 and 255 and is represented within the imageData as a single character: A Revolution character is represented in memory as a single byte or 8 bits. The maximum value represented by 8 bits (in decimal) is 255, meaning 8 bits can be used to represent 256 unique values. This in turn means that there are 256 possible characters. Don't worry if you don't understand that though. It's simple to get the numeric value of each character via the charToNum function.
So, character two of the imageData is the red content of the top left pixel of our image, character six is the red content of the second pixel in the first row and so on for each pixel. It follows that red content of pixel y of row x of our image will be char ((x - 1) *pWidth ) + ((y - 1) * 4) + 2 of the imageData. Given that we know the width and height of our image (passed to our function as arguments), we can loop through the imageData row by row, extracting the red, green, blue and alpha content from our image.
local tRed, tGreen, tBlue, tAlpha
repeat with y = 1 to pHeight
repeat with x = 1 to pWidth
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage) into tRed
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage) into tGreen
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage) into tBlue
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 1 of pImage) into tAlpha
end repeat
end repeat
repeat with y = 1 to pHeight
repeat with x = 1 to pWidth
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage) into tRed
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage) into tGreen
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage) into tBlue
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 1 of pImage) into tAlpha
end repeat
end repeat
Thresholding
This is all very well, but it doesn't tell us any more about the state of our hair. What we need to do is extract the pixels defining our hair from the image. One method we can use is called thresholding. Thresholding produces a black and white image based on a threshold value and is particularly good for binary situations i.e. we have a well defined image we want to extract from a background. It works using the following principle: If the given pixel value is above our threshold value, colour it black; otherwise, colour it white.
In our image, the colour of each pixel is represented by three values (the red, green and blue content). From these three values we want to produce a single value which can be compared to our threshold. A good idea is to take the average value of the three colour properties. An even better idea is to take the average value weighted toward particular colours. This allows us pick out particular colour values more accurately. Since my hair is a dark brown colour, I weighted more toward blue and green. So, our image processing loop now becomes...
repeat with y = 1 to pHeight
repeat with x = 1 to pWidth
local tRed, tGreen, tBlue, tAlpha
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage) into tRed
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage) into tGreen
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage)into tBlue
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 1 of pImage) into tAlpha
if (tRed * RED_WEIGHT + tGreen * GREEN_WEIGHT + tBlue * BLUE_WEIGHT) / 3 < THRESHOLD then
put numToChar(0) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage
put numToChar(0) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage
put numToChar(0) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage
else
put numToChar(255) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage
put numToChar(255) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage
put numToChar(255) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage
end if
end repeat
end repeat
Using constants, weights and thresholds
Notice the use of constants for the weights and thresholds. This allows us to easily adjust the desired values throughout the entirety of our script from a single point, saving us wading through our code to find them manually. A good convention to use is to either precede each constant with the letter k or define them in capitals, differentiating them from regular variables and keywords. In this case, our weights will be values between 0 and 1 (what fraction of that colour we want to extract) and the threshold values will be between 0 and 255. At various points throughout the function, it's a good idea to view the output of our processing so we can see exactly what is going on. You can do this via the command set the imageData of image "my image" to pImage. Here are the results I got.
Do I need a haircut?
We are now a little further down the line, but we still don't have the answer to our burning question. My reasoning is this: If my hair occupies more space than my face, it's about time I got a haircut. To do this we also need to extract our face from the image. We do this as before, by specifying a threshold value for our face i.e. each pixel in the image is either part of our hair, our face or the background. Finally, we just need to keep count of the pixels that fall on our face and the pixels that fall on our hair. If there is more hair than face pixels, return true. So our function now becomes...
local tHairCount, tFaceCount
repeat with y = 1 to pHeight
repeat with x = 1 to pWidth
local tRed, tGreen, tBlue, tAlpha
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + of pImage) into tRed
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) +3 of pImage) into tGreen
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage) into tBlue
put charToNum(char ((y - 1) * pWidth ) + ((x - 1) * 4) + 1 of pImage) into tAlpha
if (tRed * RED_WEIGHT + tGreen * GREEN_WEIGHT + tBlue * \
BLUE_WEIGHT) / 3 < HAIR_THRESHOLD then
put numToChar(HAIR_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage
put numToChar(HAIR_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage
put numToChar(HAIR_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage
add 1 to tHairCount
else if (tRed * RED_WEIGHT + tGreen * GREEN_WEIGHT + \
tBlue * BLUE_WEIGHT) / 3 < FACE_THRESHOLD then
put numToChar(FACE_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage
put numToChar(FACE_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage
put numToChar(FACE_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage
add 1 to tFaceCount
else
put numToChar(BG_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 2 of pImage
put numToChar(BG_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 3 of pImage
put numToChar(BG_COLOUR) into char ((y - 1) * pWidth ) + ((x - 1) * 4) + 4 of pImage
end if
end repeat
end repeat
return tHairCount > tFaceCount
My image, with my face picked out, now looks like the image above.
The decision
So that's it. Pretty easy huh? It's amazing what a few lines of LiveCode script can do. If you found that interesting, search Google to find out more about image processing. There's loads of information out there, from edge detection via Gaussian Filters to using feature vectors to extract more complex properties from images. Find the results from thresholding limiting? Then calculate your threshold value locally using adaptive thresholding. Or why not extend your application using some of Revolution's features? Why not grab your images continuously from a video camera using the Video Grabber? Or maybe you want the whole world to know if you need a haircut. So publish your findings online using LibUrl or even via a database using RevDB.
And do I need a haircut? Well, yes apparently. But I think we are all in agreement, we've just had a glimpse of the future: Using LiveCode to answer all the big questions. In fact, I see a range of newsletter articles dedicated to it. You ask, LiveCode answers. I wait eagerly for next months article from Ben. "It's Thursday evening. Shall I go climbing?"
Trevix
Very nice.
Can you get me some clue on how to configure a check mark detection in an image (grey scale only)?
For example, suppose I prepare an empty template with square boxes for multiple answer. I would like to scan the template, converting to an image and then scan a pencil filled copy and compare the two, as to find out which answers are correct.
Elanor Buchanan
Hi Trevix
If you know roughly where the checks will be in your image you could use a similar method to check the areas of the image where the checkboxes are, but instead of looping across the width and height of the whole image you can loop from x1 to x2 and y1 to y2 for each checkbox. You could then compare the blank and complete versions in a similar way to the comparison between tHairCount and tFaceCount in the example.
Would that give you what you need?
Kind regards
Elanor
Giulio Mastrogiuseppe
Very interesting article. One question: could this be used in some way with gray-shades-only images? Could be useful to build an application to examin images from echography?
Elanor Buchanan
Dear Giulio,
Yes, you should be able to use this method, or something similar. You would just need to adjust the processing algorithm.
Kind regards
Elanor