Finding Common Lines in Two Containers

This lesson demonstrates how to find the common lines in two containers, this is a useful function for comparing lists.

The UI

The UI

The UI for this example consists of two fields, each containing some text, and a button to find the lines that are common between the two fields,

You can download the stack associated with this lesson from this url: https://tinyurl.com/y8wtfbyn

The commonLines function

The commonLines function finds the lines that are in both of two different lists. It does this by creating an array for each list, with the keys being the text of the unique lines, then using the intersect command to find the keys that are in both arrays.

put commonLines(tBaseList, field "New Lines") into tMyLines

Creating arrays from the lists

The first thing the handler does is to create an array of the lines in the pFirstList parameter. You might think that we could use the combine command to do this, but the combine command would make each line into the content of an array element. Instead, we want each line to be the key of an element. The content of each element doesn't matter for this handler. This example puts true into each element; we could actually put any value into each element, and the handler would work the same way.

To create this array, we use the repeat control structure to make an element for each line in the pFirstList. The element's key is the text in the line, and its content is the true constant. If a line appears more than once in the list, the array will contain only one element for that line, instead of one for each repetition, because an array can't have two elements with the same key. But this is fine: since the function is looking for common lines, we don't care whether a line is repeated in the pFirstList, only whether it appears at all.

repeat for each line tLine in pFirstList
	put true into tFirstArray[tLine]
end repeat

Next, the handler creates an array from the pSecondList parameter, in the same way as it used the pFirstList. We now have two arrays, and the keys of each array are the lines in the corresponding list.

repeat for each line tLine in pSecondList
	put true into tSecondArray[tLine]
end repeat

Finding the common lines

To find the common lines, we need a list of all the elements whose keys appear in both arrays. This is exactly what the intersect command does: given two arrays, it retains only the elements whose keys match.

After the intersect command is executed, the tFirstArray variable contains only the elements whose keys are in both arrays. Each key is the text of a common line, so the function returns the list of keys in this array.

intersect tFirstArray with tSecondArray
   
## return the corresponding lines of text:
return the keys of tFirstArray

This handler takes advantage of LiveCode's ability to create and work with associative arrays. Most languages that support arrays only let you use integers as the keys, but in LiveCode, you can use any string as a key. In this example, the capability lets us use the keys of the array, not just the contents, as meaningful content, and this technique using the intersect command would not be possible without the use of associative arrays.

The commonLines function code

This function belongs on the card script

function commonLines pFirstList, pSecondList
	local tFirstArray, tSecondArray
	## create an array for each list
	## the array keys are the lines in the list:
   
	repeat for each line tLine in pFirstList
		put true into tFirstArray[tLine]
	end repeat
   
	repeat for each line tLine in pSecondList
		put true into tSecondArray[tLine]
	end repeat
   
	## retain only elements that are found in both arrays:
	intersect tFirstArray with tSecondArray
   
	## return the corresponding lines of text:
	return the keys of tFirstArray
end commonLine

The "Common Lines" button code

on mouseUp
	answer commonLines(field "list1",field "list2")
end mouseUp

A note on efficiency

You might wonder whether the array operations in this example, constructing the two arrays and executing the intersect command, make it slower than an example, like the following, that uses chunk expression to find out whether each line in pFirstList is also in pSecondList or not:

repeat for each line tLine in pFirstList
	if tLine is among the lines of pSecondList then
		put tLine & return after tCommonLines
	end if
end repeat

While array operations typically have some overhead, it turns out that the approach used in this example is much faster than the one illustrated above if the lists are longer than ten lines or so. If the pFirstList is ten lines long, the two approaches are approximately equal (within a factor of two), but as the pFirstList grows in length, the approach using arrays quickly gains a speed advantage.

0 Comments

Add your comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.