How do I convert tab delimited data into an array?
Converting tab delimited data to an array can be confusing at first. Read about a few ways one could go about it, including reordering of tabular data, simple two row tables and a complex example.
You can download the sample stack from this url: https://tinyurl.com/y9k9h3m7
Create some fake tabbed data
To start off, I have created this example data and put it into a stack. Normally one would of course have pre-existing data, but this'll do for starters.
Name Type
Calvin Boy
Fritz Cat
Garfield Cat
Jerry Mouse
Mickey Mouse
Snoopy Dog
Tintin Adult
This is the most basic tabbed data, just a header row that describes each of the two columns with content.
Note: If you are copying and pasting this example, then you have to insert the tabs manually.
Split it all up
To put that into an array, the most obvious way is to use split. The following example script does just that, and then tries whether it was successful, by returning one of the keys of the resulting array in the message box. See how the code says "by return". That part specifies the element delimiter, where "and tab" tells LiveCode what the delimiter between the key of each element, and the contents of each element is.
on mouseUp
put field 1 into theData
delete line 1 of theData --do not want header row
split theData by return and tab
put theData["fritz"] --shows "cat" in the message box
end mouseUp
Note: "return" is also a command, that's why it has a different colourisation than "tab" in the script editor. Don't let yourself be confused by the colouring.
Important: If your first column contains entries that have the same value, then this will not work. Each key needs to be unique. So if the first column was numbered from 1 to 9 but there were two 3's, then only one of them would end up in your array!
So what about some more columns?
That was easy... but what if there are several rows of data? In such a case, it is best to know a bit more about the data. Here we have a quite common case, created in thousands of excel files all over the world, every hour of the day.
Number Name Type Source Year
1 Fritz Cat Fritz the Cat 1965
2 Mickey Mouse Mickey Mouse 1928
3 Snoopy Dog the Peanuts 1950
4 Jerry Mouse Tom & Jerry 1940
5 Garfield Cat Garfield 1978
Note: If you are copying and pasting this example, then you have to insert the tabs manually.
I said it is necessary to know the data. That is because one has to decide beforehand about how one will put that into an array. Will there be a cat/mouse/dog array that contains the entries as subarrays? Maybe one wants the numbers first, because they're unique, and everything else retained as tab delimited text. Possibly one just wants to split the columns or rows, for further processing or reordering. The possibilities are endless.
Let us assume we want the Type to contain the Names, which then would in turn contain everything else. But first we will look a bit at how to reorder stuff by using split and combine. We want the Number, Source and Year columns be after each other, so that we later can put them into our arrays. In addition, we want the Type and the Name to be at the beginning. That is why we need to reorder the columns first. Luckily, we can use the "by column" form of split to do that.
on mouseUp
put field 2 into theData
split theData by column
put theData[1] into temp
put theData[3] into theData[1]
put temp into theData[3]
combine theData by column
put theData
end mouseUp
Sub-arrays a plenty
To create the array, we could try to use the fact that split always uses the first occurrence of a char as delimiter between keys and content, just as we did in the first step. But right now, that would be bad, because there are several lines that are cats or mice. If we split now, only one of the mice and one of the cats would survive, because an array cannot have several keys with the same name. That is why we are using a repeat loop and items instead.
global theResult
on mouseUp
-- first reordering
put field 2 into theData
split theData by column
put theData[1] into temp
put theData[3] into theData[1]
put temp into theData[3]
combine theData by column
-- do not want description row
delete line 1 of theData
-- now the repeat loop
set the itemdelimiter to tab
repeat for each line theLine in theData
put item 3 to -1 of theLine into \
theResult[item 1 of theLine][item 2 of theLine]
end repeat
--shows the content of one of the subarrays
put theResult["mouse"]["mickey"]
end mouseUp
Check it out
Try to look at different array contents. The easiest way to do that, is to declare the output variable as a global (as we did above), because then you'll be able to access it from the message box, and even better, the "variables" tab of the script editor. There you can "browse" our newfangled array, and get a feel about where things got stored.
lestroso
Hy, this example for me is a little complicated but well explained.
I hope that you runrev can continue to explain example like this, because i need to become expert developer runrev user to make shareware.
i appreciate very much your work.
thanks,
lestroso
www.fasasoftware.com
arie van der ent
Tried the first part of this example. Wanted to try 'split' and 'combine' as used in this example. I did not work.
Elanor Buchanan
Hi Arie, did you try the example in a stack of your own or using the example stack given in the lesson?
To use split and combine you need to specify what the item and row delimiters are. In the first part of the example the text data has one element on each line and the key and value of each element are split by a tab so we use
split theData by return and tab
You can combine array data in the same way.
See the Dictionary entries
http://docs.runrev.com/Command/split
http://docs.runrev.com/Command/combine
I hope this helps.
Elanor
Mark
I think "temporary" is a reserved word in LC and can't be used as a var name as in the 2nd code example. (But "temp", as used in the downloadable sample stack, works just fine).
Panos Merakos
Thanks for spotting this, Mark. I'll update the 2nd code example now.
Cheers,
Panos