How do I convert tab delimited data into an array?

Converting tab delimited data to an array can be confusing at first. Read about a few ways one could go about it, including reordering of tabular data, simple two row tables and a complex example.

You can download the sample stack from this url: https://tinyurl.com/y9k9h3m7

Create some fake tabbed data

To start off, I have created this example data and put it into a stack. Normally one would of course have pre-existing data, but this'll do for starters.

Name Type

Calvin Boy

Fritz Cat

Garfield Cat

Jerry Mouse

Mickey Mouse

Snoopy Dog

Tintin Adult

This is the most basic tabbed data, just a header row that describes each of the two columns with content.

Note: If you are copying and pasting this example, then you have to insert the tabs manually.

Split it all up

Split it all up

To put that into an array, the most obvious way is to use split. The following example script does just that, and then tries whether it was successful, by returning one of the keys of the resulting array in the message box. See how the code says "by return". That part specifies the element delimiter, where "and tab" tells LiveCode what the delimiter between the key of each element, and the contents of each element is.

on mouseUp
   put field 1 into theData
   delete line 1 of theData --do not want header row
   split theData by return and tab
   put theData["fritz"] --shows "cat" in the message box
end mouseUp

Note: "return" is also a command, that's why it has a different colourisation than "tab" in the script editor. Don't let yourself be confused by the colouring.

Important: If your first column contains entries that have the same value, then this will not work. Each key needs to be unique. So if the first column was numbered from 1 to 9 but there were two 3's, then only one of them would end up in your array!

So what about some more columns?

So what about some more columns?

That was easy... but what if there are several rows of data? In such a case, it is best to know a bit more about the data. Here we have a quite common case, created in thousands of excel files all over the world, every hour of the day.

Number Name Type Source Year

1 Fritz Cat Fritz the Cat 1965

2 Mickey Mouse Mickey Mouse 1928

3 Snoopy Dog the Peanuts 1950

4 Jerry Mouse Tom & Jerry 1940

5 Garfield Cat Garfield 1978

Note: If you are copying and pasting this example, then you have to insert the tabs manually.

I said it is necessary to know the data. That is because one has to decide beforehand about how one will put that into an array. Will there be a cat/mouse/dog array that contains the entries as subarrays? Maybe one wants the numbers first, because they're unique, and everything else retained as tab delimited text. Possibly one just wants to split the columns or rows, for further processing or reordering. The possibilities are endless.

Let us assume we want the Type to contain the Names, which then would in turn contain everything else. But first we will look a bit at how to reorder stuff by using split and combine. We want the Number, Source and Year columns be after each other, so that we later can put them into our arrays. In addition, we want the Type and the Name to be at the beginning. That is why we need to reorder the columns first. Luckily, we can use the "by column" form of split to do that.

on mouseUp
   put field 2 into theData
   split theData by column		
   put theData[1] into temp
   put theData[3] into theData[1]
   put temp into theData[3]
   combine theData by column		
   put theData	
end mouseUp

Sub-arrays a plenty

Sub-arrays a plenty

To create the array, we could try to use the fact that split always uses the first occurrence of a char as delimiter between keys and content, just as we did in the first step. But right now, that would be bad, because there are several lines that are cats or mice. If we split now, only one of the mice and one of the cats would survive, because an array cannot have several keys with the same name. That is why we are using a repeat loop and items instead.

global theResult	
on mouseUp
   -- first reordering		
   put field 2 into theData
   split theData by column		
   put theData[1] into temp
   put theData[3] into theData[1]
   put temp into theData[3]
   combine theData by column
   -- do not want description row	
   delete line 1 of theData 
   -- now the repeat loop	
   set the itemdelimiter to tab		
   repeat for each line theLine in theData
      put item 3 to -1 of theLine into \
            theResult[item 1 of theLine][item 2 of theLine]
   end repeat	
   --shows the content of one of the subarrays	
   put theResult["mouse"]["mickey"] 
end mouseUp

Check it out

Check it out

Try to look at different array contents. The easiest way to do that, is to declare the output variable as a global (as we did above), because then you'll be able to access it from the message box, and even better, the "variables" tab of the script editor. There you can "browse" our newfangled array, and get a feel about where things got stored.

5 Comments

lestroso

Hy, this example for me is a little complicated but well explained.
I hope that you runrev can continue to explain example like this, because i need to become expert developer runrev user to make shareware.
i appreciate very much your work.

thanks,

lestroso

www.fasasoftware.com

arie van der ent

Tried the first part of this example. Wanted to try 'split' and 'combine' as used in this example. I did not work.

Elanor Buchanan

Hi Arie, did you try the example in a stack of your own or using the example stack given in the lesson?

To use split and combine you need to specify what the item and row delimiters are. In the first part of the example the text data has one element on each line and the key and value of each element are split by a tab so we use

split theData by return and tab

You can combine array data in the same way.

See the Dictionary entries

http://docs.runrev.com/Command/split
http://docs.runrev.com/Command/combine

I hope this helps.

Elanor

Mark

I think "temporary" is a reserved word in LC and can't be used as a var name as in the 2nd code example. (But "temp", as used in the downloadable sample stack, works just fine).

Panos Merakos

Thanks for spotting this, Mark. I'll update the 2nd code example now.

Cheers,
Panos

Add your comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.