Friday, March 15, 2013

Importing CSV files

Now this post is very much driven by the fact that there are still plenty of major players in the email world that have never grasped the more advanced methods of exchanging address information and we are continually relying on that old nugget CSV.

It is fine for those that have been using computers for 20 years or more, we used to exchange data between dBase and just about everything else, so CSV is no stranger.  To those that have grown up in the point and click world of Microsoft Windows on the various Mac operating systems, anything that can not be done with a Mouse is foreign territory.

Enter Comma Seperated Values (CSV), a true "blast from the past" having a history that goes back into the 1960s, long before Apple had a computer and well before the mouse was even an idea. CSV is unfortunately a term used to describe a collection of data formats, about the only thing they have in common is that between each pierce of information there is a comma and the information is in "text".


Because there is so much difference between what is called CSV between the programs that generate is (some put text inside double quotes, "This This", others do not. It becomes important to actually have a bit of a visual at the data in the to get some idea of how the data is laid out.  It is to this end that you will often see a recommendation to open the file in a spread sheet application.

The actual data stored in the file looks something like this

 First Name,Last Name,Email Address,Home Phone,Mobile Phone,Business Phone
Willian,Gates,bill@microsoft.com,,,
Rob Roy,mcgregor,rob@mcgregor.com,,,

Developers of Spreadsheet applications have spent many years developing sophisticated import wizards for CSV files so you can tweak the import using options to make the data look like it is in columns under headings.  Ultimately this process results in a "more standardised" data format and your CSV file opening in your spreadsheet looking like a table.


This data layout is generated by Windows Live.  Thunderbird expects there to be a first and last name, but it also expects a display name but Live simply does not include this, so we have to massage the data supplied by windows Live to make things work.  Having the data in LibreOffice it is a fairly simple process to add a column for the display name.  But first things first. You need to save the document as an ODF file, as Libre is stubborn about what you can and can not do with a CSV file.

Right clicking on the letters and the top displayed a menu and you select insert columns from it.

I do this twice, and create two columns on as a temporary display name and the other as the final display name.

Now to get the display name data.  Obviously the information we want is in the first two columns, so we need to include the information from both of these columns into on. ie Concatenate the data.

To do this we use a formula.


For those not used to formulas, once it is entered the first time, you can use the little black square on the bottom right to drag the formula down the list to the end.  What we have now is a formula generated display name, but the cell contains a formula and displays what we want.

The next step is to copy and paste the information from the temp column to the final "Display Name"  when you paste, use the paste special and change the options so all you paste is text.

Now we should have  the Diusplay name in two columns, so right click the Temp Display name column and delete it. so things end up looking like those shown in the image below.

Now  File menu, select "save as" and select the file type of CSV. 
Thunderbird will not do anything but choke if you forget this step.

PostScript
For those using LibreOffice, it automatically changes formulas to the display value when you save as a CSV file. I have no idea is this is the case with other spreadsheet programs.  Needless to say you can skip the copy and paste text part if you are using Libre office, just name the column to "Display Name" in the first place and only create one now column.  For myself, I will continue the two column approach because that way I can see and be sure, I am not relying on the computer program to get it right for me.