Allowing for TSV file

Jul 25, 2014 at 5:15 PM
Hi,

I have converted our Excel formatted files to CSV and have found that cells with commas in the values are not handled. For instance, if I have a cell with a comma separated list of email addresses, or HTTP addresses, the comma is seen as a delimiter, instead of part of the value.

So, Instead of just telling you about the problem (like normal) I've downloaded the code and done a couple of minor modifications, here is what and why.

First off, I saved our Excel formatted file as a Tab Separated text file.
  1. When this happens, Excel wraps Quotes around these kind of cells. So when a search and replace happens, the quotes are inserted right along with the list of values.
  2. As mentioned above the commas in the cell, cause configgen to mess us, misaligning the values/headers.
What I did:
  1. In CsvSettingsLoader.cs, I changed your line.Split(new[] { ',' }), to a line.Split(new[] { '\t' })
  2. To solve the quotes issue I add a couple of Trim calls to the function MachineConfigurationSettings. settings.Add(new ConfigurationSetting(columnList[i], rowValues[i].TrimStart('"').TrimEnd('"')));
I think there should be either a command line flag to identify a TSV file, or the code should self-determine the file type based on file extension.

I'm not a developer by trade, (Release Engineer) so my code skills are very redimentary and I don't understand how to add that flag. I also thought about renaming my edited file as TsvSettingsLoader.cs, but I don't know how to incorporate that into the solution and make it all work.

I'd be happy to send you my updated file for you to incorporate, just not sure how best to do so.

Thanks,

Mike
Jul 28, 2014 at 10:05 PM
Edited Jul 28, 2014 at 10:06 PM
Hi Mike,

Thanks for spending time on this. After reading your message and seeing you'd been forced to switch to tabs (and also being mindful of the other CSV related issues you've had), I've had another look at the CSV parsing.

The practical upshot is that I've replaced the rather naive csv parsing implementation for a 3rd party library that does the job a lot better.

This now means that it supports commas and newlines in values, as long as they are properly quoted (which they are if exported from Excel).

If you get a chance, maybe you'd like to try this version out: it is commit "a0823ac71217".

If this works for you, I'll package it up as a proper release.

Thanks for your patience with this!

Kind Regards,

Rob
Jul 29, 2014 at 5:20 PM
Hi Rob,

This does not seem to be working for our CSV. I did a fresh save-as from our Excel file to Text (Tab Delimited) format.
Excel saves this as a Text file extension which caused CFG.exe to error. If I change the extension to CSV manually, then it loads the file.
However, the Headers and Fields are not getting parsed, and I'm getting an "Index out of bounds" error, in the ProcessRow function at:

var machineName = rowValues[0];
var configFilePath = rowValues[1];

Here is a copy of the TSV file.

Thanks,

Mike





Jul 29, 2014 at 5:25 PM
Hi Mike,

Sorry - I could have been a bit clearer: this supports comma csv files only, not tabs. My understanding was that you'd only switched to tabs because of problems with commas in fields. Please correct me if I've misunderstood.

Having said that, I think the library I am using can be made to support tab csvs ("tsv") too, so if you needed the files to be tab based, maybe I could try that.

Thanks,

Rob

P.S. There was no file attached to your post
Jul 29, 2014 at 5:33 PM
Ok, I also misunderstood, I'll try to CSV format with quotes, etc..


Jul 29, 2014 at 5:40 PM
Good news! When saved as a straight-up CSV this works as advertised. Very nice!



Jul 29, 2014 at 5:45 PM
Hi Mike,

Great - thanks for testing this, and your general patience.

I packaged a release last night that incorporates this change: v1.0.7.

You can either download it from the Downloads page here on Codeplex, or you download it via NuGet.

Kind Regards,

Rob
Jul 29, 2014 at 5:59 PM
Happy to help...