Problem processing csv file

Hi,

I'm having problems processing a csv file.
When opening the file in Excel, I get the correct value in all columns.
When viewing the file in notepad, I see the values are correct.
Then reading the file with Flexcell, I get the correct value for the first 2 rows in column x but for the 3th row, I get a complete different value.

I would like to send screenshots and the actual file, but I'm not able to when creating a topic apparently.

Dominique


I'm processing the file with the following open code: Xls.Open(WorkFile, TFileFormats.Text, ';', 1, 1, nil, Win1252Encoding, true);

Dominique

Hi,


This is normally encoding problems, but it is hard to say more without looking at the file. Can you send me one file showing the problem to adrian@tmssoftware.com ?


Hi,

I tried it with the file you sent me and the code:



begin
  xls := TXlsFile.Create( true);
  try
    Win1252Encoding := TEncoding.GetEncoding(1252);
    try
      Xls.Open('r:\cda-export.csv', TFileFormats.Text, ';', 1, 1, nil, Win1252Encoding, true);
      xls.Save('r:\test.xlsx');
    finally
      Win1252Encoding.Free;
    end;
  finally
    xls.Free;
  end;
end;


And the file seems correct. Values in column x seem ok, for example X11 is 861.99 in both the CSV and the file created by FlexCel.

The column that isn't correct with the code above is for example column BI. So for example BI11 is 

"Entreprise de pompes funèbres (à  concurrence du montant de la facture)"

But this is because the file is not in Win32, but UTF-8.

With this code:


begin
  xls := TXlsFile.Create( true);
  try
    UTF8Encoding := TEncoding.UTF8;
    try
      Xls.Open('r:\cda-export.csv', TFileFormats.Text, ';', 1, 1, nil, UTF8Encoding, true);
      xls.Save('r:\test.xlsx');
    finally
      //Don't free built-in encodings. You only have to free those created with GetEncoding;
    end;
  finally
    xls.Free;
  end;
end;



The file seems ok. The only thing I can think is that you have different decimal separators from the file, so FlexCel is considering 861.99 as 86199 (as if the . was a thousands separator).

If that is the problem you are seeing, you need to tell FlexCel which decimal separator the file uses. 
You can do that with Push/PopThreadFormat, as explained here:
http://www.tmssoftware.biz/flexcel/doc/vcl/tips/how-to-change-the-flexcel-locale.html

If it is a different problem, let me know what you see in column X.

Ps: An unrelated issue, but this file could be read much faster by specifying the date format used. Make sure to read http://www.tmssoftware.biz/flexcel/doc/vcl/guides/performance-guide.html#reading-csv-files

This was solved by email, but for anyone else looking, the problem was with the decimal separator.

The file used a "." as decimal separator, but the app was using ",".

So the number "325.04" was interpreted as a date (april 1, year 325), and that's why the strange values.
The fix implied setting the locale to english (as explained in http://www.tmssoftware.biz/flexcel/doc/vcl/tips/how-to-change-the-flexcel-locale.html:



  newFmt := TFormatSettings.Create('en-US');


  oldFmt := TFlexCelFormatSettings.PushThreadFormat('', newFmt);
  try
    xls := TXlsFile.Create(true);
    try
      Xls.Open('r:\cda-export.csv', TFileFormats.Text, ';', 1, 1, nil, TEncoding.UTF8, true);
      xls.Save('cda-test-result.xlsx');
    finally
      TFlexCelFormatSettings.PopThreadFormat(oldFmt);
    end;
  finally
    xls.Free;
  end;




This is one of the reasons it is a good idea to manually specify the date formats you expect if you know them. If we had specified that dates are in yyyy-mm-dd format for example (as they were in this file), FlexCel wouldn't have tried to convert to a yy.mm format. It still wouldn't be right without changing the local (because now it would be a string, not a number) but better. And manually specifying date formats makes the loading much faster.

As usual, all gotchas explained at http://www.tmssoftware.biz/flexcel/doc/vcl/tips/understanding-csv-files.html apply. My advise is the same as always: when possible, avoid csv files. Use JSON, XML, or whatever format that has some formal specification.