If you're working with CSV file format, chances are you will be doing an import at some point. Moving data is, after all, one of the
core uses for CSV!
Everyone who has imported data has run into import errors - it doesn't matter if you are a first-timer, or a seasoned pro doing their 10,000th import. Data can be messy, and so it doesn't always import flawlessly.
When you're importing with CSV, there are some very common issues that, regardless of the data contained in your file, you may run into and have to solve. You may, of course, still run into errors we don't cover - those that have to do with your actual data
within the file. Those can be unique to your data, the database setup where you are importing into, the particular app or software you are importing to, and so on. But, at least you'll have a head-start on troubleshooting some of those universal issues that you may encounter.
Why, you might ask, if CSV is so simple and universally loved, do we even encounter errors? Well, CSV is not actually a standardized format. If you recall from our review of
delimiters, there are quite a few varying methods of creating a CSV. And that is only one component of the file! Because you have choices on your delimiters, your encoding, etc. - there is always a chance that your file may be a bit "off" from what your import tool was expecting.
Here, we'll cover our suggested guidelines to follow in order to have a well-formatted CSV file. The result is that you will have a CSV file that can be easily read by most programs/apps, and a reduced chance of errors popping up during your import process.
1) Encoding trouble Depending on the software you are using, you may get an actual error to the effect of "Your file is not in UTF-8 encoding" (or similar). There is also the possibility, however, that you do not get any error at all, and can only see the issue when you look at the data you imported. If it's missing special characters, or the text/characters look like alien hieroglyphics, this is almost always an encoding problem. Go back to your CSV file, and ensure that it is
UTF-8 encoded.
2) File type errors You may have created your file in a Text Editor, and forgot to save as ".CSV" file type (instead, it may be a ".TXT" file). Or, perhaps you were using Excel and didn't save as CSV, but just left as ".XLS". The reasons may vary, but if you ever get an error message to the effect of "not a valid file type" - go back and check that your file was
created with a ".CSV" extension.
3) Header row headache This is a very common error, and you'll likely see it in some formation of "missing header columns", "invalid header" or "no header found" and so on. This is your red flag that the header row in your file is either a) missing or b) improperly formatted. The latter is the more common scenario. Things like extra spaces or mismatched number of header columns vs. the columns with data within them, etc. will cause this error to be thrown.
And if you're working with Excel or another spreadsheet program to create your CSV, you may have to check even more potential culprits - like hidden columns, or merged cells and so on. The takeaway - make sure that you have a header row and that it is clean and tidy! This means, trying to avoid using characters other than letters, digits, and underscores in a header. And, make sure the header row is delimited in the same way as the rest of the file.
4) Empty columns and/or rows This one can be a bit trickier to nail down. Typically if you receive an error to the effect of "empty column found" (or similar) it is because you have a column header, with no data in that column. Or, you may have data within the column, but no header labeling it. This can also be a result of simply having extra columns in the worksheet if you created the
CSV in Excel, and Excel may mistakenly think you have data housed there - even if all you see is blank. You can remedy this by deleting any extra columns from your file, and ensuring that the columns that are left have headers and data contained within them.
On the flip side, if you have an empty row error, you can similarly check that your file does not contain any extraneous rows with no data in them - delete those out and you should be good to go!
Remember, your CSV file is designed to be simplistic in structure. Each row should resemble a lockstep pattern of: text, delimiter, text, delimiter, text, delimiter. Every line should be separated by a clear line break. That's it! There is no deviation from this pattern, no matter what dataset you are working with. Keep in mind that every CSV row is expected to have the same structure in order for an import tool to recognize the data coming in. That means, making sure that each row has the same number of fields in the same order (following the order of the header), and using the same delimiters.
If you can follow these guidelines, your CSV file should "pass" the test of a well-formatted file for any importer you may be working with.