Published on

Combining CSVs with the same number of rows in Linux

Authors

Today I had to clean up a CSV as well as add some columns to it. This ended up being much harder than I thought it would be.

I eventually found a solution using more common Unix tools.

But to fully understand the solution, pretend you have 3 CSVs with the same amount of rows:

This is a CSV we are using for the column in the combined CSV, it holds the ID of the rows:

ID
32713c83-68ce-4219-bbfb-f9217fcf8a2c
372c4390-2e7b-47bd-99ac-0df60b13363d
a1787f9e-b6a0-43b4-8ba1-5e7be72cf8ca

We have another CSV with the actual data:

Name;Surname;Age
John;Smith;23
Jane;Doe;36
Harry;Smith;44

We have one last CSV with data we want to add to the end. In this case we are prepping data to be used with a Postgres copy command

Date_added;Date_Updated
now();null
now();null
now();null

Prentend the 3 files are named respectively: file-1.csv, file-2.csv and file-3.csv

After a bunch of Googling the easiest way to combine them is below:

paste -d ';' file-1.csv file-2.csv file-3.csv >> combined.csv
  1. We specify that ; is the delimiter
  2. We provide the file names in the order we wish to combine their columns
  3. We place the combined values in one new csv file called combined.csv

Reference

Merging contents of multiple .csv files into single .csv file