Split CSV by column value, and keep header
Split CSV by column value, and keep header
This has been asked many times before but I simply can't implement the solutions properly. I have a large csv named 2017-01.csv, with a date column (it's the second column in the file) and I am splitting the file by date. The original file looks like:
date
2017-01-01
2017-01-01
2017-01-01
2017-01-02
2017-01-02
2017-01-02
After the split, 2017-01-01.csv looks like
2017-01-01
2017-01-01
2017-01-01
and 2017-01-02.csv looks like
2017-01-02
2017-01-02
2017-01-02
The code I am using is
awk -F ',' '{print > (""$2".csv")}' 2017.csv
Everything works fine but I need to keep the header row. So I tried
awk -F ',' 'NR==1; NR > 1{print > (""$2".csv")}' 2017-01.csv
But I still get the same results without the header row. What am I doing wrong? I read answers to many similar questions on Stackoverflow but I just can't understand what they are doing.
I want this:
2017-01-01.csv should look like
date
2017-01-01
2017-01-01
2017-01-01
2017-01-02.csv should look like
date
2017-01-02
2017-01-02
2017-01-02
I have edited it now.
– PythonGuy
7 mins ago
Your input and output file names are looking same? is it a typo or correct thing, please confirm?
– RavinderSingh13
7 mins ago
I have edited it again to make it clear. The input and output files are different. Let me know if it makes sense now. Thanks.
– PythonGuy
5 mins ago
Please check my answer and let me know if that helps you?
– RavinderSingh13
1 min ago
1 Answer
1
Could you please try following, I haven't tested it though. Considering that your Input_file is in sorted order.
awk 'prev!=$0{close(file);file=$0".csv";print "date" > file} {print > file;prev=$0}' 2017-01.csv
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Please post expected results too in code tags.
– RavinderSingh13
13 mins ago