kks32/compare_columns_file.sh

## compare_columns_file.sh
awk -F',' 'NR==FNR{label[$1]=$1;date[$1]=$2;next}; ($2==label[$2]){print $0 "," date[$2]}' <(sort -k1 file2.csv) <(sort -k2 file1.csv) &> file3.csv

## compare_columns_in_file_question.txt
#Question
> I need to match strings between the two files and print to a third file. Data look like this:

#File 1
dbID    labnumber    myID    Status
CMV_1235    LAB06    56-1    Fail
CMV_1236    LAB14    57-1    Fail
CMV_2137    LAB84    54-4    Pass
CMV_2238    LAB85    50-3
CMV_C131    LAB21    51-2    Pass

#File 2
labnumber    date
LAB06    18/01/2016
LAB14    27/04/2016
LAB18    10/01/2016
LAB21    9/02/2016
LAB69    4/03/2016
LAB84    18/02/2016
LAB22    18/03/2016
LAB85    27/03/2016

(Not totally overlapping: there may be samples in file 1 but not file 2 and vice versa)

I want to print to file 3:
dbID    labnumber    myID    Status    date
CMV_1235    LAB06    56-1    Fail      18/01/2016
CMV_1236    LAB14    57-1    Fail      27/04/2016
CMV_2137    LAB84    54-4    Pass      18/02/2016
CMV_2238    LAB85    50-3              27/03/2016

So, If labnumber matches in file 1 and file 2, print all of that line in file 2 then print relevant date from that line in file 1, into a third file

## file1.csv

          
            dbID
            labnumber
            myID
            Status

            
              CMV_1235
              LAB06
              56-1
              Fail

            
              CMV_1236
              LAB14
              57-1
              Fail

            
              CMV_2137
              LAB84
              54-4
              Pass

            
              CMV_2238
              LAB85
              50-3

            
              CMV_C131
              LAB21
              51-2
              Pass

## file2.csv

          
            labnumber
            date

            
              LAB06
              18/01/2016

            
              LAB14
              27/04/2016

            
              LAB18
              10/01/2016

            
              LAB21
              9/02/2016

            
              LAB69
              4/03/2016

            
              LAB84
              18/02/2016

            
              LAB22
              18/03/2016

            
              LAB85
              27/03/2016
	#Question
	> I need to match strings between the two files and print to a third file. Data look like this:

	#File 1
	dbID labnumber myID Status
	CMV_1235 LAB06 56-1 Fail
	CMV_1236 LAB14 57-1 Fail
	CMV_2137 LAB84 54-4 Pass
	CMV_2238 LAB85 50-3
	CMV_C131 LAB21 51-2 Pass

	#File 2
	labnumber date
	LAB06 18/01/2016
	LAB14 27/04/2016
	LAB18 10/01/2016
	LAB21 9/02/2016
	LAB69 4/03/2016
	LAB84 18/02/2016
	LAB22 18/03/2016
	LAB85 27/03/2016

	(Not totally overlapping: there may be samples in file 1 but not file 2 and vice versa)

	I want to print to file 3:
	dbID labnumber myID Status date
	CMV_1235 LAB06 56-1 Fail 18/01/2016
	CMV_1236 LAB14 57-1 Fail 27/04/2016
	CMV_2137 LAB84 54-4 Pass 18/02/2016
	CMV_2238 LAB85 50-3 27/03/2016

	So, If labnumber matches in file 1 and file 2, print all of that line in file 2 then print relevant date from that line in file 1, into a third file
dbID	labnumber	myID	Status
CMV_1235	LAB06	56-1	Fail
CMV_1236	LAB14	57-1	Fail
CMV_2137	LAB84	54-4	Pass
CMV_2238	LAB85	50-3
CMV_C131	LAB21	51-2	Pass
	labnumber	date
	LAB06	18/01/2016
	LAB14	27/04/2016
	LAB18	10/01/2016
	LAB21	9/02/2016
	LAB69	4/03/2016
	LAB84	18/02/2016
	LAB22	18/03/2016
	LAB85	27/03/2016