assuming bash

users=`cat in.txt | cut -f2 -d'|'`
for i in $users
do
  userrecs=`cat in.txt | grep $i`
  yesc=`echo $userrecs | cut -f5 -d'|' | grep -c YES`
  noc=`echo $userrecs | cut -f5 -d'|' | grep -c NO`
  nac=`echo $userrecs | cut -f5 -d'|' | grep -c NA`
  echo "$i $yesc $noc $nac"
done


On Sat, Feb 27, 2010 at 6:09 PM, Tony Likhanga <tlikhanga@gmail.com> wrote:
Hi folks,

I have tonnes of CSV files whose sample content is as shown below:

Choice|123456780|20|1|YES|Y|2010-02-26 10:42:51.239772|
Choice|123456781|10|1|NO|Y|2010-02-26 12:42:51.243603|
Choice|123456782|20|1|NA|Y|2010-02-26 14:42:51.247464|
Choice|123456783|10|1|YES|Y|2010-02-26 14:42:51.251320|
Choice|123456780|20|1|OK|Y|2010-02-26 15:42:51.255169|

My aim is to extract the contents of the 2nd field alongside the count of similar choice result in 5th field.

Desired output:

UserId        Occurences (YES)
======     ===============
123456780   465
123456781    25

I managed to achieve this on a linux box running mysql using the steps below (which can of course be wrapped into a script):

1. grep  YES raw_files | awk 'BEGIN{FS=OFS="|"}{print $2}' > /tmp/in_file
2. Log in to mysql and perform the steps:
3. mysql> create table nums(num varchar(16));
4. mysql> load data infile '/tmp/in_file' into table nums(num);
5. mysql> select num,count(*) thecnt from nums group by num;

I'm still exploring how to achieve the same using some (simple?) script, possibly not involving the database bit. Any takers?
NB: The only bit we know before-hand is desired choice result (YES/NO/NA).

Thanks.
Tony. 


_______________________________________________
Skunkworks mailing list
Skunkworks@lists.my.co.ke
http://lists.my.co.ke/cgi-bin/mailman/listinfo/skunkworks
------------
Skunkworks Server donations spreadsheet
http://spreadsheets.google.com/ccc?key=0AopdHkqSqKL-dHlQVTMxU1VBdU1BSWJxdy1fbjAwOUE&hl=en
------------
Skunkworks Rules
http://my.co.ke/phpbb/viewtopic.php?f=24&t=94
------------
Other services @ http://my.co.ke
Other lists
-------------
Announce: http://lists.my.co.ke/cgi-bin/mailman/listinfo/skunkworks-announce
Science:  http://lists.my.co.ke/cgi-bin/mailman/listinfo/science
kazi:     http://lists.my.co.ke/cgi-bin/mailman/admin/kazi/general