
That's one way I'd do it this way 1. install mysql server and configure innodb_buffer_pool_size to be 90% of system memory (i might use someone's server for this) :) 2. create a table with four fields (id - autoinc, unclean - varchar(256), trndate (datetime), seq nos (innodb table to make use of 1 above) 3. index id primary in one index, trndate in another 4. import into the database, throw everything into the unclean field 5. wait :) 6. run query to split for example update mytable set trndate = left(unclean,8), seqno = substr(unclean,9,255) - i've noticed the format is yyyymmdd - makes it easier, you'd have to confirm my statement since i just wrote it from memory 7. wait again 8. then do selects and export to text files - the index on the trndate should make it super fast don't forget to put the indexes otherwise it will take forever to do the queries just a suggestion... and in the end you'll have a nice neat table in case you want to do more perations on it. You can even back it up and zip it for storage! On 13 June 2013 12:54, Bwana Lawi <mail2lawi@gmail.com> wrote:
Hi guys,
I have a file with over 30 million lines (Txn IDs) over a period of 11 years.
I want to delete from that file (it's a flat file, not a db table) lines that dont begin with certain characters.
Or alternatively, create another file for lines that begin with certain characters.
The format of the ID is DATE:<SOME.SEQ.NOs>(Of course, there are multiple txns per day)
So for instance it will have 20080101XXXXXXXX - For transactions done on 1st Jan 2008, 20130613XXXXXX for today's txn, etc
My requirement is to get a list of transactions done say from Jun 2010 to Jun 2011. Therefore, I should somehow either: a) Create a new file by piping output of grep ^2010 and grep ^2011 (Challenge is how to get only for 2010[06,07,08.09,10,11 and 12] and not entire 2010. Same for 2011)
b) Delete the lines that dont match a) above. I dont want to go this route.
Tools available: Ubuntu 12.04 HP-UX 11.31
PS: I don't know if the subject matches my requirement but ...
_______________________________________________ skunkworks mailing list skunkworks@lists.my.co.ke ------------ List info, subscribe/unsubscribe http://lists.my.co.ke/cgi-bin/mailman/listinfo/skunkworks ------------
Skunkworks Rules http://my.co.ke/phpbb/viewtopic.php?f=24&t=94 ------------ Other services @ http://my.co.ke