10 May
2009
10 May
'09
8:09 p.m.
At my university we are working on a project for financial engineering. We obtained trade data from Newyork Stock Exchange. The data looks like this: YYYYMMDD/Symbol There are about 5000 YYYYMMDD directories and 9000 Symbols The symbol is a tsv or csv file and it has 20 fields and has about 30k lines (average) Clearly, I can't place all this into 1 table (or can I? ) called Symbol. This would entail about 5000*9000*30000 = 1350000000000 entries in a table :-) I think thats crazy :-) I was wondering if its possible to partition the tables into YYYY or MM. Or maybe someone here has better ideas on howto better normalize the data. Any thoughts or ideas? TIA