New subject: [MonetDB-users] SQL text normalisation principles in relation to MonetDB

14 Jul 2010

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi,

I rather have a general question and I would like to know something
about normalisation in MonetDB in relation to the underlying storage model.

Since my first courses in SQL, that was PostgreSQL at the time. It was a
strong advise always try to normalise duplications of data, because it
would always increase your search preformance to look up numerical keys
over string comparisons.

Given that in MonetDB, text-strings are deduplicated in a best effort
way, factually become numbers, how does this compare to the additional
costs of:

 - enums
 - foreign key relations

For example, some database schema's use enums (or worse: varchars) for
two values. For example 'accessible', 'inaccessible'. It is clear that
the storage size for this field BOOLEAN NOT NULL, would be sufficient
and of the length of one bit (best case).

Now within MonetDB the string data is currently optimised, would it
therefore cost additional time to create a secondary table to join to vs
using a string field.

Thus, does normalisation for 'manual' deduplication, hurt or not?

Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEAREKAAYFAkw9uQAACgkQYH1+F2Rqwn2sjACeNnvDQS+cXjzs1USpomkL6rz8
7wkAn2rF2ZRvEJbQ1cX+oxJsWPcMoTk6
=RzDh
-----END PGP SIGNATURE-----

[MonetDB-users] SQL text normalisation principles in relation to MonetDB

Stefan de Konink

Sam Mason

Stefan Manegold

tags

participants (3)