On Wed, Aug 6, 2008 at 3:40 PM, Niels Nes
On Wed, Aug 06, 2008 at 03:16:34PM +0200, Stefan Manegold wrote:
On Wed, Aug 06, 2008 at 03:16:28AM -0700, Cronje Fourie wrote:
Hi Stefan...
All your assumptions are correct. - Both systems use a single SATA disk, SWAP and data partition on same disks - Compiled the MonetDB, from Jun-SuperTar, with options: --enable-sql --enable-optimise --nightly-stable - gcc etc are all latest on both machines, gcc 4.2.3 - compiled with 64bit OID's on 64bit machine. - I'm loading plain text data - Including the create table statement at the bottom, it's a whopper :) a couple of date fields are loaded as VARCHAR because the source data format is not compatible with MonetDB.
~ 200 attributes --- "impressive" ;-)
On the load side the VM usage went up to 11GB...on the 64bit system. I can't remember the exact value on the 32bit system but was around 3GB for VM.
Total VM usage of the system, or only of the mserver5 process?
Assuming that all data type you use occupy the same storage with both 32- & 64-bit systems (Niels might comment on this one), the (only?) difference between 32- & 64-bit systems is the size of OIDs and hash values used by MonetDB internally. By default, they are 32-bit on 32-bit systems and 64-bit on 64-bit systems --- the latter obviously occupying more (double the) space than the former. Given that the table doesn't have any 'key constraints', the loading should use the exact same amount of memory. Only during processing (queries or key constraint checking) the hash and oid values play a role. Indeed the other types should on 32 and 64 bit systems be of equal size.
Niels
This is becoming interesting:) Since Niels is saying that the 64-bit oid/hash values play no role during "copy into", then the problem might be indeed in Ubuntu vs KUbuntu versions, as Stefan mentioned. It will be nice to see what the results are with the 64-bit installation of ubuntu, but I believe that it will work fine since I am using also 64-bit ubuntu in my laptop and have never had problems. Officially the difference between ubuntu and kubuntu is only the window manager, ie. gnome vs kde.
At configure time, you can decide to build a 64-bit MonetDB with 32-bit OIDs (and hence hash values) (which --- in short --- will restrict tables to at most 2 billion records as on 32-bit systems --- but that should not be a problem given your 6 million records, right?) by calling configure with option "--enable-oid32". When using the monetdb-install.sh script, you have to resort to the "developers options/features" (cf., `monetdb-install.sh --devhelp`), e.g., as follows
EXTRA_ECONF=--enable-oid32 monetdb-install.sh ...
This might (should) reduce the memory footprint of mserver5 considerably.
One major difference between the two systems is that the 32bit system is a SERVER install, vs full KDE on the 64bit.
I'm now trying to run the same load on a 32bit system, which also has KDE installed and it is also much slower than on the server. Incidentally VM usage on this machine is 1GB
Hm, in case "server" vs. "KDE" make the difference, it's either that KDE consumes very (too?) much memory, leaving only little for other programs (as, e.g., MonetDB), or there's "something strange" going on that "blocks" MonetDB --- I would have no idea what that could be ...
I'm downloading the ubuntu server image at the moment and will install and test on that shortly, then get back to you with the results.
Yes, that would be great. I am / we are curious the hear what you find out!
Stefan
CREATE TABLE PrimalPool ( id10 char(10) NOT NULL, idno varchar(13) NOT NULL, cell_phone varchar(20) NULL, combined_expenses_monthly double NULL, combined_income_monthly double NULL, customer_age_id integer NULL, deceased tinyint NULL, demographic char(10) NULL, do_not_mail tinyint NULL, dob varchar(50) NULL, email varchar(150) NULL, emperica_score integer NULL, empscore_date varchar(50) NULL, ethnicity_id char(1) NULL, forename_1 varchar(20) NULL, forename_2 varchar(20) NULL, forename_3 varchar(20) NULL, gender_id char(1) NULL, home_language char(15) NULL, home_phone varchar(20) NULL, initials varchar(3) NULL, language char(1) NULL, marital_status_id char(1) NULL, market_sector_id tinyint NULL, prospect_age tinyint NULL, surname varchar(50) NULL, title_id varchar(20) NULL, work_phone varchar(20) NULL, bond_rent_payments integer NULL, bureau_installment_monthly double NULL, expense_bond_monthly double NULL, expense_maintenance_monthly double NULL, expense_municipal_services_monthly double NULL, income_tax_monthly double NULL, maint_payments integer NULL, municipal_payments integer NULL, net_income_monthly double NULL, pension_contrib_monthly double NULL, salary_gross_monthly double NULL, salary_other_monthly double NULL, statutory_expenses_monthly double NULL, uif_contrib_monthly double NULL, sbcscore integer NULL, sbcscore_date varchar(50) NULL, source_id integer NULL, account_charge_off tinyint NULL, avg_account_age integer NULL, avg_amount_offered integer NULL, avg_bscore integer NULL, avg_bscore_range integer NULL, avg_credit_limit integer NULL, avg_hilife_balance integer NULL, avg_income integer NULL, has_bad_exclusion_score tinyint NULL, has_negative_block_code tinyint NULL, has_payment_projection_score tinyint NULL, has_staff_loan tinyint NULL, invalid_logo_1stloan tinyint NULL, mail_date varchar(50) NULL, mailing_indicator tinyint NULL, mailing_key tinyint NULL, max_account_age integer NULL, max_agreement_date varchar(50) NULL, max_amount_offered integer NULL, max_bscore integer NULL, max_bscore_date varchar(50) NULL, max_bscore_range integer NULL, max_credit_limit integer NULL, max_hilife_balance integer NULL, max_income integer NULL, max_open_date varchar(50) NULL, min_account_age integer NULL, min_agreement_date varchar(50) NULL, min_amount_offered integer NULL, min_bscore integer NULL, min_bscore_date varchar(50) NULL, min_bscore_range integer NULL, min_credit_limit integer NULL, min_hilife_balance integer NULL, min_income integer NULL, min_open_date varchar(50) NULL, no_accounts tinyint NULL, no_active_accounts tinyint NULL, no_inactive_accounts tinyint NULL, rcs_offer_campaign_no varchar(20) NULL, rcs_offer_expiry_date varchar(50) NULL, rcs_offer_interest_rate tinyint NULL, sum_contract_delinquency integer NULL, sum_payment_recency tinyint NULL, sum_return_mail integer NULL, sum_warning_code integer NULL, account_active char(1) NULL, account_age integer NULL, account_id bigint NULL, account_status char(3) NULL, agreement_date varchar(50) NULL, amount_offered double NULL, application_id varchar(20) NULL, block_code_1 varchar(20) NULL, block_code_2 varchar(20) NULL, bscore integer NULL, bscore_date varchar(50) NULL, bscore_range varchar(20) NULL, campaign_no varchar(20) NULL, contract_date varchar(50) NULL, contract_delinquency tinyint NULL, credit_limit double NULL, crlife_admin double NULL, current_balance double NULL, customer_id char(30) NULL, entity_id integer NULL, expiry_date varchar(50) NULL, group_id integer NULL, hilife_balance double NULL, income double NULL, installment_value double NULL, interest_rate double NULL, logo_id varchar(3) NULL, open_date varchar(50) NULL, palink_type varchar(20) NULL, payment_recency integer NULL, payout_amount double NULL, payout_date varchar(50) NULL, physical_address_id char(32) NULL, principal_value double NULL, prodcatid integer NULL, res_address_id char(32) NULL, return_mail integer NULL, risk_rating varchar(20) NULL, tag_date varchar(50) NULL, term_period tinyint NULL, warning_code tinyint NULL, warning_code1 integer NULL, warning_code2 integer NULL, warning_code3 integer NULL, write_off_balance double NULL, accno varchar(20) NULL, account_no varchar(20) NULL, address_class varchar(1) NULL, address_code varchar(3) NULL, address_id_ia varchar(30) NULL, address_line1 varchar(65) NULL, address_line2 varchar(65) NULL, address_line3 varchar(65) NULL, address_line4 varchar(65) NULL, address_line5 varchar(65) NULL, address_source_date varchar(50) NULL, address_type varchar(3) NULL, address_type_id char(1) NULL, amc varchar(15) NULL, amc_id varchar(10) NULL, building_type varchar(3) NULL, change_status varchar(3) NULL, clean_address_line1 varchar(65) NULL, clean_address_line2 varchar(65) NULL, clean_address_line3 varchar(65) NULL, clean_address_line4 varchar(65) NULL, clean_address_line5 varchar(65) NULL, country varchar(50) NULL, country_code varchar(6) NULL, delivery_code varchar(1) NULL, dpid varchar(20) NULL, lines_in tinyint NULL, lmc varchar(20) NULL, lmc_id varchar(15) NULL, location_type varchar(20) NULL, mpcode varchar(5) NULL, mpname varchar(36) NULL, postcode varchar(10) NULL, region varchar(40) NULL, sec_location_type varchar(20) NULL, spcode varchar(8) NULL, spname varchar(25) NULL, structure_status varchar(3) NULL, suburb varchar(65) NULL, suburb_ethnic_code varchar(2) NULL, suburb_income tinyint NULL, suburb_income_band varchar(2) NULL, suburb_lat double NULL, suburb_long double NULL, tag varchar(40) NULL, town varchar(65) NULL, umc varchar(15) NULL, validity_status varchar(3) NULL, prospect_right bigint null DEFAULT 0, has_email tinyint NULL DEFAULT 0, has_address tinyint NULL DEFAULT 0, has_phone tinyint NULL DEFAULT 0, has_home_phone tinyint NULL DEFAULT 0, has_cell_phone tinyint NULL DEFAULT 0, has_work_phone tinyint NULL DEFAULT 0, chargeoff_date varchar(50) NULL, mloans_index char(30) NULL, filler_space char(1) NULL DEFAULT ' ', filler_empty varchar(1) NULL DEFAULT '', curr_date varchar(50) NULL DEFAULT '', prospect_owner bigint NULL DEFAULT 0, reference_no varchar(20) NULL, c char(30) NULL DEFAULT '0' ) GO
Thanks for the help Cronje
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
--
Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users