Allocating "only" about 2.8 GB does not ring alarms that the plan night be degenerated - assuming your data is not tiny.
Could you share:OS, full query, data sizes, free disk space at time of failure, mserver5 process size at time of failure?
Thanks,Stefan
-------- Original message --------From: Anthony Damico
hi, this query fails after i run nearly-identical queries on similar
tables, so looking at #3791, something is lopsided in this particular table
but not the others, i guess. here's the query:
CREATE TABLE ccaed123_b_svcmon_xwalk AS
( SELECT
a.enrolid ,
a.yr ,
a.datatyp ,
a.svcmon AS ccaed123_b_svcmon ,
( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS(
a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff
FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b
ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp =
b.datatyp
GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon
) WITH DATA
disk drive has 5.46 TB of free space. windows task manager says 603 GB of
768 GB of RAM are taken, but the R process only says it's holding 26GB of
RAM. when i shutdown the database, the 603GB of RAM goes to less than
100GB (used by unrelated processes). then i re-start the database and get
the same error, so i doubt this is resource-related?
row counts of the two tables being merged:
> dbGetQuery( db , "SELECT COUNT(*) FROM ccaed123_b_monthly" )
L3
1 70024781
> dbGetQuery( db , "SELECT COUNT(*) FROM ccaet123_b_st_wt" )
L3
1 204029495
here's the locale:
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2012 R2 x64 (build 9600)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] haven_1.1.0 stringr_1.2.0 snow_0.4-2
MonetDBLite_0.5.0
[5] DBI_0.7 sqldf_0.4-11 RSQLite_2.0 gsubfn_0.6-6
[9] proto_1.0.0 reshape2_1.4.2 R.utils_2.6.0 R.oo_1.21.0
[13] R.methodsS3_1.7.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.13 bindr_0.1 magrittr_1.5 bit_1.1-12
[5] R6_2.2.2 rlang_0.1.2 dplyr_0.7.4 blob_1.1.0
[9] plyr_1.8.4 tcltk_3.4.2 tools_3.4.2 assertthat_0.2.0
[13] bit64_0.9-7 digest_0.6.12 tibble_1.3.4 bindrcpp_0.2
[17] codetools_0.2-15 glue_1.2.0 memoise_1.1.0 stringi_1.1.5
[21] compiler_3.4.2 forcats_0.2.0 chron_2.3-51 pkgconfig_2.0.1
thanks for looking at this
On Wed, Nov 22, 2017 at 7:00 AM, Stefan Manegold
Allocating "only" about 2.8 GB does not ring alarms that the plan night be degenerated - assuming your data is not tiny.
Could you share: OS, full query, data sizes, free disk space at time of failure, mserver5 process size at time of failure?
Thanks, Stefan
-------- Original message -------- From: Anthony Damico
Date: 11/22/17 12:13 (GMT+01:00) To: Communication channel for MonetDB users Subject: algebra.join heapextend crash hi, i'm using monetdblite-r 0.5.0 on a confidential dataset. my syntax worked on monetdblite 0.3.1 on the same machine
ftruncate: Invalid argument Error in .local(conn, statement, ...) : Unable to execute statement 'CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.dataty...'. Server says 'MALException:algebra.join:GDK reported error. !ERROR: GDKextendf: c !ERROR: HEAPextend: failed to extend to 2890530816 for 02\57\25766.tail: GDKmre map() failed '.
it's a local drive, the hard disk isn't close to full and the machine has 700gb of ram, but the table it's trying to create might be bigger than that. dunno if related to https://www.monetdb.org/ bugzilla/show_bug.cgi?id=3791
thanks
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Anthony, given that it's hard for us to analyse / debug without being able to reproduce, did you / could you simplify / "strip-down" you query to see when it fails? E.g., you could try the following series of reduced versions and alternatives of your query, and report which work and which don't? Thanks! Best, Stefan ======== -- just the core join SELECT count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp ; -- core join and grouping SELECT count(*) FROM ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) AS t; -- create same table, but only with simple count aggregation CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA; -- avoid CASE statement; rather add condition to join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp AND a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA; -- likewise, but with condition as extra filter rather than in join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp WHERE a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA; -- almost the original query, but avoid ABS() CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN b.svcmon - a.svcmon ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA; -- original query CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA; ======== ----- On Nov 23, 2017, at 1:10 AM, Anthony Damico ajdamico@gmail.com wrote:
hi, this query fails after i run nearly-identical queries on similar tables, so looking at #3791, something is lopsided in this particular table but not the others, i guess. here's the query:
CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA
disk drive has 5.46 TB of free space. windows task manager says 603 GB of 768 GB of RAM are taken, but the R process only says it's holding 26GB of RAM. when i shutdown the database, the 603GB of RAM goes to less than 100GB (used by unrelated processes). then i re-start the database and get the same error, so i doubt this is resource-related?
row counts of the two tables being merged:
dbGetQuery( db , "SELECT COUNT(*) FROM ccaed123_b_monthly" ) L3 1 70024781 dbGetQuery( db , "SELECT COUNT(*) FROM ccaet123_b_st_wt" ) L3 1 204029495
here's the locale:
sessionInfo() R version 3.4.2 (2017-09-28) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 R2 x64 (build 9600)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] haven_1.1.0 stringr_1.2.0 snow_0.4-2 MonetDBLite_0.5.0 [5] DBI_0.7 sqldf_0.4-11 RSQLite_2.0 gsubfn_0.6-6 [9] proto_1.0.0 reshape2_1.4.2 R.utils_2.6.0 R.oo_1.21.0 [13] R.methodsS3_1.7.1
loaded via a namespace (and not attached): [1] Rcpp_0.12.13 bindr_0.1 magrittr_1.5 bit_1.1-12 [5] R6_2.2.2 rlang_0.1.2 dplyr_0.7.4 blob_1.1.0 [9] plyr_1.8.4 tcltk_3.4.2 tools_3.4.2 assertthat_0.2.0 [13] bit64_0.9-7 digest_0.6.12 tibble_1.3.4 bindrcpp_0.2 [17] codetools_0.2-15 glue_1.2.0 memoise_1.1.0 stringi_1.1.5 [21] compiler_3.4.2 forcats_0.2.0 chron_2.3-51 pkgconfig_2.0.1
thanks for looking at this
On Wed, Nov 22, 2017 at 7:00 AM, Stefan Manegold < stefan.manegold@cwi.nl > wrote:
Allocating "only" about 2.8 GB does not ring alarms that the plan night be degenerated - assuming your data is not tiny.
Could you share: OS, full query, data sizes, free disk space at time of failure, mserver5 process size at time of failure?
Thanks, Stefan
-------- Original message -------- From: Anthony Damico < ajdamico@gmail.com > Date: 11/22/17 12:13 (GMT+01:00) To: Communication channel for MonetDB users < users-list@monetdb.org > Subject: algebra.join heapextend crash
hi, i'm using monetdblite-r 0.5.0 on a confidential dataset. my syntax worked on monetdblite 0.3.1 on the same machine
ftruncate: Invalid argument Error in .local(conn, statement, ...) : Unable to execute statement 'CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.dataty...'. Server says 'MALException:algebra.join:GDK reported error. !ERROR: GDKextendf: c !ERROR: HEAPextend: failed to extend to 2890530816 for 02\57\25766.tail: GDKmre map() failed '.
it's a local drive, the hard disk isn't close to full and the machine has 700gb of ram, but the table it's trying to create might be bigger than that. dunno if related to https://www.monetdb.org/bugzilla/show_bug.cgi?id=3791
thanks
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
hi, i'm sorry Stefan, i appreciate you taking the time to write all of this
out. reproducing the point of the crash requires a weeklong importation
program - i had to re-run it reverting to monetdblite 0.3.1 instead, i
don't have enough disk space to host both databases. sorry not to be more
helpful..
On Wed, Nov 29, 2017 at 3:58 PM, Stefan Manegold
Anthony,
given that it's hard for us to analyse / debug without being able to reproduce, did you / could you simplify / "strip-down" you query to see when it fails?
E.g., you could try the following series of reduced versions and alternatives of your query, and report which work and which don't?
Thanks!
Best, Stefan
========
-- just the core join SELECT count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp ;
-- core join and grouping SELECT count(*) FROM ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) AS t;
-- create same table, but only with simple count aggregation CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- avoid CASE statement; rather add condition to join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp AND a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- likewise, but with condition as extra filter rather than in join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp WHERE a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- almost the original query, but avoid ABS() CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN b.svcmon - a.svcmon ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- original query CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
========
----- On Nov 23, 2017, at 1:10 AM, Anthony Damico ajdamico@gmail.com wrote:
hi, this query fails after i run nearly-identical queries on similar tables, so looking at #3791, something is lopsided in this particular table but not the others, i guess. here's the query:
CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA
disk drive has 5.46 TB of free space. windows task manager says 603 GB of 768 GB of RAM are taken, but the R process only says it's holding 26GB of RAM. when i shutdown the database, the 603GB of RAM goes to less than 100GB (used by unrelated processes). then i re-start the database and get the same error, so i doubt this is resource-related?
row counts of the two tables being merged:
dbGetQuery( db , "SELECT COUNT(*) FROM ccaed123_b_monthly" ) L3 1 70024781 dbGetQuery( db , "SELECT COUNT(*) FROM ccaet123_b_st_wt" ) L3 1 204029495
here's the locale:
sessionInfo() R version 3.4.2 (2017-09-28) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 R2 x64 (build 9600)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] haven_1.1.0 stringr_1.2.0 snow_0.4-2 MonetDBLite_0.5.0 [5] DBI_0.7 sqldf_0.4-11 RSQLite_2.0 gsubfn_0.6-6 [9] proto_1.0.0 reshape2_1.4.2 R.utils_2.6.0 R.oo_1.21.0 [13] R.methodsS3_1.7.1
loaded via a namespace (and not attached): [1] Rcpp_0.12.13 bindr_0.1 magrittr_1.5 bit_1.1-12 [5] R6_2.2.2 rlang_0.1.2 dplyr_0.7.4 blob_1.1.0 [9] plyr_1.8.4 tcltk_3.4.2 tools_3.4.2 assertthat_0.2.0 [13] bit64_0.9-7 digest_0.6.12 tibble_1.3.4 bindrcpp_0.2 [17] codetools_0.2-15 glue_1.2.0 memoise_1.1.0 stringi_1.1.5 [21] compiler_3.4.2 forcats_0.2.0 chron_2.3-51 pkgconfig_2.0.1
thanks for looking at this
On Wed, Nov 22, 2017 at 7:00 AM, Stefan Manegold < stefan.manegold@cwi.nl > wrote:
Allocating "only" about 2.8 GB does not ring alarms that the plan night be degenerated - assuming your data is not tiny.
Could you share: OS, full query, data sizes, free disk space at time of failure, mserver5 process size at time of failure?
Thanks, Stefan
-------- Original message -------- From: Anthony Damico < ajdamico@gmail.com > Date: 11/22/17 12:13 (GMT+01:00) To: Communication channel for MonetDB users < users-list@monetdb.org > Subject: algebra.join heapextend crash
hi, i'm using monetdblite-r 0.5.0 on a confidential dataset. my syntax worked on monetdblite 0.3.1 on the same machine
ftruncate: Invalid argument Error in .local(conn, statement, ...) : Unable to execute statement 'CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.dataty...'. Server says 'MALException:algebra.join:GDK reported error. !ERROR: GDKextendf: c !ERROR: HEAPextend: failed to extend to 2890530816 for 02\57\25766.tail: GDKmre map() failed '.
it's a local drive, the hard disk isn't close to full and the machine has 700gb of ram, but the table it's trying to create might be bigger than that. dunno if related to https://www.monetdb.org/bugzilla/show_bug.cgi?id=3791
thanks
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Anthony, Hi all, I seem to experience a similar problem running MonetDB Jul2017 (changeset 68357599b75f) compile from HG sources on Fedora 26 Linux on a machine with 2x 8 core (+HT) CPUs and 256 GB RAM having the DBfarm on a 42 TB RAID, of which 37 TB are available, I "suddenly" get pymonetdb.exceptions.OperationalError: GDK reported error: GDKmalloc_internal: failed for 185864192 bytes !MT_mmap: mmap(/scratch24r1/manegold/DBfarm/TataSteel5/bat/14/46/144603.tail,185925632) failed !!OS: Cannot allocate memory !HEAPalloc: Insufficient space for HEAP of 185864192 bytes. While the database has grow from empty to ~200 GB, there are as said still 37 TB free on the partition where the DBfarm (under monetdbd / merovingian) is located ... My workload is inserting data in chunks into 9 tables concurrently using both INSERT INTO (on 6 tables) and COPY LOADER INTO (on 2 tables). I have the suspiscion (though not confirmation, yet) that the problem might be related to growing tables / files, and thus possibly to swithcing from malloc to memeory mapping ... ... to be investigated ... Best, Stefan ----- On Nov 29, 2017, at 9:58 PM, Stefan Manegold Stefan.Manegold@cwi.nl wrote:
Anthony,
given that it's hard for us to analyse / debug without being able to reproduce, did you / could you simplify / "strip-down" you query to see when it fails?
E.g., you could try the following series of reduced versions and alternatives of your query, and report which work and which don't?
Thanks!
Best, Stefan
========
-- just the core join SELECT count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp ;
-- core join and grouping SELECT count(*) FROM ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) AS t;
-- create same table, but only with simple count aggregation CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , count(*) FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- avoid CASE statement; rather add condition to join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp AND a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- likewise, but with condition as extra filter rather than in join predicate CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( b.svcmon - a.svcmon ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp WHERE a.svcmon <= b.svcmon GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- almost the original query, but avoid ABS() CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN b.svcmon - a.svcmon ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
-- original query CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA;
========
----- On Nov 23, 2017, at 1:10 AM, Anthony Damico ajdamico@gmail.com wrote:
hi, this query fails after i run nearly-identical queries on similar tables, so looking at #3791, something is lopsided in this particular table but not the others, i guess. here's the query:
CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.datatyp , a.svcmon AS ccaed123_b_svcmon , ( MIN( CAST( ( CASE WHEN a.svcmon <= b.svcmon THEN ABS( a.svcmon - b.svcmon ) ELSE NULL END ) AS INTEGER ) ) ) AS svcmon_diff FROM ccaed123_b_monthly AS a INNER JOIN ccaet123_b_st_wt AS b ON a.enrolid = b.enrolid AND a.yr = b.yr AND a.datatyp = b.datatyp GROUP BY a.enrolid , a.yr , a.datatyp , ccaed123_b_svcmon ) WITH DATA
disk drive has 5.46 TB of free space. windows task manager says 603 GB of 768 GB of RAM are taken, but the R process only says it's holding 26GB of RAM. when i shutdown the database, the 603GB of RAM goes to less than 100GB (used by unrelated processes). then i re-start the database and get the same error, so i doubt this is resource-related?
row counts of the two tables being merged:
dbGetQuery( db , "SELECT COUNT(*) FROM ccaed123_b_monthly" ) L3 1 70024781 dbGetQuery( db , "SELECT COUNT(*) FROM ccaet123_b_st_wt" ) L3 1 204029495
here's the locale:
sessionInfo() R version 3.4.2 (2017-09-28) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 R2 x64 (build 9600)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] haven_1.1.0 stringr_1.2.0 snow_0.4-2 MonetDBLite_0.5.0 [5] DBI_0.7 sqldf_0.4-11 RSQLite_2.0 gsubfn_0.6-6 [9] proto_1.0.0 reshape2_1.4.2 R.utils_2.6.0 R.oo_1.21.0 [13] R.methodsS3_1.7.1
loaded via a namespace (and not attached): [1] Rcpp_0.12.13 bindr_0.1 magrittr_1.5 bit_1.1-12 [5] R6_2.2.2 rlang_0.1.2 dplyr_0.7.4 blob_1.1.0 [9] plyr_1.8.4 tcltk_3.4.2 tools_3.4.2 assertthat_0.2.0 [13] bit64_0.9-7 digest_0.6.12 tibble_1.3.4 bindrcpp_0.2 [17] codetools_0.2-15 glue_1.2.0 memoise_1.1.0 stringi_1.1.5 [21] compiler_3.4.2 forcats_0.2.0 chron_2.3-51 pkgconfig_2.0.1
thanks for looking at this
On Wed, Nov 22, 2017 at 7:00 AM, Stefan Manegold < stefan.manegold@cwi.nl > wrote:
Allocating "only" about 2.8 GB does not ring alarms that the plan night be degenerated - assuming your data is not tiny.
Could you share: OS, full query, data sizes, free disk space at time of failure, mserver5 process size at time of failure?
Thanks, Stefan
-------- Original message -------- From: Anthony Damico < ajdamico@gmail.com > Date: 11/22/17 12:13 (GMT+01:00) To: Communication channel for MonetDB users < users-list@monetdb.org > Subject: algebra.join heapextend crash
hi, i'm using monetdblite-r 0.5.0 on a confidential dataset. my syntax worked on monetdblite 0.3.1 on the same machine
ftruncate: Invalid argument Error in .local(conn, statement, ...) : Unable to execute statement 'CREATE TABLE ccaed123_b_svcmon_xwalk AS ( SELECT a.enrolid , a.yr , a.dataty...'. Server says 'MALException:algebra.join:GDK reported error. !ERROR: GDKextendf: c !ERROR: HEAPextend: failed to extend to 2890530816 for 02\57\25766.tail: GDKmre map() failed '.
it's a local drive, the hard disk isn't close to full and the machine has 700gb of ram, but the table it's trying to create might be bigger than that. dunno if related to https://www.monetdb.org/bugzilla/show_bug.cgi?id=3791
thanks
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
Anthony Damico
-
Stefan Manegold