--
1G6
52 route de bischwiller
67300
Schiltigheim
Société de Services et de Formations en Logiciels Libres
http://1g6.biz
Tél : 06 64 63 70 35
De: "Stefan Manegold" <Stefan.Manegold@cwi.nl>
À: "Pierre-Adrien Coustillas" <pcoustillas@1g6.biz>
Cc: "Communication channel for MonetDB users" <users-list@monetdb.org>
Envoyé: Lundi 1 Décembre 2014 23:37:59
Objet: Re: inner join very slow
Hi,
assuming uniformly distributed data, i.e.,
each of the 121382 distinct (siteniveau1, chapitreniveau1) combinations occurs on average 1061964254/121382 ~= 8748.9 times in tmp__pierre2, and
each of the 124307 distinct (site, idchapitre1)combinations occurs on average 1331867/124307 ~= 10.7 times in ref__at_lm_chapitres,
and assuming that each of the 121382 distinct (siteniveau1, chapitreniveau1) combinations from tmp__pierre2 finds a match in ref__at_lm_chapitres,
the join result would by about 1061964254 * 10.7 ~= 113630175178 tuples, i.e., about 10 times larger than tmp__pierre2 .
Having said that, could you share some more details and information:
- hardware configuration of your system (CPU, #cores, memory size, disk system)
- MonetDB version
- operating system
and what kind of activities do you observe while this query is running?
How big is your mserver5 process? How high is the CPU load? How high is the I/O load? Is the system swapping? etc.
Best,
Stefan
and assuming that all 121382 distinct (siteniveau1, chapitreniveau1) combinations from
----- Original Message -----
> Hello,
>
>
> join attributes type are :
> site, idchapitre1
> SELECT count(*) FROM tmp__pierre2 SRC INNER JOIN ref__at_lm_chapitres
> SUB ON SRC.siteniveau1(VARCHAR(10)) =SUB.site( VARCHAR(10)) AND
> SRC.chapitreniveau1(VARCHAR(15)) =SUB.idchapitre1( VARCHAR(100)) ;
>
>
> I do not know how can return the join
>
>
>
> sql>select count ,sum(cnt) from (select count as cnt from tmp__pierre2 group
> by siteniveau1, chapitreniveau1) as x;
> +--------+------------+
> | L2 | L3 |
> +========+============+
> | 121382 | 1061964254 |
> +--------+------------+
> 1 tuple (54.1s)
>
>
> sql>select count , sum(cnt) from (select count as cnt from
> ref__at_lm_chapitres group by site, idchapitre1) as x;
> +--------+---------+
> | L2 | L3 |
> +========+=========+
> | 124307 | 1331867 |
> +--------+---------+
> 1 tuple (144.736ms)
>
>
> how to interpret this result ?
>
>
>
>
> Pierre
>
> --
> 1G6
> 52 route de bischwiller
> 67300 Schiltigheim
> Société de Services et de Formations en Logiciels Libres
> http://1g6.biz
> Tél : 06 64 63 70 35
>
> ----- Mail original -----
>
> De: "Stefan Manegold" <Stefan.Manegold@cwi.nl>
> À: "Communication channel for MonetDB users" <users-list@monetdb.org>,
> "Pierre-Adrien Coustillas" <pcoustillas@1g6.biz>
> Envoyé: Lundi 1 Décembre 2014 20:28:48
> Objet: Re: inner join very slow
>
>
> Of what type(s) are the join attributes?
>
> How big is the expected join result?
>
> If you don't know,
> select count(*), sum(cnt) from (select count(*) as cnt from tmp__pierre2
> group by siteniveau1, chapitreniveau1) as x;
> select count(*), sum(cnt) from (select count(*) as cnt from
> ref__at_lm_chapitres group by site, idchapitre1) as x;
> might help to give some estimation / idea ...
>
> Best,
> Stefan
>
> On December 1, 2014 6:16:11 PM CET, Pierre-Adrien Coustillas
> <pcoustillas@1g6.biz> wrote:
> >Hello,
> >
> >
> >With this query :
> >SELECT count(*) FROM tmp__pierre2 SRC INNER JOIN ref__at_lm_chapitres
> >SUB ON SRC.siteniveau1=SUB.site AND
> >SRC.chapitreniveau1=SUB.idchapitre1;
> >
> >
> >SELECT count(*) FROM tmp__pierre2;
> >+------------+
> >| L1 |
> >+============+
> >| 1 061 964 254 |
> >
> >SELECT count(*) FROM ref__at_lm_chapitres;
> >+---------+
> >| L1 |
> >+=========+
> >| 1 331 867 |
> >
> >
> >after several hours of running the query is still not finished and I
> >stopped
> >
> >
> >how to find out where the problem?
> >
> >
> >thanks
> >
> >
> >Pierre
> >
> >
> >--
> >1G6
> >52 route de bischwiller
> >67300 Schiltigheim
> >Société de Services et de Formations en Logiciels Libres
> >http://1g6.biz
> >Tél : 06 64 63 70 35
> >
> >
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >users-list mailing list
> >users-list@monetdb.org
> >https://www.monetdb.org/mailman/listinfo/users-list
>
>
--
| Stefan.Manegold@CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |