On 2013-02-21 18:07, Miguel Ping wrote:
I'm hitting a segfault. Can the following line ever return NULL for non-null 'g'? Can't understand how this happens. I'm grouping a table with only one element:
grps = (const oid *) Tloc(g, BUNfirst(g));
Yes, that can return NULL. If g's tail column is TYPE_void, this will return NULL.
afterwards, accessing grps trhrows a segfault:
... prev = grps[0];//segfaults
+------+------+ | id | data | +======+======+ | a | 11 | +------+------+ sql>select id, hllagg(data) from test group by id; (BOOM!)segfaults.
Thanks
OK let me take some time to digest that, some of the terms you use are vaguely meaningful for me, but it sure helps to understand the code.
Thanks for the explanation.
On 02/21/2013 03:44 PM, Sjoerd Mullender wrote: On 2013-02-20 19:52, Miguel Ping wrote:
Hi,
Did the UDF group by api changed? I just tried to upgrade my custom UDFs, getting an error about a 'sub':
batudf.*sub*hllagg' undefined in: _37:bat[:any,:str] := batudf.subhllagg(_34:bat[:oid,:str], _18:bat[:oid,:oid], r1_18:bat[:oid,:oid], _38:bit)
as usual, I checked the diffs in the xmlagg that comes with monet, there's some new functions that I can't really tell what they are for:
AGGRsubxml AGGRsubxmlcand BATxmlaggr
Can someone kindly explain the changes? It seems a breaking change, I couldn't find anything on the docs. BATxmlaggr does the real work. As you can see, it's a static function, so not used by any other code. AGGRsubxml and AGGRsubxmlcand are the functions that are called from
On 02/21/2013 03:51 PM, Miguel Ping wrote: the MAL level. As you can see, the former calls the latter with a NULL argument for the sid parameter. The sid parameter represents the optional candidates list that I referred to in my blog post.
AGGRsubxmlcand just converts the bat pointers to BAT pointers and calls BATxmlaggr to do the real work. Afterwards it cleans up and returns the result.
And now what is the real work? b is the bat being aggregated. It is dense headed, and in the case of this function, the tail is the XML values. g is the groups BAT. Again, it is dense headed and aligned with b. The tail is oid and it indicates the group each element b is a member of. Equal values in g means the same group. If g is NULL, there are no groups, or rather, all values are in the same group, and e is not used. e is the extents BAT. The head is again dense, but it is not aligned with b and g. Instead, it contains the range of group ids that are used in g. The tail is not used. If e can be NULL. It just makes the code less efficient. s is the candidates list and can be NULL. If it is specified, it is, again, dense-headed. The tail is oid and sorted in ascending order and without duplicates. All values must be in the range of the head columns of b and g. It indicates the values that participate. If s is not specified, all values in b/g participate, otherwise only the values mentioned in s. Finally, skip_nils says what to do with nil values in the tail of b: ignore (do as if the value isn't there), or take along in the aggregation (with most aggregations that would mean the result would be nil).
I hope this helps.
I also removed the BATaccessBegin and BATaccessEnd macro calls, it seems they are not needed anymore. That is correct.
-- Sjoerd Mullender
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender