Mark, I think there is a bug in the implementation of the PYTHON_MAP aggregate functions. The return value of a PYTHON_MAP aggregate function can change if there are other functions in the query. The return value of my function weighted_percentile_0 (definition below) changes if I include the function median() in my query. The results are incorrect if median() isn't in the query, but are correct if median() is in the query. In this test case, the weights (v2) are all 1 and the values (v1) are uniformly sampled integers from [1,10]. Number of rows for each fctr is ~1000, and my function should reduce to min(v1) in this case. Is some information bleeding between threads? Should I submit a bug report? This is on the default branch. Thanks, Dave sql>select fctr,weighted_percentile_0(v1,v2),min(v1) from mini where fctr < 10 group by fctr order by fctr; +------+--------------------------+------+ | fctr | L1 | L2 | +======+==========================+======+ | 1 | 3 | 1 | | 2 | 2 | 1 | | 3 | 2 | 1 | | 4 | 1 | 1 | | 5 | 3 | 1 | | 6 | 3 | 1 | | 7 | 2 | 1 | | 8 | 1 | 1 | | 9 | 2 | 1 | +------+--------------------------+------+ 9 tuples (126.928ms) sql>select fctr,weighted_percentile_0(v1,v2),min(v1),median(v1) from mini where fctr < 10 group by fctr order by fctr; +------+--------------------------+------+------+ | fctr | L1 | L2 | L3 | +======+==========================+======+======+ | 1 | 1 | 1 | 5 | | 2 | 1 | 1 | 5 | | 3 | 1 | 1 | 5 | | 4 | 1 | 1 | 5 | | 5 | 1 | 1 | 5 | | 6 | 1 | 1 | 5 | | 7 | 1 | 1 | 5 | | 8 | 1 | 1 | 5 | | 9 | 1 | 1 | 5 | +------+--------------------------+------+------+ 9 tuples (519.195ms) sql> CREATE AGGREGATE weighted_percentile_0(a DOUBLE, w DOUBLE) RETURNS DOUBLE LANGUAGE PYTHON_MAP { import numpy as np # Standardize and sort based on values in a q = np.array([0]) / 100.0 idx = np.argsort(a) a_sort = a[idx] w_sort = w[idx] # Get the cumulative sum of weights ecdf = np.cumsum(w_sort) # Find the percentile index positions associated with the percentiles p = q * (w_sort.sum() - 1) # Find the bounding indices (both low and high) idx_low = np.searchsorted(ecdf, p, side='right') idx_high = np.searchsorted(ecdf, p + 1, side='right') idx_high[idx_high > ecdf.size - 1] = ecdf.size - 1 # Calculate the weights weights_high = p - np.floor(p) weights_low = 1.0 - weights_high # Extract the low/high indexes and multiply by the corresponding weights x1 = np.take(a_sort, idx_low) * weights_low x2 = np.take(a_sort, idx_high) * weights_high wp = np.add(x1,x2) return(wp[0]) };