Re: [Monetdb-developers] Monetdb-developers Digest, Vol 5, Issue 6

6 Oct 2006

      Hi all,
...
In PROC rpc_client() (runtime/xrpc.mx), I had changed
  ws_addcoll(ws, ...
to
  ws_opencoll(ws_id(ws), ...
it should we ws_opencoll(wsid, ...)

You have that, because in the query context we always have both variable
'ws' and 'wsid'.

ws_id(ws) should be called once, it asks for a unique oid. The combination
of (id,int(ws)) is squeezed into a lng, and that is wsid. 

As for doc_tbl(), what would work is to change its parameter to wsid; one
can always get the ws with ws := ws_bat(wsid), this is done now in numerous
places in pf_support.mx. 

Your conclusions (1), (2) and (3) are all correct.

But, Jan's suggestion to store the wsid inside the ws can also work.
However, I deem it very awkward to introduce a new BAT in the ws just to
retain a number. Other suggestions are to rename the ws-bat to some unique
key, and then we can use that name as transaction key (=wsid). We would
change the wsid type then from lng to str. Another idea is to set the
head-seqbase of the ws-bat to an oid. I think the ws is accessed with
fetch() only, which disregards any seqbase.

Note that internally in pathfinder.mx there are a number of functions that
just pass around an artificial wsid, and in fact a ws does not exist at all
(this is the MIL document management interface e.g. shred_doc), so a number
of (internal) functions in pathfinder.mx will have to stay using a wsid in
their signature. They do not need a ws-bat.

But the exported functions ws_opencoll and ws_opendoc could indeed just use
a ws as they did before.

I will not have time for this until next week, so if anyone feels like
adventurous, he may try.

Peter

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |

------------------------------

Message: 2
Date: Fri, 6 Oct 2006 12:43:21 +0200
From: Stefan Manegold 
Subject: Re: [Monetdb-developers] The new "ws-API",	Algebra & XRpc (&
	PFtijah)
To: Monetdb-developers@lists.sourceforge.net
Message-ID: <20061006104321.GA2821@corona.ins.cwi.nl>
Content-Type: text/plain; charset=us-ascii

On Thu, Oct 05, 2006 at 11:03:09AM +0200, Stefan Manegold wrote:
...
Dear Peter, dear fellow PF & MXQ developers,
[...]
In cases of XRpc & Algebra, there are still places were wsid := ws_id(ws)
is
IMHO called in the wrong location:
XRpc:
In PROC rpc_client() (runtime/xrpc.mx), I had changed
  ws_addcoll(ws, ...
to
  ws_opencoll(ws_id(ws), ...
While this seems to worg for now with read-only documents (i.e., in the
absense of updates and concurrency), it is IMHO incorrect in general.
Rather, rpc_client should receive wsid as argument instead of ws;
if necessary, ws can then be derived via ws := ws_bat(wsid).
I just changed the signatures of doLoopLiftedRPC() & rpc_client()
to receive  lng wsid  instead of  BAT ws,
BAT ws  is the locally derived via  ws := ws_bat(wsid);

All XRpc tests seem to work fine.

Stefan

[...]

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |

------------------------------

Message: 3
Date: Fri, 06 Oct 2006 14:39:19 +0200
From: Jan Rittinger 
Subject: Re: [Monetdb-developers] The new "ws-API",	Algebra & XRpc (&
	PFtijah)
To: Monetdb-developers@lists.sourceforge.net
Message-ID: <45264E77.3060201@in.tum.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

If this wsid is a concept that is strictly necessary for the update 
support (and nobody except me thinks that this bit shifting might be not 
the nicest solution) we could also think about getting rid of the ws and 
replacing it with the wsid. I just do not like the fact that there are 
two variables underway that have some side effects on each other.

So a fourth proposal (d) would be to replace the occurrences of ws by 
wsid. This would mean that *all* document accesses would have to include 
another indirection (even all path steps).

On 10/05/2006 11:03 AM, Stefan Manegold wrote with possible deletions:
...
Dear Peter, dear fellow PF & MXQ developers,
I'm not sure, whether I do understand the new "ws-API" correctly, and
would
be grateful if someone of you could enlighten me.
First, here's how far I get so far (please correct me if/where I'm
wrong!),
followed by concrete open questions.
For transaction support, Peter introduced a new wsid that uniquely
identifies (a reference to?) a ws, which is required for conflict
detection
(as far as I understand).
A wsid is to be generated from/for a ws by calling wsid := ws_id(ws);
however, this generated a new unique id each time is called (two calls on
the same ws yield two different ids, right?), hence, wsid := ws_id(ws)
should immediately follow ws := ws_create() .
For the inverse, there is ws := ws_bat(wsid) that return the ws identified
by wsid; obviously, this function yield the same result each time it's
called with the same wsid.
Since some functionality required to to the wsid instead of "only" the ws,
Peter had changes the signatures/API of some runtime PROCs. As far as I
see
right now, these are mainly (only?)
  ws_doc(ws, ... 		->  ws_opendoc(wsid, ...
  ws_addcoll(ws, ... 	->  ws_opencoll(wsid, ...
  ws_destroy(int(ws)) 	->  ws_destroy(wsid)
I finished these changes by simply running the folloing one-liners:
find * -type f | xargs grep -l 'var ws := ws_create();' | xargs perl -i -p
-e 's|(var ws := ws_create\(\);)|$1 var wsid := ws_id(ws);|'
find * -type f | xargs grep -l 'ws_doc(ws,'             | xargs perl -i -p
-e 's|ws_doc\(ws,|ws_opendoc(wsid,|'
find * -type f | xargs grep -l 'ws_destroy(int(ws))'    | xargs perl -i -p
-e 's|ws_destroy\(int\(ws\)\)|ws_destroy(wsid)|'
find * -type f | xargs grep -l 'ws_addcoll(ws,'         | xargs perl -i -p
-e 's|ws_addcoll\(ws,|ws_opencoll(wsid,|'
find * -type f | xargs grep -l 'ws_addcoll'             | xargs perl -i -p
-e 's|ws_addcoll|ws_opencoll|'
While this seemed to have been enough to fix all .milS tests and (most of)
the milprint_summer functionallity, there are two open issues left:
the Algebra version, XRpc & PFtijah.
To be honest, I haven't looked at PFtijah, yet; some help by the
respective
experst would be appreciated.
In cases of XRpc & Algebra, there are still places were wsid := ws_id(ws)
is
IMHO called in the wrong location:
XRpc:
In PROC rpc_client() (runtime/xrpc.mx), I had changed
  ws_addcoll(ws, ...
to
  ws_opencoll(ws_id(ws), ...
While this seems to worg for now with read-only documents (i.e., in the
absense of updates and concurrency), it is IMHO incorrect in general.
Rather, rpc_client should receive wsid as argument instead of ws;
if necessary, ws can then be derived via ws := ws_bat(wsid).
Algebra:
in PROC doc_tbl() (runtime/pf_support.mx), I had changed
  var r := ws_doc(ws, item);
into
  var wsid := ws_id(ws);
  var r := ws_opendoc(wsid, item);
Same story as above.
However, doc_tbl return ws also in its result BAT-of-BATs, as far as I
understand mainly for "canonical API" reasons.
Obviously, it is not possible to replace BAT ws by lng wsid, here...
Leaving the latter problem aside for the moment, the following could
(have)
work(ed) as a general rule, and we should check and change the codebase
accordingly:
1) wsid := ws_id(ws) must only be called immediately after ws :=
ws_create()
2) all functions/PROCs that recursively (i.e., including all transitively
   called functions/PROCs) require wsid instead of or inaddition to ws
should
   be modified in that the receive wsid instead of ws as an argument;
   ws can then locally be derived via ws := ws_bat(wsid) if/where
necessary.
3) all functions/PROCs that recursively (i.e., including all transitively
   called functions/PROCs) "never" (at least not yet) require wsid, but
are
   fine with ws can stay unchanged, i.e., getting (only) ws as argument.
   Obviously, wherever these functions need to maintain a signature/API
   aligned with those that fall under point 2) above, we should also
change
   these functions as described above with 2).
As indicated above, this does not work for doc_tbl and its companion PROCs
in runtime/pf_support.mx that make-up the interface between the Algebra
version of the compiler and the runtime.
I assume the respective canonical API could be changed in that the
respective PROC get wsid instead of (or in addition to?) ws as arguments.
However, we cannot easily change the API to return wsid (lng) instead of
ws
(BAT) in the result BAT-of-BATs. Here, I see three solutions --- the last
one actually comes from JanR:
a) Is ws/wsid indeed required in the result?
   If not, we could simply discard it.
b) Wrap wsid temprarly into a ("fake") [void,lng] BAT containing only one
   (nil,wsid) BUN.
   (Ugly!)
c) (JanR) Put the wsid inside ws --- basically as a ("fake") BAT
[void,lng].
   Kind of the "encapsulated" solution --- not only for the
algebra-runtime
   interface! 
   Consequently, we should have ws_create() call ws_id(ws) internally and
   stored the result in ws (this could save the seemingly "redundant"
   ws := ws_create(); wsid := ws_id(ws); sequence), and all (most, except
at
   least ws_destroy()) signatures/API could be kept / re-unified to pass
   (only) ws (now including wsid) instead of wsid as argument.
There seems to be one problem, so, if I understand Peter's comment
   correctly:
   "
   - ws-IDs are now lng-s (combination of *unique* ID and bat-ID) such IDs
     are meaningful also after the query is done (and ws-bat deallocated).
                    ^1^1^1^1^1^1^1^1^1^1^1^1^1^1  ^2^2^2^2^2^2^2^2^2^2^2 
     this is needed for trans mgmt.
     All meta-bats witha ws-id field now hold lng instead of int.
   "
In other words, (^2) wsid needs to be available also if ws is gone.
   I'm not sure, though, whether is has to be available in the global wsid
   variable (i.e., inside/during the query), or (only) in some meta-bats
   outside (and hence after) the query (-context) (as suggested by ^1).
Peter,
could you please enlighten me/us, here?
In case "only" (^1) is required, the "encapsulated" solution seems
indeed
   to be suitable for all code.
   Otherwise, is would at least be a local solution for the
algebra-runtime
   interface (in case (a) & (b0 are no option), but bears the burden of
   maintaining consistency between the global wsid variable and the wsid
stored
   inside the ws.
I hope, this email and any reactions to it help to clearify the current
situation, rather than to blur it even more ...
Stefan
-- 
Jan Rittinger
Database Systems
Technische Universit?t M?nchen (Germany)
http://www-db.in.tum.de/~rittinge/

------------------------------

Message: 4
Date: Fri, 6 Oct 2006 15:08:00 +0200
From: Stefan Manegold 
Subject: Re: [Monetdb-developers] The new "ws-API",	Algebra & XRpc (&
	PFtijah)
To: Jan Rittinger 
Cc: Monetdb-developers@lists.sourceforge.net
Message-ID: <20061006130800.GB5598@corona.ins.cwi.nl>
Content-Type: text/plain; charset=us-ascii

On Fri, Oct 06, 2006 at 02:39:19PM +0200, Jan Rittinger wrote:
...
If this wsid is a concept that is strictly necessary for the update 
support (and nobody except me thinks that this bit shifting might be not 
the nicest solution) we could also think about getting rid of the ws and 
replacing it with the wsid. I just do not like the fact that there are 
two variables underway that have some side effects on each other.
To be honest, I'm still in the process of analyzing trying to understand,
what all the flavors, pros, cons, and ideas behind having/using wsid are;
hence, I cannot tell, whether it is "strictly necessary" or not.

As far as I understand so far, the basic idea behind the "bit shifting" is
merely to store both the id of (and hence reference to) BAT ws as well as a
unique identifier --- that is still valid after a query/transaction (and
hence
cannot simply be the ws BAT id, which might be re-used for a subsequent
query/transaction) --- in one single atomic value.

Further, I agree, that having two related variables that need to be kept in
sync and treated accordingly is not very handy for maintenance --- my guess
is, that the current situation resulted from lack of time and fear to
implement (something like) your (d) proposal; which I agree is "nicer" &
"cleaner", but requires quite a lot of possibly error-prone code changes ---
nevertheless, I might give it a try during the weekend ...

Stefan
...
So a fourth proposal (d) would be to replace the occurrences of ws by 
wsid. This would mean that *all* document accesses would have to include 
another indirection (even all path steps).
-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |

------------------------------

Message: 5
Date: Fri, 6 Oct 2006 15:42:02 +0200
From: Stefan Manegold 
Subject: Re: [Monetdb-developers] The new "ws-API",	Algebra & XRpc (&
	PFtijah)
To: Monetdb-developers@lists.sourceforge.net
Cc: Jan Rittinger 
Message-ID: <20061006134202.GA6958@corona.ins.cwi.nl>
Content-Type: text/plain; charset=us-ascii

On Fri, Oct 06, 2006 at 03:08:00PM +0200, Stefan Manegold wrote:
...
On Fri, Oct 06, 2006 at 02:39:19PM +0200, Jan Rittinger wrote:
...
If this wsid is a concept that is strictly necessary for the update 
support (and nobody except me thinks that this bit shifting might be not
...
...
the nicest solution) we could also think about getting rid of the ws and
...
...
replacing it with the wsid. I just do not like the fact that there are 
two variables underway that have some side effects on each other.
To be honest, I'm still in the process of analyzing trying to understand,
what all the flavors, pros, cons, and ideas behind having/using wsid are;
hence, I cannot tell, whether it is "strictly necessary" or not.
As far as I understand so far, the basic idea behind the "bit shifting" is
merely to store both the id of (and hence reference to) BAT ws as well as
a
unique identifier --- that is still valid after a query/transaction (and
hence
cannot simply be the ws BAT id, which might be re-used for a subsequent
query/transaction) --- in one single atomic value.
Further, I agree, that having two related variables that need to be kept
in
sync and treated accordingly is not very handy for maintenance --- my
guess
is, that the current situation resulted from lack of time and fear to
implement (something like) your (d) proposal; which I agree is "nicer" &
"cleaner", but requires quite a lot of possibly error-prone code changes

...
nevertheless, I might give it a try during the weekend ...
Stefan
...
So a fourth proposal (d) would be to replace the occurrences of ws by 
wsid. This would mean that *all* document accesses would have to include
...
...
another indirection (even all path steps).
Well, I just realize that this actually only helps to save us from having
function interfaces that expect/use either ws or wsid or both --- it does
not help to get rid of the global ws variable: to exist throught the whole
query, the ws BAT must either be a global variable or be persistent --- the
latter is not possible, since ws is a BAT-of-BATs, and those cannot be
persistent...

... I just recall, that we have something like "session" BATs --- I'll check
whether that's an option (any help/hint is of course welcome ...)

Stefan

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |

------------------------------

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

------------------------------

_______________________________________________
Monetdb-developers mailing list
Monetdb-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-developers

End of Monetdb-developers Digest, Vol 5, Issue 6
************************************************

Re: [Monetdb-developers] Monetdb-developers Digest, Vol 5, Issue 6

p.a.boncz