I come back to this issue with some more interesting results. I did mention in the past that 'COPY INTO ... FROM stdin ..' seems to be a lot slower than 'COPY INTO ... FROM <filename> ...' Actually, I have just found something that may shift the problem away from the stdin itself. Look at the following two methods of restoring a previously dumped (rather small) database. RESTORE METHOD 1 (far too slow): $ mclient < backup.sql [ 62 ] [ 70199 ] [ 288 ] [ 70322 ] [ 69 ] [ 423 ] [ 280596 ] [ 4302 ] [ 49038 ] [ 1 ] [ 10 ] [ 6943 ] [ 6943 ] [ 3424 ] [ 3424 ] [ 3424 ] [ 1 ] [ 6943 ] [ 3424 ] TIME elapsed: 68 seconds RESTORE METHOD 2: $ echo "\< backup.sql" | mclient [ 62 ] [ 70199 ] [ 288 ] [ 70322 ] [ 69 ] [ 423 ] [ 280596 ] [ 4302 ] [ 49038 ] [ 1 ] [ 10 ] [ 6943 ] [ 6943 ] [ 3424 ] [ 3424 ] [ 3424 ] [ 1 ] [ 6943 ] [ 3424 ] TIME elapsed: 1 seconds Can anyone explain this difference? One further consideration: Method 2 works fine, but cannot be used to load gzipped dumps. Maybe this can be a feature request: '\<' should accept gzipped input. Roberto On Mon, 2009-05-18 at 13:03 +0200, Stefan Manegold wrote:
On Mon, May 18, 2009 at 11:57:57AM +0200, Stefan de Konink wrote:
On Mon, 18 May 2009, Stefan Manegold wrote:
In case you came to your conclusion from experimental/emperical study, I would be very interested to know about the experiemntes performed and the results measured. In that case, I would be verythankful if you could share the informative details with us, i.e. detailed descriptions of - experimental setup (HW, SW, datasets, workload/queries) - measured results (what was measured how, and what are the results?) - (your) discussion/analysis/interpretation of the results
I thought the MADAM meeting was for this.
Ok. fine.
... we should then revive the ideas of keeping note of MADAMs (e.g., on a Wiki or alike) as reference for those who cannot be present ...
Could you please elaborate on whether you consider you setup/schema "representaive" and/or whether/to which extend you think that your results for one specific case could be generalized to other cases?
Technically the only thing I have to prove to get my code reverted is that there are cases where late contraints hurt performance.
IMHO, our task should be the following:
- analyze whether there is indeed a significat performance difference between early and late constraint checking; (5% do not justify any further time to be spent on this) - and if so, analyze under which circumstances which altertative performs (significatly) better or worse than the other; - and (only) then try to find out where the (significat) performance difference comes from - and (only) then + either try to eliminate the difference; + or * make dump/restore use the most suitable alternative (as far as possible with reasonable effort); * and document our experiences to give adivce for users
Stefan
Stefan
-- | M.Sc. Roberto Cornacchia | CWI (Centrum voor Wiskunde en Informatica) | Science Park 123, 1098XG Amsterdam, The Netherlands | tel: +31 20 592 4322 , http://www.cwi.nl/~roberto