I tend to look at software problems before I look at hardware problems...
I'm much more inclined to believe that the deadlock was a mutex issue and
the timing of it happening caused the transaction log record file to be
truncated. Hits on memory from cosmic rays? That's rather grandiose... I
think you should go re-read Occam's Razor.
73,
Matthew W. Jones (KI4ZIB)
http://matburt.net
On Tue, Nov 24, 2009 at 9:55 AM, Sam Mason
On Tue, Nov 24, 2009 at 09:42:08AM -0500, Matthew Jones wrote:
Memory tests were run and no problems were reported. With MonetDB+bad memory I would expect to lose some data and no data was lost here.
You lost the contents of one file didn't you? Hits on memory from cosmic rays cause random single bit flips at a small but measurable rate--though ECC memory should help. These could cause something in Linux's page buffer to go awry and hence ending up with an empty file when unexpected. The bug rate of normal software means that that these sorts of errors are almost completely masked, however when code matures these sorts of errors start to become important. I'm not sure if Monet is stable enough to think about this yet, but if you've got a stable access pattern that only hits exactly the same code paths then it shouldn't matter about the remaining bugs.
Higher assurance software tends to include checksums and other simple invariants that are checked at various places in code to make sure that errors like this aren't propagated too far.
-- Sam http://samason.me.uk/
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users