additional details on IS NULL problem

23 May 2016

      Hi Anthony -

I’ve narrowed down the problem.  The bug isn’t related to size, but  rather  how the table data was created/loaded.   Loading from a .txt file gives good behavior, loading from binary does not.

Normally I create binary files for each column and use the binary bulk loader to load.  That is what was failing.  When I first tested  with 50 columns, I created the test file from sql via the command
    copy (select * …) into ‘/workdir/…kellyFirst50.txt using delimiters ‘\t’,’\n’;

Then I created a new table called kellyFirst50 and used the “COPY INTO kellyfirst50 from ‘/workdir…kellyFirst50.txt’ using delimiters ‘\t,’\n’; ”  command.  Checking for “IS NULL” worked, so I thought it was a size problem.

But, when trying to figure out at what point the file size mattered, I ran java scripts to create binary files of ever decreasing sizes, then loaded them via the COPY BINARY commands.  Checking for IS NULL always failed here, even when I only loaded 50 values.

Here are the results when loading from binary files into monetdb.  Using IS NOT NULL works, but IS NULL does not:

sql>\\

sql>select count(*) from weirtestnull;

+------+

| L1   |

+======+

|   50 |

+------+

1 tuple (2.014ms)

sql>select count(amesbyd_weir) from weirtestnull;

+------+

| L1   |

+======+

|   50 |

+------+

1 tuple (0.865ms)

sql>select count(amesbyd_weir) from weirtestnull where amesbyd_weir IS NOT NULL;

+------+

| L1   |

+======+

|   34 |

+------+

1 tuple (3.005ms)

sql>

sql>select * from weirtestnull where amesbyd_weir IS NULL;

+-----+-----+---------------+--------------+--------------+----------------+--------------------+------------------+----------------+

| chr | pos | amesbycn_weir | amesbyd_weir | amesbyf_weir | amesbynam_weir | amesbyturkeyp_weir | dentbyflint_weir | dentbynam_weir |>

+=====+=====+===============+==============+==============+================+====================+==================+================+

+-----+-----+---------------+--------------+--------------+----------------+--------------------+------------------+----------------+

0 tuples (4.041ms) !4 columns dropped!

sql>

Dump the data into a .txt file:

sql>copy (select * from weirtestnull) into '/workdir/db_outfiles/weirtestnull.txt' using delimiters '\t','\n';

50 affected rows (3.809ms)

sql>

Now drop the table, recreate it, but this time load from the txt file created above:

sql>drop table weirtestnull;

operation successful (1.948ms)

sql>\copy into weirtestnull from '/workdir/db_outfiles/weirtestnull.txt' using delimiters '\t','\n';

50 affected rows (57.643ms)

sql>

Re-run the commands – I now get correct values for IS NULL:

sql>select count(*) from weirtestnull;

+------+

| L1   |

+======+

|   50 |

+------+

1 tuple (1.851ms)

sql>select count(amesbyd_weir) from weirtestnull;

+------+

| L1   |

+======+

|   34 |

+------+

1 tuple (0.983ms)

sql>

sql>select * from weirtestnull where amesbyd_weir IS NULL;

+------+--------+------------+------------+------------+------------+-------------+-------------+-------------+-------------+-------------+-------------+

| chr  | pos    | amesbycn_w | amesbyd_we | amesbyf_we | amesbynam_ | amesbyturke | dentbyflint | dentbynam_w | dentbyturke | flintbynam_ | flintbyturk |>

:      :        : eir        : ir         : ir         : weir       : yp_weir     : _weir       : eir         : ypen_weir   : weir        : eypen_weir  :>

+======+========+============+============+============+============+=============+=============+=============+=============+=============+=============+

|   10 | 228864 | 2.95496e-0 | null       | null       | -0.0001263 | null        | null        | -0.00028749 | null        | -0.00029885 | null        |

:      :        : 5          :            :            : 25         :             :             : 4           :             : 1           :             :

|   10 | 228962 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 228972 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229011 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229025 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229187 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229367 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229405 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229444 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229593 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229615 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229671 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 229829 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 230000 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 230052 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

|   10 | 230120 | null       | null       | null       | null       | null        | null        | null        | null        | null        | null        |

+------+--------+------------+------------+------------+------------+-------------+-------------+-------------+-------------+-------------+-------------+

16 tuples (5.2

 My java program uses Float.NaN and Integer.MIN_VALUE, Byte.MIN_VALUE to get “null” stored in the db.

When we load directly from binary, are the null values created differently?  Is something not marked that gets marked when loading from text files?

With this information, can you reproduce, or do you want me to give you my binary files in a dropbox somewhere?

Thanks - Lynn

additional details on IS NULL problem

Lynn Carol Johnson