[MonetDB-users] COPY...INTO with auto increment
I noticed that COPY...INTO for bulk loading requires that all fields be present in the data file. Is there any way to use bulk loading and have it honor the auto increment column or at least not have to specify every column in the data file? 73, Matthew W. Jones (KI4ZIB) http://matburt.net
Matthew Jones wrote:
I noticed that COPY...INTO for bulk loading requires that all fields be present in the data file. Is there any way to use bulk loading and have it honor the auto increment column or at least not have to specify every column in the data file? How about adding the serial column after loading. It will be zero cost.
73, Matthew W. Jones (KI4ZIB) http://matburt.net
------------------------------------------------------------------------
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
The problem is, I am loading continously, and I need this field
continuously.
73,
Matthew W. Jones (KI4ZIB)
http://matburt.net
On Thu, Aug 27, 2009 at 7:11 PM, Martin Kersten
Matthew Jones wrote:
I noticed that COPY...INTO for bulk loading requires that all fields be present in the data file. Is there any way to use bulk loading and have it honor the auto increment column or at least not have to specify every column in the data file? How about adding the serial column after loading. It will be zero cost.
73, Matthew W. Jones (KI4ZIB) http://matburt.net
------------------------------------------------------------------------
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On 27-08-2009 19:49:37 -0400, Matthew Jones wrote:
On Thu, Aug 27, 2009 at 7:11 PM, Martin Kersten
wrote: Matthew Jones wrote: > I noticed that COPY...INTO for bulk loading requires that all fields be > present in the data file. Is there any way to use bulk loading and have > it honor the auto increment column or at least not have to specify every > column in the data file? How about adding the serial column after loading. It will be zero cost.
Adding a "serial" column won't be zero cost, and I doubt it will be possible, since it implies a NOT NULL constraint, which cannot be added to a to be added column. Technically you could do it, assuming that you would just start filling the column because you know what can be put into it, as is defined by the sequence.
The problem is, I am loading continously, and I need this field continuously.
So it seems you're using COPY INTO to regularly append batches of new data to your dataset. You do this because of performance reasons? Since COPY INTO is designed for loading (restoring) a database, is it perhaps possible to use prepared statement batching instead? That allows you to have the sequence column(s) do their job.
My solution is to create a sequence with the table, before batch loading my
data I will get the next value of the sequence, then reset it to that value
+ the size of my batch and iterate over my batch adding that value.
I believe I may have found a bug in the sequence though... I want to make
sure that this value will not have problems when working with large
integers:
sql> CREATE SEQUENCE "test_seq" as bigint;
sql> select next value for test_seq;
+------------------------------+
| next_value_for_testdb |
+==============================+
| 1 |
+------------------------------+
sql> alter sequence test_seq restart with 3000000000;
sql> select next value for test_seq;
+------------------------------+
| next_value_for_datawarehouse |
+==============================+
| 2147483647 |
+------------------------------+
I can always increment the sequence higher, but if I try to alter it to
anything past the maximum value for a 32-bit integer then it will reset to
it.
What do you think? Bug?
73,
Matthew W. Jones (KI4ZIB)
http://matburt.net
On Fri, Aug 28, 2009 at 5:55 AM, Fabian Groffen
On 27-08-2009 19:49:37 -0400, Matthew Jones wrote:
On Thu, Aug 27, 2009 at 7:11 PM, Martin Kersten
wrote: Matthew Jones wrote: > I noticed that COPY...INTO for bulk loading requires that all fields be > present in the data file. Is there any way to use bulk loading and have > it honor the auto increment column or at least not have to specify every > column in the data file? How about adding the serial column after loading. It will be zero cost.
Adding a "serial" column won't be zero cost, and I doubt it will be possible, since it implies a NOT NULL constraint, which cannot be added to a to be added column. Technically you could do it, assuming that you would just start filling the column because you know what can be put into it, as is defined by the sequence.
The problem is, I am loading continously, and I need this field continuously.
So it seems you're using COPY INTO to regularly append batches of new data to your dataset. You do this because of performance reasons? Since COPY INTO is designed for loading (restoring) a database, is it perhaps possible to use prepared statement batching instead? That allows you to have the sequence column(s) do their job.
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On 28-08-2009 09:33:41 -0400, Matthew Jones wrote:
I can always increment the sequence higher, but if I try to alter it to anything past the maximum value for a 32-bit integer then it will reset to it.
What do you think? Bug?
yep https://sourceforge.net/tracker/index.php?func=detail&aid=2846238&group_id=56967&atid=482468
On 28-08-2009 09:33:41 -0400, Matthew Jones wrote:
My solution is to create a sequence with the table, before batch loading my data I will get the next value of the sequence, then reset it to that value + the size of my batch and iterate over my batch adding that value.
Can't you COPY INTO a temporary table, and then INSERT INTO original_table (col2, col3, col4) SELECT col1, col2, col3 from temporary_table; ?
I still have to be able to do that while continuously loading batches, and
moving between tables like that will require me to lock out both tables
while it is going on. Instead with my solution I only have to lock out the
sequence for as long as it takes to generate the number and increment it, I
just wish I didn't have to pay the cost at all at the client layer.
73,
Matthew W. Jones (KI4ZIB)
http://matburt.net
On Mon, Aug 31, 2009 at 6:42 AM, Fabian Groffen
On 28-08-2009 09:33:41 -0400, Matthew Jones wrote:
My solution is to create a sequence with the table, before batch loading my data I will get the next value of the sequence, then reset it to that value + the size of my batch and iterate over my batch adding that value.
Can't you COPY INTO a temporary table, and then INSERT INTO original_table (col2, col3, col4) SELECT col1, col2, col3 from temporary_table; ?
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On 31-08-2009 09:28:58 -0400, Matthew Jones wrote:
I still have to be able to do that while continuously loading batches, and moving between tables like that will require me to lock out both tables while it is going on. Instead with my solution I only have to lock out the sequence for as long as it takes to generate the number and increment it, I just wish I didn't have to pay the cost at all at the client layer.
do you know how many records come in advance? then you can just update the sequence before you load, and hence avoid the possibility of sequence numbers being returned that would be assigned to your batch.
Yes, that is what I do... I select the next number and then alter the
sequence to restart at that number + batch size.
73,
Matthew W. Jones (KI4ZIB)
http://matburt.net
On Mon, Aug 31, 2009 at 10:50 AM, Fabian Groffen
On 31-08-2009 09:28:58 -0400, Matthew Jones wrote:
I still have to be able to do that while continuously loading batches, and moving between tables like that will require me to lock out both tables while it is going on. Instead with my solution I only have to lock out the sequence for as long as it takes to generate the number and increment it, I just wish I didn't have to pay the cost at all at the client layer.
do you know how many records come in advance? then you can just update the sequence before you load, and hence avoid the possibility of sequence numbers being returned that would be assigned to your batch.
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (3)
-
Fabian Groffen
-
Martin Kersten
-
Matthew Jones