Hi Ariel,
The COPY INTO command recognizes a series of records of the form:
<data><field delimiter><data><field delimiter>...<data><record delimiter>
For example the command
COPY INTO tbl FROM '/path/to/file' USING DELIMITERS '\t','\n','"';
will recognize lines of the form:
1 "MIA" 7 "MIA|115331571990001804" 2000-08-01
but nothing else. Specifically the first 222 lines in the monet_list.txt file be rejected by COPY INTO. You need to delete those lines before you try to load data with copy into.
Take a look at the documentation of COPY INTO in the MonetDB web page [1].
Hope this helps,
Panos.
[1] https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/CSV_bulk_loads
On 3/30/20 7:42 PM, Ariel Abadi wrote:
> Hi Panos!
>
> Enclosed you can find the file (truncated) that I used and the statement I
> run on mclient.
>
> COPY OFFSET 223 INTO "sb_traf"."nrm_20000818" FROM '/tmp/monet_list.txt'
> USING DELIMITERS '\t','\n','"';
>
>
> Thks
> Ariel
>
>
> On Mon, Mar 30, 2020 at 8:07 AM Panagiotis Koutsourakis <
> panagiotis.koutsourakis@monetdbsolutions.com> wrote:
>
>> Hi Ariel,
>>
>> The first issue is not totally surprising to me. My best guess is that
>> reading from a file is using buffered input while reading from stdin is
>> not. See for example [1] and [2]. I am not sure there is a lot we can do at
>> the MonetDB side about this.
>>
>> About the second issue could you please post the exact COPY INTO command
>> and the first few lines of the file you are trying to load so that we can
>> figure out the exact issue?
>>
>> Best regards,
>> Panos.
>>
>> [1]
>> https://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python
>> [2]
>> https://discourse.julialang.org/t/stdin-stdout-10x-slower-than-file-open/17233
>>
>> On 3/26/20 11:51 AM, Ariel Abadi wrote:
>>> Hi!
>>>
>>> After a big battle against the DB, I found 2 issues.
>>>
>>> 1) Running the COPY RECORDS FROM *STDIN *and from *FILE*. Is totally
>>> different in terms of performance. While the first one last 24 minutes,
>> the
>>> second one last only *90 secs.*
>>> 2) As I had generated all those files thru msqldump (which it states the
>>> "CREATE TABLE" instruction and then the COPY RECORDS)... I try ro run the
>>> COPY *XXXX OFFSET *RECORDS (in order to avoid the create table
>>> statement)... and the query, because of the word "STDIN" in the file. It
>>> was not OFFSETTING all the rows I was asking for.
>>>
>>> I hope I've been clear... if not please let me know!
>>>
>>> Ariel
>>>
>>> On Tue, Mar 24, 2020 at 11:33 PM Martin Kersten <martin.kersten@cwi.nl>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> I am not sure what issue you are now dealing with, but since you are a
>>>> company
>>>> that seems to use (or considering) MonetDB in your commercial offerings,
>>>> I gladly refer you to MonetDBSolutions(*), where we can give you the
>>>> commercial
>>>> support needed to diagnose your issues.
>>>>
>>>> regards Martin
>>>>
>>>> (*) https://www.monetdbsolutions.com/
>>>>
>>>> On 24/03/2020 23:55, Ariel Abadi wrote:
>>>>> Thanks Martin for your prompt response
>>>>> But I did and msqldump from other database, and now the upload is being
>>>> done thru the BULK LOAD.
>>>>>
>>>>> /COPY 14173375 RECORDS INTO "sb_traf"."nrm_20180818" FROM stdin USING
>>>> DELIMITERS '\t','\n','"';/
>>>>>
>>>>>
>>>>> Any other idea ? :'(
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 24, 2020 at 7:37 PM Martin Kersten <martin.kersten@cwi.nl
>>>> <mailto:martin.kersten@cwi.nl>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> given the numbers I suspect you are loading them as SQL insert
>>>> statements under autocommit mode,
>>>>> Please read the documentation on Bulk loading.
>>>>>
>>>>> regards,Martin
>>>>> Sent from my iPad
>>>>>
>>>>>> On 24 Mar 2020, at 23:12, Ariel Abadi <aabadi@starbi.com <mailto:
>>>> aabadi@starbi.com>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi!
>>>>>> Maybe somebody can help me, please!
>>>>>>
>>>>>> I'm trying to upload into Monet a file with 14MM records. Is
>> taking
>>>> more than *24 minutes.*
>>>>>> It is a fresh new database.
>>>>>>
>>>>>> Enclosed you will find the server configuration.
>>>>>>
>>>>>> Monetdb Version Nov2019-SP3
>>>>>>
>>>>>>
>>>>>> *[root@mn tmp]$ lscpu*
>>>>>> Architecture: x86_64
>>>>>> CPU op-mode(s): 32-bit, 64-bit
>>>>>> Byte Order: Little Endian
>>>>>> CPU(s): 40
>>>>>> On-line CPU(s) list: 0-39
>>>>>> Thread(s) per core: 2
>>>>>> Core(s) per socket: 10
>>>>>> Socket(s): 2
>>>>>> NUMA node(s): 2
>>>>>> Vendor ID: GenuineIntel
>>>>>> CPU family: 6
>>>>>> Model: 85
>>>>>> Model name: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
>>>>>> Stepping: 4
>>>>>> CPU MHz: 800.479
>>>>>> BogoMIPS: 4400.00
>>>>>> Virtualization: VT-x
>>>>>> L1d cache: 32K
>>>>>> L1i cache: 32K
>>>>>> L2 cache: 1024K
>>>>>> L3 cache: 14080K
>>>>>> NUMA node0 CPU(s):
>>>> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>>>>>> NUMA node1 CPU(s):
>>>> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>>>>>> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep
>>>> mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
>> pbe
>>>> syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts
>>>> rep_good nopl xtopology nonstop_tsc
>>>>>> cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
>>>> tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe
>>>> popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
>>>> 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3
>>>>>> invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow
>>>> vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep
>>>> bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap
>>>> clflushopt clwb intel_pt avx512cd avx512bw
>>>>>> avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc
>>>> cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear
>>>> flush_l1d
>>>>>>
>>>>>>
>>>>>> *[root@mn tmp]$ cat /proc/meminfo*
>>>>>>
>>>>>> MemTotal: 83886080 kB
>>>>>> MemFree: 10535812 kB
>>>>>> MemAvailable: 80159744 kB
>>>>>> Buffers: 0 kB
>>>>>> Cached: 69623932 kB
>>>>>> SwapCached: 0 kB
>>>>>> Active: 6625436 kB
>>>>>> Inactive: 63737872 kB
>>>>>> Active(anon): 324456 kB
>>>>>> Inactive(anon): 417184 kB
>>>>>> Active(file): 6300980 kB
>>>>>> Inactive(file): 63320688 kB
>>>>>> Unevictable: 0 kB
>>>>>> Mlocked: 5320 kB
>>>>>> SwapTotal: 8388604 kB
>>>>>> SwapFree: 8381988 kB
>>>>>> Dirty: 220 kB
>>>>>> Writeback: 0 kB
>>>>>> AnonPages: 13626052 kB
>>>>>> Mapped: 186368 kB
>>>>>> Shmem: 1800 kB
>>>>>> KReclaimable: 4134960 kB
>>>>>> Slab: 0 kB
>>>>>> SReclaimable: 0 kB
>>>>>> SUnreclaim: 0 kB
>>>>>> KernelStack: 16720 kB
>>>>>> PageTables: 56500 kB
>>>>>> NFS_Unstable: 0 kB
>>>>>> Bounce: 0 kB
>>>>>> WritebackTmp: 0 kB
>>>>>> CommitLimit: 74123472 kB
>>>>>> Committed_AS: 18337812 kB
>>>>>> VmallocTotal: 34359738367 kB
>>>>>> VmallocUsed: 415836 kB
>>>>>> VmallocChunk: 0 kB
>>>>>> Percpu: 39552 kB
>>>>>> HardwareCorrupted: 0 kB
>>>>>> AnonHugePages: 0 kB
>>>>>> ShmemHugePages: 0 kB
>>>>>> ShmemPmdMapped: 0 kB
>>>>>> CmaTotal: 0 kB
>>>>>> CmaFree: 0 kB
>>>>>> HugePages_Total: 0
>>>>>> HugePages_Free: 0
>>>>>> HugePages_Rsvd: 0
>>>>>> HugePages_Surp: 0
>>>>>> Hugepagesize: 2048 kB
>>>>>> Hugetlb: 0 kB
>>>>>> DirectMap4k: 816396 kB
>>>>>> DirectMap2M: 82513920 kB
>>>>>> DirectMap1G: 52428800 kB
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users-list mailing list
>>>>>> users-list@monetdb.org <mailto:users-list@monetdb.org>
>>>>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>>> _______________________________________________
>>>>> users-list mailing list
>>>>> users-list@monetdb.org <mailto:users-list@monetdb.org>
>>>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users-list mailing list
>>>>> users-list@monetdb.org
>>>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> users-list mailing list
>>> users-list@monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>
>> _______________________________________________
>> users-list mailing list
>> users-list@monetdb.org
>> https://www.monetdb.org/mailman/listinfo/users-list
>>
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list