What is better JSON or TEXT field in terms of performance ???
Hi All! Does anybody has worked with json fields? We are planning to use json fields with very large text on it (the hierarchy of our json would be plain (only one level), we are doing some tests over 7MM records, it took like 1m30s to retrieve the information. In this case to get the information we used the function json.filter(name_of_field, '$..xxxx') We have also tested using a TEXT field (and using some string functions) we parsed the information inside the field and we found that sometimes it was quicker it took from 1m30s to 35s. I dont know exactly why the difference on that delta. We need to implement the processes, and need to decide how to move forward.... Does anybody can help me on the decision ??? We know and is a fact, that Monet can beat any DB in terms of performance, and that is why we decided to use it, but now Im worried about this type of queries (against json/text fields) and how long this queries would take if we increase the amount of records.... What would you suggest ?? Thanks in advance! Rgds Ariel
Hi Just a quick note. The JSON functionality was merely added to MonetDB as proof-of-concept without any intention or special data/index structures to make it fast. It simply maps to a straight string representation, which is parsed upon need to get the components. For this particular case, I would simple store the documents as strings that comply with the JSON structure and use regular expression evaluation to pre-filter the candidates, where after the component can be extracted. Scaleable JSON support is definitely and area where a lot can be gained. I would most likely focus on JSON to RELATIONAL mappings, because this covers > 95% of the cases. Rendering back to JSON can be done either at the server side, or as part of the front-end application. Overall technically not difficult, but quite some engineering. Replacing the JSON atom implementation with a fast library is also a way to go. And those in love of FPGAs might embed a fast JSON parser. For performance comparison, the DB2 and Oracle JSON extensions might be a better approach. As far as I recall, also Postgresql uses a BSON structure and some index support. regards, Martin On 05/12/2017 20:28, Ariel Abadi wrote:
Hi All!
Does anybody has worked with json fields?
We are planning to use json fields with very large text on it (the hierarchy of our json would be plain (only one level), we are doing some tests over 7MM records, it took like 1m30s to retrieve the information. In this case to get the information we used the function json.filter(name_of_field, '$..xxxx')
We have also tested using a TEXT field (and using some string functions) we parsed the information inside the field and we found that sometimes it was quicker it took from 1m30s to 35s. I dont know exactly why the difference on that delta.
We need to implement the processes, and need to decide how to move forward.... Does anybody can help me on the decision ???
We know and is a fact, that Monet can beat any DB in terms of performance, and that is why we decided to use it, but now Im worried about this type of queries (against json/text fields) and how long this queries would take if we increase the amount of records....
What would you suggest ??
Thanks in advance! Rgds Ariel
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Martin!
Thank you very much for your quick answer!!!
Just for me to understand, and since I will move forward using a TEXT field
... in terms of performance is the same to use a TEXT field rather than a
VARCHAR(50000) ... (I know varchar(50000) does not very look nice :D) but,
may be has a better performance ? or is exactly the same ?
Thanks again
Rgds
Ariel
On Tue, Dec 5, 2017 at 6:36 PM, Martin Kersten
Hi
Just a quick note. The JSON functionality was merely added to MonetDB as proof-of-concept without any intention or special data/index structures to make it fast. It simply maps to a straight string representation, which is parsed upon need to get the components. For this particular case, I would simple store the documents as strings that comply with the JSON structure and use regular expression evaluation to pre-filter the candidates, where after the component can be extracted.
Scaleable JSON support is definitely and area where a lot can be gained. I would most likely focus on JSON to RELATIONAL mappings, because this covers > 95% of the cases. Rendering back to JSON can be done either at the server side, or as part of the front-end application. Overall technically not difficult, but quite some engineering. Replacing the JSON atom implementation with a fast library is also a way to go. And those in love of FPGAs might embed a fast JSON parser.
For performance comparison, the DB2 and Oracle JSON extensions might be a better approach. As far as I recall, also Postgresql uses a BSON structure and some index support.
regards, Martin On 05/12/2017 20:28, Ariel Abadi wrote:
Hi All!
Does anybody has worked with json fields?
We are planning to use json fields with very large text on it (the hierarchy of our json would be plain (only one level), we are doing some tests over 7MM records, it took like 1m30s to retrieve the information. In this case to get the information we used the function json.filter(name_of_field, '$..xxxx')
We have also tested using a TEXT field (and using some string functions) we parsed the information inside the field and we found that sometimes it was quicker it took from 1m30s to 35s. I dont know exactly why the difference on that delta.
We need to implement the processes, and need to decide how to move forward.... Does anybody can help me on the decision ???
We know and is a fact, that Monet can beat any DB in terms of performance, and that is why we decided to use it, but now Im worried about this type of queries (against json/text fields) and how long this queries would take if we increase the amount of records....
What would you suggest ??
Thanks in advance! Rgds Ariel
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
On 06/12/17 13:47, Ariel Abadi wrote:
Hi Martin! Thank you very much for your quick answer!!!
Just for me to understand, and since I will move forward using a TEXT field ... in terms of performance is the same to use a TEXT field rather than a VARCHAR(50000) ... (I know varchar(50000) does not very look nice :D) but, may be has a better performance ? or is exactly the same ?
That's the same. There is no difference in which strings are stored whether they are VARCHAR(1) or CHARACTER LARGE OBJECT (aka CLOB or STRING) or TEXT.
Thanks again Rgds Ariel
*/ /* */ /*
On Tue, Dec 5, 2017 at 6:36 PM, Martin Kersten
mailto:martin.kersten@cwi.nl> wrote: Hi
Just a quick note. The JSON functionality was merely added to MonetDB as proof-of-concept without any intention or special data/index structures to make it fast. It simply maps to a straight string representation, which is parsed upon need to get the components. For this particular case, I would simple store the documents as strings that comply with the JSON structure and use regular expression evaluation to pre-filter the candidates, where after the component can be extracted.
Scaleable JSON support is definitely and area where a lot can be gained. I would most likely focus on JSON to RELATIONAL mappings, because this covers > 95% of the cases. Rendering back to JSON can be done either at the server side, or as part of the front-end application. Overall technically not difficult, but quite some engineering. Replacing the JSON atom implementation with a fast library is also a way to go. And those in love of FPGAs might embed a fast JSON parser.
For performance comparison, the DB2 and Oracle JSON extensions might be a better approach. As far as I recall, also Postgresql uses a BSON structure and some index support.
regards, Martin On 05/12/2017 20:28, Ariel Abadi wrote:
Hi All!
Does anybody has worked with json fields?
We are planning to use json fields with very large text on it (the hierarchy of our json would be plain (only one level), we are doing some tests over 7MM records, it took like 1m30s to retrieve the information. In this case to get the information we used the function json.filter(name_of_field, '$..xxxx')
We have also tested using a TEXT field (and using some string functions) we parsed the information inside the field and we found that sometimes it was quicker it took from 1m30s to 35s. I dont know exactly why the difference on that delta.
We need to implement the processes, and need to decide how to move forward.... Does anybody can help me on the decision ???
We know and is a fact, that Monet can beat any DB in terms of performance, and that is why we decided to use it, but now Im worried about this type of queries (against json/text fields) and how long this queries would take if we increase the amount of records....
What would you suggest ??
Thanks in advance! Rgds Ariel
_______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
participants (3)
-
Ariel Abadi
-
Martin Kersten
-
Sjoerd Mullender