Mercurial > hg > monetdb-java
annotate src/main/java/nl/cwi/monetdb/mcl/parser/MCLParser.java @ 322:0fcf338ce0b4
Optimized parse method of TupleLineParser by creating less helper objects and replacing method calls by direct operations on variables.
We now only create a StringBuilder object uesc when it is needed (when an escaped character is detected in the tuple line).
In most tuple data lines no escape characters are used and thus the StringBuilder object was not used/needed.
Also increased the default capacity of StringBuilder uesc such that it is not enlarged so often.
Also made StringBuilder object uesc now part of the TupleLineParser object, such that it can be reused by
many parse() calls, again reducing the number of created StringBuilder objects.
author | Martin van Dinther <martin.van.dinther@monetdbsolutions.com> |
---|---|
date | Wed, 11 Sep 2019 17:18:00 +0200 (2019-09-11) |
parents | bb273e9c7e09 |
children | 54137aeb1f92 |
rev | line source |
---|---|
0
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
1 /* |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
2 * This Source Code Form is subject to the terms of the Mozilla Public |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
3 * License, v. 2.0. If a copy of the MPL was not distributed with this |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
4 * file, You can obtain one at http://mozilla.org/MPL/2.0/. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
5 * |
261
d4baf8a4b43a
Update Copyright year to 2019
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents:
203
diff
changeset
|
6 * Copyright 1997 - July 2008 CWI, August 2008 - 2019 MonetDB B.V. |
0
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
7 */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
8 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
9 package nl.cwi.monetdb.mcl.parser; |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
10 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
11 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
12 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
13 * Interface for parsers in MCL. The parser family in MCL is set up as |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
14 * a reusable object. This allows the same parser to be used again for |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
15 * the same type of work. While this is a very unnatural solution in |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
16 * the Java language, it prevents many object creations on a low level |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
17 * of the protocol. This favours performance. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
18 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
19 * A typical parser has a method parse() which takes a String, and the |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
20 * methods hasNext() and next() to retrieve the values that were |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
21 * extracted by the parser. Parser specific methods may be available to |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
22 * perform common tasks. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
23 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
24 * @author Fabian Groffen |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
25 */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
26 public abstract class MCLParser { |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
27 /** The String values found while parsing. Public, you may touch it. */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
28 public final String values[]; |
322
0fcf338ce0b4
Optimized parse method of TupleLineParser by creating less helper objects and replacing method calls by direct operations on variables.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents:
297
diff
changeset
|
29 protected int colnr = 0; |
0
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
30 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
31 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
32 * Creates an MCLParser targetted at a given number of field values. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
33 * The lines parsed by an instance of this MCLParser should have |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
34 * exactly capacity field values. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
35 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
36 * @param capacity the number of field values to expect |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
37 */ |
297
bb273e9c7e09
Add "final" keyword to classes, method arguments and local variables where possible.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents:
261
diff
changeset
|
38 protected MCLParser(final int capacity) { |
0
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
39 values = new String[capacity]; |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
40 } |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
41 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
42 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
43 * Parse the given string, and populate the internal field array |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
44 * to allow for next() and hasNext() calls. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
45 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
46 * @param source the String containing the line to parse |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
47 * @return value |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
48 * @throws MCLParseException if source cannot be (fully) parsed by |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
49 * this parser |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
50 * @see #next() |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
51 * @see #hasNext() |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
52 */ |
297
bb273e9c7e09
Add "final" keyword to classes, method arguments and local variables where possible.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents:
261
diff
changeset
|
53 abstract public int parse(final String source) throws MCLParseException; |
0
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
54 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
55 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
56 * Repositions the internal field offset to the start, such that the |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
57 * next call to next() will return the first field again. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
58 */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
59 final public void reset() { |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
60 colnr = 0; |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
61 } |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
62 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
63 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
64 * Returns whether the next call to next() or nextInt() succeeds. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
65 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
66 * @return true if the next call to next() or nextInt() is bound to |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
67 * succeed |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
68 * @see #next() |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
69 */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
70 final public boolean hasNext() { |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
71 return colnr < values.length; |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
72 } |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
73 |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
74 /** |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
75 * Returns the current field value, and advances the field counter |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
76 * to the next value. This method may fail with a RuntimeError if |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
77 * the current field counter is out of bounds. Call hasNext() to |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
78 * determine if the call to next() will succeed. |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
79 * |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
80 * @return the current field value |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
81 * @see #hasNext() |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
82 */ |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
83 final public String next() { |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
84 return values[colnr++]; |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
85 } |
a5a898f6886c
Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff
changeset
|
86 } |