annotate src/main/java/nl/cwi/monetdb/mcl/parser/MCLParser.java @ 322:0fcf338ce0b4

Optimized parse method of TupleLineParser by creating less helper objects and replacing method calls by direct operations on variables. We now only create a StringBuilder object uesc when it is needed (when an escaped character is detected in the tuple line). In most tuple data lines no escape characters are used and thus the StringBuilder object was not used/needed. Also increased the default capacity of StringBuilder uesc such that it is not enlarged so often. Also made StringBuilder object uesc now part of the TupleLineParser object, such that it can be reused by many parse() calls, again reducing the number of created StringBuilder objects.
author Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
date Wed, 11 Sep 2019 17:18:00 +0200 (2019-09-11)
parents bb273e9c7e09
children 54137aeb1f92
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
1 /*
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
2 * This Source Code Form is subject to the terms of the Mozilla Public
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
3 * License, v. 2.0. If a copy of the MPL was not distributed with this
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
4 * file, You can obtain one at http://mozilla.org/MPL/2.0/.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
5 *
261
d4baf8a4b43a Update Copyright year to 2019
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents: 203
diff changeset
6 * Copyright 1997 - July 2008 CWI, August 2008 - 2019 MonetDB B.V.
0
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
7 */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
8
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
9 package nl.cwi.monetdb.mcl.parser;
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
10
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
11
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
12 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
13 * Interface for parsers in MCL. The parser family in MCL is set up as
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
14 * a reusable object. This allows the same parser to be used again for
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
15 * the same type of work. While this is a very unnatural solution in
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
16 * the Java language, it prevents many object creations on a low level
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
17 * of the protocol. This favours performance.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
18 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
19 * A typical parser has a method parse() which takes a String, and the
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
20 * methods hasNext() and next() to retrieve the values that were
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
21 * extracted by the parser. Parser specific methods may be available to
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
22 * perform common tasks.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
23 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
24 * @author Fabian Groffen
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
25 */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
26 public abstract class MCLParser {
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
27 /** The String values found while parsing. Public, you may touch it. */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
28 public final String values[];
322
0fcf338ce0b4 Optimized parse method of TupleLineParser by creating less helper objects and replacing method calls by direct operations on variables.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents: 297
diff changeset
29 protected int colnr = 0;
0
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
30
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
31 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
32 * Creates an MCLParser targetted at a given number of field values.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
33 * The lines parsed by an instance of this MCLParser should have
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
34 * exactly capacity field values.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
35 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
36 * @param capacity the number of field values to expect
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
37 */
297
bb273e9c7e09 Add "final" keyword to classes, method arguments and local variables where possible.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents: 261
diff changeset
38 protected MCLParser(final int capacity) {
0
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
39 values = new String[capacity];
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
40 }
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
41
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
42 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
43 * Parse the given string, and populate the internal field array
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
44 * to allow for next() and hasNext() calls.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
45 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
46 * @param source the String containing the line to parse
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
47 * @return value
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
48 * @throws MCLParseException if source cannot be (fully) parsed by
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
49 * this parser
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
50 * @see #next()
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
51 * @see #hasNext()
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
52 */
297
bb273e9c7e09 Add "final" keyword to classes, method arguments and local variables where possible.
Martin van Dinther <martin.van.dinther@monetdbsolutions.com>
parents: 261
diff changeset
53 abstract public int parse(final String source) throws MCLParseException;
0
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
54
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
55 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
56 * Repositions the internal field offset to the start, such that the
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
57 * next call to next() will return the first field again.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
58 */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
59 final public void reset() {
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
60 colnr = 0;
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
61 }
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
62
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
63 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
64 * Returns whether the next call to next() or nextInt() succeeds.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
65 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
66 * @return true if the next call to next() or nextInt() is bound to
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
67 * succeed
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
68 * @see #next()
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
69 */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
70 final public boolean hasNext() {
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
71 return colnr < values.length;
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
72 }
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
73
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
74 /**
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
75 * Returns the current field value, and advances the field counter
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
76 * to the next value. This method may fail with a RuntimeError if
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
77 * the current field counter is out of bounds. Call hasNext() to
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
78 * determine if the call to next() will succeed.
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
79 *
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
80 * @return the current field value
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
81 * @see #hasNext()
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
82 */
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
83 final public String next() {
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
84 return values[colnr++];
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
85 }
a5a898f6886c Copy of MonetDB java directory changeset e6e32756ad31.
Sjoerd Mullender <sjoerd@acm.org>
parents:
diff changeset
86 }