class Puppet::Pops::Parser::Lexer2

Constants

KEYWORDS

Keywords are all singleton tokens with pre calculated lengths. Booleans are pre-calculated (rather than evaluating the strings “false” “true” repeatedly.

KEYWORD_NAMES

Reverse lookup of keyword name to string

PATTERN_BARE_WORD
PATTERN_CLASSREF

The NAME and CLASSREF in 4x are strict. Each segment must start with a letter a-z and may not contain dashes (w includes letters, digits and _).

PATTERN_COMMENT

The single line comment includes the line ending.

PATTERN_DOLLAR_VAR
PATTERN_MLCOMMENT
PATTERN_NAME
PATTERN_NON_WS
PATTERN_NUMBER
PATTERN_REGEX
PATTERN_REGEX_A
PATTERN_REGEX_END
PATTERN_REGEX_ESC
PATTERN_REGEX_Z
PATTERN_WS
STRING_BSLASH_SLASH

PERFORMANCE NOTE: Comparison against a frozen string is faster (than unfrozen).

TOKEN_APPENDS
TOKEN_AT
TOKEN_ATAT
TOKEN_COLON
TOKEN_COMMA
TOKEN_DELETES
TOKEN_DIV
TOKEN_DOT
TOKEN_DQMID
TOKEN_DQPOS
TOKEN_DQPRE
TOKEN_EPPEND
TOKEN_EPPEND_TRIM
TOKEN_EPPSTART

EPP_START is currently a marker token, may later get syntax

TOKEN_EQUALS
TOKEN_FARROW
TOKEN_GREATEREQUAL
TOKEN_GREATERTHAN
TOKEN_HEREDOC

HEREDOC has syntax as an argument.

TOKEN_IN_EDGE
TOKEN_IN_EDGE_SUB
TOKEN_ISEQUAL
TOKEN_LBRACE
TOKEN_LBRACK

ALl tokens have three slots, the token name (a Symbol), the token text (String), and a token text length. All operator and punctuation tokens reuse singleton arrays Tokens that require unique values create a unique array per token.

PEFORMANCE NOTES: This construct reduces the amount of object that needs to be created for operators and punctuation. The length is pre-calculated for all singleton tokens. The length is used both to signal the length of the token, and to advance the scanner position (without having to advance it with a scan(regexp)).

TOKEN_LCOLLECT
TOKEN_LESSEQUAL
TOKEN_LESSTHAN
TOKEN_LISTSTART
TOKEN_LLCOLLECT
TOKEN_LPAREN
TOKEN_LSHIFT
TOKEN_MATCH
TOKEN_MINUS
TOKEN_MODULO
TOKEN_NOMATCH
TOKEN_NOT
TOKEN_NOTEQUAL
TOKEN_NUMBER
TOKEN_OTHER

This is used for unrecognized tokens, will always be a single character. This particular instance is not used, but is kept here for documentation purposes.

TOKEN_OUT_EDGE
TOKEN_OUT_EDGE_SUB
TOKEN_PARROW
TOKEN_PIPE
TOKEN_PLUS
TOKEN_QMARK
TOKEN_RBRACE
TOKEN_RBRACK
TOKEN_RCOLLECT
TOKEN_REGEXP
TOKEN_RPAREN
TOKEN_RRCOLLECT
TOKEN_RSHIFT
TOKEN_SELBRACE
TOKEN_SEMIC
TOKEN_STRING

Tokens that are always unique to what has been lexed

TOKEN_TILDE
TOKEN_TIMES
TOKEN_VARIABLE
TOKEN_VARIABLE_EMPTY
TOKEN_WORD
TOKEN_WSLPAREN

Attributes

locator[R]

Public Class Methods

new() click to toggle source
    # File lib/puppet/pops/parser/lexer2.rb
183 def initialize()
184   @selector = {
185     '.' =>  lambda { emit(TOKEN_DOT, @scanner.pos) },
186     ',' => lambda {  emit(TOKEN_COMMA, @scanner.pos) },
187     '[' => lambda do
188       before = @scanner.pos
189       # Must check the preceding character to see if it is whitespace.
190       # The fastest thing to do is to simply byteslice to get the string ending at the offset before
191       # and then check what the last character is. (This is the same as  what an locator.char_offset needs
192       # to compute, but with less overhead of trying to find out the global offset from a local offset in the
193       # case when this is sublocated in a heredoc).
194       if before == 0 || @scanner.string.byteslice(0, before)[-1] =~ /[[:blank:]\r\n]+/
195         emit(TOKEN_LISTSTART, before)
196       else
197         emit(TOKEN_LBRACK, before)
198       end
199     end,
200     ']' => lambda { emit(TOKEN_RBRACK, @scanner.pos) },
201     '(' => lambda do
202       before = @scanner.pos
203       # If first on a line, or only whitespace between start of line and '('
204       # then the token is special to avoid being taken as start of a call.
205       line_start = @lexing_context[:line_lexical_start]
206       if before == line_start || @scanner.string.byteslice(line_start, before - line_start) =~ /\A[[:blank:]\r]+\Z/
207         emit(TOKEN_WSLPAREN, before)
208       else
209         emit(TOKEN_LPAREN, before)
210       end
211     end,
212     ')' => lambda { emit(TOKEN_RPAREN, @scanner.pos) },
213     ';' => lambda { emit(TOKEN_SEMIC, @scanner.pos) },
214     '?' => lambda { emit(TOKEN_QMARK, @scanner.pos) },
215     '*' => lambda { emit(TOKEN_TIMES, @scanner.pos) },
216     '%' => lambda do
217       scn = @scanner
218       before = scn.pos
219       la = scn.peek(2)
220       if la[1] == '>' && @lexing_context[:epp_mode]
221         scn.pos += 2
222         if @lexing_context[:epp_mode] == :expr
223           enqueue_completed(TOKEN_EPPEND, before)
224         end
225         @lexing_context[:epp_mode] = :text
226         interpolate_epp
227       else
228         emit(TOKEN_MODULO, before)
229       end
230     end,
231     '{' => lambda do
232       # The lexer needs to help the parser since the technology used cannot deal with
233       # lookahead of same token with different precedence. This is solved by making left brace
234       # after ? into a separate token.
235       #
236       @lexing_context[:brace_count] += 1
237       emit(if @lexing_context[:after] == :QMARK
238              TOKEN_SELBRACE
239            else
240              TOKEN_LBRACE
241            end, @scanner.pos)
242     end,
243     '}' => lambda do
244       @lexing_context[:brace_count] -= 1
245       emit(TOKEN_RBRACE, @scanner.pos)
246     end,
247 
248 
249     # TOKENS @, @@, @(
250     '@' => lambda do
251       scn = @scanner
252       la = scn.peek(2)
253       if la[1] == '@'
254         emit(TOKEN_ATAT, scn.pos) # TODO; Check if this is good for the grammar
255       elsif la[1] == '('
256         heredoc
257       else
258         emit(TOKEN_AT, scn.pos)
259       end
260     end,
261 
262     # TOKENS |, |>, |>>
263     '|' => lambda do
264       scn = @scanner
265       la = scn.peek(3)
266       emit(la[1] == '>' ? (la[2] == '>' ? TOKEN_RRCOLLECT : TOKEN_RCOLLECT) : TOKEN_PIPE, scn.pos)
267     end,
268 
269     # TOKENS =, =>, ==, =~
270     '=' => lambda do
271       scn = @scanner
272       la = scn.peek(2)
273       emit(case la[1]
274            when '='
275              TOKEN_ISEQUAL
276            when '>'
277              TOKEN_FARROW
278            when '~'
279              TOKEN_MATCH
280            else
281              TOKEN_EQUALS
282            end, scn.pos)
283     end,
284 
285     # TOKENS '+', '+=', and '+>'
286     '+' => lambda do
287       scn = @scanner
288       la = scn.peek(2)
289       emit(case la[1]
290            when '='
291              TOKEN_APPENDS
292            when '>'
293              TOKEN_PARROW
294            else
295              TOKEN_PLUS
296            end, scn.pos)
297     end,
298 
299     # TOKENS '-', '->', and epp '-%>' (end of interpolation with trim)
300     '-' => lambda do
301       scn = @scanner
302       la = scn.peek(3)
303       before = scn.pos
304       if @lexing_context[:epp_mode] && la[1] == '%' && la[2] == '>'
305         scn.pos += 3
306         if @lexing_context[:epp_mode] == :expr
307           enqueue_completed(TOKEN_EPPEND_TRIM, before)
308         end
309         interpolate_epp(:with_trim)
310       else
311         emit(case la[1]
312              when '>'
313                TOKEN_IN_EDGE
314              when '='
315                TOKEN_DELETES
316              else
317                TOKEN_MINUS
318              end, before)
319       end
320     end,
321 
322     # TOKENS !, !=, !~
323     '!' => lambda do
324       scn = @scanner
325       la = scn.peek(2)
326       emit(case la[1]
327            when '='
328              TOKEN_NOTEQUAL
329            when '~'
330              TOKEN_NOMATCH
331            else
332              TOKEN_NOT
333            end, scn.pos)
334     end,
335 
336     # TOKENS ~>, ~
337     '~' => lambda do
338       scn = @scanner
339       la = scn.peek(2)
340       emit(la[1] == '>' ? TOKEN_IN_EDGE_SUB : TOKEN_TILDE, scn.pos)
341     end,
342 
343     '#' => lambda { @scanner.skip(PATTERN_COMMENT); nil },
344 
345     # TOKENS '/', '/*' and '/ regexp /'
346     '/' => lambda do
347       scn = @scanner
348       la = scn.peek(2)
349       if la[1] == '*'
350         lex_error(Issues::UNCLOSED_MLCOMMENT) if scn.skip(PATTERN_MLCOMMENT).nil?
351         nil
352       else
353         before = scn.pos
354         # regexp position is a regexp, else a div
355         value = scn.scan(PATTERN_REGEX) if regexp_acceptable?
356         if value
357           # Ensure an escaped / was not matched
358           while escaped_end(value)
359             more = scn.scan_until(PATTERN_REGEX_END)
360             return emit(TOKEN_DIV, before) unless more
361             value << more
362           end
363           regex = value.sub(PATTERN_REGEX_A, '').sub(PATTERN_REGEX_Z, '').gsub(PATTERN_REGEX_ESC, '/')
364           emit_completed([:REGEX, Regexp.new(regex), scn.pos-before], before)
365         else
366           emit(TOKEN_DIV, before)
367         end
368       end
369     end,
370 
371     # TOKENS <, <=, <|, <<|, <<, <-, <~
372     '<' => lambda do
373       scn = @scanner
374       la = scn.peek(3)
375       emit(case la[1]
376            when '<'
377              if la[2] == '|'
378                TOKEN_LLCOLLECT
379              else
380                TOKEN_LSHIFT
381              end
382            when '='
383              TOKEN_LESSEQUAL
384            when '|'
385              TOKEN_LCOLLECT
386            when '-'
387              TOKEN_OUT_EDGE
388            when '~'
389              TOKEN_OUT_EDGE_SUB
390            else
391              TOKEN_LESSTHAN
392            end, scn.pos)
393     end,
394 
395     # TOKENS >, >=, >>
396     '>' => lambda do
397       scn = @scanner
398       la = scn.peek(2)
399       emit(case la[1]
400            when '>'
401              TOKEN_RSHIFT
402            when '='
403              TOKEN_GREATEREQUAL
404            else
405              TOKEN_GREATERTHAN
406            end, scn.pos)
407     end,
408 
409     # TOKENS :, ::CLASSREF, ::NAME
410     ':' => lambda do
411       scn = @scanner
412       la = scn.peek(3)
413       before = scn.pos
414       if la[1] == ':'
415         # PERFORMANCE NOTE: This could potentially be speeded up by using a case/when listing all
416         # upper case letters. Alternatively, the 'A', and 'Z' comparisons may be faster if they are
417         # frozen.
418         #
419         la2 = la[2]
420         if la2 >= 'A' && la2 <= 'Z'
421           # CLASSREF or error
422           value = scn.scan(PATTERN_CLASSREF)
423           if value && scn.peek(2) != '::'
424             after = scn.pos
425             emit_completed([:CLASSREF, value.freeze, after-before], before)
426           else
427             # move to faulty position ('::<uc-letter>' was ok)
428             scn.pos = scn.pos + 3
429             lex_error(Issues::ILLEGAL_FULLY_QUALIFIED_CLASS_REFERENCE)
430           end
431         else
432           value = scn.scan(PATTERN_BARE_WORD)
433           if value
434             if value =~ PATTERN_NAME
435               emit_completed([:NAME, value.freeze, scn.pos - before], before)
436             else
437               emit_completed([:WORD, value.freeze, scn.pos - before], before)
438             end
439           else
440             # move to faulty position ('::' was ok)
441             scn.pos = scn.pos + 2
442             lex_error(Issues::ILLEGAL_FULLY_QUALIFIED_NAME)
443           end
444         end
445       else
446         emit(TOKEN_COLON, before)
447       end
448     end,
449 
450     '$' => lambda do
451       scn = @scanner
452       before = scn.pos
453       value = scn.scan(PATTERN_DOLLAR_VAR)
454       if value
455         emit_completed([:VARIABLE, value[1..-1].freeze, scn.pos - before], before)
456       else
457         # consume the $ and let higher layer complain about the error instead of getting a syntax error
458         emit(TOKEN_VARIABLE_EMPTY, before)
459       end
460     end,
461 
462     '"' => lambda do
463       # Recursive string interpolation, 'interpolate' either returns a STRING token, or
464       # a DQPRE with the rest of the string's tokens placed in the @token_queue
465       interpolate_dq
466     end,
467 
468     "'" => lambda do
469       scn = @scanner
470       before = scn.pos
471       emit_completed([:STRING, slurp_sqstring.freeze, scn.pos - before], before)
472     end,
473 
474     "\n" => lambda do
475       # If heredoc_cont is in effect there are heredoc text lines to skip over
476       # otherwise just skip the newline.
477       #
478       ctx = @lexing_context
479       if ctx[:newline_jump]
480         @scanner.pos = ctx[:newline_jump]
481         ctx[:newline_jump] = nil
482       else
483         @scanner.pos += 1
484       end
485       ctx[:line_lexical_start] = @scanner.pos
486       nil
487     end,
488     '' => lambda { nil } # when the peek(1) returns empty
489   }
490 
491   [ ' ', "\t", "\r" ].each { |c| @selector[c] = lambda { @scanner.skip(PATTERN_WS); nil } }
492 
493   [ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'].each do |c|
494     @selector[c] = lambda do
495       scn = @scanner
496       before = scn.pos
497       value = scn.scan(PATTERN_NUMBER)
498       if value
499         length = scn.pos - before
500         assert_numeric(value, before)
501         emit_completed([:NUMBER, value.freeze, length], before)
502       else
503         invalid_number = scn.scan_until(PATTERN_NON_WS)
504         if before > 1
505           after = scn.pos
506           scn.pos = before - 1
507           if scn.peek(1) == '.'
508             # preceded by a dot. Is this a bad decimal number then?
509             scn.pos = before - 2
510             while scn.peek(1) =~ /^\d$/
511               invalid_number = nil
512               before = scn.pos
513               break if before == 0
514               scn.pos = scn.pos - 1
515             end
516           end
517           scn.pos = before
518           invalid_number = scn.peek(after - before) unless invalid_number
519         end
520         assert_numeric(invalid_number, before)
521         scn.pos = before + 1
522         lex_error(Issues::ILLEGAL_NUMBER, {:value => invalid_number})
523       end
524     end
525   end
526   ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
527     'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '_'].each do |c|
528     @selector[c] = lambda do
529       scn = @scanner
530       before = scn.pos
531       value = scn.scan(PATTERN_BARE_WORD)
532       if value && value =~ PATTERN_NAME
533         emit_completed(KEYWORDS[value] || @taskm_keywords[value] || [:NAME, value.freeze, scn.pos - before], before)
534       elsif value
535         emit_completed([:WORD, value.freeze, scn.pos - before], before)
536       else
537         # move to faulty position ([a-z_] was ok)
538         scn.pos = scn.pos + 1
539         fully_qualified = scn.match?(/::/)
540         if fully_qualified
541           lex_error(Issues::ILLEGAL_FULLY_QUALIFIED_NAME)
542         else
543           lex_error(Issues::ILLEGAL_NAME_OR_BARE_WORD)
544         end
545       end
546     end
547   end
548 
549   ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
550     'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'].each do |c|
551     @selector[c] = lambda do
552       scn = @scanner
553       before = scn.pos
554       value = @scanner.scan(PATTERN_CLASSREF)
555       if value && @scanner.peek(2) != '::'
556         emit_completed([:CLASSREF, value.freeze, scn.pos - before], before)
557       else
558         # move to faulty position ([A-Z] was ok)
559         scn.pos = scn.pos + 1
560         lex_error(Issues::ILLEGAL_CLASS_REFERENCE)
561       end
562     end
563   end
564 
565   @selector.default = lambda do
566     # In case of unicode spaces of various kinds that are captured by a regexp, but not by the
567     # simpler case expression above (not worth handling those special cases with better performance).
568     scn = @scanner
569     if scn.skip(PATTERN_WS)
570       nil
571     else
572       # "unrecognized char"
573       emit([:OTHER, scn.peek(0), 1], scn.pos)
574     end
575   end
576   @selector.each { |k,v| k.freeze }
577   @selector.freeze
578 end

Public Instance Methods

clear() click to toggle source

Clears the lexer state (it is not required to call this as it will be garbage collected and the next lex call (lex_string, lex_file) will reset the internal state.

    # File lib/puppet/pops/parser/lexer2.rb
598 def clear()
599   # not really needed, but if someone wants to ensure garbage is collected as early as possible
600   @scanner = nil
601   @locator = nil
602   @lexing_context = nil
603 end
emit(token, byte_offset) click to toggle source

Emits (produces) a token [:tokensymbol, TokenValue] and moves the scanner's position past the token

    # File lib/puppet/pops/parser/lexer2.rb
725 def emit(token, byte_offset)
726   @scanner.pos = byte_offset + token[2]
727   [token[0], TokenValue.new(token, byte_offset, @locator)]
728 end
emit_completed(token, byte_offset) click to toggle source

Emits the completed token on the form [:tokensymbol, TokenValue. This method does not alter the scanner's position.

    # File lib/puppet/pops/parser/lexer2.rb
733 def emit_completed(token, byte_offset)
734   [token[0], TokenValue.new(token, byte_offset, @locator)]
735 end
enqueue(emitted_token) click to toggle source

Allows subprocessors for heredoc etc to enqueue tokens that are tokenized by a different lexer instance

    # File lib/puppet/pops/parser/lexer2.rb
744 def enqueue(emitted_token)
745   @token_queue << emitted_token
746 end
enqueue_completed(token, byte_offset) click to toggle source

Enqueues a completed token at the given offset

    # File lib/puppet/pops/parser/lexer2.rb
738 def enqueue_completed(token, byte_offset)
739   @token_queue << emit_completed(token, byte_offset)
740 end
escaped_end(value) click to toggle source

Determine if last char of value is escaped by a backslash

    # File lib/puppet/pops/parser/lexer2.rb
581 def escaped_end(value)
582   escaped = false
583   if value.end_with?(STRING_BSLASH_SLASH)
584     value[1...-1].each_codepoint do |cp|
585       if cp == 0x5c # backslash
586         escaped = !escaped
587       else
588         escaped = false
589       end
590     end
591   end
592   escaped
593 end
file() click to toggle source

TODO: This method should not be used, callers should get the locator since it is most likely required to compute line, position etc given offsets.

    # File lib/puppet/pops/parser/lexer2.rb
646 def file
647   @locator ? @locator.file : nil
648 end
file=(file) click to toggle source

Convenience method, and for compatibility with older lexer. Use the lex_file instead. (Bad form to use overloading of assignment operator for something that is not really an assignment).

    # File lib/puppet/pops/parser/lexer2.rb
639 def file=(file)
640   lex_file(file)
641 end
fullscan() click to toggle source

Scans all of the content and returns it in an array Note that the terminating [false, false] token is included in the result.

    # File lib/puppet/pops/parser/lexer2.rb
675 def fullscan
676   result = []
677   scan {|token| result.push(token) }
678   result
679 end
initvars() click to toggle source
    # File lib/puppet/pops/parser/lexer2.rb
660 def initvars
661   @token_queue = []
662   # NOTE: additional keys are used; :escapes, :uq_slurp_pattern, :newline_jump, :epp_*
663   @lexing_context = {
664     :brace_count => 0,
665     :after => nil,
666     :line_lexical_start => 0
667   }
668   # Use of --tasks introduces the new keyword 'plan'
669   @taskm_keywords = Puppet[:tasks] ? { 'plan' => [:PLAN, 'plan',  4], 'apply' => [:APPLY, 'apply', 5] }.freeze : EMPTY_HASH
670 end
lex_file(file) click to toggle source

Initializes lexing of the content of the given file. An empty string is used if the file does not exist.

    # File lib/puppet/pops/parser/lexer2.rb
652 def lex_file(file)
653   initvars
654   contents = Puppet::FileSystem.exist?(file) ? Puppet::FileSystem.read(file, :mode => 'rb', :encoding => 'utf-8') : ''
655   assert_not_bom(contents)
656   @scanner = StringScanner.new(contents.freeze)
657   @locator = Locator.locator(contents, file)
658 end
lex_string(string, path=nil) click to toggle source
    # File lib/puppet/pops/parser/lexer2.rb
614 def lex_string(string, path=nil)
615   initvars
616   assert_not_bom(string)
617   @scanner = StringScanner.new(string)
618   @locator = Locator.locator(string, path)
619 end
lex_token() click to toggle source

This lexes one token at the current position of the scanner. PERFORMANCE NOTE: Any change to this logic should be performance measured.

    # File lib/puppet/pops/parser/lexer2.rb
719 def lex_token
720   @selector[@scanner.peek(1)].call
721 end
lex_unquoted_string(string, locator, escapes, interpolate) click to toggle source

Lexes an unquoted string. @param string [String] the string to lex @param locator [Locator] the locator to use (a default is used if nil is given) @param escapes [Array<String>] array of character strings representing the escape sequences to transform @param interpolate [Boolean] whether interpolation of expressions should be made or not.

    # File lib/puppet/pops/parser/lexer2.rb
627 def lex_unquoted_string(string, locator, escapes, interpolate)
628   initvars
629   assert_not_bom(string)
630   @scanner = StringScanner.new(string)
631   @locator = locator || Locator.locator(string, '')
632   @lexing_context[:escapes] = escapes || UQ_ESCAPES
633   @lexing_context[:uq_slurp_pattern] = interpolate ? (escapes.include?('$') ? SLURP_UQ_PATTERN : SLURP_UQNE_PATTERN) : SLURP_ALL_PATTERN
634 end
regexp_acceptable?() click to toggle source

Answers after which tokens it is acceptable to lex a regular expression. PERFORMANCE NOTE: It may be beneficial to turn this into a hash with default value of true for missing entries. A case expression with literal values will however create a hash internally. Since a reference is always needed to the hash, this access is almost as costly as a method call.

    # File lib/puppet/pops/parser/lexer2.rb
754 def regexp_acceptable?
755   case @lexing_context[:after]
756 
757   # Ends of (potential) R-value generating expressions
758   when :RPAREN, :RBRACK, :RRCOLLECT, :RCOLLECT
759     false
760 
761   # End of (potential) R-value - but must be allowed because of case expressions
762   # Called out here to not be mistaken for a bug.
763   when :RBRACE
764     true
765 
766   # Operands (that can be followed by DIV (even if illegal in grammar)
767   when :NAME, :CLASSREF, :NUMBER, :STRING, :BOOLEAN, :DQPRE, :DQMID, :DQPOST, :HEREDOC, :REGEX, :VARIABLE, :WORD
768     false
769 
770   else
771     true
772   end
773 end
scan() { |token| ... } click to toggle source

A block must be passed to scan. It will be called with two arguments, a symbol for the token, and an instance of LexerSupport::TokenValue PERFORMANCE NOTE: The TokenValue is designed to reduce the amount of garbage / temporary data and to only convert the lexer's internal tokens on demand. It is slightly more costly to create an instance of a class defined in Ruby than an Array or Hash, but the gain is much bigger since transformation logic is avoided for many of its members (most are never used (e.g. line/pos information which is only of value in general for error messages, and for some expressions (which the lexer does not know about).

    # File lib/puppet/pops/parser/lexer2.rb
689 def scan
690   # PERFORMANCE note: it is faster to access local variables than instance variables.
691   # This makes a small but notable difference since instance member access is avoided for
692   # every token in the lexed content.
693   #
694   scn   = @scanner
695   lex_error_without_pos(Issues::NO_INPUT_TO_LEXER) unless scn
696 
697   ctx   = @lexing_context
698   queue = @token_queue
699   selector = @selector
700 
701   scn.skip(PATTERN_WS)
702 
703   # This is the lexer's main loop
704   until queue.empty? && scn.eos? do
705     token = queue.shift || selector[scn.peek(1)].call
706     if token
707       ctx[:after] = token[0]
708       yield token
709     end
710   end
711 
712   # Signals end of input
713   yield [false, false]
714 end
string=(string) click to toggle source

Convenience method, and for compatibility with older lexer. Use the lex_string instead which allows passing the path to use without first having to call file= (which reads the file if it exists). (Bad form to use overloading of assignment operator for something that is not really an assignment. Also, overloading of = does not allow passing more than one argument).

    # File lib/puppet/pops/parser/lexer2.rb
610 def string=(string)
611   lex_string(string, nil)
612 end