org.jrubyparser.lexer
Class Lexer

java.lang.Object
  extended by org.jrubyparser.lexer.Lexer

public class Lexer
extends Object

This is a port of the MRI lexer to Java it is compatible to Ruby 1.8.x and Ruby 1.9.x depending on compatibility flag.


Nested Class Summary
static class Lexer.HeredocContext
           
static class Lexer.Keyword
           
static class Lexer.LexState
           
 
Field Summary
 boolean commandStart
           
 Lexer.HeredocContext heredocContext
           
 
Constructor Summary
Lexer()
           
Lexer(boolean isOneEight)
           
 
Method Summary
 boolean advance()
          How the parser advances to the next token.
 boolean collectComments()
           
 StackState getCmdArgumentState()
           
 StackState getConditionState()
           
 String getCurrentLine()
           
 String getEncoding()
           
static Lexer.Keyword getKeyword(String str)
           
 int getLeftParenBegin()
           
 Lexer.LexState getLexState()
           
 SourcePosition getPosition()
           
 SourcePosition getPosition(SourcePosition startPosition, boolean inclusive)
          Get position information for Token/Node that follows node represented by startPosition and current lexer location.
 boolean getPreserveSpaces()
          Return whether or not the lexer should be "space preserving".
 LexerSource getSource()
           
 StrTerm getStrTerm()
           
 CStringBuilder getTokenBuffer()
           
protected  void handleFileEncodingComment(String encodingLine)
           
 int incrementParenNest()
           
 boolean isCommandStart()
           
 boolean isIdentifierChar(int c)
          This is a valid character for an identifier?
 boolean isOneEight()
           
 boolean isSetSpaceSeen()
           
 int nextToken()
           
protected  boolean parseMagicComment(String magicLine)
           
protected  int readComment(int c)
          Read a comment up to end of line.
 int readEscape()
           
 int readUTFEscape(CStringBuilder buffer, boolean stringLiteral, boolean symbolLiteral)
           
 void readUTFEscapeRegexpLiteral(CStringBuilder buffer)
           
 void reset()
           
 void resetStacks()
           
 void setCommandStart(boolean commandStart)
           
 void setEncoding(String encoding)
           
 void setLeftParenBegin(int value)
           
 void setLexState(Lexer.LexState lex_state)
           
 void setParserSupport(ParserSupport parserSupport)
          Parse must pass its support object for some check at bottom of yylex().
 void setPreserveSpaces(boolean preserveSpaces)
          Set whether or not the lexer should be "space preserving" - in other words, whether the parser should consider whitespace sequences and code comments to be separate tokens to return to the client.
 void setSource(LexerSource source)
          Allow the parser to set the source for its lexer.
 void setSpaceSeen(boolean setSpaceSeen)
           
 void setState(Lexer.LexState state)
           
 void setStrTerm(StrTerm strterm)
           
 void setValue(Object yaccValue)
           
 void setWarnings(IRubyWarnings warnings)
           
 int token()
          Last token read from the lexer at the end of a call to yylex()
 int tokenAddMBC(int codepoint, CStringBuilder buffer)
           
 void tokenAddMBCFromSrc(int c, CStringBuilder buffer)
           
 Object value()
          Value of last token (if it is a token which has a value).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

heredocContext

public Lexer.HeredocContext heredocContext

commandStart

public boolean commandStart
Constructor Detail

Lexer

public Lexer()

Lexer

public Lexer(boolean isOneEight)
Method Detail

getKeyword

public static Lexer.Keyword getKeyword(String str)

setPreserveSpaces

public void setPreserveSpaces(boolean preserveSpaces)
Set whether or not the lexer should be "space preserving" - in other words, whether the parser should consider whitespace sequences and code comments to be separate tokens to return to the client. Parsers typically do not want to see any whitespace or comment tokens - but an IDE trying to tokenize a chunk of source code does want to identify these separately. The default, false, means the parser mode.

Parameters:
preserveSpaces - If true, return space and comment sequences as tokens, if false, skip these
See Also:
getPreserveSpaces()

getPreserveSpaces

public boolean getPreserveSpaces()
Return whether or not the lexer should be "space preserving". For a description of what this means, see setPreserveSpaces(boolean).

Returns:
preserveSpaces True iff space and comment sequences will be returned as tokens, and false otherwise.
See Also:
setPreserveSpaces(boolean)

getLexState

public Lexer.LexState getLexState()

setLexState

public void setLexState(Lexer.LexState lex_state)

isSetSpaceSeen

public boolean isSetSpaceSeen()

setSpaceSeen

public void setSpaceSeen(boolean setSpaceSeen)

isCommandStart

public boolean isCommandStart()

setCommandStart

public void setCommandStart(boolean commandStart)

getSource

public LexerSource getSource()

incrementParenNest

public int incrementParenNest()

getLeftParenBegin

public int getLeftParenBegin()

setLeftParenBegin

public void setLeftParenBegin(int value)

reset

public void reset()

advance

public boolean advance()
                throws IOException
How the parser advances to the next token.

Returns:
true if not at end of file (EOF).
Throws:
IOException

nextToken

public int nextToken()
              throws IOException
Throws:
IOException

token

public int token()
Last token read from the lexer at the end of a call to yylex()

Returns:
last token read

getTokenBuffer

public CStringBuilder getTokenBuffer()

value

public Object value()
Value of last token (if it is a token which has a value).

Returns:
value of last value-laden token

getPosition

public SourcePosition getPosition(SourcePosition startPosition,
                                  boolean inclusive)
Get position information for Token/Node that follows node represented by startPosition and current lexer location.

Parameters:
startPosition - previous node/token
inclusive - include previous node into position information of current node
Returns:
a new position

getPosition

public SourcePosition getPosition()

getCurrentLine

public String getCurrentLine()

setEncoding

public void setEncoding(String encoding)

getEncoding

public String getEncoding()

setParserSupport

public void setParserSupport(ParserSupport parserSupport)
Parse must pass its support object for some check at bottom of yylex(). Ruby does it this way as well (i.e. a little parsing logic in the lexer).

Parameters:
parserSupport -

setSource

public void setSource(LexerSource source)
Allow the parser to set the source for its lexer.

Parameters:
source - where the lexer gets raw data

getStrTerm

public StrTerm getStrTerm()

setStrTerm

public void setStrTerm(StrTerm strterm)

resetStacks

public void resetStacks()

setWarnings

public void setWarnings(IRubyWarnings warnings)

setState

public void setState(Lexer.LexState state)

getCmdArgumentState

public StackState getCmdArgumentState()

isOneEight

public boolean isOneEight()

getConditionState

public StackState getConditionState()

setValue

public void setValue(Object yaccValue)

isIdentifierChar

public boolean isIdentifierChar(int c)
This is a valid character for an identifier?

Parameters:
c - is character to be compared
Returns:
whether c is an identifier or not mri: is_identchar

parseMagicComment

protected boolean parseMagicComment(String magicLine)
                             throws IOException
Throws:
IOException

handleFileEncodingComment

protected void handleFileEncodingComment(String encodingLine)
                                  throws IOException
Throws:
IOException

collectComments

public boolean collectComments()

readComment

protected int readComment(int c)
                   throws IOException
Read a comment up to end of line. When found each comment will get stored away into the parser result so that any interested party can use them as they seem fit. One idea is that IDE authors can do distance based heuristics to associate these comments to the AST node they think they belong to.

Parameters:
c - last character read from lexer source
Returns:
newline or eof value
Throws:
IOException

readUTFEscapeRegexpLiteral

public void readUTFEscapeRegexpLiteral(CStringBuilder buffer)
                                throws IOException
Throws:
IOException

tokenAddMBC

public int tokenAddMBC(int codepoint,
                       CStringBuilder buffer)

tokenAddMBCFromSrc

public void tokenAddMBCFromSrc(int c,
                               CStringBuilder buffer)
                        throws IOException
Throws:
IOException

readUTFEscape

public int readUTFEscape(CStringBuilder buffer,
                         boolean stringLiteral,
                         boolean symbolLiteral)
                  throws IOException
Throws:
IOException

readEscape

public int readEscape()
               throws IOException
Throws:
IOException


Copyright © 2013. All Rights Reserved.