网站个人备案容易过吗,拼多多网站开发,苏州网站排名优化报价,wordpress相册修改文章目录 什么是ANTLR#xff1f;第一个例子ANTLR4 的工作流程Lua脚本语法校验准备一个Lua Grammar文件maven配置新建实体类Lua语法遍历器语法错误监听器单元测试 参考 什么是ANTLR#xff1f;
https://www.antlr.org/
ANTLR (ANother Tool for Language Recognition) is a… 文章目录 什么是ANTLR第一个例子ANTLR4 的工作流程Lua脚本语法校验准备一个Lua Grammar文件maven配置新建实体类Lua语法遍历器语法错误监听器单元测试 参考 什么是ANTLR
https://www.antlr.org/
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It’s widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.
ANTLRANother Tool for Language Recognition是一个强大的解析器生成器用于读取、处理、执行或翻译结构化文本或二进制文件。 它被广泛用于构建语言、工具和框架。ANTLR 根据语法定义生成解析器解析器可以构建和遍历解析树。
第一个例子
https://github.com/antlr/antlr4/blob/master/doc/getting-started.md#a-first-example
新建个Hello.g4文件
// Define a grammar called Hello
grammar Hello;
r : hello ID ; // match keyword hello followed by an identifier
ID : [a-z] ; // match lower-case identifiers
WS : [ \t\r\n] - skip ; // skip spaces, tabs, newlines安装IDEA插件 ANTLR v4https://plugins.jetbrains.com/plugin/7358-antlr-v4 打开ANTLR Preview 在r : hello ID ; // match keyword hello followed by an identifier这行上右键点击Test Rule r 输入hello world能够准确识别出ID为word。 输入hello World就不能够识别出ID为world了。
ANTLR4 的工作流程
词法分析器 (Lexer) 将字符序列转换为单词(Token)的过程。词法分析器(Lexer)一般是用来供语法解析器(Parser)调用的。语法解析器 (Parser) 通常作为编译器或解释器出现。它的作用是进行语法检查并构建由输入单词(Token)组成的数据结构(即抽象语法树)。语法解析器通常使用词法分析器(Lexer)从输入字符流中分离出一个个的单词(Token)并将单词(Token)流作为其输入。实际开发中语法解析器可以手工编写也可以使用工具自动生成。抽象语法树 (Parse Tree) 是源代码结构的一种抽象表示它以树的形状表示语言的语法结构。抽象语法树一般可以用来进行代码语法的检查代码风格的检查代码的格式化代码的高亮代码的错误提示以及代码的自动补全等。 如上左边的点线流程代表了通过 ANTLR4将原始的.g4 规则转化为 Lexer、Parser、Listener 和 Visitor。右边的虚线流程代表了将原始的输入流通过 Lexer 转化为 Tokens再将 Tokens 通过 Parser 转化为语法树最后通过 Listener 或 Visitor 遍历 ParseTree 得到最终结果。
Lua脚本语法校验
准备一个Lua Grammar文件
https://github.com/antlr/grammars-v4/tree/master/lua
/*
BSD LicenseCopyright (c) 2013, Kazunori Sakamoto
Copyright (c) 2016, Alexander Alexeev
All rights reserved.Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:1. Redistributions of source code must retain the above copyrightnotice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyrightnotice, this list of conditions and the following disclaimer in thedocumentation and/or other materials provided with the distribution.
3. Neither the NAME of Rainer Schuster nor the NAMEs of its contributorsmay be used to endorse or promote products derived from this softwarewithout specific prior written permission.THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.This grammar file derived from:Lua 5.3 Reference Manualhttp://www.lua.org/manual/5.3/manual.htmlLua 5.2 Reference Manualhttp://www.lua.org/manual/5.2/manual.htmlLua 5.1 grammar written by Nicolai Mainierohttp://www.antlr3.org/grammar/1178608849736/Lua.gTested by Kazunori Sakamoto with Test suite for Lua 5.2 (http://www.lua.org/tests/5.2/)Tested by Alexander Alexeev with Test suite for Lua 5.3 http://www.lua.org/tests/lua-5.3.2-tests.tar.gz
*/grammar Lua;chunk: block EOF;block: stat* retstat?;stat: ;| varlist explist| functioncall| label| break| goto NAME| do block end| while exp do block end| repeat block until exp| if exp then block (elseif exp then block)* (else block)? end| for NAME exp , exp (, exp)? do block end| for namelist in explist do block end| function funcname funcbody| local function NAME funcbody| local attnamelist ( explist)?;attnamelist: NAME attrib (, NAME attrib)*;attrib: ( NAME )?;retstat: return explist? ;?;label: :: NAME ::;funcname: NAME (. NAME)* (: NAME)?;varlist: var_ (, var_)*;namelist: NAME (, NAME)*;explist: exp (, exp)*;exp: nil | false | true| number| string| ...| functiondef| prefixexp| tableconstructor| assocright exp operatorPower exp| operatorUnary exp| exp operatorMulDivMod exp| exp operatorAddSub exp| assocright exp operatorStrcat exp| exp operatorComparison exp| exp operatorAnd exp| exp operatorOr exp| exp operatorBitwise exp;prefixexp: varOrExp nameAndArgs*;functioncall: varOrExp nameAndArgs;varOrExp: var_ | ( exp );var_: (NAME | ( exp ) varSuffix) varSuffix*;varSuffix: nameAndArgs* ([ exp ] | . NAME);nameAndArgs: (: NAME)? args;/*
var_: NAME | prefixexp [ exp ] | prefixexp . NAME;prefixexp: var_ | functioncall | ( exp );functioncall: prefixexp args | prefixexp : NAME args;
*/args: ( explist? ) | tableconstructor | string;functiondef: function funcbody;funcbody: ( parlist? ) block end;parlist: namelist (, ...)? | ...;tableconstructor: { fieldlist? };fieldlist: field (fieldsep field)* fieldsep?;field: [ exp ] exp | NAME exp | exp;fieldsep: , | ;;operatorOr: or;operatorAnd: and;operatorComparison: | | | | ~ | ;operatorStrcat: ..;operatorAddSub: | -;operatorMulDivMod: * | / | % | //;operatorBitwise: | | | ~ | | ;operatorUnary: not | # | - | ~;operatorPower: ^;number: INT | HEX | FLOAT | HEX_FLOAT;string: NORMALSTRING | CHARSTRING | LONGSTRING;// LEXERNAME: [a-zA-Z_][a-zA-Z_0-9]*;NORMALSTRING: ( EscapeSequence | ~(\\|) )* ;CHARSTRING: \ ( EscapeSequence | ~(\|\\) )* \;LONGSTRING: [ NESTED_STR ];fragment
NESTED_STR: NESTED_STR | [ .*? ];INT: Digit;HEX: 0 [xX] HexDigit;FLOAT: Digit . Digit* ExponentPart?| . Digit ExponentPart?| Digit ExponentPart;HEX_FLOAT: 0 [xX] HexDigit . HexDigit* HexExponentPart?| 0 [xX] . HexDigit HexExponentPart?| 0 [xX] HexDigit HexExponentPart;fragment
ExponentPart: [eE] [-]? Digit;fragment
HexExponentPart: [pP] [-]? Digit;fragment
EscapeSequence: \\ [abfnrtvz\\]| \\ \r? \n| DecimalEscape| HexEscape| UtfEscape;fragment
DecimalEscape: \\ Digit| \\ Digit Digit| \\ [0-2] Digit Digit;fragment
HexEscape: \\ x HexDigit HexDigit;fragment
UtfEscape: \\ u{ HexDigit };fragment
Digit: [0-9];fragment
HexDigit: [0-9a-fA-F];COMMENT: --[ NESTED_STR ] - channel(HIDDEN);LINE_COMMENT: --( // --| [ * // --[| [ * ~(|[|\r|\n) ~(\r|\n)* // --[AA| ~([|\r|\n) ~(\r|\n)* // --AAA) (\r\n|\r|\n|EOF)- channel(HIDDEN);WS: [ \t\u000C\r\n] - skip;SHEBANG: # ! ~(\n|\r)* - channel(HIDDEN);maven配置
使用JDK8的注意antlr4最高版本为4.9.3原因如下 来源https://github.com/antlr/antlr4/releases/tag/4.10 Increasing minimum java version Going forward, we are using Java 11 for the source code and the compiled .class files for the ANTLR tool. The Java runtime target, however, and the associated runtime tests use Java 8 (bumping up from Java 7). dependenciesdependencygroupIdorg.antlr/groupIdartifactIdantlr4-runtime/artifactIdversion${antlr.version}/version/dependency
/dependenciesbuildpluginsplugingroupIdorg.antlr/groupIdartifactIdantlr4-maven-plugin/artifactIdversion${antlr.version}/versionconfigurationvisitortrue/visitorlistenertrue/listener/configurationexecutionsexecutiongoalsgoalantlr4/goal/goals/execution/executions/plugin/plugins
/buildproperties!--https://mvnrepository.com/artifact/org.antlr/antlr4-runtime--antlr.version4.9.3/antlr.versionmojo.version3.0.0/mojo.version
/properties新建实体类
语法错误每行有什么错误。
package com.baeldung.antlr.lua.model;/*** 语法错误** author duhongming* see* since 1.0.0*/
public class SyntaxErrorEntry {private Integer lineNum;private String errorInfo;public Integer getLineNum() {return lineNum;}public void setLineNum(Integer lineNum) {this.lineNum lineNum;}public String getErrorInfo() {return errorInfo;}public void setErrorInfo(String errorInfo) {this.errorInfo errorInfo;}
}语法错误报告每行有什么错误的集合。
package com.baeldung.antlr.lua.model;import java.util.LinkedList;
import java.util.List;/*** 语法错误报告** author duhongming* see* since 1.0.0*/
public class SyntaxErrorReportEntry {private final ListSyntaxErrorEntry syntaxErrorList new LinkedList();public void addError(int line, int charPositionInLine, Object offendingSymbol, String msg) {SyntaxErrorEntry syntaxErrorEntry new SyntaxErrorEntry();syntaxErrorEntry.setLineNum(line);syntaxErrorEntry.setErrorInfo(line 行, charPositionInLine 列 offendingSymbol 字符处存在语法错误: msg);syntaxErrorList.add(syntaxErrorEntry);}public ListSyntaxErrorEntry getSyntaxErrorReport() {return syntaxErrorList;}
}Lua语法遍历器
package com.baeldung.antlr.lua;import com.baeldung.antlr.LuaParser;
import com.baeldung.antlr.LuaVisitor;
import org.antlr.v4.runtime.tree.ErrorNode;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.RuleNode;
import org.antlr.v4.runtime.tree.TerminalNode;/*** Lua语法遍历器** author duhongming* see* since 1.0.0*/
public class LuaSyntaxVisitor implements LuaVisitorObject {
// ctrlO Override即可
}语法错误监听器
package com.baeldung.antlr.lua;import com.baeldung.antlr.lua.model.SyntaxErrorReportEntry;
import org.antlr.v4.runtime.BaseErrorListener;
import org.antlr.v4.runtime.RecognitionException;
import org.antlr.v4.runtime.Recognizer;/*** 语法错误监听器** author duhongming* see* since 1.0.0*/
public class SyntaxErrorListener extends BaseErrorListener {private final SyntaxErrorReportEntry reporter;public SyntaxErrorListener(SyntaxErrorReportEntry reporter) {this.reporter reporter;}Overridepublic void syntaxError(Recognizer?, ? recognizer,Object offendingSymbol, int line, int charPositionInLine,String msg, RecognitionException e) {this.reporter.addError(line, charPositionInLine, offendingSymbol, msg);}
}单元测试
package com.baeldung.antlr;import com.baeldung.antlr.lua.LuaSyntaxVisitor;
import com.baeldung.antlr.lua.SyntaxErrorListener;
import com.baeldung.antlr.lua.model.SyntaxErrorEntry;
import com.baeldung.antlr.lua.model.SyntaxErrorReportEntry;
import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.junit.Test;import java.util.List;import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;public class LuaSyntaxErrorUnitTest {public static ListSyntaxErrorEntry judgeLuaSyntax(String luaScript) {//新建一个CharStream读取数据CharStream charStreams CharStreams.fromString(luaScript);//包含一个词法分析器的定义作用是将输入的字符序列聚集成词汇符号。LuaLexer luaLexer new LuaLexer(charStreams);//新建一个词法符号的缓冲区用于存储词法分析器生成的词法符号TokenCommonTokenStream tokenStream new CommonTokenStream(luaLexer);//新建一个语法分析器用于分析词法符号缓冲区中的词法符号LuaParser luaParser new LuaParser(tokenStream);SyntaxErrorReportEntry syntaxErrorReporter new SyntaxErrorReportEntry();SyntaxErrorListener errorListener new SyntaxErrorListener(syntaxErrorReporter);luaParser.addErrorListener(errorListener);LuaSyntaxVisitor luaSyntaxVisitor new LuaSyntaxVisitor();luaSyntaxVisitor.visit(luaParser.chunk());return syntaxErrorReporter.getSyntaxErrorReport();}Testpublic void testGood() throws Exception {ListSyntaxErrorEntry errorEntryList judgeLuaSyntax(if a~1 then print(1) end);assertThat(errorEntryList.size(), is(0));}Testpublic void testBad() throws Exception {//新建一个CharStream读取数据ListSyntaxErrorEntry errorEntryList judgeLuaSyntax(if a!1 then print(1) end);assertThat(errorEntryList.size(), is(2));}
}
最终目录情况及单元测试情况
参考
https://www.baeldung.com/java-antlr https://juejin.cn/post/7018521754125467661 https://www.nosuchfield.com/2023/08/26/ANTLR4-from-Beginning-to-Practice/ https://blog.csdn.net/qq_37771475/article/details/106387201 https://blog.csdn.net/qq_37771475/article/details/106426327