Last week, we used ANTLR to generate a library to be able to analyze Kotlin code. It’s time to use the generated API to check for specific patterns.
API overview
Let’s start by having a look at the generated API:
KotlinLexer
: Executes lexical analysis.KotlinParser
: Wraps classes representing all Kotlin tokens, and handles parsing errors.KotlinParserVisitor
: Contract for implementing the Visitor pattern on Kotlin code.KotlinParserBaseVisitor
is its empty implementation, to ease the creation of subclasses.KotlinParserListener
: Contract for callback-related code when visiting Kotlin code, withKotlinParserBaseListener
its empty implementation.
Class diagrams are not the greatest diagrams to ease the writing of code. The following snippet is a very crude analysis implementation. I’ll be using Kotlin, but any JVM language interoperable with Java could be used:
val stream = CharStreams.fromString("fun main(args : Array<String>) {}") (1)
val lexer = KotlinLexer(stream) (2)
val tokens = CommonTokenStream(lexer) (3)
val parser = KotlinParser(tokens) (4)
val context = parser.kotlinFile() (5)
ParseTreeWalker().apply { (6)
walk(object : KotlinParserBaseListener() { (7)
override fun enterFunctionDeclaration(ctx: KotlinParser.FunctionDeclarationContext) { (8)
println(ctx.SimpleName().text) (9)
}
}, context)
}
Here’s the explanation:
1 | Create a CharStream to feed the lexer on the next line.
The CharStreams offers plenty of static fromXXX() methods, each accepting a different type (String , InputStream , etc.) |
2 | Instantiate the lexer, with the stream |
3 | Instantiate a token stream over the lexer. The class provides streaming capabilities over the lexer. |
4 | Instantiate the parser, with the token stream |
5 | Define the entry point into the code. In that case, it’s a Kotlin file - and probably will be for the plugin. |
6 | Create the overall walker that will visit each node in turn |
7 | Start the visiting process by calling walk and passing the desired behavior as an object |
8 | Override the desired function. Here, it will be invoked every time a function node is entered |
9 | Do whatever is desired e.g. print the function name |
Obviously, lines 1 to 7 are just boilerplate to wire all components together. The behavior that need to be implemented should replace lines 8 and 9.
First simple check
In Kotlin, if a function returns Unit
- nothing, then explicitly declaring its return type is optional.
It would be a great rule to check that there’s no such explicit return.
The following snippets, both valid Kotlin code, are equivalent - one with an explicit return type and the other without:
fun hello1(): Unit {
println("Hello")
}
fun hello2() {
println("Hello")
}
Let’s use grun
to graphically display the parse tree (grun
was explained in the previous post).
It yields the following:
As can be seen, the snippet with an explicit return type has a type branch under functionDeclaration. This is confirmed by the snippet from the KotlinParser ANTLR grammar file:
functionDeclaration
: modifiers 'fun' typeParameters?
(type '.' | annotations)?
SimpleName
typeParameters? valueParameters (':' type)?
typeConstraints
functionBody?
SEMI*
;
The rule should check that if such a return type exists, then it shouldn’t be Unit
.
Let’s update the above code with the desired effect:
ParseTreeWalker().apply {
walk(object : KotlinParserBaseListener() {
override fun enterFunctionDeclaration(ctx: KotlinParser.FunctionDeclarationContext) {
if (ctx.type().isNotEmpty()) { (1)
val typeContext = ctx.type(0) (2)
with(typeContext.typeDescriptor().userType().simpleUserType()) { (3)
val typeName = this[0].SimpleName()
if (typeName.symbol.text == "Unit") { (4)
println("Found Unit as explicit return type " + (5)
"in function ${ctx.SimpleName()} at line ${typeName.symbol.line}")
}
}
}
}
}, context)
}
Here’s the explanation:
1 | Check there’s an explicit return type, whatever it is |
2 | Strangely enough, the grammar allows for a multi-valued return type. Just take the first one. |
3 | Follow the parse tree up to the final type name - refer to the above parse tree screenshot for a graphical representation of the path. |
4 | Check that the return type is Unit |
5 | Prints a message in the console. In the next step, we will call the SonarQube API there. |
Running the above code correctly yields the following output:
Found Unit as explicit return type in function hello1 at line 1
A more advanced check
In Kotlin, the following snippets are all equivalent:
fun hello1(name: String): String {
return "Hello $name"
}
fun hello2(name: String): String = "Hello $name"
fun hello3(name: String) = "Hello $name"
Note that in the last case, the return type can be inferred by the compiler and omitted by the developer. That would make a good check: in the case of a expression body, the return type should be omitted. The same technique as above can be used:
- Display the parse tree from the snippet using
grun
: - Check for differences. Obviously:
- Functions that do not have an explicit return type miss a
type
node in thefunctionDeclaration
tree, as above - Functions with an expression body have a
functionBody
whose first child is=
and whose second child is anexpression
- Functions that do not have an explicit return type miss a
- Refer to the initial grammar, to make sure all cases are covered.
functionBody : block | '=' expression ;
- Code!
ParseTreeWalker().apply { walk(object : KotlinParserBaseListener() { override fun enterFunctionDeclaration(ctx: KotlinParser.FunctionDeclarationContext) { val bodyChildren = ctx.functionBody().children if (bodyChildren.size > 1 && bodyChildren[0] is TerminalNode && bodyChildren[0].text == "=" && ctx.type().isNotEmpty()) { val firstChild = bodyChildren[0] as TerminalNode println("Found explicit return type for expression body " + "in function ${ctx.SimpleName()} at line ${firstChild.symbol.line}") } } }, context) }
The code is pretty self-explanatory and yields the following:
Found explicit return type for expression body in function hello2 at line 5