DEV Community

Vicente Maldonado
Vicente Maldonado

Posted on • Originally published at Medium on

Meet JFlex

JFlex is a scanner generator for Java. A scanner generator will generate a scanner (a.k.a. lexer) for you instead of you having to write one yourself. JFlex is modeled after (f)lex only it’s written in Java and generates Java lexers unlike the two older tools.

What is the JFlex workflow?

  • Create a JFlex source file (*.flex)
  • Use the JFlex command-line tool to compile the file into a Java file
  • Use javac to compile the Java file
  • Invoke the *.class file

et voilà, you have a working scanner/lexer. You can use it as a standalone tool or in combination with other programs — tools like Yacc/Bison commonly expect a scanner to feed them input to work with.

A JFlex source file is made up of three parts:

separated by double percent sign (%%). Here is a simple example:

import java.io.\*;

%%

That’s it for the first part. Oddly enough, if you want to add code to the generated Java class you’ll have to include that code in the middle section of the JFlex file (options and declarations). There’s no magic to it: JFlex creates the lexer based on a template, and the code you put in the first section does not end up as a part of the generated class — this is why you put your import statements here (You can go wild and put full Java classes there too but that’s not a very good idea).

The code you do put in the middle section of your JFlex file, on the other hand, does end up as a part of the lexer. Let’s add the main method to the class and make it self-contained:

%{

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

 System.out.println("Start lexing");

 while (true)
 {
 System.out.println(lexer.yylex());
 }
}

%}

A lexer that JFlex generates needs to be initialized with a Java Reader. In this case we will accept input from System.in, ie. stdio. JFlex generates a class named Yylex with a function named yylex(). Let’s change that (we are still in the middle section of our JFlex file):

%class Lexer
%type String

%%

This will make JFlex change the class name to Lexer and yylex() will return a Java String instead of a Yytoken — a class we won’t bother creating.

The plan here is to make yylex() return any character we type on our keyboard — this is why we specify its %type as String. Then we’ll just use an infinite loop ( while (true) ) to accept characters and immediately print them out.

Let’s finish with the third section (lexical rules):

[^] { return yytext(); }

[^] will match any character. yytext() will return the character as a string and { return ...} is what yylex() will return so we are done.

Here is the complete file:

import java.io.\*;

%%

%{

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

 System.out.println("Start lexing");

 while (true)
 {
 System.out.println(lexer.yylex());
 }
}

%}

%class Lexer
%type String

%%

[^] { return yytext(); }

You need to compile it (watch for jflex error messages in output):

[johnny@test example1]$ jflex Lexer.flex

compile the generated Java file:

[johnny@test example1]$ javac Lexer.java

and run it:

[johnny@test example1]$ java Lexer

Here’s an example terminal session:

Start lexing
123
1
2
3

abc 
a
b
c

^C[johnny@test example1]$

Use ctrl-c to stop the program. Of course this is not very exciting and you don’t have to use a 570 loc Java file (yes, that’s how long the generated lexer is) just to echo characters.

You can download the source code from Github.

Top comments (0)