QuickREx offers the most widely used implementations of Regular Expressions in the Java-world:
Since the APIs of the variants are slightly different, a common abstraction is used to hold information about matches and groups. The following interface abstracts a regular expression evaluated against a specific text and leading to a number of matches and groups:
package de.babe.eclipse.plugins.quickREx.regexp;
/**
* Abstracts matches in a text.
*
* @author bastian.bergerhoff
*/
public interface MatchSet {
/**
* Returns true if and only if there is a next match
* in this MatchSet. Acts like next() in
* an enumeration in that it causes the whole instance state to be
* centered around the next match.
*
* @return true if and only if there is a next match
*/
public boolean nextMatch();
/**
* Returns the start-offset of the current match.
*
* @return the start-offset of the current match
*/
public int start();
/**
* Returns the end-offset of the current match.
*
* @return the end-offset of the current match
*/
public int end();
/**
* Returns the number of groups in the current match.
* 0 is returned if there are no groups - the match itself
* does not count as a group.
*
* @return the number of groups in the current match
*/
public int groupCount();
/**
* Returns the String-contents of the group with the passed
* index.
*
* @param groupIndex the index of the group
* @return the String-contents of the group
*/
public String groupContents(int groupIndex);
/**
* Returns the start-offset of the group with the passed
* index.
*
* @param groupIndex the index of the group
* @return the start-offset of the group
*/
public int groupStart(int groupIndex);
/**
* Returns the end-offset of the group with the passed
* index.
*
* @param groupIndex the index of the group
* @return the end-offset of the group
*/
public int groupEnd(int groupIndex);
}
There is an implementation for the JDK-variant and an abstract base-implementation plus two concrete implementations for the ORO-variants. The last only differ in their constructor, where Awk- or Perl-Compilers are used as requested. As an example implementation, consider the JDK-variant:
package de.babe.eclipse.plugins.quickREx.regexp.jdk;
import java.util.Collection;
import java.util.Iterator;
import java.util.Vector;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import de.babe.eclipse.plugins.quickREx.regexp.Flag;
import de.babe.eclipse.plugins.quickREx.regexp.MatchSet;
/**
* MatchSet using JDK-regular expressions.
*
* @author bastian.bergerhoff, andreas.studer
*/
public class JavaMatchSet implements MatchSet {
private final Pattern pattern;
private final Matcher matcher;
private final static Collection flags = new Vector();
static {
flags.add(JavaFlag.JDK_CANON_EQ);
flags.add(JavaFlag.JDK_CASE_INSENSITIVE);
flags.add(JavaFlag.JDK_COMMENTS);
flags.add(JavaFlag.JDK_DOTALL);
flags.add(JavaFlag.JDK_MULTILINE);
flags.add(JavaFlag.JDK_UNICODE_CASE);
flags.add(JavaFlag.JDK_UNIX_LINES);
}
/**
* Returns a Collection of all Compiler-Flags the JDK-implementation
* knows about.
*
* @return a Collection of all Compiler-Flags the JDK-implementation
* knows about
*/
public static Collection getAllFlags() {
return flags;
}
/**
* The constructor - uses JDK-regular expressions
* to evaluate the passed regular expression against
* the passed text.
*
* @param regExp the regular expression
* @param text the text to evaluate regExp against
* @param flags a Collection of Flags to pass to the Compiler
*/
public JavaMatchSet(String regExp, String text, Collection flags) {
int iFlags = 0;
for (Iterator iter = flags.iterator(); iter.hasNext();) {
Flag element = (Flag)iter.next();
iFlags = iFlags | element.getFlag();
}
pattern = Pattern.compile(regExp, iFlags);
matcher = pattern.matcher(text);
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#nextMatch()
*/
public boolean nextMatch() {
return matcher.find();
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#start()
*/
public int start() {
return matcher.start();
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#end()
*/
public int end() {
return matcher.end();
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#groupCount()
*/
public int groupCount() {
return matcher.groupCount();
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#groupContents(int)
*/
public String groupContents(int groupIndex) {
return matcher.group(groupIndex);
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#groupStart(int)
*/
public int groupStart(int groupIndex) {
return matcher.start(groupIndex);
}
/* (non-Javadoc)
* @see de.babe.eclipse.plugins.quickREx.regexp.MatchSet#groupEnd(int)
*/
public int groupEnd(int groupIndex) {
return matcher.end(groupIndex);
}
}
The MatchSets are then used to loop over and work out matches and groups:
MatchSet matches = MatchSetFactory.createMatchSet(QuickRExPlugin.getDefault().getREFlavour(), p_RegExp, p_testText, flags);
matchData = new Vector();
while (matches.nextMatch()) {
Match match = new Match(matches.start(), matches.end());
for (int g = 0; g<matches.groupCount(); g++) {
match.addGroup(new Group(g+1, matches.groupContents(g+1), matches.groupStart(g+1), matches.groupEnd(g+1)));
}
matchData.add(match);
}
where Match and Group are abstractions of matches and groups.