Performing "AND" and "OR" searching in GoFish

Besides the standard search mode, GoFish also has a Wildcard search mode (which allows * and ? in the search phrase), and it also supports Regular Expressions for more advanced searches, like "AND" , "OR" and other incredible RegEx matches that may require a PhD in computer science to comprehend.

“AND” searches in GoFish

...using Wildcards

You can basically perform an “AND” search in GoFish using a Wildcard search like this:

select*from

(Note, to enter Wildcard search mode, click on the Advanced button, and choose "Wildcards (* and ?)" in the Options section at the bottom of the form.)

This would *essentially* be like doing an AND search for lines that contain both “SELECT” and “FROM” on the same line. However, the terms have to appear in that exact order, and this would allow any number of charcters between each word, and would even find partial matches on the end of the first term or the beginning of the second term. It's close, and potentially very helpful, but not perfect.

... using Regular Expressions

So, there are some slight differences in the match results between this simple Wildcard search and doing a true “AND” search using a Regular Expression.

It turns out that, after some Googling, the syntax for a RegEx AND search is fairly simple, but not totally intuitive (as is usually the case with Regular Expressions). Here is the link I found which best explained this to my simple brain: http://stackoverflow.com/questions/469913/regular-expressions-is-there-an-and-operator

(Note, to enter Regular Expression mode, click on the Advanced button, and choose "Regular Expression" in the Options section at the bottom of the form.)

So, the RegEx syntax for “select” and “from” on the same line is:

(?=.*select) (?=.*from)

Notes:

  • The enclosing parentheses are required around each term.
  • It is okay to have a space character between each term set. This can often help with readability.
  • Be careful... the .* patterns in the above expressions do not mean quite the same thing in this "look ahead" pattern as they do when used it basic RegEx expressions as a wildcard markers to denote any number of characters. You can learn the about the details in the link provided above.
  • The order of these terms does *not* matter.

If you want to put a whole word restriction on any or each term, it would become:

(?=.*\bselect\b) (?=.*\bfrom\b)

(In RegEx language, \b is called a 'word bondary' marker and it will prevent partial matches when used at the beginning or end of a word. You can add or remove those \b’s as needed based on what you want to look for.)

Also, you can have as many of these as you want:

(?=.*term1) (?=.*term2) (?=.*term3) (?=.*term4)

When testing these 3 different search methods on my own code base of a large project, I  got slightly different match counts on each one:

WildCard search:  “select*from” found 223 match lines from 73 files.

RegEx without \b:  (?=.*from) (?=.*select) found 229 match lines in 75 files.

RegEx with \b:   (?=.*\bselect\b) (?=.*\bfrom\b) found 206 match lines from 70 files.

Notice that the first RegEx example found *more* matches than the Wildcard search, so, you should be aware of what the differences are before you assume that Wildcard searching will give you evertyhing that you expect. I have not yet researched what the technical reasons are  that the Wildcard search found fewer matches. Perhaps, given other search terms or another code base, the outcome could be different. If you learn anything about this, please contact me with the details. 

Performing “OR” searches in GoFish

There is no way to perform an “OR” search in GoFish using the WildCard search mode, so you would have to use a RegEx.

You can perform “OR” searching using a RegEx using the pipe character (|) between each term:

(select|from)

Notes:

  • The enclosing parentheses are required around the search phrase. 
  • It is *not* necessary to use a .* character to allow for other characters before or after the word.
  • It is okay to have a space character between each term and the pipe character. This can often help with readability.
  • The order of these terms does *not* matter.

You can put a whole word restriction on each term to prevent partial word matching, it would become:

(\bselect\b | \bfrom\b)

(In RegEx language, \b is called a 'word bondary' marker and it will prevent partial matches when used at the beginning or end of a word. You can add or remove those \b’s as needed based on what you want to look for.)

And, you can have as many of these as you want:

(term1|term2|term3|term4)

Last edited Jun 13, 2012 at 2:49 AM by mattslay, version 24

Comments

mattslay Jun 12, 2012 at 4:29 PM 
From Gregory on UT:

There'a also something like 'Zero-width negative lookahead assertion'

Example:

I want "select" and "from" and *not* "into"

Then you would use:

(?=.*\bselect\b) (?=.*\bfrom\b) (?!.*\binto\b)