Wildcards

Wherever you can specify a file or directory name in Linux, you can use wildcards.
By using one or more special symbols, the shell will find those files which match
a pattern, and place them on the command line instead of the pattern itself.
The word "wild card" refers to the "Joker" in a pack of cards, since this card can
stand for any other card in many card games. In the same way, the "wildcard" character
can stand for other letters and characters in a filename.

Testing Wildcards

To get the hang of wildcards, the best thing to do is to go to a directory which is full
of files and try using the "ls" command with the wildcards as arguments. As we saw
before, the "ls" command can take a parameter which tells it what to display. Instead
of giving it a directory, we’re going to pass it a list of all filenames to display. This list
will come from the wildcard patterns which we will see below.

So, before you continue, in the terminal window type the command "cd /usr/bin".
This will switch to the main directory containing the operating system commands. It’s
full of files, so it’s ideal for our experiments.

sandbox@laptop:~ > cd /usr/bin
sandbox@laptop:/usr/bin >

The * wildcard

The first wildcard is the asterisk (*). The asterisk stands for zero or more other
characters. By placing this wildcard at the beginning, middle or end of a pattern,
you can build a pattern which has the rest of the pattern at one or either end.
For example the pattern "*txt" means any sequence of letters which ends with
"txt".

Below is a table of patterns, and an example of which filenames would match
such a pattern, and others that would NOT match.

Pattern Matching files NON-matching files Why not
*.txt File.txt
another-file.txt
txt
File.TXT
File.txt2
txtfile
Filetxt
TXT is uppercase
ends in "2"
"txt" is not at the end
no period (.)
*txt File.txt
another-file.txt
txt
Filetxt
File.TXT
File.txt2
txtfile
TXT is uppercase
ends in "2"
"txt" is not at the end
*txt* File.txt
another-file.txt
txt
Filetxt
File.txt2
txtfile
File.TXT uppercase

The ? wildcard

While the * wildcard could stand for zero or more letters or characters, the ? wildcard stands for exactly one.
Thus, a pattern of "???" stands for filenames which are exactly three characters long. The pattern "x??" matches
any three-letter filename which starts with "x".

The [] wildcard

The square brackets are used to contain a set of characters to match. For example, the pattern "[ABC]*" matches any
filename which starts in one of the letters A B or C, followed by zero or more characters.

If the first character is an exclamation mark (!) or caret (^), then the pattern matches any character except those
given. Thus, the pattern "[^x]*" means any filename except those starting with "x".

Instead of individual letters, the set can contain a range. For example, the pattern "[A-Z]*" means any filename which starts
with an uppercase letter between A and Z inclusive, followed by zero or more other characters, while "[A-Za-z123]" means
a single character which is an uppercase or lowercase letter, or the digits 1, 2 or 3.

How wildcards work

There is a very significant difference between the way that Windows handles wildcards and how this is done in Linux
or other Unixes. In Windows, the program or command being executed receives the wildcard expression intact. If a program
was not designed to cater for wildcards, it will try to open a file called (say) "*.txt".

In Linux, on the other hand, it is the bash shell that does all the work. It takes the pattern containing wildcards, convert it into
a list of matching filenames, and pass that to the program in place of the pattern. The table below shows how certain commands
would be "translated" by bash. The actual filenames depend on the contents of the directory so they could vary.

Original command What actually gets executed
ls ls
ls y* ls yacc ybmtopbm yes ypcat ypchfn ypchsh ypmatch yppasswd ypwhich yuvsplittoppm yuvtoppm
ls ?a? ls cal man tac
ls blubble* ls blubble*

Note the last example - bash could find no file with that name so the pattern, including wildcards, is passed to
the program "as is".

This makes things easier for the programs themselves, because they only need to
cater for lists of filenames on their command lines. However, there are a few
techniques that one could do in MS-DOS that cannot be done in Linux.
For example in MS-DOS, one could enter the command "copy *.doc *.bak" - which
would copy all files ending with "doc" into an equivalent filename, but ending in "bak".
In Linux, that command would be translated to something like
"copy file1.doc file2.doc file3.doc file2.bak file4.bak" - giving a totally different and
probably undesired result. Actually this technique no longer works well in Windows due
to the introduction of long file names.

Wildcards with directories

Wildcards with Linux work on directories too. For example, the pattern "*/file.txt"
means, all files called "file.txt" in any subdirectory.

Hidden files

Wildcards will not match hidden files unless the wildcard pattern itself starts with
a period. Thus, the pattern ".*" matches all hidden files (hidden files are files which
start with a period, such as .profile or .kde2)