Regular Expression Patterns

owlThis is a quick, look-up of pre-constructed regular expression patterns you can use while validating forms and working with the Internet.  A cheat sheet to speed up your coding.  Each of these has been tested to work with PHP functions.

If you would like to test any of these patterns yourself with an on-line tester,  I recommend the Regular Expression Test Tool, located here.  To test, drop the starting and ending quotes in the $pattern variable and paste /../ into the tester.

Validating Forms

Usernames
It's normal to place restrictions on usernames, as opposed to first and last names. This pattern must have 8 to 25 alphanumeric characters, and is allowed to have _ or - between the characters, but no other punctualtion is allowed.  It treats spaces as separate words and will only match words from 8 to 25 characters.

$pattern ="/[[:alnum:]_-]{8,25}/";

Addresses with PO Boxes
Matches: Box or box and that's it.

$pattern = "/[Bb]ox/";

US Postal Zip Codes
Must have 5 digts. and it may have a - with 4 digits, that's it. No other characters allowed.  This will not validate Canadian or UK postal codes.

$pattern = "/^[0-9]{5}(?:-[0-9]{4})?$/";

Canadian Zip Codes
Candian postal codes must have two groups of 3 alphanumerics with a space in the middle.  The first letter may not start with DFIOQUW or Z. A good number is: "K2A 9B3"

$pattern = "/^[^DFIOQUWZdfioqwz][0-9][[:alpha:]] [0-9][[:alpha:]][0-9]/";

U.K. Postcodes
Must have 5-8 alphanumeric characters separated by a space that starts with a letter. Valid codes are: "DN55 1PT" and "B33 8TH" for example.

$pattern = "/^[[:alpha:]]{1,2}[0-9][[:alnum:]]? [0-9][[:alpha:]][[:alpha:]]/";

North American Phone Numbers
Must be 10 digits. Digits can be grouped as 3 3 4 digits and separated with a . - () or spaces. This is a good pattern to use the preg_match_all to separate the numbers into groups.

$pattern = "/^(\(?[0-9]{3}\)?)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})?/";

Format North American Phone Numbers to a set Format
Use preg_replace with the above pattern. If it's a valid number, in this pattern, it is formatted to:  (123) 456-7899

$replacement = "($1) $2-$3";

International phone numbers
International numbers start with a + followed by country code and national number. Must have a +, numbers must be at least 7 digits, and can not exceed 15 digits.

$pattern = "/^\+(?:[0-9] ?){6,14}[0-9]$/";

Email addresses
Before the @, it allows multiple numbers, upper and lower letters, and . _ % - and that's it. You must have a @.  After the @, you must have at least one dot, but not 2 dots together, and the after the final . only 2-6 letters to match both 2 digit country codes and the 6 digit .museam.

$pattern = "/[A-Za-z0-9._%-]+@(?:[A-Za-z0-9-]+\.)+[A-Za-z]{2,6}$/";

Date and Time

Date Formats
Match m/d/yy or mm/dd/yyyy allowing 1 or 2 digits for day and month and 2 or 4 digits for year. This is a good one to use with preg_match_all to pull dates out of the group. This will match years of 4 or 2 digits starting with 00 but not just a single digit for the year.

$pattern = "/^(1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?([0-9]{2})/";

If you require leading zeros and a 4 digit year, try this pattern. This requires the format: mm/dd/yyyy.  If you want to see the output of preg_match_all multidimensional array, my newchk utility could help you when you only want a part of the year.

$pattern = "/^(?:(1[0-2]|0[1-9])\/(3[01]|[12][0-9]|0[1-9])|(3[0-1]||12[0-9]|0[1-9])\/(1[0-2]|0[1-9]))\/([0-9]{4})/";

If you're searching through text for dates, as opposed to verifying form entries,  you'll need to use the /b word delimiter on each side of the pattern.  That goes for all the rest of these patterns, also.

$pattern = "/\b(?:(1[0-2]|0[1-9])\/(3[01]|[12][0-9]|0[1-9])|(3[0-1]||12[0-9]|0[1-9])\/(1[0-2]|0[1-9]))\/([0-9]{4})\b/";

Time
Hours:Minutes:Seconds for a 24 hour or 12 hour clock.  Preg_match_all can break this into an array where you can pull the hours out.

$pattern = "/^(2[0-3]|[01]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$/":

Security

Passwords
A combination of upper and lower case letters and numbers and at least 8 characters, but not more than 25 characters,  No control characters.

$pattern = "/^(?=.*[[:alnum:]]).{8,25}/";

And with control characters, but not quotes or | which could lead to SQL injection

$pattern = "/^(?=.*[[:alnum:]]|[~!@#$%^&*()-_=+]).{8,25}/";

U.S. Currency
Make the $ sign and any commas optional. There must be a . with 2 decimals
It will recognize $.90, $0.90, $002,456.23, and $ 23.13 with a space after the $ sign.

$pattern = "/^\$ ?([0-9]{0,3}(,[0-9]{0,3})*|[0-9]+)(\.[0-9][0-9])$/";

Credit Cards
First, use preg_replace to strip out the spaces and -'s between the numbers.

$pattern = "/[ -]/";
$replacement = '';
$cleanccnumber = $preg_replace($pattern, $replacement, $ccnumber)

The four major credit card companies (Visa, MasterCard, Discover, Amex) all have different number formats. Visa 16 digits starting with a 4, Mastercard 16 digits starting with 51-55, Discover 16 digits starting with 6011 or 15 digits starting with 5, Amex 15 digits starting with 34 or 37. This checks all these combinations for each type of card.

$pattern = "/^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011)[0-9]{12}|5([0-9][0-9])[0-9]{12}|3(?:[47][0-9]{13}))?/";

Credit Card Security Code
Three numbers for all cards, except Amex has four numbers.

$pattern = "/^[0-9]{3,4}$/";

Credit Card Expire Date
Usually are 2 digit month and 2 digit year, but normally done with a drop down menu for validation.

Social Security Number
Nine digit numbers in the format of 999-99-9999.  The first three digits are not 000, 666, or 900 to 999. The other two groups can not be 00 or 0000.  Everything else works.

$pattern = "/^(?!000|666|9[0-9]{2})[0-9]{3}-(?!00)[0-9]{2}-(?!0000)[0-9]{4}$/";

Federal Tax ID Number
EIN numbers are 9 digit numbers in the format of 99-9999999.

$pattern = "/[0-9]{2}-[0-9]{7}/";

U.S. Passport Number
Must be between six to nine digits all together.

$pattern = "/^[0-9]{6,9}$/";

Internet

IPv4 Address
IP address are 4 groups of 3 digits, between 0 - 255, like so, 255.255.255.255 or theoretically they could be 0.0.0.0

$pattern =  "/^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/";

IPv6 Addresses
The new IPv6 addresses consists of eight 16-bit words, as 4 hexadecimal characters each, and delimited by colons. Leading zeros are optional, for example: "1762:1230:EFAC:220:2:B03:1:AF18"

$pattern = "/^(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4}$/i";

Domain Names
This will account for both domesitic and international domain names, for example: http://www.geekgumbo.com

$pattern = "/^(https?|ftp|file):\/\/.+$/i";

My intent is to gradually add to this list over time.  If any one would like to add a pattern to the list, please put them in a comment, and I will be happy to expand the list for others to use.

Posted in Regular Expressions | Leave a comment

Regular Expression Syntax

In our last post I went through a quick overview of the PHP functions that can be used with regular expression patterns.  Let's take a closer look at these patterns.

I always assign my reqular expression to a variable, to make it easy to change, and use with a function, like so:

$pattern = "/quick/";

if(preg_match($pattern, $text))
{
... do something
};

The above reqular expression will find a lower-case quick.  We then use the pattern with a PHP regular expression function like preg_match, see my last article.

Let's talk about this pattern.  The quotes are enclosing the pattern, like you would any string in PHP. The / starts the regular expression and encloses the pattern with the exception that there are some modifiers that you can use after the last /, like so:

$pattern = "/quick/i"

The "i" says ignore case, now we would match on either quick or Quick.

If you want to include a / in the pattern, you  can escape the / with a backslash, \ , like so.
/123\/456/  would match 123/456

Let's work through the syntax:

/^ar/     ^  finds a string starting with ar

/ar$/   $  finds strings ending in ar.

/a.r/    .   is like a wild card and matches any one character, here this would match aar, abr, acr, adr, ...

/ab*c/   *   the asterisk means zero or more of the last character.  This matches ac, abc, abbc, abbbc, ...

/do(es)?/  ?  the question mark matches the preceding grouping 0 or 1 time.  This matches do or does.

BRACKETS [   ]

Brackets are used to match anything within the bracket.

/ar[ckt]/  matches arc, ark, and art

/[0-9.-]/  matches any number, dot, or - sign

There is an or, |, operator
/[abc|xys]/  matches abc or xys

NOT CHARACTER ^

There is a reverse character inside the bracket, ^, which matches anything but the characters given.   This is not at the start of the pattern, and doesn't mean "starts with."  It is inside the brackets.

/ar[^ckt]/  matches ara, arb,ard, ... not arc, ark, or art

/[^A-Za-z0-9]/  matches any symbol not a number or letter

RANGES  -

Brackets also allow for ranges.
/ar[c-e]/  matches ar with c,d, or e, that is: arc,ard,are, but not ara, or arf
/[0-9]/    matches any numbers
/[A-Z]/    matches any capital letters

You can combine ranges
/[0-9A-Za-z]/  matches all letters upper and lower case and numbers.  In the ASCII character table capital letters come before lowercase letters and are separate characters.

MULTIPLIERS

There's some special characters that act as multipliers.

If you want to do one or more you use a plus, like this:
/ab+c/  matches abc, abbc, abbbc, abbbbc, ...

You can use multipliers with ranges
/[a-z]+/  matches one or more lowercase letters.  For example, searching  "This one" would match "his".

If you want to do 0 or 1 more
/ab?c/  matches ac, abc, and that's it.

You can do a repetive grouping with ( )
/a(bc)+d/  matches abcd, abcbcd, abcbcbcd, ...

And you can multiply patterns with qualifiers {}
/ab{3}c/ matches abbbc
/a(bc){4}d/ matches abcbcbcbcd

CONTROL CLASSES

There are some control classes, or groups of characters represented by a word. They are set off with [: :]

[:lower:] matches lower case letters, a but not A
[:upper:] matches upper case letters, A but not a
[:alpha:] matches letters any case, a,A
[:alnum:] matches alphanumeric, letters or numbers a,A,2
[:space:] matches a space
[:blank:] matches a space or tab
[:digit:]{3} matches any three digits, 012, 213, ...

[:cntrl:] matches control characters.  Control characters are null, bell, backspace, horiz tab, line feed, form feed, carriage return, escape, and delete.

[:punct:] matches punctuation, such as ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.

[:xdigit:] matches hexadecimal digits, 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

Regular expressions can be put together into some fairly complex patterns. They'll look so complicated, you'll wonder how it ever works.

Let's do one, a simple US zip code allowing both the 5 and 9 digit zips.  Here goes,

/^([0-9]{5})(-[0-9]{4})?$/

This reads at the start of the string match a digit exactly  5 times, group, and in the next group, match a -, match a digit exactly 4 times, and in the second group either 0 or one match only on the - 4 digits, and finally the string ends with none or - 4digits.

Onliine Test Tool

So you don't get lost in building your regular expression, there is an online regular expression test tool, here, that I'd like to recommend.   This will give you the ability to insert a pattern, insert a test string, and see if your pattern works before you use it in your code.  The tester, along with a good basic knowledge of regular expression syntax, will go a long way toward making searching and validating your data a lot easier.

 

Posted in Regular Expressions | Leave a comment

PHP Regular Expression Functions

hipowlWhen you use a regular expression, you create a special regular expression syntax pattern.  That pattern is then used to search the supplied text for that pattern, and then do something if it finds the pattern.  It can be used to verify and find credit card numbers, zip codes, and words that start with "t" and end with "s," for example.

PHP has nine funcions that help with regular expressions.  I'm going to review each function quickly.  To make things easier, we can group the nine functions into three groups: functions used to test for a pattern, functions used to insert text, and functions used to output patterns found in the text.

FUNCTIONS USED TO TEST

These functions are meant to be used with "if" statements and yield true, if it is a match, or false, if there is not a match.

PREG_MATCH

Preg_match looks to see if there is a match between the regular expression and the string your testing, like so:

$source = "The quick brown fox jumped over the lazy dog.";
$pattern = "/quick/";  // look for the word 'quick'

if (preg_match($pattern, $source))
{
    //do something
    jump();
}

The output of preg_match is either 1 or 0.  There is either a match or not.  It stops searching after it finds a match. In the above example, the word "Quick" with a capital Q would not be a match, we'd have to have the pattern  "/quick/i" (the i for case-insensitive) to do that.  Preg_match is probably the most used of the preg functions.

PREG_LAST_ERROR

This function returns the error code from the last preg_ function you  ran.  Error code patterns are a set of PHP predefined constant error patterns.  It is often used with PREG_MATCH to see if an error has occurred while matching.

preg_match($pattern, $source);

if(preg_last_error() === PREG_RECURSION_LIMIT_ERROR)
{
echo ("Recursion limit was exhausted!");
}
else if (preg_last_error() === PREG_BACKTRACK_LIMIT_ERROR)
{
echo ("Backtrack limit was exhausted!");
}

There are 13 pre-defined regular expression error patterns in PHP. If you are checking for errors check the PHP manual for a list of these constants.

FUNCTIONS USED TO INSERT TEXT

PREG_SPLIT

preg_split  splits a string into different array items.

$pattern = "/ /";
$source = "123456789";
$result = array();

$limit = 4;  // optional - do 4 characters
$flag = "PREG_SPLIT_NO_EMPTY"; // optional - predefined PHP constants.  This one returns only non-empty characters

$result = preg_split($pattern, $source, $limit, $flag);

echo $result;

Which returns:  Array(0=>1, 1=>2, 2=>3, 3=>4, 4=>56789)

The reason all the numbers are not comma separated, is we used the $limit to only do four matches. The $flag variable is a set of three PHP predefined constants you can use in splitting strings.

PREG_QUOTE

Preg_quote is unique in that it places a \ in front of any reqular expression characters.  It is set up differently then other preg functions.

$delimiter =  "#"; // optional - a replacement character other than "\"
$source = "The quick brown fox cost me $600 when it bit my dog.";

$result = preg_quote($source, $delimiter);
echo $result;

The result is: The quick brown fox cost me #$600 when it bit my dog#.

Without the delimiter it would read: The quick brown fox cost me \$600 when it bit my dog\.  This is useful for escaping characters for printing.

FUNCTIONS USED TO OUTPUT

PREG_REPLACE

preg_replace performs a regular expression search and replace.

$source =       // What to search
$pattern =      // The search pattern
$replacement =  // What to use as a replacement
$limit =        // optional - The number of replacements to do
$count =        // optional - The number of replacements made
$result =       // The array that is returned

$result = preg_replace ($pattern, $replacement, $source, $limit, $count)

A Limit of -1 will do all the replacements.  The entire $source is returned with the replacement strings in place.

This is an interesting function as you could have several patterns and replacements in an array.  There is a surprisingly good example on the PHP web site.

$string = 'The quick brown fox jumped over the lazy dog.';
$patterns = array();
$patterns[0] = '/quick/';
$patterns[1] = '/brown/';
$patterns[2] = '/fox/';
$replacements = array();
$replacements[2] = 'bear';
$replacements[1] = 'black';
$replacements[0] = 'slow';

echo preg_replace($patterns, $replacements, $string);

The result is:

"The bear black slow jumped over the lazy dog."

These functions usually output a $result variable that is an array of pattern matches that you use elsewhere in your code.

PREG_FILTER

This is identical to preg_replace, it does a search and replace based on a pattern, but it filters out what doesn't match, and returns the replace items in an array.

$result = preg_filter( $pattern, $replacement, $string, $Limit, $count);

PREG_REPLACE_CALLBACK

This is identical to preg_replace. It performs a regular expression search and replace, but instead of a $replacement value a callback is specified.

$source = "The quick brown fox jumped over the lazy dog.";
$pattern = "/quick/";  // looking for the lowercase word "quick"
$limit = -1;
$matches = array();

function theCallBack($matches)
{
    echo $matches[0];
}

$preg_replace_callback($pattern, 'theCallBack', $source, $limit, $count)

Every time the word "quick" matches the callback is fired.  In this case, the word quick will be in $matches[0].

PREG_MATCH_ALL

Preg_match_all matches repeatedly all occurrences of a pattern in an array or string and outputs the results to a multidimensional array.  It does not stop after the first occurrence, like preg_match.  This function is useful to pull out specific information from a document.  It can be used to pull out all javascript source files in a web page, for example.

$source = file_get_contents("http://www.geekgumbo.com");  // Open a web page source
$pattern = " /src=[\"']?([^\"']?.*(js)[\"']?/i " ;   // Start with "src=" and end with a quote, or double quote, after the ".js"
$result = array();

preg_match_all($pattern, $source, $result);

The result might looks something like this.

$result [0][0] -> src="../../js/jquery.js"
$result [0][1] -> src="../../js/script.js"
$result [1][0] -> ../../js/jquery.js
$result [1][1] -> ../../js/script.js
$result [2][0] -> js
$result [2][1] -> js

If you want to play around with preg_match_all, a perfect way to see all the results easily is using my newchk utility at "http://www.newchk.com".

PREG_GREP

Preg_grep is like the grep command in Linux.  It searches through an array and returns all the matches of a particular pattern into a result array.  Let's look.

$source = array("apples", "appricots", "oranges", "grapes", "bananas");
$pattern = "/^ap/";  //begins with an "ap"
$result = array();

$result = preg_grep($pattern, $source);

print_r($result);
// Prints: "Array ( [0] => apples [1] => appricots )"

In the above example, preg_grep returned all fruits starting with the letters "ap".

There is an interesting option, where you can return an array of all the items that do not match the pattern by adding "preg_grep_invert," like so.

$nomatch = array();
$nomatch = preg_grep($pattern, $array, preg_grep_invert);
print_r($nomatch);

// Prints "Array ( [2] => "oranges" [3] => "grapes" [4] => "bananas" )"

Notice that the result array maintains the orginal array keys.

Preg_match is by far the most used regular expression function, followed by preg_replace.  This post did not intend to be a complete write-up on regular expression functions, but rather useful as a quick look-up when writing code as to the syntax and intent of a preg function.   Check the  PHP manual for the definitions of any constants referred to in the article and further explanation.

Posted in Regular Expressions | Leave a comment

Web Development Tools

wwwjpeg220A web developer, over time, gathers a suite of tools to make his job easier.  I thought it might be interesting to my readers to go through the current tools I'm using and give you my 50,000 foot view of what the tool is good for, and maybe why I find it better than other tools.

I run a Windows 7 Development environment at work doing open-source development.  I communicate with various Linux servers during the course of the day.  All the utility tools listed here run on Windows 7, are open-source, and thus free, for you to download and use.

Let's start with "WampServer" - I use WampServer as my Apache, PHP and MySQL localhost server.  I prefer WampServer, over XAMPP, because of the toolbar menu in the lower right corner is easy to use and has a lot of functionality, and its easy to set up virtual hosts in the menu.  See my article on  "Multiple Virtual Hosts in WampServer".    A word to the wise on this one.  Beware the 64bit version of WampServer, at this writing, it has bugs.  Go with the 32 bit Version.  It works great.

For an IDE, I use "NetBeans".  I like NetBeans, over Eclipse, because it is a more streamlined and an easier user interface. See my article on "NetBeans 7.1 Review"

I use several different editors, besides NetBeans, depending on what I need to do.  On Linux, I use Vim.  I have Vim loaded in Windows, and use it when I go to the command line. The others don't count, including the elusive EMACS.  I should get some response on this one.  All in good fun :-}

My most used editor is "Notepad++".  I highly recommend this editor, I'm usually back and forth in it all day.  I use it to write and save snippets of code, among other things. What's nice is it keeps all your current tabs open every time you open the editor until you specifically close the file.

My current project has a lot of compressed JavaScript, which is difficult to read.  To uncompress and edit the file, I use the "Free JavaScript Editor".

For version control, I use "Git".  There are now two environments for Git for Windows, Mysysgit and git-scm, each does the job.  I liked and used Mysysgit for years, and am now using git-scm.  It has a nice icon, but I'm finding a couple of quirks.  I will probably go back to Mysysgit next time.  It's a toss up.  This is one where you migrate back and forth.

What about Images:

For viewing images, I use "XnView".  It works great, and allows you to quickly access pictures on your PC.  For years, I used IrfanView and stayed with it, mostly because I liked their panda icon.  I know, dumb.  XnView has a better interface.

Picking Colors from any where on the screen: "ColorCop".  I use to use it to measure screen distances, but I've found that Windows 7 has messed up this feature.  Hopefully, it will be fixed in the future.  It's still the best color picker I've used.  I'm looking for a utility to measure screen distances in pixels.

For editing an image, I like "Gimp 2.6".  I found that 2.8 is not my "cup of tea" with the unified menu.  And I don't like that I have to export the image to save it in another format.  I got use to 2.6.

Taking a snapshot of the screen for this blog, for example, I use "Greenshot".

For reading PDF's or ebooks, I like "MartView".  See my article on "MartView - PDF Reader".

Behind the Scenes Utilities:

My hands down favorite is "AutoHotKey".  If you don't have this yet,  get it.  It will improve your productivity.  This little utility will allow you to bring up web sites, start applications, and enter strings of text to the screen with a couple of key strokes, of your choosing. 5 STARS.  See my article "AutoHotKey for Windows".

As long as we're messing with keys, to remap your keyboard, try "KeyTweak", and turn off that annoying CapsLock key.  I wrote an article on KeyTweek here.

For saving my clipboard contents, I use "Spartan".  It is very configurable, easy to bring up, and you can save past copies, like your favorite web sites in a permanet area.

Working with Files:

For finding lost files on my computer, I use "Everything".  I like the way it eliminates your possible choices as you type, and once it scans your files when I log in, it's lightning fast. See my article on Search Everything.  By the way, Windows 7 Search does not find every file.

For searching for specific text strings inside of files, like in what files do I find the term "<a title="Milestone" in a group of files, and at what line number.  On Linux it's grep, on Windows, I use "AstroGrep," and it too is fast.

For finding the difference in two or three versions of the same file.  I like "P4Merge".  It also integrate well with Git.  See my article on: "P4Merge File Comparison Editor"    I've heard good things about "Beyond Compare," but that's not free.

For extracting compressed zip and tar files, I prefer: "7Zip".

SQL Tools:

Of course, a nod to "phpMyAdmin" which comes with WampServer, a good tool.

For writing queries, I like the older "MySQL Query Browser" over any of the later tools.  Better hurry if you want this tool as Oracle is moving to retire it.

A tool I don't care for, that other developers seem to love, Toad.  It's got bugs, it's not fail safe, and has a crappy database export.

Windows Utilities:

I have to give a nod to Scotty and "WinPatrol"  No one can mess with my Start Up files without my knowing about it.

Clean up the trash - "CCleaner" can't be beat, although I wish they didn't come out with so many new versions.

I use both "Malwarebytes" and "Spybot Search and Destroy" for cleaning out malware.  Malwarebytes takes longer to run, but does a better job.  Both clean up different types of malware.  It's a toss between "AVG" and "Avast" on antivirus, I've used both.  And my firewall, which has now gone to its grave is "PC Tools Firewall Plus".

For killing processes, which the Task Manager can't seem to do, if your running XP, check out "SysTree++".  It's awesome.  Any other Windows, I recommend "Microsoft Process Explorer", its free.

I know I've missed a couple here and there, but these are the tools I use every day.  I'm all ears to any other suggestions you might have for tools you use and love.

The two tools I use and love the most is Notepad++  and AutoHotKey.  So there you have it, what my work environment looks like.  I hope it has given you some new ideas for software to try out.

Posted in Web Development | Leave a comment

PHP – Working with Files

Although I don't get much chance to write to files directly, since most of my work is with databases, every once in awhile its nice to know how to do this.

Working with files is much like working with a database, you first have to open a connection to the file, then, like with a database, you can: create a new file, read the file, and update the file. After you're done with the file operation, you close the file, just like you close a database. Let's see how we do this with PHP.

First, to open a file we use the fopen command like so:

$fileName = "testFile.txt";
$fileHandle = fopen($fileName,'a') or die("can't open file");

The fopen command generates a link to the file called, a file handle. We use the file handle for the rest of our file operations as a pointer to the file.

When you open a file, it is expected, that the programmer will know what he wants to do with the file. The fopen commands asks why the file is being openned. There are several choices to substitute for the 'a' in the fopen command:

'r'  Open for reading only; File pointer at the start of the file.

'r+' Open for reading and writing; File pointer at the start of the file.

'w'  Open for writing only; File pointer at the start of the file; Delete any contents. If the file does not exist, create it.

'w+'  Open for reading and writing; File pointer at the start of the file, Delete any contents. If the file does not exist, create it.

'a'   Open for writing only; File pointer at the end of the file. If the file does not exist, create it.

'a+'  Open for reading and writing; File pointer at the end of the file. If the file does not exist, create it.

'x'  Create and open for writing only; File pointer at the start of the file. If the file exists, generate an E_WARNING.

'x+'  Create and open for reading and writing; File pointer at the start of the file. If the file exists, generate an E_WARNING.

Quite an array of options. There are a few more, but these are the main choices. In reality, if you're adding to the contents of the file, like with a log or error file, you want to add the contents below anything else in the file, so the "a" is appropriate. If you're creating a new file, you would probably want to use the "w" or "x" to avoid overwriting an existing file. If you're just reading the file, a simple "r" will do.

Now that we have the file open, how do we add content and read the file.

$fileName = "testFile.txt";
$fileHandle = fopen($fileName, 'a') or die("can't open file");

$strTest = "The string to write to the file.";

// To write to the file
fwrite($fileHandle, $strTest);

// Close the file after the write
fclose($fileHandle);

The fwrite command says write the $strTest to the file. Since we openned the file to append, "a", $srtTest will be added to the end of any contents in the file.

How about reading, we could use fread() , like so:

$fileName = "testFile.txt";
$fileHandle = fopen($fileName, 'a') or die("can't open file");

// If we want to read the entire file
$lengthOfRead =  filesize($filename);

// You assign the output of the fread to a variable
$result = fread($fileHandle, $lengthOfRead);

// If we want to read a line at a time
while (!feof($fileHandle))
{
	echo fgets($filehandle);
	// carriage return for a new line
  	echo "\t";
}

fclose($filehandle);

The $lengthOfRead variable is the amount of the file you want to read, in bytes. feof() tests for the end of the file. fgets() reads a line at a time.

Carriage returns at the end of the line are different for different operating systems. Use "\r" for the Mac, and "\n" for Linux, and Windows needs a "\r\n". In Windows, if you use a "\t" it will pick the right line ending for you.

There are few other commands in PHP that combine the fopen, fread or fwrite, and fclose all in one command, that I recommend.

file_puts_contents("testFile.txt", $strTest )

$fileContents = file_get_contents("testFile.txt");

file_puts_contents is identical to calling fopen(), fwrite() and fclose() successively to write data to a file. If the file exists the content will be overwritten. If the file doesn't exist it will be created.

file_gets_contents reads the entire file as a string to memory.

ANd there is one more way to read a file:

$fileContents = readfile("testFile.txt");

The main difference is that readfile writes its contents to the output buffer instead of memory. You can output it directly to your screen with a

print($fileContents).

As you can see, there's more than one way to accomplish the same task in PHP. You just need to know what you're trying to do, before you start doing it.

Posted in PHP | Leave a comment