Archive for March, 2012


Posted: March 19, 2012 in Uncategorized
    1. How to Generate Random Numbers using php?

You can generate Random Numbers using php: follow the code.

<? srand(time()); $random = (rand()%9); print("random number between 0 and 9 is: $random"); ?> 
    1. How to send Mail Using PHP to send mail?

Let’s get interactive with surfers again. I’d like to create a form mail system in PHP. First we need to create a form in HTML to gather the information. I am going to use one similar to my Perl, because it looks clean.

form action=”send_email.php3″ method=”POST”>

The $to variable will be defined in the send_mail.php3 script to point to my email address (you’ll see the php code momentarily), of course. I also prefill the subject with the words “diary entry suggestion” which visitors can change to something else, if they want:

<input type=”text” size=”22″ name=”subject” value=”diary entry suggestion”>

Ok, let’s take a look at the code to mail the contents of the above form to me and if the process is successful it will redirect you right back to this page. In order for this code to work you will need to know the path to sendmail on your server.

<? $to = ""; $from_header = "From: $from"; if($contents != "") { //send mail - $subject & $contents come from surfer input mail($to, $subject, $contents, $from_header); // redirect back to url visitor came from header("Location: $HTTP_REFERER"); } else { print("<HTML><BODY>Error, no comments were submitted!"); print("</BODY></HTML>"); } ?> 

Notes: I make sure there are some comments submitted or else show an error message by using the != (not equal) with an if statement. When you want to redirect the browser using the header function (like when using the setcookie function) you must do it before any HTML. With the header and $HTTP_REFERER in the code above I am simply sending people who submit the form back where they came from, which should will be this page that calls the form. I could have easily changed the header line to read:


    1. Using Regular Expressions – Validating Email Addresses

Validating email addresses isn’t a perfect science, unfortunately. When you think of all the different possible formats, you have to be pretty broad — maybe too broad — in defining a matching pattern (or regular expression, whatever you prefer to call it). You will find many different regular expressions other programmers have written for validating email addresses so mine certainly isn’t the only one (nor am I egotistical enough to suggest it is the best one), but it is one that works:

// join the mail list?
if ($php_script_list == “yes”)
// is the $from email address in valid format?
if(ereg(“([[:alnum:].-]+)(@[[:alnum:].-]+.+)”, $from))

First we make sure the $php_script_list checkbox is checked. Then we check the syntax of the user submitted $from email address. The first portion of the email needs to contain at least one but likely more {alnum} alphanumeric (a-z or 0-9) character or the dash – symbol then there must be the @ symbol followed by at least one, but likely more. Then there is more alphanumeric characters and at least one period mixed in there. That’s what that long somewhat cryptic regular expression means. would be a valid email
abc@123com woud be an INvalid email (missing period) would be a valid email would be an INvalid email (missing @)

In regular expressions you enclose ranges within brackets. Thus [a-z] would mean any letter between a-z would return true. [a-c] would only return true if a,b,c was in the comparison string. I encourage you to refer to the php manual and look at the ereg function again.Note that when using ereg function patterns are case sensitive.

int ereg(string pattern, string string, array [regs]);

Use the eregi function in place of ereg for patterns that are case INsensitive. Let’s look at the rest of the mail list code:

if(file_exists("email_list.txt")) { $newfile = fopen("email_list.txt", "r"); while(!feof($newfile)) { $duplicate = fgetss($newfile, 255); // is email address submitted already on file? if(eregi("$from", $duplicate)) { print("<HTML><BODY>This email <strong>$from</strong>"); print(" is already in the database. <br>Please go back"); print(" and uncheck the mail list box</BODY></HTML>"); fclose($newfile); exit; } } fclose($newfile); $addemail = fopen("email_list.txt", "a"); fputs($addemail, "$from\n"); fclose($addemail); } else { // since file doesn't already exist, let's create for first time $newfile = fopen("email_list.txt", "a"); fputs($newfile, "$from\n"); fclose($newfile); chmod("email_list.txt", 0666); }

Start by determining whether the email_list.txt file exists. If it does not exist then this is the very first time someone is being added to the list and we don’t need to bother with checking to see whether their email address is in the list or not. So we only need to create the file and write the email address. You may also notice I used the chmod function to set the file permissions for the email_list.txt file. This line of code is optional. When using the “a”ppend file option, Unix will automatically create the file for the first time and set the permissions. The problem is, depending, on the directory and umask settings it may create a file that only the script has owner permissions to read and write and update. By using the chmod function sometimes can be the only way to gain the permissions you want once a script has set the initial permissions. If you try to FTP a file to a directory and get the “permissions denied” error, you may very well need to write a small PHP script using the chmod command to change the permissions of the file (or simply delete the file, but then you’ll lose the data in it). Ok, but let’s say the email_list.txt file does exist, now the most logical thing to do is make sure we don’t already have that same email address on file.  We do this by reading a line at a time of the existing file and comparing that line (I named the variable $duplicate) against the $from email address and seeing if there is a match.  If there is then we print the message to the browser letting the person submitting the form know that we already have that email address in the database, close the open file, and exit the script. They can go back and uncheck the box and resubmit and the email will go through fine. Now if the email address does not match another email in the database they are appended to the end of the database.

    1. Changing a strings case in PHP
<? // force all uppercase print(strtoupper("i bet this will show up as all letters capitalized<br>")); // force all lowercase print(strtolower("I BET THIS WILL SHOW UP AS ALL LETTERS IN LOWERCASE<br>")); // force the first letter of a string to be capitalized print(ucfirst("i bet this will show the first letter of the string capitalized<br>")); // force the first letter of each WORD in a string to be capitalized print(ucwords("i bet this will show the first letter of every word capitalized<br>")); ?> 
    1. Admin Password gateway using PHP

The HTML for the above form looks like:

<form method="POST" action="example18.php3"> <div align="left"><p><font face="BankGothic Md BT">Admin password?</font> <input type="password" name="pw" size="14"><input type="submit" value="Submit"></p> </div></form> 

Now the php code to check the submitted password versus the hardcoded one is:

<? $adminpass = "test123"; if ($pw == $adminpass) { print("Welcome to the administration area!"); } else { print("Wrong password"); } ?> 

If you try example 18 with password test123 it will let you in, otherwise you’ll get the wrong password message. As you can see it only takes a few lines of code and this will work on NT or Unix. Additionally, you would enclose the contents of the admin area inside the success portion of the if statement. For those using Unix servers, you also have .htacess available which we’ll look at next.

Using .htaccess (UNIX only)

For those who aren’t already familiar with .htaccess it is a server side password protection scheme. By uploading a file named .htaccess into a directory you can protect every file in that directory and beneath that directory from unauthorized use. Here’s a good way to look at the way .htaccess fundamentally works:        <————– same as typing —>   <———- upload .htaccess here —>  <—- UNprotected –> <—- protected –>

The code inside the .htaccess file you would upload would look like this:

AuthName “Name to display”
AuthType Basic
AuthUserFile /home/usr/www/.htpasswd
AuthGroupFile /dev/null
require valid-user

The AuthUserFile should contain the absolute path to this file above. The .htpasswd file will contain the user id and pw combinations to allow into the admin area. The code inside the .htpasswd file (you can name this file anything you want, though) will contain the username and encrypted password combination in the format



Understanding B-TREE INDEXING…..

Posted: March 19, 2012 in Uncategorized

The essence of our technique is quite simple. Rows
are assigned tag values in the order in which they are
added to the table. Note that tag values identify rows in
a table, not records in an individual partition or in an
individual index. Each tag value appears precisely once
in each index. All vertical partitions are stored in B-tree
format with the tag value as the leading key. The important
novel aspect is how storage of this leading key is
reduced to practically zero.
The essence of our technique is that in each B-tree
page, the page header stores the lowest tag value among
all B-tree entries on that page, and the actual tag value
for each individual B-tree entry is calculated by adding
this value and the slot number of the entry within the
page. There is no need to store the tag value in the individual
B-tree entries; only a single tag value is required
per page. If a page contains tens, hundreds, or even
thousands of B-tree entries, the overhead for storing the
minimal tag value is practically zero for each individual
record. If the size of the row identifier is 4 or 8 bytes
and the size of a B-tree node is 8 KB, the per-page row
identifier imposes an overhead of 0.1% or less.
If all the records in a page have consecutive tag
values, this method not only solves the storage problem
but also reduces “search” for a particular key value in
the index to a little bit of arithmetic followed by a direct
access to the desired B-tree entry. Thus, the access performance
in leaf pages of these B-trees can be even better
than that achieved with interpolation search or in
hash indexes.
If records in a page do not have consecutive tag
values, the proposed method does not work, at least not
immediately. There are multiple ways to design for possible
gaps in the sequence of tags. One way is to prohibit
and avoid gaps, e.g., by means of a strict requirement
that rows in the table are only appended at the end
or they are deleted only in the order in which they were
added. Gaps in the tag sequence within one index usually
imply that the rows in the table lack consecutive tag
values and that the same problem exists in all B-tree
representing vertical partitions. Alternatively, gaps may
occur due to missing values or Null values.
A second way for dealing with gaps in the sequence
of tags is to retain ghost records in the B-tree pages.
During deletion of a single row in the table and the corresponding
records in all the table’s indexes, the B-tree be considered, including concurrency control and recovery,
bulk insertions and deletions, online index creation,
index creation with allocation-only logging, verification
to guard against corruption due to hardware or
software faults, etc.

Additional applications
In the discussions above, a single table was partitioned
vertically and tags assigned per table. Alternatively,
a table may be partitioned in multiple steps. The
first step groups columns into subsets and sort order
defined for each subset. Tags are assigned based on this
sort order. The second step partitions each subset into
storage structures, e.g., B-trees on tag columns with the
compression feature described earlier. Thus, in this storage
architecture, the earlier discussions apply not to a
traditional logical table or view but to each vertical partition
of such a table or view.
In more traditional database settings, there are some
real-world business processes in which sequential numbers
or identifiers are common, important, or even legally
required. For example, orders, invoices, cheques,
etc. fall into this category. For databases that describe
these real-world objects, indexes that map real-world
identifiers to additional information can benefit from the
compression method described. Even if there are large
gaps in the overall sequence, e.g., in vehicle identification
numbers, data in many index pages will be short
coherent sequences that may benefit. For small gaps,
ghost slots as described above might be sufficient.
In other words to maximize effective scan bandwidth,
column stores should be 100% full. Thus,
changes should be initially retained elsewhere using
techniques like differential files [SL 76] and then applied
in bulk. For efficient capture, e.g., during bulk
loading, the changes should likely be in row format
rather than column format [SAB 05].
A recent study of master-detail clustering within Btree
indexes [G 07] found that the proposed compression
method applies even there. In other words, multiple
columns can be stored in a single B-tree in a columnoriented
format rather than the traditional row-oriented
form. Each column is assigned to key range within the
B-tree such that the individual columns are concatenated.
Each record’s key consists of column identifier
and row identifier. After the column identifier has been
truncated using prefix truncation [BU 77], the row identifier
can be truncated using the presented design. Even
recent insertions and deletions in row format can be
represented in the same B-tree in yet another key range.
Finally, independent of the scan performance in relational
data warehousing environments, vertical partitioning
and columnar storage using B-trees as described
automatically turns row-level locking into column-level


MySQL Date and Time Formats

Posted: March 4, 2012 in Uncategorized

All of the larger systems that I’ve helped develop necessitated storing some combination of date and time fields. Almost universally, these fields presented difficulties of some kind or another. One particularly memorable challenge was converting between internal storage and display formats that featured the month name rather than a numeric one. We were able to overcome it using a combination of database and programming solutions, but it would have been much easier had we been using MySQL. Its thorough and flexible date handling allows you to switch between internal and external date/time formats with relative ease. Now I’d like to show you what we could have done in MySQL…


Some Date/Time Basics

MySQL supports a variety of date/time data type combinations. All you have to do, when creating a date/time column, is choose one of the following types that most closely matches the date/time portion and accuracy that you require:


  • DATE: Stores a date value in the form YYYY-MM-DD. For example 2008-10-23.
  • TIME: values in ‘HH:MM:SS’ format (or ‘HHH:MM:SS’ format for large hours values). TIME values may range from ‘-838:59:59’ to ‘838:59:59’.
  • DATETIME: Stores a date and time value of the form YYYY-MM-DD HH:MM:SS. For example 2008-10-23 10:37:22. The supported range of dates and times is 1000-01-01 00:00:00 all the way through to 9999-12-31 23:59:59
  • TIMESTAMP: Similar to DATETIME but more precise, as the TIMESTAMP includes milliseconds. There are a few other differences as well, depending on the version of MySQL and the mode in which the server is running.
  • YEAR: The YEAR type is a one-byte type used for representing years. It can be declared as YEAR(2) or YEAR(4) to specify a display width of two or four characters. The default is four characters if no width is given. For four-digit format, MySQL displays YEAR values in YYYY format, with a range of 1901 to 2155, or 0000.

A table containing DATE and DATETIME columns is created much the same way as any other column in a table. For example, we can create a new table called orders that contains order number, order item, order date and order deliverycolumns as follows:

 CREATE TABLE `MyDB`.`orders` ( `order_no` INT NOT NULL AUTO_INCREMENT, `order_item` TEXT NOT NULL, `order_date` DATETIME NOT NULL, `order_delivery` DATE NOT NULL, PRIMARY KEY (`order_no`) ) 

While it is common to store dates using a dash (-) as the delimiter and a colon (:) as the time delimiter, it is in fact possible to use any character, or even no character at all, between the date and time segments. Hence, the following formats all represent the same date and time:


 2009-10-20 10:39:11 20091020103911 2009/10/20 10.39.11 2009*10*20*10*39*11 

Input Values

MySQL will happily accept any of the following statements as date inputs:


 INSERT INTO tbl_name (idate) VALUES (19970505); INSERT INTO tbl_name (idate) VALUES ('19970505'); INSERT INTO tbl_name (idate) VALUES ('97-05-05'); INSERT INTO tbl_name (idate) VALUES ('1997.05.05'); INSERT INTO tbl_name (idate) VALUES ('1997 05 05'); INSERT INTO tbl_name (idate) VALUES ('0000-00-00'); INSERT INTO tbl_name (idate) VALUES (CURDATE()); 

The yyyy-mm-dd format used by MySQL is called the International (or ISO) date format. The beauty of using that format is that it prevents the days and months from being mixed up. Depending on the database locale, ambiguous months and days such as 3/4/1999 could be interchanged. I have run into this issue many times in Access, which relied on the ‘dd/mm/yyyy’ short date format!

Applications which insert dates must convert them from a pure date data type into a string comprised of some variation of year, month, and day format. The following VBScript code would convert a date and time into a YYYY-M-D h:m:s string such as ‘2006-3-3 15:2:7’:


 dt = CDate( dt ) ISODateTime = "'" & Year(dt) & "-" & Month(dt) & "-" & Day(dt) _ & " " & Hour(dt) & ":" & Minute(dt) & ":" & Second(dt) & "'" 

Note that, without the time formatting on the final line, the time portion would be left out and would be lost forever!


Output Values

By default, MySQL also displays dates in the International format:


 SELECT idate FROM tbl_name WHERE idate = CURDATE(); --returns 2011-01-21 

To display dates and times in a different format, use MySQL’s versatile DATE_FORMAT() function:



It accepts a number of specifiers in the format argument. These are the “%” character followed by format specifier characters. For example, %e specifies the numeric month; %h specifies the hour:


 SELECT DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y'); --returns something like 'Sunday October 2009' mysql> SELECT DATE_FORMAT('2007-10-04 22:23:00', '%H:%i:%s'); --returns something like '22:23:00' 

Working with Month Names

The date format standard where I work is ‘DD-MMM-YYYY’ where MMM is the abbreviated three-letter month name. Having worked with this format for years, I can’t think of a more diabolical way to drive developers batty! There have been many trials and tribulations, but we’ve gotten the hang of it now! Here’s what I’ve learned:

Ideally, you should exert control over the inputs by using some type of GUI widget that limits the month names to valid dates. A dropdown or calendar control works nicely for this purpose. Conversely, accepting date inputs via freeform text fields is asking for trouble.

Other inputs, such as those submitted from a partner via XML data, must be accepted in whatever format is given to you. Working with dates which contain full month names can be especially problematic. In these instances, MySQL’s STR_TO_DATE() function can help. It is the inverse of the DATE_FORMAT() function. It takes a value string (str) and a format string (format). The format string can contain literal characters and format specifiers beginning with “%”. Literal characters in format must match literally in str. Format specifiers in format must match a date or time part in str. STR_TO_DATE() returns a date/time value if the format string contains both date and time parts, or a date or time value if the string contains only date or time parts. If the date/time value extracted from str is invalid, STR_TO_DATE() returns NULL and produces a warning:


 SELECT STR_TO_DATE('01,5,2013','%d,%m,%Y'); '2013-05-01' SELECT STR_TO_DATE('May 1, 2013','%M %d,%Y'); '2013-05-01' 

Displaying month names is fairly easy because the DATE_FORMAT() function has specifiers for both full and abbreviated month names ( %M and %b respectively ). The MONTHNAME() function can also be of service there. Just be careful with abbreviated month names because some locales, like French (‘fr’), use a four letter abbreviation. The problem with French month abbreviations is that three letters is not enough to distinguish between June and July, which are juin and juillet in French.

Beginning with MySQL 5.0.25, the locale indicated by the lc_time_names system variable controls the language used to display day and month names and abbreviations. This variable affects the output from the DATE_FORMAT(), DAYNAME(), and MONTHNAME() functions. Locale names have a language and region portion listed by IANA ( such as ‘ja_JP’ or ‘pt_BR’. The default value is ‘en_US’.

The following function reads the language portion of the lc_time_names system variable to return a three letter month name in the current locale. It accepts the month number as the input parameter, ranging from 1 to 12:



The following tests highlight the workaround for the June/July issue in French. Juillet is thus displayed as jul:


 SELECT @@lc_time_names; -->returns 'en_US' SELECT GetThreeLetterMonthName(6); -->returns 'Jun' SELECT GetThreeLetterMonthName(7); -->returns 'Jul' SET lc_time_names = 'fr_CA'; SELECT GetThreeLetterMonthName(6); -->returns 'jui' SELECT GetThreeLetterMonthName(7); -->returns 'jul' 


MySQL’s thorough and flexible date handling allows you to switch between internal and external date/time formats with relative ease. Moreover, traditionally challenging three letter month names can be managed using MySQL’s lc_time_names system variable and locale-aware DATE_FORMAT(), DAYNAME(), and MONTHNAME() functions.


Advanced Pattern Matching with MySQL

Posted: March 4, 2012 in Uncategorized

The LIKE operator is great for finding words or phrases within strings. I’ve used it for those times that you need to match complex string patterns in MySQL. But you’ll be happy to learn that it provides a form of advanced pattern matching that is based on extended regular expressions used by Unix utilities such as vi, grep and sed.

The LIKE Operator

With LIKE, you can test for simple patterns using wildcards. It returns 1 if the expression matches the pattern; otherwise it returns 0. If either the expression or pattern is NULL, the result is also NULL. You can assign your own escape character instead of the backslash character (\). Here is the syntax:

expression LIKE pattern [ESCAPE 'escape_char']

LIKE recognizes the following two wildcard characters:

  • _ matches one character
  • % matches any number of characters, including zero characters

I recently used the LIKE statement to check email contact display names where users who were members of either the local network (LAN) or a partner network. The local contacts were in the format of “LastName, FirstName.” Those who belonged to the other department’s LAN were displayed as “LastName.FirstName.” To match both these cases, the following statement was used:

SELECT user FROM user_data WHERE email_display LIKE 'Gravelle%Robert';

That single statement would return rows where the email_display was formatted as either “Gravelle, Robert” or

Introducing the REGEXP operator

The regular expression, or regexp, is well known across many programming languages. Languages that support the regexp include C, C++, .NET, Java, JavaScript, PHP, Perl and many others. It is largely based on the powerful UNIX vi, grep, sed search tools.

MySQL’s implementation of regular expressions is based on the work of Henry Spencer. It works in much the same way that the LIKE operator does, except that it adds a lot of extra pattern matching capability. In fact, it even has an alias of RLIKE! Here’s the syntax using both statements:

expression REGEXP pattern
expression RLIKE pattern

Matching with Wildcard Characters

Regular expression wildcards differ slightly from those of the LIKE statement. Rather than matching any one character (_) or more characters (%), regular expressions match specific patterns called elements. You can still emulate the behavior of the LIKE statement, but the REGEXP can do a whole lot more.

To match any one character, use the period (.). You could say that it replaces the underscore character (_). However, when positioned before an asterisk (*), it then matches any number of characters, including none; that is to say, zero to N characters. We could therefore reformulate the LIKE expression as the following REGEXP:

'Gravelle, Robert' REGEXP 'Gravelle.*Robert'

The only caveat to using such a general pattern is that regular expressions are greedy! They will match as many characters as possible. Case in point, the following text would not match the ‘Gravelle%Robert’ LIKE expression because it matches against the entire string. However it would for a REGEX, because it doesn’t care where in the string the pattern matches:

Dear Mr. Gravelle,

I would like to thank you for your fan mail to Robert Pattinson.

You could say that the REGEX behaves more like ‘%Gravelle%Robert%’.

For that reason, it’s prudent to avoid such general patterns. If we know that there will be some delimiter in between the names, we can match using the plus sign (+). It matches the preceding element one or more times. Hence, there has to be at least one character there for the expression to match:

'Gravelle, Robert' REGEXP 'Gravelle.+Robert'

Better still, if we know how many characters may be used to delimit the names, we can specify this in the pattern as well, using curly braces ({m,n}), where it specifies m through n instances of the preceding element:

... RLIKE 'Gravelle.{1,2}Robert'

That will match ‘Gravelle, Robert’ or ‘Gravelle.Robert’, but it will still match too many other patterns for my liking.

Getting into Specifics

Let’s get down to brass tacks and outline exactly what we want. The first character should either be a period or a comma. In OR situations such as these, it’s best to define a valid character group by enclosing them between square brackets ([]). There may also be a space. To define a zero or one condition, follow the element by a question mark (?). This statement will really narrow down the field!

... REGEXP 'Gravelle[.,] ?Robert'

Many characters have equivalent named constants that you can use to make your code more readable. To use a named constant, enclose it between square brackets and periods. For instance, replacing the space in the above expression with its constant would yield the following:

... REGEXP 'Gravelle[.,][.space.]?Robert'

You can make the grouped characters exclusive, meaning that they are excluded from a match, by adding a caret (^) after the opening square bracket. The following expression will match as long as the “R” in “Robert” does not immediately follow the period or comma:

... REGEXP 'Gravelle[.,][^R]?Robert'

Grouping Elements

Of course an element can contain more than one character. To create a multi-character element, enclose them in parentheses (()). This construct would allow us to separate our conditions into a period (.) and a comma, followed by a space (, ). As you recall, the period is reserved to match against any character. Therefore, we need to escape it. This is accomplished by preceding it by a backslash (\). To create an OR condition with multi-character elements, use the vertical bar (|) character:


... REGEXP 'Gravelle(\.|(,[.space.]))Robert'


Position Markers

If that still isn’t precise enough for you, you can also denote positions within the string: namely the start (the caret (^)) and end (the dollar sign ($)). Obviously, these characters must be placed at the very start or end of the pattern! Since we know that our pattern matches the start of the string, we can include the caret:

... REGEXP '^Gravelle\.|(,[.space.])Robert'


And Now for Some Bad News

The extended POSIX.2 Regular Expressions on which the MySQL REGEX is based does not support backreferences, which are temporary variables that are created whenever parentheses are used. These can be referenced further along in the expression by using a numeric identifier such as $1, $2,…$n or \1, \2,…\n. Therefore, we can’t reference the first name again to make up the second part of the OR condition:

... REGEXP '^Gravelle(\.(Robert))|(,[.space.]$$'

The $1 above refers to the “Robert” that is enclosed in the innermost parentheses.

Instead, we have to rewrite it:
... REGEXP '^Gravelle(\.(Robert))|(,[.space.]$'


As long as you bear in mind that the MySQL implementation of Regular Expressions is not all-inclusive, I’m sure that you’ll agree that it’s leagues above the LIKE statement. Also, be prepared to accept the associated learning curve, because Regular Expressions really are a language all to themselves.