student :: geek :: photographer :: legend

PHP howto - Sanitize database inputs

July 24th, 2008 Denham Coote

When accepting data from a user, any data at all, it should be sanitized before making its way to your database.

What does this mean? Well, for one, you’re going to inspect the data and make sure that it doesn’t contain any malicious code such as ill-intentioned javascript.  Another is to prepare the data so that when it gets added to your insert/update SQL it doesn’t break the SQL (or do other nasty actions). Otherwise know as a SQL injection attack.

The technical details of the types of attacks we’re protecting against are a bit out of the scope of this post, but there are numerous resources available which will explain far better than I am able to.

After a form has been submitted (via get or post) it gets stored in the global array $_GET or $_POST.  Once we have this data, we can and should do a bunch of things to it, such as:

Stripping out malicious code

We’ll scan through the input, searching for anything that shouldn’t be there, like html code, <script> tags, etc.

<?
function cleanInput($input) {
 
$search = array(
    '@<script[^>]*?>.*?</script>@si',   // Strip out javascript
    '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
    '@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
    '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments
);
 
    $output = preg_replace($search, '', $input);
    return $output;
}
?>

’slashing

This part can sometimes get tricky, but not to worry, the code’s not too bad.  Basically we’re adding a backslash before any of the following: (single-quote), (double quote), \ (backslash) and NULL characters.  Depending on your server configuration, there are a bunch of ways of getting this done.  PHP has something called magic_quotes, which does this automatically.  Note, however, that as of PHP 6 this feature has been deprecated and removed.  Another PHP function, addslashes(), is the manual version of magic_quotes.  addslashes(”Where’s Wally”); will return “Where\’s Wally”.  A better option, if your server supports it, is mysql_real_escape_string().  It performs pretty much the same function, but is apparently better.

<?
function sanitize($input) {
    if (is_array($input)) {
        foreach($input as $var=>$val) {
            $output[$var] = sanitize($val);
        }
    }
    else {
        if (get_magic_quotes_gpc()) {
            $input = stripslashes($input);
        }
        $input  = cleanInput($input);
        $output = mysql_real_escape_string($input);
    }
    return $output;
}
?>

To use, we simply pass any input to the function. The function works on single strings, as well as deep arrays.

<?
$bad_string = "Hi! <script src='http://www.evilsite.com/bad_script.js'></script> It's a good day!";
 
$_POST = sanitize($_POST);
$_GET  = sanitize($_GET);
$good_string = sanitize($bad_string);
// $good_string returns "Hi! It\'s a good day!"
?>

Typecasting

Making sure that the data we’re inserting matches the expected type;  i.e, someone’s age should be received as an integer value, and not a string.

<?
$age = (int) $_GET['age'];
?>

This is a very gentle introduction to sanitizing your database input, and I would certainly recommend that you do a lot more research on these methods in order to use them correctly in your given environment.

That’s it for today. If you found this useful, of would like to improve it, comments are always appreciated!

PHP howto - Database paging (pagination)

June 26th, 2008 Denham Coote

OK, so it’s not quite the nuclear bomb I promised, but it’s just as much fun :)

Database paging, for those of you who are interested, is when you split the number of results returned by a query into smaller chunks, and then show those one page at a time.  Think of how Google will display 10 results out of 4 236 735.  Same thing.

The basic idea is to:

  1. Run your query, limited to the number of desired results
  2. Get the number of results that there would have been, without the limit
  3. Display the first set of results
  4. Build and display <prev> and <next> links, which, when clicked…
  5. Display the prev/next set of results, moved down/up by the desired amount
  6. Repeat 4 & 5

The following code sample is a very basic implementation of this idea.  I have not checked the code, so apologies in advance if there are any bugs.

<?php
 
$no_results = TRUE;   // No results found yet
$howmany    = 10;     // Return 10 results per query
 
// Set default starting point of query to 0, or, if set, to $_GET['rs']
$row_start  = (isset($_GET['rs'])) ? $_GET['rs'] : 0;
 
 
// Do our SQL query, with something like LIMIT 0, 10
$sql    = "SELECT SQL_CALC_FOUND_ROWS id, name, surname FROM person LIMIT ". $row_start .", ". $howmany ."";
$result = mysql_query($sql);
 
 
// Get the number of rows that would have been returned WITHOUT a limit clause, to be used later for paging.
$count_sql        = "SELECT FOUND_ROWS() AS total";
$count_sql_result = mysql_query($count_sql);
$count_row 	  = mysql_fetch_array($count_sql_result);
$count_result 	  = $count_row['total'];
 
// Start looping through our result set
while($row = mysql_fetch_array($result)) {
    $no_results = FALSE;
 
    // Save results of query to $line_output
    $line_output .= "
        <div class=\"someclassname\">
            <div>". $row['id'] ."</div>
            <div>". $row['name'] ."</div>
            <div>". $row['surname'] ."</div>
        </div>";
}
 
// Don't bother building paging if we don't have records
if ($no_results) {
    $line_output = "No records found...";
    $page_output = "";
}
else {
    // Build <prev> and <next> links and save to $page_output
    $rs_prev = $row_start - $howmany; // where would prev page start, given current start less no. of records
    $rs_next = $row_start + $howmany; // where would next page start, given current start plus no. of records
 
    // If for some reason the next <prev> starting point is negative, do not display <prev>
    // This happens when our current starting point is already 0
    // This may happen if some smartass manually changes the rs= bit in the url
    $page_output_prev 	= ($rs_prev < 0) ? "" : "<a href='?rs=".$rs_prev."'>Previous</a>";
 
    // Will the next page jump start point exceed the number of records returned?
    // If so, don't display <next>'
    $page_output_next 	= ($rs_next >= $count_result) ? "" : "<a href='?rs=".$rs_next."'>Next</a>";
 
    // Just something to put between <prev> & <next>, IF they are both active
    if (($page_output_prev == "") || ($page_output_next == "")) {$page_output_breaker = "";}
    else { $page_output_breaker = " || ";}
 
    // Build final paging output
    $page_output = $page_output_prev . $page_output_breaker . $page_output_next;
}
 
// Write the outputs
echo $line_output;
echo $page_output;
 
?>

A few points worth taking note of:

Row counting

To get the total number of results, I have used

SELECT SQL_CALC_FOUND_ROWS

followed by a second query

SELECT FOUND_ROWS() AS total

As stated in the comments, this will return the number of results that there would have been without a limit clause.  There are other ways to achieve this, namely using count() in a second query, but this way is apparently quicker, and also slightly cleaner code.

Building the paging links

In the code I have used

$_GET['rs']

What this does is get the value from the part of the URL that looks something like http://www.yoursite.com/index.php?rs=10

That value then becomes our next starting point, and is injected into the SQL query.

I’ve seen some tutorials where page numbers are used instead of starting records.  This is fairly easy to achieve, and involves dividing the number of records returned by the size of the desired result set to get the number of pages, and then multiply again when determining the next starting point for the limit.  I’ve not done that in this tutorial for the sake of simplicity.  Besides, Google uses the records, and not pages, method.  Can’t be that terrible :)

Extending the functionality

In this example I’m echoing the result to screen.  You could instead wrap this up in a function and return the results.  Another easy modification would be to alternate the background colours, as shown in my previous howto.

And that’s it for today.  If you found this useful, of would like to improve it, comments are always appreciated!

PHP howto - Alternating background colours for table rows

June 23rd, 2008 Denham Coote

Often when writing something for the web, you’ll need to output data in a table (or, for CSS zealots, nicely formatted <div>’s).  In order to improve readability, you might want to colour every second row differently. This is really easy:

$counter = 1;
while($counter < 10) {
 
    //set bgcolor
    $bgcol = ($counter % 2 == 1) ? "#ececec" : "#ffffff";
 
    //write the output html
    echo "<tr><td bgcolor=\"".$bgcol."\">Your content goes here...</td></tr>";
 
    //increment the counter
    $counter++;
}

The above code is merely to illustrate the idea behind alternating rows.  I’ve used a very short conditional that checks if the current row is even or odd.  Based on that result, it sets the background colour variable to either #ffffff or #ececec.  Once the loop has run, the counter is incremented, and we start over.

By the way, the line:

    $bgcol = ($counter % 2 == 1) ? "#ececec" : "#ffffff";

is equivalent to using:

    if ($counter % 2 == 1) {
        $bgcol ="#ececec";
    }
    else {
        $bgcol = "#ffffff";
    }

Some extensions to make this code better are things like inserting real data (from a call to a database), replacing the table code with <div>’s, replacing the bgcolour values with a particular style class, and writing to a string/file/etc instead of echoing the results.  Another cool thing (if in a user-based environment) is to add a subsequent if statement that checks to see if the row being processed matched the currently logged on user - this way you can show a user that the row in question belongs to them by defining a third bg colour and highlighting it differently to the rest.

A slightly more real-world example of how the code may look is:

$sql = "SELECT id, name, surname FROM person";
$result = mysql_query($sql);
$counter = 1;
 
while($row = mysql_fetch_array($result)) {
 
    //set class
    $classname = ($counter % 2 == 1) ? "dark-div" : "light-div";
 
    //write the output html
    echo "
        <div class=\"".$classname."\">
            <div>". $row['id'] ."</div>
            <div>". $row['name'] ."</div>
            <div>". $row['surname'] ."</div>
        </div>";
 
    //increment the counter
    $counter++;
}

And in next week’s issue, how to assemble a nuclear bomb from ordinary household items! ;-)

A beginner’s take on developing a Facebook application

June 20th, 2008 Denham Coote

Grab your popcorn, coffee, blowup doll, whatever it is that passes time, and settle in for a lengthy post!

Being relatively new to the world of PHP development, as well as the associated technologies (MySQL, CSS, etc), I’m going to ask that you bear with me while I ramble on about things that, for many of you, may seem entirely trivial.  My reasoning for doing this is because a) repeating what I’ve learned reinforces the lessons, and b) someone else may find it beneficial.  The level of detail and technical jargon I use will vary.  Sorry.  So, without further ado, here goes…

The application is a platform for users to submit pick up lines (guys) and seduction tips (girls).  Users are allowed to vote on content added by the opposite sex (IE, guys may vote for seduction tips, and girls for pick up lines) as well as being able to comment (on either type).

The above requirements are quite straightforward.  The implementation thereof, is interesting.  While the application is really simple, a lot of users never get to see what’s really going on behind the scenes.  Here’s a quick rundown:

The first thing we do is grab the user’s sex from their profile.  This allows us to automatically present the correct input forms when they wish to post a pick up line / seduction tip.

Submitting

When the user submits a pick up line / seduction tip, it needs to be cleaned of any potentially harmful code (like bad JavaScript).  This is done by stripping out any illegal characters from their submission.

Once we have the submission, we give it a unique tracking id, so we can find it again later on.  We also record the pertinent details such as who made the submission (in this case, the user’s unique Facebook ID), the date and time, and the type of submission (pul/tip).  All of this is done behind the scenes, on the fly.  This information is then added to our database as a new record.

Formatting

Having hundreds of lines of content stored in a database is all good and well, but we need a way to display it.  For this I wrote a simple function (Yeah Tyler, simple ;-)) which could be called in a number of ways, depending on how we wished to display the content.  For example, listing the 10 most recent entries, whether they be pick up lines or seduction tips.  Or, more specifically, displaying all the entries of a single user.  Or a random selection.  Or the 5 most voted (more on voting in a bit!) for pick up lines, tips, or both.

Once the function is called (and provided with its criteria) it will send the data (encapsulated in named <div> tags) back to the calling file, which then applies any CSS styling we need (to make it pretty!) and pushes it to the browser.  The data it sends back include:

  • the author
  • date and time
  • number of votes for the item
  • the actual text of the item
  • number of comments.

Voting

Users are allowed to vote for content submitted by the opposite sex.  They may only vote once per item, but can vote on as many items as they like.  A few things need to happen here.  When the pick up line / seduction tip is displayed, a voting badge is (think Digg) is attached to it.  This badge is generated along with every pul/tip that gets sent back to the browser.  Every time a line is displayed, we need to check if a) the current user is allowed to vote for this item (opposite sex?) and b) if they are allowed to vote, have they voted for this item before? If so, we disable voting for that item.  If not, we provide a button with will then enable them to vote.

When a user votes for an item, we need to record some more details.  We need to know

  • who the vote is coming from (the user’s Facebook id)
  • which item the vote is for (the unique tracking id I mentioned earlier)
  • the date and time of the vote

1 & 2 are important in terms of avoiding duplicate votes in the system.

Ranking

Once we have a table full of votes, we can use the information to do some form of ranking on our items.  A few possibilities include ranking based on:

  • Overall number of votes per user
  • Overall number of votes per item
  • Most active users (who’s doing the voting)

Commenting

Quite similar to voting in terms of code and structure, but a bit more simple.  Here there are less constraints.  A user can comment on any item, and more than once.  Here we record the following:

  • who the comment is coming from (the user’s Facebook id)
  • which item the comment is for (the unique tracking id I mentioned earlier)
  • the date and time of the comment
  • the actual comment itself (the text)

And that’s pretty much it!  There are a bunch of other cool bits and pieces, but that would make this an (even more) exceptionally long post.