When accepting data from a user, any data at all, it should be sanitized before making its way to your database.
What does this mean? Well, for one, you’re going to inspect the data and make sure that it doesn’t contain any malicious code such as ill-intentioned javascript. Another is to prepare the data so that when it gets added to your insert/update SQL it doesn’t break the SQL (or do other nasty actions). Otherwise know as a SQL injection attack.
The technical details of the types of attacks we’re protecting against are a bit out of the scope of this post, but there are numerous resources available which will explain far better than I am able to.
After a form has been submitted (via get or post) it gets stored in the global array $_GET or $_POST. Once we have this data, we can and should do a bunch of things to it, such as:
Stripping out malicious code
We’ll scan through the input, searching for anything that shouldn’t be there, like html code, <script> tags, etc.
<? function cleanInput($input) { $search = array( '@<script[^>]*?>.*?</script>@si', // Strip out javascript '@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags '@<style[^>]*?>.*?</style>@siU', // Strip style tags properly '@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments ); $output = preg_replace($search, '', $input); return $output; } ?>
’slashing
This part can sometimes get tricky, but not to worry, the code’s not too bad. Basically we’re adding a backslash before any of the following: ‘ (single-quote), “ (double quote), \ (backslash) and NULL characters. Depending on your server configuration, there are a bunch of ways of getting this done. PHP has something called magic_quotes, which does this automatically. Note, however, that as of PHP 6 this feature has been deprecated and removed. Another PHP function, addslashes(), is the manual version of magic_quotes. addslashes(”Where’s Wally”); will return “Where\’s Wally”. A better option, if your server supports it, is mysql_real_escape_string(). It performs pretty much the same function, but is apparently better.
<? function sanitize($input) { if (is_array($input)) { foreach($input as $var=>$val) { $output[$var] = sanitize($val); } } else { if (get_magic_quotes_gpc()) { $input = stripslashes($input); } $input = cleanInput($input); $output = mysql_real_escape_string($input); } return $output; } ?>
To use, we simply pass any input to the function. The function works on single strings, as well as deep arrays.
<? $bad_string = "Hi! <script src='http://www.evilsite.com/bad_script.js'></script> It's a good day!"; $_POST = sanitize($_POST); $_GET = sanitize($_GET); $good_string = sanitize($bad_string); // $good_string returns "Hi! It\'s a good day!" ?>
Typecasting
Making sure that the data we’re inserting matches the expected type; i.e, someone’s age should be received as an integer value, and not a string.
<? $age = (int) $_GET['age']; ?>
This is a very gentle introduction to sanitizing your database input, and I would certainly recommend that you do a lot more research on these methods in order to use them correctly in your given environment.
That’s it for today. If you found this useful, of would like to improve it, comments are always appreciated!








