Introduction to code review


This course is an introduction to performing source code review to find security vulnerabilities in web applications. The application we are going to work on is a simple PHP application that allows users to upload and download files. Think of it as a over-simplified Dropbox. This exercise will also cover the different methods that can be used to perform code review.


Reasons for doing code review

There are multiple reasons for doing code review:

  • It can be faster than penetration testing. Some issues are really easy to spot during a code review (for example weak encryption), where others can take a lot more time (XSS for example).
  • Compliance can require you to perform security code review (for example as part of PCI DSS 6.3.2).
  • After doing penetration testing for a while; you want to do something different.
  • You want to find better bugs. Some of the bugs you will find during a code review can be surprisingly hard to discover with black-box testing.
  • You want to check if some code is backdoored (it's actually really hard to do).
  • You want to write an exploit for a bug.


There are plenty of ways to perform code review. You can find some of the methods below:

  • String matching/Grep for bugs.
  • Following user input.
  • Reading source code randomly.
  • Read all the code.
  • Check one functionality at a time (login, password reset...).

String matching/Grep for bugs

This is probably the fastest way to find low-hanging fruits; you just try to find patterns of known vulnerabilities. For example, you can use grep to find calls to the PHP system function:

$ grep -R 'system\(\$_' *

You can find a list of regular expressions to try on your code base in the GRaudit project (

This approach suffers from a lot of limitations:

  • You don't get a lot of coverage/assurance on the quality (and therefore security) of the source code. You just know that based on your list of patterns, you couldn't find any issues.
  • You need to know of all the dangerous functions/patterns.
  • You end up using very complex regular expressions.

This approach works pretty well for timeboxed reviews where you don't have enough time. It can also help you get familiar with a code base as part of a longer review. However, it's probably not the best way to perform proper reviews.

Following user inputs

Another way to go about doing a review is to follow all the user-controlled inputs and find all ways to access the application (the routes/URI available)

To get started you need to find all the ways to provide data to the application (example in PHP):

  • $_POST / $_GET / $_REQUEST
  • Data coming from the database (for stored XSS and second-order injections for example)
  • Data read from a file or a cache
  • ...

This method provides good coverage. However, you will need a good understanding of the framework/language used. Finally, you may end up reviewing the same function again and again if it's called multiple times.

By functionality

Another common way to do a review is to pick one functionality, for example:

  • "Password reset".
  • "Database access".
  • "Authentication".

And review all the code associated with this functionality. This work especially well if you do this across multiple applications/framework as they will all have different behaviors.

This approach gives you an excellent coverage for the functionalities your reviewed and will teach you what mistake people usually make for a given functionality. However, you only have coverage of what you reviewed.

Read everything

Finally, the more time-consuming way: just start reading the code one file at the time. A better ways to do this is to try to find weaknesses, not vulnerabilities. Then trying to see if the weaknesses can become vulnerabilities on their own or by combining them.

This method is obviously the most laborious way of working but it brings excellent coverage. It's crucial to keep good notes when using this approach.

What to look for?

When doing a review, you need to look for everything:

  • Weird behavior
  • Missing checks
  • Complexity
  • Security checks already in place
  • Differences between 2 functions/methods/classes
  • Comparison and conditions (if/else)
  • Regular expressions/string matching
  • What is missing?

You will probably end up seeing function/class/method you don't know. To solve this issue, you need to

  • Google it
  • Test its’ behavior

It’s going to take time (especially early on) but the more code you review, the easier it gets. Make sure you create a snippet with the function/class/method to test its' behavior. It will be convenient for your future reviews. To test it, you need to run it locally and try to find some edge cases that the developers may have missed.


Getting started

For this exercise, you have a really simple web application. We are going to go with the "Read everything" approach as there isn't that much source code to read.

To get started, don't focus on finding vulnerabilities, try to find weaknesses:

  • “That doesn’t seem right”
  • “What happens if I put [X] here”

Then, try to see if these weaknesses can become vulnerabilities. Either on their own or by combining them.

The application

The application is a straightforward application with a dozen security issues. As a user, you can only:

  • Register/Login/Logout
  • Upload and retrieve file

To get started you can get the code:

You can use the tool cloc ( to get a better idea of the size of the application:

% cloc .
      14 text files.
      13 unique files.                              
       2 files ignored. v 1.72  T=0.11 s (120.6 files/s, 46503.2 lines/s)
Language                     files          blank        comment           code
CSS                              2            676             11           3973
PHP                             10             48              4            289
SQL                              1              5              0              5
SUM:                            13            729             15           4267

List of weakness

You can find below the list of issues:

  • Hardcoded credentials or secrets
  • Information leak
  • Missing security flags
  • Weak password hashing mechanism
  • Cross-Site Scripting
  • No CSRF protection
  • Directory listing
  • Crypto issue
  • Signature bypass
  • Authentication bypass
  • Authorisation bypass
  • Remote Code Execution
Before reading any further read all the source code of the application and try to find the issues by yourself. Make sure you keep notes.

This page is intentionally left blank

The issues

Hardcoded credentials/secret

It's pretty easy to spot the hardcoded credentials or secrets if you read all the code. Here we can see that the following files contain some sensitive information:

  • classes/db.php:
    $lnk = mysql_connect("", "pentesterlab", "pentesterlab");
  • classes/jwt.php:
    return hash("sha256","donth4ckmebr0".$data);

A modern web application should store all secrets outside of the source code of the application. During deployment, the secrets and credentials should be injected inside the application (via environment variables or configuration file). This limits access to production secrets/credentials for developers.

Information leak

To find this issue, you need to think about the way the application will get deployed. The web root of the application is the main directory of the source code. Therefore, the following files and directories will be publicly available:

  • .git: this can be used to rebuild the full source code of the application.
  • deploy.sql: this contains the database password in clear text and the hashed admin password.

These files should not be in the web root of the application.

Security flags on cookies

The application use setcookie to set cookies for authentication. However, it doesn't set the secure and httpOnly flags on them in the following files:

  • login.php:
 setcookie("auth", User::createcookie($_POST['username'], $_POST['password']));
  • register.php:
setcookie("auth", User::createcookie($_POST['username'], $_POST['password']));

Weak password hashing mechanism

If you look at the register function in classes/user.php:

  public static function register($user, $password) {
    $sql = "INSERT INTO  users (login,password) values (\"";
    $sql.= mysql_real_escape_string($user);
    $sql.= "\", md5(\"";
    $sql.= mysql_real_escape_string($password);
    $sql.= "\"))";
    $result = mysql_query($sql);

You can see that the application uses md5 to hash passwords. Furthermore, the hashing is done without any seed. Finally, the hashing of the passwords is done by the database (as opposed to the application). This means that customers' password are likely to end up in the database logs or get transmitted in cleartext between the database and the application.

The application should use scrypt, bcrypt or PBKDF2 to store the password.

Cross-Site Scripting

The application uses the h function to escape data coming from the users. This function is just an alias for htmlentities defined in classes/phpfix.php:

  function h() {
    return call_user_func_array("htmlentities", func_get_args());

We can see that, h is not used in the following places:

  • index.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>
  • login.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>
  • register.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>

However, $error is not under user's control. Therefore, those are only weaknesses, not vulnerabilities. They should still get fixed to future-proof the application.

The exploitable XSS in located in index.php:

<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post" enctype="multipart/form-data">

As $_SERVER['PHP_SELF'] is user-controlled and can be used to trigger a XSS (using a path like /index.php/"><script...).

There is also another one in classes/user.php that can be triggered by registering multiple times the same username for example (due to the call to mysql_error).

No CSRF protection

Here we can see that none of the forms are protected by a CSRF token. This is an example of the "what is missing" bugs, you can't see it because it's not there.

Directory listing

When users retrieve the list of their files the following code is called (in classes/user.php):

  public static function getfiles($user) {
    $base = "files/".$user;
    if (!file_exists($base)) {
    return array_diff(scandir($base), array('..', '.'));

This code gets the list of files (scandir) in files/[USERNAME] and removes the parent (..) and current (.) directory from the list using array_diff.

Nothing in this code (or during registration) prevents an attacker to use the following usernames: .., ../.., ../../.. ... This will allow her to get a directory listing of any directory on the server hosting the application.

Crypto issue

The crypto issue is located in classes/jwt.php in the signing function:

  public static function signature($data) {
    return hash("sha256","donth4ckmebr0".$data);

Here we can see that the application is using a hash instead of a hmac. This creates multiple issues:

  • It's likely to be vulnerable to Length Extension
  • It's not the RFC way to sign JWT token. This implementation will not work with other applications.

Signature bypass

The signature bypass is located in classes/jwt.php:

  public static function verify($auth) {
    list($h64,$d64,$sign) = explode(".",$auth);
    if (!empty($sign) and (JWT::signature($h64.".".$d64) != $sign)) {
      die("Invalid Signature");

We can see that the application get three elements from the $auth variable: $h64,$d64 and $sign. Then it checks the signature $sign using: JWT::signature($h64.".".$d64) != $sign. However, it only checks the signature if one is provided: if (!empty($sign) and.... An attacker can provide a malicious $auth variable that doesn't contain a signature. This will allow him to bypass the signature mechanism.

Authentication bypass

The authentication bypass is based on a real vulnerability in the Play framework covered in another PentesterLab exercise: Play Session Injection.

Due to the weird JSON parser:

  public static function parse_json($str) {
    $data = explode(",",rtrim(ltrim($str, '{'), '}'));
    $ret = array();
    foreach($data as $entry) {
      list($key, $value) =  explode(":",$entry);
      $key = rtrim(ltrim($key, '"'), '"');
      $value = rtrim(ltrim($value, '"'), '"');
      $ret[$key] = $value;
    return $ret;

and the way the JSON is created:

    $header = str_replace("=","",base64_encode('{"alg":"HS256","iat":'.time().'}'));
    $token = "{";
    foreach($data as $key=>$value) {
      $token.= '"'.$key.'":"'.$value.'",';
    $token .= "}";

It's possible to create a malicious user named test","username":"admin to become admin in the application.

Authorization bypass

The authorization bypass is very simple. When a file is uploaded, it gets saved in the directory: /files/[USERNAME]/[FILENAME]:

  public static function addfile($user) {
    $file = "files/".$user."/".basename($_FILES["file"]["name"]);
    if (!preg_match("/\.pdf/", $file)) {
      return  "Only PDF are allowed";
    } elseif (!move_uploaded_file($_FILES["file"]["tmp_name"], $file)) {
      return "Sorry, there was an error uploading your file.";

It's trivial for an attacker to guess [USERNAME] and [FILENAME] to gain access to other users' files. The developper should ever put in place some access control or make the links impossible to guess.

Remote Code Execution

The RCE is located in the addfile function:

  public static function addfile($user) {
    $file = "files/".$user."/".basename($_FILES["file"]["name"]);
    if (!preg_match("/\.pdf/", $file)) {
      return  "Only PDF are allowed";
    } elseif (!move_uploaded_file($_FILES["file"]["tmp_name"], $file)) {
      return "Sorry, there was an error uploading your file.";

We already know that uploaded files are stored in the web root. We can see that the developer limits the file type users can upload to PDF. However, the regular expression is missing $ to match the end of the string. It's possible for an attacker to upload a file named exec.pdf.php. As the file's extension is .php and the file is in the web root, it will get executed when accessed by the attacker.


This exercise showed you how to perform your first code review to find potential security issues. I hope the course provides you with enough information and the methodology to get started. As always with security, practice makes perfect. Try to regularly review some code: changes introduced to fix a CVE, applications you're using everyday, applications in a weird language... Now it is time to apply what you just learned!

Did you enjoy this exercise?

Make sure you check out PentesterLab PRO and PentesterLab PRO Enterprise to develop your skills.