exploiting_preventing_insecure_deserialisation

When developing a game, you may need to save a player’s run to a file so that you don’t lose their progress and they can return to where they left off. Similarly, when developing an online text editor, you may want to preserve the content that the user has written.

Indeed, there are many cases where we want to save the state of our application to restore it in the future. Two terms are used to define this process: serialization and deserialization.

What is deserialization (and serialization)?

Serialization is the process of converting the state of an application into a format suitable for transfer or storage. Deserialization is the reverse process and therefore restores the state of the application.

It is possible that we wish to serialise specific data such as instances of classes for example. In this case, the format in which the data is serialized must support it. This is done by specifying, for example, the name of the class next to the data. However, a question may arise: what would happen during deserialization if the user puts another class name than the one expected?

Classes can have methods that will be called when they are deserialized, when their members are read or written, or when they are destroyed. For example, a class could put in a list all the temporary files it has created to delete them when it is destroyed. This is the case, for example, of this XLSXWriter class in PHP.

In fact, if a user is able to control what the server will deserialise, then they could use dangerous classes to grant themselves rights, delete files or execute arbitrary code.

Deserialization in PHP: how it works and possible exploitations

In PHP, the serialize function is used to serialize a data structure and its type. The unserialize function is used to go backwards.

After the data is serialized, it is represented differently according to its type.

  • For a string s:13: "Hello, world!";, the number corresponds to its length.
  • For an integer, i:42;.
  • For a boolean, b:0; or b:1;.
  • For the value null, N;.
  • For a list a:1:{s:5: "hello";s:5: "world";}, the first number is the number of entries in the list, and between the braces is a sequence of keys/values.
  • For an O:5: "Hello":1:{s:7: "to_whom";s:5: "World";}, the first number is the length of the class name and the rest is the same as for a list.

An attacker will surely be interested in the representation of an object because he will be able to control the name of the class and thus build what he wants. Take the code below for example.

php
class TemporaryStorage {
	public $files = array();

	function __destruct() {
		foreach ($this->files as $temp_file) {
			echo "Deleting $temp_file...\n";
			@unlink($temp_file);
		}
	}
};

unserialize($_GET["user_data"]);

If this class exists in the code, then an attacker could abuse the deconstruction of TemporaryStorage to arbitrarily delete files on the system. To do this, he would have to forge an object with the class name TemporaryStorage with a files field and as its value a list containing the names of the files to be deleted. Then pass this to the unserialize function.

O:16:"TemporaryStorage":1:{s:5:"files";a:2:{i:0;s:20:"/app/application.php";i:1;s:16:"/data/datbase.db";}}

If we transmit the above string, then we will get the following response from the server.

Deleting /app/application.php...
Deleting /data/datbase.db...

Homemade classes that would allow an attacker to increase the impact may require access to the source code to be discovered. However, some libraries containing vulnerable classes can be used, which increases the possibilities for an attacker. On this point, the phpggc project collects a list of classes that can be exploited in frequently used libraries and frameworks.

If another vulnerability is present in your application that allows arbitrary editing of PHP sessions, then an attacker could use it to perform dangerous deserialization because sessions are nothing more than serialized objects.

Deserialization in C# with Json.NET: how it works and possible exploitations

The Json.NET library allows you to serialize and deserialize data in a format that is considered safe: JSON. Unfortunately, when deserializing, the library will use the $type field in an object to know which class to instantiate. If the TypeNameHandling.Objects option is used, then when trying to deserialise an item of type object any type can be specified.

For example, consider the following case:

csharp
using System.Diagnostics;
using Newtonsoft.Json;

public class CommandManager {
	public string? output { get; set; }

	private string? _command;

	public string? command {
		get { return _command; }

		set {
			_command = value;

			Process p = new Process();
			p.StartInfo.RedirectStandardOutput = true;
			p.StartInfo.FileName = "/usr/bin/env";
			p.StartInfo.Arguments = _command;
			p.Start();

			output = p.StandardOutput.ReadToEnd();
			p.WaitForExit();

			Console.WriteLine(output);
		}
	}
};

This class has a custom setter for the command field, it will execute the command passed and put the output into the output field. As explained above, if we use the Json.NET library to deserialize, then we can specify the name of this structure in the $type field and call the command setter.

csharp
var result = JsonConvert.DeserializeObject<object>(
	user_input,
	new JsonSerializerSettings { TypeNameHandling = TypeNameHandling.Objects }
);

If we use the above code to deserialise the data below.

json
{"$type":"CommandManager, myproject","command":"id"}

Then the CommandManager setter will be called and our id command executed. In the same way as for PHP exploitation, there are lists of vulnerable classes on projects like ysoserial.net.

Deserialization in other languages and security best practices

Generally speaking, the process is the same in languages that allow code or classes to be loaded dynamically. One looks for a deserialization function and then for classes that have interesting methods without always taking into account security considerations.

We propose two ways to overcome this problem. If you really need to represent complex data coming from the server and returned by the client, it is possible to use a cryptographic signature with the server-serialized data and check the signature when the user returns the data. In this way, an attacker will not be able to modify the data to load arbitrary code without invalidating the signature.

The other way would be to use a more basic format such as JSON while using a library that does not allow (or at least by default) arbitrary loading of classes.

Author: Arnaud PASCAL – Pentester @Vaadata