Creating a traceroute program in PHP

Warning: This blogpost has been posted over two years ago. That is a long time in development-world! The story here may not be relevant, complete or secure. Code might not be complete or obsoleted, and even my current vision might have (completely) changed on the subject. So please do read further, but use it with caution.
Posted on 30 Jul 2010
Tagged with: [ ICMP ]  [ IP ]  [ PHP ]  [ traceroute

Today i was reading upon this wonderful article about writing a trace-route program in Python in 40 lines. Even though trace-route is one of the many tools i use on day to day basis, i never really got into writing a version myself (something I like to do just to gain knowledge how things works). So when I was reading this post, i thought, Python is nice, but is it possible to do it in PHP as well? The answer to that: yes and no..

Even though PHP supports sockets, it does not completely gives us all the power that comes with the socket-API. Some things are hidden (for reasons), so we have to do some things manually. As PHP gets more powerful, more stable and more enterprise-ready by the day, I assume more and people will actually start to use PHP for other uses than just creating webscripts. One of things might be creating daemons (programs running in the background on a server) or something similar. This would mean more and more low-level API’s like the socket-API should become available.

What is traceroute?

For the people who don’t know what traceroute is, here is a small explanation. All traffic on the internet will not go directly from the source (a web-server hosting a website) to the destination (your computer), but passes a series of substations (routers, switches etc). This is a transparent thing which you don’t really care about as an end-user. But when you can’t reach the web-server,  it could be that the web-server is unavailable, or something along the way has broken down. You can use traceroute to visualize the route the traffic would take from your computer to the web-server (and normally, vice versa).

How does it work?

Traceroute works with the ICMP protocol. This is a protocol used for maintaining and controlling the whole internet basically. It is used to figure out quickly is a server is online, it will tell you if a host or port is unavailable, if something needs to be redirected etc.. It the police-force of the internet and it’s there to keep things flow smoothly.

As said, every time you send a packet onto the internet to somewhere else, it will go through a series of substations. Each station will check if the packet is for him, if not, it will send it to a neighboring station just a little bit closer to the final destination. However, this is not a process that goes on indefinitely. When there is no end station, or something has been misconfigured somewhere, it would be a bad thing if every station keeps on passing a packet along in circles. A special counter inside each packet will tell a station how many times it can be passed along. This is called “time-to-live” (TTL). Every station along the way decreases this counter and as soon as it hits 0, the station drops the packet and returns a message back to the source telling it that the “TTL has expired”. This is not the same as telling the source that the destination is unknown, but merely that it’s further away than the initial TTL that was given to the packet.

What traceroute does, it take advantage of this TTL field. It sends a packet to a certain destination (the site you want to traceroute to), with a TTL of 1. This means, the packet gets dropped by the first station it passes and returns a ICMP message. We fetch this message, find out who send it (namely, the first station), we figure out how long the roundtrip took (time from send until receiving the ICMP message) and print this on the screen. After this we increase the TTL to 2, and send out the packet again. Now it will pass the first station, and gets dropped by the second station. That station returns a ICMP message and we print the info.. This continues until we have reached a certain TTL (normally, 30) or until we have reached our final destination.

Overcoming the PHP socket API problems

In essence, our traceroute program is fairly simple. It does not have as many features as normal traceroute programs, but the basics are pretty similar. Also, normal traceroute programs send out multiple packets to each host so you get an average roundtrip time (it’s even possible that each packet will end up at another station, normal tracerouters will show you that too).

We rely on 2 different protocols here. First, we have the ICMP, which need because those are the packets we want to get from our substations. Secondly, our probe-packets are based on UDP. It’s possible to use TCP, but UDP is much easier to work with. Because we need to adjust the TTL for the packets (something that is taken care of automatically by the TCP/IP software on your OS), we need to use the socket_set_option() functionality of PHP. However, setting the TTL is something you cannot do directly, because PHP does not support the flags needed.

First of all, socket_set_option() only supports SOL_SOCK, which sets options purely for the socket, but we could also set it to something else. In another language, we could use the SOL_IP level,and with the IP_TTL option, we could set the ttl.

// Set the TTL for packets to 3
socket_set_option ($socket, SOL_IP, IP_TTL, 3);

The defines SOL_IP and IP_TTL are not defined in PHP. This is because these values are not only OS specfic, but also are known under diffeernt names. SOL_IP is something that is linux specific (not sure if the same option exists for windows socket, but they do things very differently anyways). The BSD (and mac osx) use a different constant: IPPROTO_IP.  Luckily, they both use the same number. IP_TTL however, is different under different OS’ses. For instance: under linux, it’s value is 2, while under mac OSX (and other BSD’s), it’s 4.

Since PHP sockets (at least, the socket_set_option() function) is a small layer on top of the actual socket-API, it doesn’t really care what kind of flags you give it. So even though PHP does not really support is, the API underneath does, we can send those numbers without any problems and it works. Sweet!

Running the code:

Even though we can run this program, it probably will result in errors. That is because Linux (or any other decent OS) will not let you thinker with the internals of TCP/IP packets without a good cause and using protocols like ICMP are not something everybody is allowed to do.

If you really want to do this, you need to be root. Therefore, this program will only work when running it as root. This means it probably is not going to work when running it from a web-server, but you have to run it from the command line:

jthijssen@tarabas:~/traceroute$ sudo php traceroute.php
Tracerouting to destination: 199.6.1.164
  1   192.168.1.1      0.004 ms  192.168.1.1
  2   *                0.005 ms  static.kpn.net
  3   (timeout)
  4   139.156.113.141  0.005 ms  nl-asd-dc2-ias-csg01-ge-3-2-0-kpn.net
  5   195.190.227.221  0.005 ms  asd2-rou-1022.nl.euroringen.net
  6   134.222.229.105  0.005 ms  asd2-rou-1001.NL.eurorings.net
  7   134.222.97.186   0.007 ms  kpn-1402.xe-0-0-0.jun1.galilei.network.bit.nl
  8   213.154.236.75   0.012 ms  213.154.236.75
  9   199.6.1.164      0.012 ms  pub3.kernel.org

This is a traceroute to www.kernel.org. I’ve removed the second hop (because that’s the IP at my place). The 3rd hop returned a timeout. Probably the station there did not return a ICMP packet back to use.

The actual code:

The code itself can be found on github. However, it’s small enough to post here as well.

<?php

    define ("SOL_IP", 0);
    define ("IP_TTL", 2);    // On OSX, use '4' instead of '2'.

    $dest_url = "www.google.com";   // Fill in your own URL here, or use $argv[1] to fetch from commandline.
    $maximum_hops = 30;
    $port = 33434;  // Standard port that traceroute programs use. Could be anything actually.

    // Get IP from URL
    $dest_addr = gethostbyname ($dest_url);
    print "Tracerouting to destination: $dest_addr\n";

    $ttl = 1;
    while ($ttl < $maximum_hops) {
        // Create ICMP and UDP sockets
        $recv_socket = socket_create (AF_INET, SOCK_RAW, getprotobyname ('icmp'));
        $send_socket = socket_create (AF_INET, SOCK_DGRAM, getprotobyname ('udp'));

        // Set TTL to current lifetime
        socket_set_option ($send_socket, SOL_IP, IP_TTL, $ttl);

        // Bind receiving ICMP socket to default IP (no port needed since it's ICMP)
        socket_bind ($recv_socket, 0, 0);

        // Save the current time for roundtrip calculation
        $t1 = microtime (true);

        // Send a zero sized UDP packet towards the destination
        socket_sendto ($send_socket, "", 0, 0, $dest_addr, $port);

        // Wait for an event to occur on the socket or timeout after 5 seconds. This will take care of the
        // hanging when no data is received (packet is dropped silently for example)
        $r = array ($recv_socket);
        $w = $e = array ();
        socket_select ($r, $w, $e, 5, 0);

        // Nothing to read, which means a timeout has occurred.
        if (count ($r)) {
            // Receive data from socket (and fetch destination address from where this data was found)
            socket_recvfrom ($recv_socket, $buf, 512, 0, $recv_addr, $recv_port);

            // Calculate the roundtrip time
            $roundtrip_time = (microtime(true) - $t1) * 1000;

            // No decent address found, display a * instead
            if (empty ($recv_addr)) {
                $recv_addr = "*";
                $recv_name = "*";
            } else {
                // Otherwise, fetch the hostname for the address found
                $recv_name = gethostbyaddr ($recv_addr);
            }

            // Print statistics
            printf ("%3d   %-15s  %.3f ms  %s\n", $ttl, $recv_addr,  $roundtrip_time, $recv_name);
        } else {
            // A timeout has occurred, display a timeout
            printf ("%3d   (timeout)\n", $ttl);
        }

        // Close sockets
        socket_close ($recv_socket);
        socket_close ($send_socket);

        // Increase TTL so we can fetch the next hop
        $ttl++;

        // When we have hit our destination, stop the traceroute
        if ($recv_addr == $dest_addr) break;
    }

Conclusion:

Working with PHP socket API is fun. Because PHP is platform-independent, it’s hard to come up with methods that work on all platforms. But because PHP passes down a lot of functionality from the socket-functions straight down to the socket-API of the OS, you can still benefit from all the functionality of the OS.

Expanding the example above is easy. Instead of sending 1 packet, you could send 3, and return an average round-trip time. You could use argv[] parameters to get the URL you want from the command line. Check if you are actually root before running and many more things that could be improved. Also, take a look at other (decent) traceroute implementations. There are many more ways to figure out the route a packet takes. It’s up to you to implement.