Network programming

Paul Harrison -- pfh@csse.monash.edu.au


Networks

A network is a collection of computers connected together somehow.

  • The computers communicate by sending each other small packets of data (generally in the order of a kilobyte).

    Example: ethernet, wireless, modem, wide area networks

  • Every computer in the network can communicate directly with every other computer in the network.

  • Every computer has a network address, often called a hardware address.

    Example: the hardware address of my laptop is 00:30:65:01:02:54


    Internetworks

  • A computer can be joined to more than one network. These computers are called gateways.

  • These computers can relay packets between networks.

  • A collection of networks connected by gateways is called an internetwork.

    Example: the internet

  • Each computer in an internetwork has another address. On the internet, this is called its IP address (IP stands for Internet Protocol).

    Example: the IP address of my laptop is 130.194.225.100

  • Gateways use this address to decide which network to send a packet on to.

    Example: a packet sent from mandarin.csse.monash.edu.au to www.anu.edu.au passes through six gateways in approximately 50 milliseconds.

      # traceroute www.anu.edu.au   
      traceroute to www.anu.edu.au (150.203.99.8), 30 hops max, 38 byte packets
       1  caul-gw.monash.edu.au (130.194.227.254)  4.219 ms  3.191 ms  3.243 ms
       2  monash-gw-28.net.monash.edu.au (130.194.28.1)  4.516 ms  4.043 ms  4.250 ms
       3  vic-gw.vrn.edu.au (203.21.130.33)  13.524 ms  19.739 ms  20.501 ms
       4  vic-act.atm.net.aarnet.edu.au (192.12.76.34)  164.509 ms  168.172 ms  184.508 ms
       5  anu-anuhub.carno.net.au (203.22.212.34)  207.311 ms  130.980 ms  131.975 ms
       6  hanhuba.anu.edu.au (150.203.205.3)  154.367 ms  137.398 ms  146.712 ms
       7  www.anu.edu.au (150.203.99.8)  161.976 ms  184.624 ms  187.591 ms

  • Most computers on the internet also have a domain name. This is a human readable name.

    Examples: mandarin.csse.monash.edu.au, www.monash.edu.au


    The TCP/IP protocol stack

    The vast majority of network applications today use TCP/IP.

  • TCP stands for Transfer Control Protocol.

    Examples: HTTP (Hyper-Text Transfer Protocol), FTP (File Tranfer Protocol), SSH (Secure SHell)

  • TCP and IP are actually two layers of a Protocol Stack.

    Application (eg HTTP)
    Transfer (TCP)
    Network (IP)
    Data Link (eg ethernet)

    (this is not a stack in the sense of something you can push things onto and pop things off of!)

  • Each layer in the stack builds on top of the layer below it.

  • The Data Link layer does the actual business of communicating with other computers.

  • The Data Link layer is, in a certain specific, well defined sense, unreliable. A packet sent over, say, ethernet is not guaranteed to arrive at its destination.

  • The Network layer handles packaging packets so they can travel through the internet. It also deals with working out the hardware addresses corresponding to the IP addresses of different computers. It is also the layer at which routing occurs.

  • The Network layer is also unreliable.

  • TCP converts streams of outgoing data into packets, and incoming packets into streams. It hides the idea of packets, and instead presents the network as something kind of like a file that you can read from and write to.

  • TCP also provides reliability: if packets are lost in transit, they will be re-sent.


    Clients and servers

    Network applications can usually be divided into two different types: servers and clients.

  • Servers provide a service.

    Examples: Apache and IIS give access to web pages, SSHD (Secure SHell Daemon) lets you run a shell on the machine it is running on.

  • Clients "connect" to servers, and make use of the service they provide.

    Examples: Netscape connects to web servers and fetches web pages from them, SSH lets you connect to an SSHD and start a remote shell.

  • On the internet, servers "listen" for incoming connections on TCP ports. Each computer has 65535 such ports.

    Examples: Apache and IIS generally listen to port 80, SSH generally listens to port 22

    (these ideas of "connections" and "ports" are just a protocol, an agreement between client and server, in reality it's still all just packets)


    Using TCP

  • Use the sockets interface.

  • A socket is a sort of general purpose object for connecting things together. One of the things it is used for is as the ends of a TCP/IP connection.

    For clarity, these examples are written in Python and don't handle errors. The corresponding C code is very similar. Writing network code in a low level language such as C can be very dangerous! Bored teenagers can use mistakes such as writing data beyond the end of an array to gain control of your computer.

    (using sockets in C will be covered in more detail in Trevor Dix's talk on Unix Processes and Sockets)


    Writing a client

    from socket import *
    
    server_name = 'www.csse.monash.edu.au'
    server_port = 80
    
    # Work out IP address corresponding to the server's domain name
    server_address = gethostbyname(server_name)
    
    # Create a socket
    my_socket = socket(AF_INET,SOCK_STREAM)
    
    # Connect the socket to the server
    my_socket.connect((server_address, server_port))
    
    # Make some file objects from the socket object
    # The equivalent C function is "fdopen"
    # (Alternatively use send and recv on the socket itself,
    #  but this can be tricky)
    incoming_stream = my_socket.makefile("rt")
    outgoing_stream = my_socket.makefile("wt")
    
    # Now we can communicate with the server we just connected to
    
    # We should be connected to the School's web server,
    # so lets send an HTTP request...
    outgoing_stream.write("GET /index.html\r\n\r\n")
    outgoing_stream.flush()
    
    # ... and print out the web-page it gives us in reply
    print incoming_stream.read()
    
    # Finally, clean up
    incoming_stream.close()
    outgoing_stream.close()
    my_socket.close()
    
    

    Writing a server

  • The server is more complicated because it has to deal with multiple clients at simultaneously.
    import sys, os
    from socket import *
    
    port = 10000
    
    # Create a socket to listen for connections
    listening_socket = socket(AF_INET,SOCK_STREAM)
    
    # Tell the socket which port it is meant to be listening to
    # Note: only root can bind to ports below 1024
    # Make sure you choose a port that some other application isn't using!
    listening_socket.bind(('', port))
    
    # Tell the socket to listen for connections
    listening_socket.listen(5)
    
    # listening_socket is a special kind of socket who's only job is to
    # listen for connections. 
    
    while 1:
        # Accept a connection
        accepted_socket, address = listening_socket.accept()
    
        # Accepting the connection created another socket!
    
        # This is so that you can keep listening for new connections
        # while you deal with this connection.
        
        # accepted_socket behaves the same way as the socket in the client
    
        # Deal with this connection in a separate process, because we want
        # to keep listening for new connections
        if os.fork() != 0:
    	# Parent process: close the new socket and keep looping
    	accepted_socket.close()
    
        else:
    	# Child process: close the listening socket and
    	# handle the new connection
    	listening_socket.close()
    	
    	incoming_stream = accepted_socket.makefile("rt")
    	outgoing_stream = accepted_socket.makefile("wt")
    
            # Do something server-ish
            outgoing_stream.write("Hello world!\n")
    
    	incoming_stream.close()
    	outgoing_stream.close()
    	accepted_socket.close()
    	sys.exit(0)
    


    But in real life...

    ... there is more to network programming than client-server TCP/IP.


    Higher level protocols

  • There are verious protocols built on top of TCP/IP to make network programming simpler.

    Examples: CORBA, XML-RPC

  • There are also many environments and libraries that can simplify network programming.

    Examples: libraries for languages such as Python and Perl


    UDP

    UDP (User Datagram Protocol) is a protocol that can be used instead TCP.

  • UDP provides a means to send packets of data to other computers without setting up a connection.

  • UDP is, in a certain specific, well defined sense, unreliable. Applications that use UDP have to provide there own mechanism for re-sending packets that disappear en-route.

    Examples: DNS (Domain Name Service), games such as Quake


    NAT networks

    NAT (Network Address Translation) networks appear on the internet to have a single IP address. The gateway to the network rewrites the source and destination IP addresses of incoming and outgoing packets, using magic.

  • Computers on a NAT network can only be clients, not servers.

  • NAT networks are useful because there are a limited number of IP addresses, and IP addresses cost money.


    Peer-to-peer software

    Peer-to-peer software acts as both a client and server. Multiple computers cooperate to provide a service.

    Examples: Gnutella, The Circle


    Resources

  • Man pages: socket, ip, tcp

  • Network related UNIX commands: ifconfig, route, traceroute, ping, tcpdump, netstat

  • Unix Network Programming by Richard W. Stevens

  • The RFCs are the definitive but highly technical reference on internet protocols.

  • This page: http://www.csse.monash.edu.au/~pfh/network-talk/network.html