Understanding HTTP using sockets

Nov 2016
Tue 01
0
0
0

Creating a simple HTTP server using sockets

With the ever increasing popularity of web-based software (web-apps, microservices, REST, SOAP etc), it's always good to think about whats going on behind the scenes. This is particularly useful for me when venturing into frameworks like Flask or Django.

So, how low shall we go? As low as the main transport protocols used in the internet stack - TCP and UDP.

TCP and UDP

Without going into too much detail, the two main standards for transporting bytes across the internet are UDP and TCP. They differ in the nature of the connection, TCP requires a handshake (a formal connection between a server and a client process) whilst a client can send a UDP message to a server without guarantee of complete transmission.

For the purposes of this post, we'll focus on TCP as for internet traffic using HTTP, TCP is nearly always used (hence the typical labelling of the internet protocol stack as TCP/IP)

So how do we transfer messages between two processes over the network using TCP? One answer - sockets.

Sockets

Sockets are the magic that interface between your program and the operating system. A socket API is provided by the OS and can be accessed using libraries in all programming languages, so a developer can pick any - as long as it's Python.

Let's create client and server scripts that communicate over TCP using sockets.

server_tcp.py

import socket

#socket.SOCK_STREAM indicates TCP
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(("localhost", 12345))
serversocket.listen(1)

(clientsocket, address) = serversocket.accept()
msg = clientsocket.recv(1024)
print "server recieved "+msg

client_tcp.py

import socket

#socket.SOCK_STREAM indicates TCP
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect(("localhost", 12345))

msg = "Hello World from client"

print "client sending: "+msg
clientsocket.send(msg)

First running the server then client script in separate processes results in the following:

client_output

client sending: Hello World from client

server_output

server received: Hello World from client

The above example can be modified to perform two-way communication by adding send and recv methods to each server and client scripts:

server_tcp.py

print "server sending reply" 
clientsocket.send("server received your message")

client_tcp.py

msg = clientsocket.recv(1024)
print "client received: "+msg

giving the output:

client_output

client sending: Hello World from client
client received: server received your message

server_output

server received: Hello World from client
server sending reply

As may be clear from the above example, if we were to use this in anger we would soon need to define a standard way to communicate, i.e. a protocol. This would include the defition of the message type (plain text or something else?), message length and methods to handle server/client requests (including authentication and standard error messages). This is where HTTP comes in.

before reading on, run the server script and try opening the address in a web browser (http://localhost:12345/) - you should see the same server messages

HTTP

HTTP is a protocol for defining messages sent throughout the web. As suggested above, communication via HTTP is usually done using sockets and the TCP transport protocol. So if we were to modify the server example above, what would the message look like?

  1. Firstly a status line, that includes the version of HTTP and a status (if this message is a respone)
  2. Header fields, including:
    • content-type: text/html or application/json (message format)
    • content-length: (length of message in bytes)
  3. Empty line
  4. Message body

Time to modify the server script:

server_tcp_http.py

import socket

serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(("localhost", 12345))
serversocket.listen(1)

msg = """
HTTP/1.1
Content-Type: text/html

<html>
<body>
<b>Hello World</b>
</body>
</html>

"""

(clientsocket, address) = serversocket.accept()
sent = clientsocket.send(msg)

As you can see we have added some HTML to the body of the HTTP message. Running the server and client scripts give:

client output:

client sending: Hello World from client
client received: 
HTTP/1.1
Content-Type: text/html

<html>
<body>
<b>Hello World</b>
</body>
</html>

To give this a more authentic look, we can switch the python client script for a web browser. Run the server script again but now use a browser to navigate to http://localhost:12345/. The browser now understands the HTTP message and renders the HTML:

quarterwidth HTTP server implemented using sockets

The server script can be modified to parse 'GET' and 'POST' requests (along with other HTTP methods) but this is beyond the scope of this post. Python provides lots of libraries to simplify communications over HTTP, with vanilla Python itself containing SimpleHTTPServer. As such there is seldom the need to develop web servers using sockets directly - but hey, it's interesting.




Comments