Creation of a Proxy Web Server – Set 1:

In Python socket programming is much easier than C for Data Science. The programmer has no reason to worry about the minute details in the sockets. Here the user can focus more on the application layer instead of a network layer. In this section, we will see the development of a simple multi-threaded proxy server capable of handling HTTP traffic.

We will follow three steps for implementation.

  1. Creation of incoming socket:

We will create a socket named server socket in the __init__ method of server class. It will create a socket for the incoming connections. We will later bind the socket and wait for clients to connect.

Creation of a Proxy Web Server in Python for Data Science – Set 1 - PST

  1. Accepting client and process:

It is the most important step. Here we will wait for the client’s connection request and when a connection is made we will dispatch the request in a separate thread which makes us ready for the next request. This process allows handling multiple requests and so, increases the performance of the server manifolds.

Creation of a Proxy Web Server in Python for Data Science – Set 1 - PST

  1. Redirecting the traffic:

The proxy server will act as intermediary between the source and the destination. We will fetch data from the source, and then we will pass it to the client.

  • We will extract the URL from the received request data first.Creation of a Proxy Web Server in Python for Data Science – Set 1 - PST
  • After this, we will find the destination address of the request. The address we have is a tuple consisting of (destination_ip_address, destination_port_no). We will receive the data from this address.Creation of a Proxy Web Server in Python for Data Science – Set 1 - PST
  • We will further setup a new connection to destination server and send a copy of the original request to the server. The server then responds with a response. The generic message format of RFC 822 is used in all response messages.
  • We will then redirect the server’s response to the client. The original connection to the client is conn. It is possible that the response is bigger than MAX_REQUEST_LEN which we receive in one call. Therefore a nul response marks the end of response.

At last, we will close server connections and handle the errors, in order to make sure that the server works perfectly.

Testing the server:
  1. We will run the server on a terminal in order to check it. So, we keep it running, and then we will switch to our favorite browser.
  2. We will then go to the browser’s proxy setting and change the proxy server to ‘localhost’ and port to ‘12345’.
  3. We will now open any website which is http ad we should be able to access contents on the browser.

When the server starts running, we can monitor the requests that come to the client. The data can be used for building statistical information . We will also be able to blacklist an IP address or block a website.

So, to learn more about it in python for data science, you can check this and this as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.