Sunday 2 October 2011

Snooping Internet Traffic

The internet as we know it, is a global system of interconnected computer networks using a standard set of rules (called protocols). It is used used by billions of users worldwide.

Everything you do on internet like browsing,watching videos,chat,email,etc.. involves flow of data from one machine to another & gives rise to internet traffic.
Internet traffic actually consists of discrete pieces of information called packets.

So all data that is to be transmitted over internet (internet traffic) is first broken down to discrete chunks of data which are much much smaller in size & then these pieces are transmitted from the source to destination.
The packets are transmitted according to certain protocols like Transmission
Control Protocol(TCP)/Internet Protocol(IP),User Datagram Protocol(UDP),etc. Depending on the nature of traffic. A typical packet contains perhaps 1,000 or 1,500 bytes.

It's possible to capture the internet traffic on your local machines by using some free utilities like tcpdump and wireshark.

tcpdump is a common linux command line utility which can be used to analyze packets. With its help, we can intercept/snoop on the packets going in and out of the network.

Wireshark is another packet analyzing utility which is GUI based. It is free and opensource and can be easily used for snooping purposes.

We used wireshark & tcpdump, both at the same time on ubuntu to capture live traffic while we were using Internet like usual.

Here is some interesting analysis:

Below you see a sample packet captured through tcpdump:


Here is the same packet with some parts highlighted:


The first line of the packet deals with the summary of what's going on.

In the sample packet, we can see that it came at around 1:03 pm from the same coputer that was running the script, which is, ECELAB - PC 15 from port 38285, and it was going to a host with ip 74.125.236.109 which is google's ip address. (Try putting it in your url address bar)

Note that instead of a number, the port section of the destination host is written as "www", which translates to port number 80. (Default port for www)

The rest of the top line deals with the various flags which were set and at the end, the total length of the packet is given.

The data below the top line is divided into 3 columns. The first column stores the address (in hex) of the first block of data in the second column.
For example:
0x0000 - is the address of "4500" in the above picture
0x0040 - is the address of "772e" in the above picture

The second and third columns show the actual data which packet is carrying in HEX and ASCII format respectively.

Now let's try to decode select sections of the packet, to see what they mean.

In tcpdump:


The packet that we are looking at is an example of an IP (Internet Protocol) packet, which is the principal protocol communication across a network. It mainly consists of two parts:

  1. Header - Deals with the information about packet, such as origination,destination,protocol used,version,length etc.
  2. Payload - The actual data the packet carries is called the payload of the packet.

The first 160 bits of an IPv4 Packet are reserved for the header portion.
First of all, let's remind us that 2 digits in hex take up space which is equal to 1 byte.
So, 160 bits = 20 bytes = 40 hexadecimal digits.
As we saw in the packet, the hexadecimal digits were grouped in "chunks" of 4 digits each, which gives us 10 chunks for the headers.

Let's decode some of the hex data:

Now, see the second-half of the 5th data-block ("8011") of the header. It is: 11. From this chunk, we can find out the protocol which is being used for communcation.
Since, (11)(Base 16) = (17)(Base 10)
Let's see what protocol number 17 means:





Aha! It is the UDP protocol, or the "User Datagram Protocol".

Let's look at the last two data-blocks of the header.

They are : c0a8 04ff
Let's decode them two digits at the time:
(c0)(Base 16) = (192)(Base 10)
(a8) = (168)
(04) = (4)
(ff) = (255)

Look at the packet again. What do you see?
It gave us the destination IP address!
So you see, the header basically contains information about packet

Similary, in wireshark, we can see the source ip portion of the packet, by clicking on the required field in the second pane of the window:

Now, let's look at the payload part of the packet.

If you look at the packet, you'll see the port being used is "netbios-dgm" (NetBios Datagram service)
What is this NetBios? It is just a file sharing service used by Windows.

You know, when you click on the network shares and a bunch of PCs pop up which are currently on the wifi network which you're on? Yeah, that's NetBios in action.
We can actually see that in action in this packet. Look at the ASCII part of the packet. We can see "ADMIN-PC" written there, which is the Computer Name of someone on the network.

There you go, that was the basic anatomy of an IP Packet.

Let's see some other packets which caught our attention while snooping on the network:





Look at that! So that's what goes on when we are googling!

A jpeg file being transferred:



Bash Script

Let's look at the script we prepared for our project.

Traffic Snooper

Go on, try it! You can run "./project.sh -h" to see a list of usage options and some explanation. Read it and take a look at the comments to find out what actually goes on in there.

Here it is in action:


And here is what we got:




Well, that's it! I hope you enjoyed reading thist post and learned some valuable insight into the world of web snooping :)

Sources:

http://en.wikipedia.org/wiki/Internet
http://computer.howstuffworks.com/question525.htm

http://tutorials.papamike.ca/pub/tcpdump.html

http://en.wikipedia.org/wiki/Network_packet

http://en.wikipedia.org/wiki/IPv4#Header

http://www.grc.com/port_138.htm

http://www.tldp.org/LDP/abs/html/



Apoorv Singh - 2011028
Divyanshu Bansal - 2011045

No comments:

Post a Comment