Routers keep this mapping in a list for future lookup. The chain keeps on going, with each step out to the wider network adding enough information so the network knows how to handle your packet at each step. Packets hop between routers from the originating machine until they get to where they should be going, or otherwise you get an error returned to you. If a router does not know of the target IP address of where the packet should be sent, it take a look at the target IP address and see if it is an address it should know about. If it is, then it will try to connect to the machine and forward the packet. If it isn’t, then the packet will be forwarded to another router higher up in the hierarchy to deal with.
All internet computers will have an IP address, and this is what is used to pass the packets of data around. But wait, no one is typing dotted quads in to surf the net though, right? Because we are rubbish at remembering numbers, there’s an aliasing system in place – that links a friendly name to the IP address. This service is offered by Domain Name Servers or DNS for short. These are machines on the internet that can turn easy to remember human names into actual IP addresses.
This list could get pretty long, so there’s a whole chain of DNS machines in place so one DNS machine doesn’t have to remember every single machine on the internet. Your router will know the nearest DNS to you. On the DNS machine itself, if it doesn’t know the friendly name you’re looking for, it will pass it on to a higher level DNS machine who will either know the mapping or will ask a machine that is higher up in the chain to do the lookup. The DNS lookup will cascade up the chain until it can’t find it then you will get an error – host could not be resolved. The chaining of DNS machines is pretty clever, and is classic divide and conquer. Your nearest DNS machine knows the machines near it, the DNS’s DNS will know the DNS’s in the region, and the regional DNS will know the national DNS, which knows the international DNS and so on. It’s a little more complicated that I have described but you get the idea.
It’s pretty much the same as the way routing works – a service knows about its own little corner of the network in detail, and also knows a more widely connected service it can forward requests on to. You can think of it as a local post office sending letters locally where the postie knows who lives at each house and can deal with local mail easily. For letters going further abroad, the local post office will send it to the regional post office, who may send it to a different local post office for delivery to a house, or it may forward it to the national post office if it is further afield. The national post office would forward it to another regions post office, and then in turn to the other local post office. Note that your letter isn’t copied to every local post office just on the offchance that it might get lucky and get the right address – the internet is segmented so packets of data are not broadcast or forwarded any further than the local segment, or a single route out on to another segment. Otherwise the whole net would collapse in a packet storm with everyone talking to everyone on the network, most of the packets being discarded as not for this address.
When you connect to www.catpictures.com, you will first get the IP address of the machine that provides that website. Your router doesn’t know the address, so it will ask the router that it is connected to (your ISPs) which in turn may ask another router until it can do the mapping from the friendly name to actual IP address. Your computer will then send a request to get the default page of the website. It will pump out a packet which will contain your local IP address, and your MAC, plus the request itself (plus some other networky stuff). Your router will look at it, realise that this packet needs forwarding on, and then puts it to the “bigger” router that it is connected to. Your packet will hop, router to router, until it gets to the destination. On the way, it will pick up a little fluff from each hop, so that when there is a reply, it can be routed back the way it came, and back to you.
And that is how the internet used to work. Remember that TCP/IP was originally designed for the American military on their own networks, so everybody could pretty much see what everyone else was up to, if they had a mind to. Not a problem when all the networking gear was internal to the military, and by default it was trusted not to be up to no good.
While network devices usually only listen for messages they have business knowing about (they will have the network devices own IP and MAC on them so they can be recognised), they can be instructed to enter into promiscuous mode. This sounds a lot more exciting than it actually is, but all it means is that the network device will pick up any packet that goes by on the local network segment; your communications can be eavesdropped.
However, once the protocol was opened up to the great unwashed, and people started trying to make money off the ‘net, it was realised that there needed to be some privacy and secrecy to protect the transactions. At this point the HyperText Transfer Protocol – HTTP (which sits on top of TCP/IP) was extended to include HTTPS – a secure protocol. You’ll see this in your browser at the start of the website you’re browsing – http:// The internet was not really designed with security in mind, so a lot of solutions to provide privacy and trust between communications has been shoehorned on top as a result.
When your browser connects to a https enabled website, there’s a clever little handshake that takes place where the server uses a certificate to sign itself off as the website that you think it is. These certificates are published by a Certificate Authority (CA) in order to establish trust. A CA can and does revoke these certificates, although it might take a day or two for the revocation to be fully populated globally with all the certification authorities. The handshake takes a few steps, firstly the your computer will send information about what sorts of encryption it can handle – there are many different implementations of encryption and both you and the website need to agree on what to use at the outset. The next step is where the server establishes its bona fides, and sends the certificate and some other info back to your machine. It won’t send the certificate itself, but the public key of the certificate. Public and private keys are parts of asymmetric cryptography and we’ll get on to that in a bit. For now, your machine has to decide if it trusts the public key it’s just been sent, and it will either do that outright if you’ve OK’d it on your computer, or it will check with the CA to see if it is OK. Certificates can also expire, so this is checked too. In some instances for extra security, the web site will also ask that you send your own certificate so both parties trust each other.
Your machines will then agree the encryption key that will be used to encrypt and decrypt all communications between the two machines from now on. This is done by your computer using a randomly generated key, and it encrypts it with the public key you got from the web site. Only the holder of the website’s private key can decrypt this message. At this point both you and the website are the only people who know this encryption key and it will be discarded when your session with the website is closed. Every piece of data that is sent or received will be encrypted with this key, but it is important to realise that the routing information that helps the packet of data get around the internet is still readable. Your data (the onion core) is protected, however. Anyone eavesdropping your conversation cannot read what you’re sending, but they will know who you are talking to.