📶

4. Networking

👉
Become job-ready by solving real-world challenges and build your professional cybersecurity skills with the National Cyber League!

TCP/IP Model

The Internet as we know it is powered by a host of networking protocols and technologies, the most fundamental technologies that deliver us the latest tweets and cat photos can be described with the TCP/IP model of networking. TCP/IP Model is one of the popular models (the other being the OSI Model) that describes how different networking protocols work with each other to provide connectivity and data transfer.

In the TCP/IP Model, there are 4 layers of abstraction:

  1. Link Layer - responsible for handling the transmission of data between computers that are directly connected physically or wirelessly.
  2. Internet Layer - responsible for routing data between devices on the Internet which are not directly connected (this is "IP")
  3. Transport Layer - responsible for ensuring reliability and deliverability of data addressed to different applications on a device (this is "TCP")
  4. Application Layer - the actual data being transferred
TCP/IP model
TCP/IP model

The reason why we have these different layers of abstraction is to allow others to build upon existing and widely adopted technology to quickly create new applications without having to start from scratch. Think of the TCP/IP Model and the overall Internet as a giant postal system.

Now imagine you want to send a box of cookies to your friend Alex, but you don't want to have to drive the box of cookies over to Alex yourself because Alex lives 10 hours driving distance away from you, so you use the postal system. Now it's simply your responsibility to package the cookies, write down Alex's name & address on the package, and then drop the package off at the post office. Everything else is taken care of for you by the postal system rather than you having to worry about which specific post office your package goes through, you just want the package to arrive at Alex's house safely.

The Internet works very much in the same ways, you tell your computer that you want to send some data to somewhere else on the Internet and it just goes. So to apply the postal system analogy on the TCP/IP Model:

  • Link Layer - the mail trucks moving the package between post offices
  • Internet Layer - the street address of the addressee building
  • Transport Layer - the name/unit number of the person you’re addressing
  • Application Layer - the actual message being sent

This practice is called "encapsulation" whereby your data is "contained" by an outer layer (e.g. Link layer packet holds an Internet layer packet, then an Internet layer packet holds a Transport layer packet, and finally, a Transport layer packet holds an Application layer packet.) The below diagram visualizes how data goes from a computer to the server through a router and a datacenter. You'll find that at the router and datacenter, the packets are only read at their Link layer and Internet layer, the router and datacenter doesn't care about the Transport layer or Application layer. Very much like the postal system, at each distribution center, they only care about reading the address that you're sending to and putting them on the mail trucks to get you closer to that address rather than who you're sending it to or what the contents of your mail might be.

TCP/IP model encapsulation
TCP/IP model encapsulation

Internet Protocol

The base unit of data on the Internet is a "packet", which is just a small amount of raw data with some metadata about its source and destination for example. You can imagine this as analogous to a letter that you send in the mail, which contains its contents as well as shipping information.

In IP, or the "Internet Protocol", we have two kinds of IP addresses - IPv4 and IPv6 - we'll focus on IPv4 address for now. An IP address is how computers identify each other on the Internet, you can imagine this as the street address of the computer.

An IP packet will contain a source and destination IP address much like how you will write a "From" address and a "To" address on an envelope. The source IP address allows the receiving computer to know where to send back packets for additional data communication. IP addresses is also what allows network routers, devices that pass along packets to its destination, to determine how to route these packets.

You can imagine network routers as the postal distribution hubs where mail gets put on large trucks and sent to a distribution hub closer to the destination. The Internet also works in hubs, with large geographic regions serviced by certain routing hubs. Organizations that manage these hubs are designated as an autonomous system (AS) by their Regional Internet Registry and are given an autonomous system number that will refer to the list of IP addresses that the organization manages. Some larger organizations may have multiple AS numbers, especially if they manage many different IP addresses whose devices are geographically far apart.

Just like how the post system uses zip codes to are used to group mailing addresses into a general geographic region, AS numbers are used to group lists of IP addresses to a general geographic region. For example, the zip codes 20001 and 20002 represent addresses in parts of Washington, DC and the AS number AS27 represents IP addresses at the University of Maryland.

These groupings can be further broken down into smaller segments. In the post system, the ZIP+4 codes identify smaller geographic regions inside a zip code. For example, the 20001-4427 refers specifically a single block on 9th street in DC. In the IP system, groups of IP addresses can be referred to by their prefix using CIDR notation. These groups are called CIDR blocks or IP address blocks and AS numbers may consist of one or many of these CIDR blocks. For example, the 128.8.0.0/16 block is one of multiple different blocks of IP addresses operated in AS27.

Term Table

TCP/IP

IP Address

Port

AS Number

CIDR Block

TCP/IP Example

128.8.28.36

TCP Port 80

AS27

128.8.0.0/16

USPS

Street Address

Unit Number

Zip Code

ZIP+4

USPS Example

999 9th St NW

#200

20001

20001-4427

Internet Connections

So how does this all help us as view pictures or videos online? These technologies are working constantly without us knowing and that's kind of the point, because if we all needed to have in-depth network engineering knowledge to use the Internet, then it wouldn't be so ubiquitous and easily accessible. So let's show the path a packet has to take to get from your browser to the web server hosting your favorite cat photos.

  1. Firstly, your browser will send its request to the computer.
  2. Then, your computer will use its WiFi chip to send a wireless signal to your home WiFi router.
  3. Next, your home router will send the data packets through the cables that leave your house, whether it be utility poles or buried cables in the ground.
  4. Through the utility poles, your data packets will arrive at your residential Internet Service Provider (ISP) which will route your data to the next appropriate ISP.
  5. If the pictures are being hosted by a company, then the data packets will very likely arrive at a business ISP. Business ISPs are similar to your residential ISPs such as Verizon, Comcast, or Spectrum, but they generally service only business customers and provide a higher level of service guarantee (e.g. guarantee no significant outages).
  6. From the business ISP, your data packet will eventually enter a datacenter that's physically housing the server.
  7. At the datacenter, your packets will then be routed to the destination server and then processed by the server.
  8. To make the return journey, data packets sent by the server will take the exact same or a similar path back to your home computer, eventually rendering the picture on your browser.
The path of a data packet between you and a remote server
The path of a data packet between you and a remote server

Advanced Info

TCP

The two most common transport layer protocols are TCP and UDP. TCP stands for the "Transmission Control Protocol" and its key feature is that it provides something that’s called reliable service. With TCP, data packets that are lost and not delivered to the intended destination will be re-transmitted to ensure that it is delivered. TCP accomplishes this by numbering each packet and ensuring that for every packet that it sends out, it receives an "ACK" or acknowledgement packet back from the receiver.

In the postal analogy, this is like having tracking on your packets, where you can notified if your packet has been delivered or not. To do this, TCP has to set up a "connection" in order to keep track of all the packets it sends out and/or receives. To start a connection between two computers:

  1. Computer A will send a "SYN" or synchronize packet to the computer B.
  2. Upon receiving the "SYN" packet, computer B will reply with a "SYN" packet as well, along with an "ACK" packet for the "SYN" packet from 1.
  3. When Computer A receives the "SYN" packet from computer B, computer A will reply back with an "ACK" to computer B indicating that it has received the "SYN".

At this point, both computers have synchronized and acknowledged each other. This process is called the "TCP 3-way handshake" and it marks the beginning of a connection. No application layer data is exchanged in this handshake process, but once the handshake is complete, the two computers and send each other data packets and communicate freely.

TCP connection open handshake
TCP connection open handshake

Each packet that is sent through a TCP connection has an identifier that is used in a corresponding "ACK" packet. For example, if a packet with the ID "x" is sent out, the sender will expect an "ACK" packet returned with the ID counter incremented by 1, this means that the data packet was successfully delivered. If a sender does not receive an "ACK" packet back, then it is presumed that either the initial data packet is lost or the "ACK" packet was lost, either way, the sender will re-transmit the data packet and await an "ACK" from the receiver to ensure that the packet was successfully delivered.

TCP data packet flow
TCP data packet flow

This ID is incremented for each packet sent and two counters are kept for each TCP connection with each counter tracking the packets being sent in one direction. In the picture above, the ID in the orange PSH packets is keeping track of the packets being sent from the client to the server and ID in the purple PSH packets is keeping track of the packets being sent from the server to the client.

Data packets are often sent as a "PSH" or push packet, though not always, the "PSH" packet informs the receiving computer that it should "push" the data received so far to the application. This is something that’s handled by the protocol and users of the protocol generally do not have to worry about setting this particular flag on the packet.

At the end of a TCP connection, the connection must be closed so that both computers can stop keeping track of IDs sent and the "ACK" packets received, so it does a handshake similar to the initial "SYN" 3-way handshake, except for closing a connection, it’ll use "FIN" or finish/end packets instead of "SYN" packets.

TCP connection closing handshake
TCP connection closing handshake

Finally, TCP has another kind of packet called the "RST" or reset packet. A "RST" packet is just a really aggressive way to terminate a connection. When a "RST" packet is used, there is not "ACK" or "FIN" packets, it’s just the one "RST" packet that gets sent out and that’s it, it’s a very succinct message and terminates the connection on both sides immediately without a handshake. A "RST" packet is often used by a computer if it receives a TCP packet outside of a TCP connection, e.g. a stray packet or a super delayed packet, so it sends out a "RST" packet to the other side notifying them to essentially cease communication.

UDP

UDP packets on the other hand is a sharp contrast to TCP. UDP stands for "User Datagram Protocol" and the difference is actually in the name itself. UDP is a "datagram" centric protocol, meaning that you should only use it in the context of 1 single UDP datagram. This is because UDP has none of the reliability guarantees as TCP, if you send out a UDP packet and it is lost, you’ll never know unless the receiving end explicitly informs the sender. UDP does not have a 3-way handshake to start a connection and it does not have "ACK" packets to acknowledge receipt, it’s simply fire and forget.

UDP data packet flow
UDP data packet flow

TCP vs UDP

So why would anyone use UDP over TCP if UDP can suffer from packet loss? Well, the short answer is that somethings you just don’t care if you lose some packets. Because UDP doesn’t have the handshakes or the "ACK" packets, it uses significantly less data bandwidth than TCP - less packets sent, less data used, simple enough. So in a situation where you don’t care about packet loss and want to use as little bandwidth as possible (or to be bandwidth efficient for high bandwidth use cases), then UDP is a great choice.

Another great use case for UDP is streaming video/audio. Whenever you’re on Discord voice chatting with your friends, you’re actually sending your voice over UDP. This is because a few packets lost in voice call is not a big deal, if you didn’t hear the other person, just ask them to repeat. Most of the times, the loss of 1 or 2 packets will mean just a fraction of a second being cut out and that you’ll still hear the majority of the words or sentence spoken and that is preferable to randomly hearing voice clips from several seconds earlier.

Similar to voice data, streaming video data is also highly beneficial via UDP, this is again because packet loss is acceptable here. The loss of 1 packet might just mean that a few pixels didn’t make it but in video, each second of video generally has at least 24 frames, meaning that for 1 second of video, 24 updates of the screen will take place. This means that if you lose 1 packet, that’s only a few pixels for 1/24 of a second, the average viewer is unlikely to notice that and the video player can also apply clever techniques to fill in those pixels from previous frames, making the effects of the packet loss virtually indistinguishable. The key benefit to using UDP for video streaming is that Video streaming uses a lot of data bandwidth, so the video provider can save bandwidth by not having to exchange "ACK" packets as you might in TCP.

Finally, it's important to note that while both TCP and UDP both have port numbers, they do not necessarily point to the same application on a computer. For example, a web server might be listening on TCP port 80, the server will accept new connections solely on the TCP port and not on the UDP port. This means that if you tried to connect via UDP port 80 trying to access the web server, this will not work because the web server is not listening on UDP port 80. Now there are some applications that were explicitly designed to support both TCP and UDP, e.g. DNS servers often listen on both TCP port 53 and UDP port 53. In the case of the DNS server, it was the choice of the application developer to support both rather than something that is provided by the networking protocols themselves.

By the Way, What is the Cloud?

More and more, we hear the word "cloud" in our technology vocabulary, e.g. iCloud or Google Cloud. The "cloud", put in the simplest terms, is just servers (or services) that you’re renting from someone else. Before we had the cloud, companies wishing to run servers had to buy their own servers and put them into a datacenter somewhere and pay a datacenter operator for the Internet connectivity and electricity.

With the advent of cloud technology, the cloud operators purchase servers, obtain Internet connectivity, pay for electricity, and set things up physically in a datacenter. Then a company, or an individual, can ask to rent some computing capacity from that cloud operator and the cloud operator will bill the customer for appropriate rental fees. Very much like the Internet, we don’t need to be a network engineer to use it, the advent of cloud technology just means that we don’t need to be datacenter experts to start hosting our own servers on the Internet.

Become job-ready by solving real-world challenges and build your professional cybersecurity skills with the National Cyber League.