Deep Packet Inspection (DPI) is used for in-depth analysis of the packets sent over the Internet. Every parcel of digital information—including the email you send, VoIP calls you make, and websites you load—is transmitted across the web in a formatted piece of structured data known as a “packet.” Inside this packet is structured metadata that assures your data is routed to the proper destination.
Analyzing these packets is a process known as deep packet inspection (DPI,) and the practice is employed daily by enterprise companies, internet service providers (ISPs,) and media companies.
What is Deep Packet Inspection?
Internet traffic is composed of small bundles of data known as packets. Packets wrap digital information in a bundle of metadata that identifies traffic source, destination, content, and other pieces of valuable details. Analyzing digital traffic is a lot like analyzing automobile traffic: Patterns reveal useful insights. By studying metadata like headers using Deep Packet Inspection (DPI) network specialists can learn how best to optimize servers to reduce overhead, detect and deter hackers, combat malware, other statistical information, and glean intimate details about user behavior while performing traffic management as well.
As we mentioned before DPI is mostly used by ISPs, media companies and enterprise. Let’s use the analogy, if packets are mail, ISPs are the postal service and have access to unencrypted web traffic as well as packet metadata like headers. For instance, ISPs often use DPI to determine data usage, data limits, bandwidth throttling, regulation compliance, prioritization of traffic, load balancing or collection of statistical data from its subscribers. This provides ISPs with an abundance of useful information, and the companies leverage access to user data in a number of ways. Most ISPs in the United States are allowed to turn user data over to law enforcement agencies. Additionally, many ISPs use consumer data to target advertising, analyze file sharing habits, and tier access service and speeds.
DPI allows us to inspect the packets beyond header and footer. It can dig deeper and get some granular information like the application to which the packet belongs and the packet content.
DPI strips down the header and footer from the packet and inspects the payload to perform signature matching, looking for specific string and other details.
There are several methods that are used by DPI to perform the inspection. Some of the popular methods used include port-based, statistical, and automation-based approaches. Port-based is the standard protocol identification approach which inspects the port fields in the TCP/UDP headers for the commonly assigned port numbers to the respective protocol. In statistical analysis, the focus is on the classification of the traffic rather than payload and gathering generic information like packet length, port numbers to classify the traffic. The automation based approach is the widely preferred pattern/regular expression matching technique which uses a finite state machine for the pattern matching. It includes the following state: initial state, acceptance state for matching the patterns and intermediate states for partial matching cases. Matching begins with the initial state when a payload string enters the automation engine, and if the process reaches the final state, it means that the match is found.
Now that you grasp the idea of DPI, let’s understand how DPI works and how DPI technology has evolved.
OSI Model and Flow of Data Packets
To understand how DPI works and how this technology has evolved, we need to understand how a data packet flows through OSI protocol stack.
As per the OSI model, the communication system between the sender and receiver of a network packet is partitioned into seven layers:
- The Application Layer – Responsible for interacting with the application software.
- The Presentation Layer – Responsible for compression, encryption, and formatting of data being presented.
- The Session Layer – Responsible for creating, managing and ending a session’s communication.
- The Transport Layer – Responsible for sequencing and delivery of data
- The Network Layer – Responsible for the addressing and routing of the network packets.
- The Data Link Layer – Responsible for formatting the packet as per the medium of transmission of packets.
- The Physical Layer – Responsible for defining the actual media and characteristics of the transmitted data.
When we type a URL in the address bar, the data typically flows through the OSI protocol stack in the following way:
- We type the URL in the address bar of the browser. The application layer interacts with the corresponding software, the web browser. The browser makes an HTTP request to access the web page from the web server. The request is passed through the next layer of the OSI model – the presentation layer.
- The presentation layer is concerned with the actual format of data being presented. When the browser receives the data from the web server, the presentation layer presents it in a proper format like JPEG, MPEG, MOV, HTML etc. This layer can also encrypt and compress the data.
- The next layer is the session layer. This layer is responsible for creating, managing and ending the session’s communication between the sender and receiver of the data. The session layer, the presentation layer, and the application layer are mainly responsible for composing the payload of a packet.
- The transport layer deals with the sequencing and delivery of the data. It segments the data into packets, sequence the packets, establishes a connection between the source and destination of the packets and then, sends those across through the next layer of the OSI model. Note: the transport layer is not concerned with the managing and ending of sessions. It only processes the connection between the sender and the receiver of the data.
- The network layer is responsible for the addressing and routing of the network packets. It deals with how the network packets will travel from one part of the network to the other. However, it is not concerned with whether the packets received are error free. The transport layer takes care of that.
- The data link layer formats the packets as per the medium used for transmitting the packets – e.g. Wireless medium, ethernet connection etc.
- The physical layer does not change the actual data of the packets. It defines the actual media and characteristics of the transmitted data. The physical layer, the data link layer, the network layer, and the transport layer are mainly responsible for composing the headers of network packets.
Types of Packet Inspections
Initially, packet inspection was used in traditional firewalls. They would use this technology to monitor and filter packets for network security. Later, this technology gradually evolved to Deep Packet Inspection. Now, DPI is widely used in modern next-generation firewalls for enhancing network security, though the usage of DPI is not at all limited to that. It is widely used for content optimization, network and subscriber analysis and content regulation.
The three types of packet inspection:
- Shallow Packet Inspection
- Medium Packet Inspection
- Deep Packet Inspection
Shallow Packet Inspection
Shallow Packet Inspection is widely used in traditional firewalls. It works mainly in the first three layers of the OSI model. This technology examines mainly the headers of the network packets to decide on whether the packet should be passed or should be dropped.
Shallow packet inspection mainly observes the source and destination IP addresses, the number of packets the message is broken into, the total number of hops in routing the packet and synchronization data for reassembling the packets etc to decide on whether the packet should be processed further.
Medium Packet Inspection
Medium Packet Inspection is widely used in application proxies. They examine the packet headers and limit the amount of payload of the packet. And, that information is then matched against a pre-loaded parse list, which can be easily updated by the system administrators. A parse list allows specific packet types based on the data format types and associated location on the Internet, rather than their IP addresses alone.
Medium packet inspection technology can look into the presentation layer of the packet’s payload, which enables it to detect certain file formats. Using medium packet inspection devices, administrators can thus prevent client computers from receiving flash files from YouTube, image files from social networking sites etc. Medium packet inspection can even prioritize some packets based on associated application commands and file formats of the data. It can dig into the packet to identify application protocol commands associated with it and then permit or deny it as per that information.
Medium packet inspection was quite an advancement from shallow packet inspection. But, the problem with this technology is it is poorly scalable, which limits its usefulness to a large extent.
Medium packet inspection technology can look into the payload of the packets only up to a certain extent. So, medium packet inspection devices have only limited application awareness. And, we needed something more.
Deep Packet Inspection
Deep Packet Inspection technology evolved for that purpose. It looks into the payload of the packets and can identify the origin and content of each packet to take further decisions.
Deep packet inspection devices use expressions to define patterns of interest in network data streams. It can handle the packets based on specific patterns present in the payload of the packets.
So, a DPI Device can look into the payload of all the data packets passed through it in real time. So that would mean, a DPI Device can look inside all the traffic from a specific IP address, pick out all the HTTP traffic, capture all the traffic that is meant for or coming from a specific mail server and reassemble those emails when a user types out.
Who deep packet inspection affects
Beyond enterprise and SMB companies, DPI is used primarily by:
- Media companies. Media companies have a storied history of consolidation. When ISPs buy media companies they combine broadcast data with digital data to determine everything from television and web programming to corporate and consumer internet service prices.
- Law enforcement agencies. It is legal and sometimes required, that ISPs gather and share DPI-gathered data for crimes involving intellectual property violation and drug and human trafficking.
- Consumers. Most consumers are aware that, love it or lump it, personal data is for sale. Most consumers are likely unaware that their ISP is probably analyzing, anonymizing, and reselling personal browsing data to advertising companies.
Why deep packet inspection matters
Although packet-sniffing is an archaic tactic, due to the sheer scale of connected devices DPI is more relevant today than in prior epochs. DPI is relevant for three primary reasons:
- The scale of connectivity. The Internet today, particularly mobile, is more important now to more people for more reasons than ever before. Every company and organization relies on network inspection technology to optimize traffic, reduce overhead, and fend off cyber-attacks. DPI isn’t the only line of defense, but for many organizations, scanning and analyzing packets is the first line of defense.
- IoT market. Like the mobile market before it, IoT is booming, more and more devices are getting connected to the Internet every day, which increases concern regarding the repeated exploitation of them for DDOS attacks. Many contemporary IoT devices often lack standard firmware and security standards that could protect the devices from being lassoed into a zombie botnet. DPI will shield ISPs and networks from IoT DDoS attacks and help security analysts learn more about critical IoT security flaws.
- Privacy concerns. DPI helps media companies and ISPs learn about customers unimaginable ways. Every page you load and every piece of communication you send is filtered and routed through an ISP. No longer “dumb pipes,” internet service providers are vertically integrating with media companies (the Comcast NBC/Universal and AT&T Time Warner mergers are two examples) and leveraging their data to target consumers with advertising and assist law enforcement agencies with intelligence gathering. ISPs make use of DPI to analyze consumer behavior on the Internet and selling their personal browsing data to marketing and advertising companies. This practice raises concern regarding consumer privacy. It can also be used to provide security agencies unauthorized surveillance of a user’s activity, and governments can restrict users from accessing certain contents which are against their agenda.
Applications of Deep Packet Inspection
Deep Packet Inspection technology has several applications. Some major applications include:
It is widely used to next-generation firewalls to monitor and filter traffic per application basis instead of port basis, which enables it to troubleshoot network problems in a better way.
A Deep Packet Inspection device can detect and filter a wide range of malware including trojans, viruses, spyware, adware, and other malicious applications. It can do that by mainly taking a couple of approaches:
- URL Detection – Deep packet inspection devices can compare incoming and embedded URLs against a database of that of known malicious websites.
- Object Detection – Deep packet inspection devices can look into the traffic to search for potentially harmful executables and objects and then, analyze them to detect malware.
- Signature Detection – Deep packet inspection devices can look into the payload of data packets to search for the presence of signatures of known malware. Signature matching is done using a database of known signatures of malware and it usually takes help of security service providers to update the signature database.
DPI devices can look into the traffic to search for requested URLs and block URLs which are potentially harmful or inappropriate.
Protocols and Application Recognition
DPI can look into the traffic to distinguish between email services including IMAP, POP3, and SMTP. It can identify protocols like HTTP, FTP, TCP etc. It can also look into the payload of data traffic to see the presence of certain file types like Flash, YouTube, Windows Media etc. It can identify a wide variety of tunneling, session, peer-to-peer, messaging and VoIP protocols so that it can route the data for further processing.
DPI can be used to maintain QoS (Quality of Service) for the end users. It can be used to differentiate between different types of traffic and to prioritize or throttle down those different types of traffic to maintain basic QoS.
Billing and Metering of Traffic
DPI can be used by the Internet Service Providers to offer subscribers different levels of access like usage, data limits, bandwidth etc. It can also be used for the purpose of compliance of certain regulations of traffic, prioritization of traffic and load balancing.
Sometimes DPI is used by the Internet Service Providers to gather statistical information of their subscribers. For example, the ISPs can gather information on web browsing habits of their subscribers and later, to use that to enhance marketing revenues.
Application Distribution and Load Balancing
DPI can be used to look into the packet content and then, to redirect them to different destinations for the purpose of load balancing and fault tolerance.
DPI can be used to examine the traffic and to block content that are potentially harmful or unlawful.
DPI can be used to look into the packet content and automatically detect and block unauthorized sharing of copyrighted contents including music or video files.
DPI serves many purposes from network security to potentially being a real threat to privacy. But wait! What about encryption? We are seeing a huge push to encrypting all Internet traffic, so won’t that hinder DPI? The new version of the web protocol, HTTP/2 requires encrypted communications to even work.
HTTPS does encrypt the connections but your browser has to make DNS requests which are sent primarily via UDP so that data will be collected as will any unencrypted links or unencrypted cookies sent incorrectly without HTTPS. These additional bits which will be collected may be very telling about what type of content you are looking at. That is Why You Should Encrypt Your DNS.
So, are we safe from DPI? The short answer is no, these companies aren’t so quick to get rid of DPI, rather they may use a technology called SSLBump which would act as a man-in-the-middle for DPI. For instance, your workplace spoofs itself as the encrypted site you are going to. Because they control your computer, you don’t even know it is happening. Then they decrypt your Internet traffic to use DPI on it and then re-encrypt it back to the Internet.
Inherently these tools were developed for detecting bad things on a network. There are better methodologies to detecting bad things on a network compared to DPI/SSLBump that are also better in terms of privacy. Though companies don’t seem willing to give up DPI.
This article is from ThePrivacyMachine and published after authorization. It does not represent Gigalight Community's position. Before reproduced, please contact the original author.