OpenVPN is a widely used VPN solution. Due to its popularity and the the confidentiality nature, the authority of China started blocking its connection.
Since the content of OpenVPN connection handshake packet is encrypted, we believe that GFW analyzes the packet size/frequency patterns to recognize the connection.
We are trying to find a way to bypass this statistical analyze. We also want to implement a system to filter and modify network packet in real-time.
There's a virtual network device driver in Linux kernel called tun
.
It provides a way to read/write network packet using a user-space tool.
OpenVPN itself relies on the tun
driver to provide network traffic
redirection and encryption. When sending network packets, OpenVPN will
change the routing table so the network traffic (except for the traffic to
the VPN server) will be routed to a tun
device. OpenVPN will read the
packets and encrypt them, then send these packets to the server side
through normal network interface. Receiving network packets works
similarly.
This gives us the idea to implement another virtual network device
driver. It works like a tun
driver: we can change the routing table to
redirect the traffic to it, then we can read/modify the packets going
through.
We've patched OpenVPN version 2.3.1
. In this patched version, the
server side will stop connection as soon as handshake is finished, and
the client side will stop sending packet at the same time.
This way we can capture only the packets during the handshake phase and trying to find a pattern.
We came up with the following ways to change the statistic pattern:
- Manipulate packets with wrong checksum, and insert them into the normal IP packets sent by the client
- Inserting IP packet with a proper TTL value, this TTL should be big enough to pass through GFW, but small enough so that it will be dropped before reaching the OpenVPN server.
- Inserting some other no-op UDP/TCP packet that will be simply ignored by the OpenVPN server.
We can do this in kernel-space or user-space. However, since performance is not our concern now, we probably will do it in user-space.
We need to read the tun
driver source code and get familiar with the
IP packet structure.
We are expecting this system can work on any application layer protocols. So we need to define a language to define the filtering and modifying rules.
We may starts reading the L7 filter
(which is a layer-7 content
filtering system) and find some useful hints.
There are many tools that help people dealing with network traffic. We
can capture and analyze traffic using tcpdump
and wireshark
, filter and
change the flow of the packets with iptables
. The goal of
our system is to modify the network traffic in real-time according to the rules
given. We think the system we are designing can be applied to a lot
scenarios in network traffic analyzing and protocol implementing.
However, the design of our system is limited that we can only change from client side. We don't know if one day GFW will start filtering the response packet from server side.
What's more, we are unable to fully test this system against GFW, since GFW will not block every OpenVPN connection. Thus we don't know if we've succeed in tricking GFW, or it's just GFW didn't blocking our connection when we are testing.