patch-2.4.0-test6 linux/Documentation/networking/ethertap.txt

Next file: linux/Documentation/networking/ip-sysctl.txt
Previous file: linux/Documentation/networking/dmfe.txt
Back to the patch index
Back to the overall index

diff -u --recursive --new-file v2.4.0-test5/linux/Documentation/networking/ethertap.txt linux/Documentation/networking/ethertap.txt
@@ -1,88 +1,262 @@
-Documentation on setup and use of EtherTap.
+Ethertap programming mini-HOWTO
+-------------------------------
 
-Contact Jay Schulist <jschlst@turbolinux.com> if you
-have questions or need futher assistance.
+The ethertap driver was written by Jay Schulist <jschlst@turbolinux.com>,
+you should contact him for further information. This document was written by
+bert hubert <bert.hubert@netherlabs.nl>. Updates are welcome.
 
-Introduction
-============
+What ethertap can do for you
+----------------------------
 
-Ethertap provides packet reception and transmission for user
-space programs. It can be viewed as a simple Ethernet device,
-which instead of receiving packets from a network wire, it receives
-them from user space.
+Ethertap allows you to easily run your own network stack from userspace.
+Tunnels can benefit greatly from this. You can also use it to do network
+experiments. The alternative would be to use a raw socket to send data and
+use libpcap to receive it. Using ethertap saves you this multiplicity and
+also does ARP for you if you want.
 
-Ethertap can be used for anything from AppleTalk to IPX to even 
-building bridging tunnels. It also has many other general purpose
-uses.
+The more technical blurb:
 
-Ethertap also can do ARP for you, although this is not enabled by
-default.
+Ethertap provides packet reception and transmission for user space programs.
+It can be viewed as a simple Ethernet device, which instead of receiving
+packets from a network wire, it receives them from user space.
 
-SetUp
-=====
+Ethertap can be used for anything from AppleTalk to IPX to even building
+bridging tunnels. It also has many other general purpose uses.
 
-First you will have to enable Ethertap in the kernel configuration.
-Then you will need to create any number of ethertap device files,
-/dev/tap0->/dev/tap15. This is done by the following command.
+Configuring your kernel
+-----------------------
 
-mknod /dev/tap* c 36 16  ( 17 18 19 20 for tap1,2,3,4...)
+Firstly, you need this in Networking Options:
 
-** Replace * with the proper tap device number you need. **
+	#
+	# Code maturity level options
+	#
+	CONFIG_EXPERIMENTAL=y
 
-Now with your kernel that has ethertap enabled, you will need
-to ifconfig /dev/tap* 192.168.1.1 (replace 192.168.1.1 with the
-proper IP number for your situation.)
+Then you need Netlink support:
 
-If you want your Ethertap device to ARP for you would ifconfig
-the interface like this: ifconfig tap* 192.168.1.1 arp
+	CONFIG_NETLINK=y
 
-Remember that you need to have a corresponding /dev/tap* file
-for each tap* device you need to ifconfig.
+This allows the kernel to exchange data with userspace applications. There
+are two ways of doing this, the new way works with netlink sockets and I
+have no experience with that yet. ANK uses it in his excellent iproute2
+package, see for example rtmon.c. iproute2 can be found on
+ftp://ftp.inr.ac.ru/ip-routing/iproute2*
 
-Now Ethertap should be ready to use.
+The new way is described, partly in netlink(7), available on 
+http://www.europe.redhat.com/documentation/man-pages/man7/netlink.7.php3
 
-Diagram of how Ethertap works. (Courtesy of Alan Cox)
-====================================================
+There is also a Netlink-HOWTO, available on http://snafu.freedom.org/linux2.2/docs/netlink-HOWTO.html
+Sadly I know of no code using ethertap with this new interface.
 
-This is for a tunnel, but you should be able to
-get the general idea.
+The older way works by opening character special files with major node 36.
+Enable this with:
 
-        1.2.3.4 will be the router to the outside world
-        1.2.3.5 our box
-        2.0.0.1 our box (appletalk side)
-        2.0.0.* a pile of macintoys
+	CONFIG_NETLINK_DEV=m
 
+Please be advised that this support is going to be dropped somewhere in the
+future!
 
-[1.2.3.4]-------------1.2.3.5[Our Box]2.0.0.1---------> macs
+Then finally in the Network Devices section, 
 
-The routing on our box would be
+	CONFIG_ETHERTAP=m
 
-                ifconfig eth0 1.2.3.5 netmask 255.255.255.0 up
-                route add default gw 1.2.3.4
-                ifconfig tap0   2.0.0.1 netmask 255.255.255.0 up arp
-                (route add 2.0.0.0 netmask 255.255.255.0)
+You can include it directly in the kernel if you want, of course, no need
+for modules.
 
-C code for a Simple program using an EtherTap device
-====================================================
+Setting it all up
+-----------------
 
-This code is just excerpts from a real program, so some parts are missing
-but the important stuff is below.
+First we need to create the /dev/tap0 device node:
 
-void main (void)
-{
-	int TapDevice, eth_pkt_len = 0;
-	unsigned char full_pkt_len[MAX_PKT_LEN];
+	# mknod /dev/tap0 c 36 16
+	# mknod /dev/tap1 c 36 17
+	(etc)
+
+Include the relevant modules (ethertap.o, netlink_dev.o, perhaps netlink.o),
+and bring up your tap0 device:
+
+	# ifconfig tap0 10.0.0.123 up
+
+Now your device is up and running, you can ping it as well. This is what
+confused me to no end, because nothing is connected to our ethertap as yet,
+how is it that we can ping it?
+
+It turns out that the ethertap is just like a regular network interface -
+even when it's down you can ping it. We need to route stuff to it:
+
+	# route add -host 10.0.0.124 gw 10.0.0.123
+
+Now we can read /dev/tap0 and when we ping 10.0.0.124 from our
+localhost, output should appear on the screen.
+
+	# cat /dev/tap0
+	:ßVU:9````````````````````````şışET@?'
+
+
+Getting this to work from other hosts
+-------------------------------------
+
+For this to work, you often need proxy ARP. 
+
+	# echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp 
+
+eth0 here stands for the interface that connects to 'other hosts'.
 
-	TapDevice = open("/dev/tap0", O_RDWR);
-        if(TapDevice < 0)
-        {
-                perror("Error opening device");
-                exit(1);
-        }
+Chances are that you are trying this on a non-routing desktop computer, so
+you need to enable ip forwarding:
 
-	write(TapDevice, full_packet, eth_pkt_len);
+	# echo 1 > /proc/sys/net/ipv4/ip_forward 
 
-	close(TapDevice);
+You should now be able to ping 10.0.0.124 from other hosts on your
+10.0.0.0/8 subnet. If you are using public ip space, it should work from
+everywhere.
 
-	return;
+ARP
+---
+
+If we were to take things very literally, your tcp/ip pseudo stack would
+also have to implement ARP and MAC addresses. This is often a bit silly as
+the ethertap device is a figment of our imagination anyway. However, should
+you want to go 'all the way', you can add the 'arp' flag to ifconfig:
+
+	# ifconfig tap0 10.0.0.123 up arp
+
+This may also be useful when implementing a bridge, which needs to bridge
+ARP packets as well.
+
+The sample program below will no longer work then, because it does not
+implement ARP. 
+
+Sample program
+--------------
+
+A sample program is included somewhere in the bowels of the netfilter
+source. I've extracted this program and list it here. It implements a very
+tiny part of the IP stack and can respond to any pings it receives. It gets
+confused if it receives ARP, as it tries to parse it by treating it as an IP
+packet.
+
+/* Simple program to listen to /dev/tap0 and reply to pings. */
+#include <fcntl.h>
+#include <netinet/ip.h>
+#include <netinet/ip_icmp.h>
+#if defined(__GLIBC__) && (__GLIBC__ == 2)
+#include <netinet/tcp.h>
+#include <netinet/udp.h>
+#else
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#endif
+#include <string.h>
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+
+u_int16_t csum_partial(void *buffer, unsigned int len, u_int16_t prevsum)
+{
+	u_int32_t sum = 0;
+	u_int16_t *ptr = buffer;
+
+	while (len > 1)  {
+		sum += *ptr++;
+		len -= 2;
+	}
+	if (len) {
+		union {
+			u_int8_t byte;
+			u_int16_t wyde;
+		} odd;
+		odd.wyde = 0;
+		odd.byte = *((u_int8_t *)ptr);
+		sum += odd.wyde;
+	}
+	sum = (sum >> 16) + (sum & 0xFFFF);
+	sum += prevsum;
+	return (sum + (sum >> 16));
+}
+
+int main()
+{
+	int fd, len;
+	union {
+		struct {
+			char etherhdr[16];
+			struct iphdr ip;
+		} fmt;
+		unsigned char raw[65536];
+	} u;
+
+	fd = open("/dev/tap0", O_RDWR);
+	if (fd < 0) {
+		perror("Opening `/dev/tap0'");
+		return 1;
+	}
+
+	/* u.fmt.ip.ihl in host order!  Film at 11. */
+	while ((len = read(fd, &u, sizeof(u))) > 0) {
+		u_int32_t tmp;
+ 		struct icmphdr *icmp
+			= (void *)((u_int32_t *)&u.fmt.ip + u.fmt.ip.ihl );
+		struct tcphdr *tcp = (void *)icmp;
+		struct udphdr *udp = (void *)icmp;
+		
+		fprintf(stderr, "SRC = %u.%u.%u.%u DST = %u.%u.%u.%u\n",
+			(ntohl(u.fmt.ip.saddr) >> 24) & 0xFF,
+			(ntohl(u.fmt.ip.saddr) >> 16) & 0xFF,
+			(ntohl(u.fmt.ip.saddr) >> 8) & 0xFF,
+			(ntohl(u.fmt.ip.saddr) >> 0) & 0xFF,
+			(ntohl(u.fmt.ip.daddr) >> 24) & 0xFF,
+			(ntohl(u.fmt.ip.daddr) >> 16) & 0xFF,
+			(ntohl(u.fmt.ip.daddr) >> 8) & 0xFF,
+			(ntohl(u.fmt.ip.daddr) >> 0) & 0xFF);
+
+		switch (u.fmt.ip.protocol) {
+		case IPPROTO_ICMP:
+			if (icmp->type == ICMP_ECHO) {
+				fprintf(stderr, "PONG! (iphdr = %u bytes)\n",
+					(unsigned int)((char *)icmp
+						       - (char *)&u.fmt.ip));
+
+				/* Turn it around */
+				tmp = u.fmt.ip.saddr;
+				u.fmt.ip.saddr = u.fmt.ip.daddr;
+				u.fmt.ip.daddr = tmp;
+
+				icmp->type = ICMP_ECHOREPLY;
+				icmp->checksum = 0;
+				icmp->checksum
+					= ~csum_partial(icmp,
+							ntohs(u.fmt.ip.tot_len)
+							- u.fmt.ip.ihl*4, 0);
+
+				{
+					unsigned int i;
+					for (i = 44;
+					     i < ntohs(u.fmt.ip.tot_len); i++){
+						printf("%u:0x%02X ", i,
+						       ((unsigned char *)
+							&u.fmt.ip)[i]);
+					}
+					printf("\n");
+				}
+				write(fd, &u, len);
+			}
+			break;
+		case IPPROTO_TCP:
+			fprintf(stderr, "TCP: %u -> %u\n", ntohs(tcp->source),
+				ntohs(tcp->dest));
+			break;
+
+		case IPPROTO_UDP:
+			fprintf(stderr, "UDP: %u -> %u\n", ntohs(udp->source),
+				ntohs(udp->dest));
+			break;
+		}
+	}
+	if (len < 0)
+		perror("Reading from `/dev/tap0'");
+	else fprintf(stderr, "Empty read from `/dev/tap0'");
+	return len < 0 ? 1 : 0;
 }
+			

FUNET's LINUX-ADM group, linux-adm@nic.funet.fi
TCL-scripts by Sam Shen (who was at: slshen@lbl.gov)