Because it's slow. Using
/dev/mem is way way faster, even in Python, and zippity-quick in C.
Based on C code by Dom and Gert at http://elinux.org/RPi_Low-level_peripherals. Using mmap is potentially much faster than http://en.wikipedia.org/wiki/Sysfs, which is what almost all RPi tutorials recommend.
So where do these addresses and constants come from? See http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf, page 90. You'll want to read everything from page 90 to page 104 if you want to know what's going on. And take a look at the table of contents (pages 2 and 3) to see the chip's capabilities.
The ARM architecture is used for many different chips made by different manufacturers, and you can find open-source Verilog and VHDL implementations. How the manufacturers differentiate their products is by offering different sets of peripherals. Knowledge gained with one ARM implementation is largely transferable to another. The first ARM family I got acquainted with was the AT91SAM7 family, and the GPIO there isn't so different from Broadcom's BCM2835 used in the Raspberry Pi. The Broadcom chip has a hardware memory mapper which the SAM7 family lacked, so one couldn't run Linux on the SAM7 chips.
Also see my RPi hacking repository on Github, where I did some mmap hacking in a C module of the xxmodule.c variety.
My example for the Beaglebone Black is taken from an excellent blog post by Alexander Hiam. The BBB uses a different ARM chip, the Sitara AM335x from TI. The Technical Reference Manual contains the information we need to talk to the GPIO.
Section 2 contains the memory map for the chip, and on page 181 we see that 0x481AC000 is where to find the registers to control GPIO2, the third GPIO block. Those registers are listed in section 25.4.1 on page 4871. The following pages, all the way out to 4897, detail the different bits and fields contained in those registers.