A few things are going on here.
First of all, the data vs.voice bit is correct, it takes much less "overhead" to move a bit of text than it does to make a voice connection.
As far as the PTT (Push-to-Talk) feature of Nextel, this is in many ways similar to a text message, in that the Nextels actually digitize the voice and then can store-and-forward it when the network can slip it into the next open slot.
We use a similar system in the fire department, it's called the "Astro" system, and sometimes, you call county and hear your own voice a few seconds later (an eternity by computer standards) as the network was too busy to accept my voice packet, but it was stored and forwarded as network bandwidth cleared.
Also, Nextel tends to build their own infrastructure, with more exclusive antenna sites and so forth, so unlike Verizon or Cingular, which typically lease tower space from a 3rd party, and share physical infrastructure, Nextel has (in many places) it's own towers and antennas.

I used to have Nextel, and it's the phone of choice for emergency responders, but their coverage in this area is poor, and so I was forced to drop Nextel and go to Verizon. But I miss the "bee-beep" of the walkie talkie.