Okay, I usually focus on the things that can happen when you look at your phone. This is, of course, the core of Howard Rheingold's Shibuya Epiphany when he was visiting Japan and noticed that more people were looking at their phones than talking on them. Since there are more times, in my opinion, when people want to interact discretely with their mobile device rather than blather out loud for anyone nearby to hear, I concentrate mostly on that sort of interaction. However, every once in a while I question what could happen if you really pushed the Voice envelope.
Imagine a voice powered mobile phone that looked just like a Bluetooth headset. That's it, no display, just a thing that sits on your ear and maybe a mini-boom for a microphone. Now, imagine if you press a button and are connected digitally to a voice recognition system which you can then use to dial the phone with. But it's not a generic system that you're interacting with - think 411 or TellMe - but your data, customized by you using a web based admin previously. You can import your address book, calendaring information, etc. so that all the stuff you need is just a simple command away.
"Call Mom," would dial your Mom's number. "Today's appointments?" would bring back an electronic voice that would tell you when your next meeting was. "Email Dave... Hey, let's get together next weekend. " would send off an email with an MP3 attachment of your message. "Message Mike... Are you up for lunch later?" would send off an MMS message to Mike. "Todo... pick up my laundry on the way home" would pop a voice record in your todo list, and to get them back you could ask, "List today's todos".
Now, all this functionality would be centralized and provided, for costs sake, over VoIP. You only get a normal voice connection if you're dialing someone, otherwise the requests and responses are sent over a data connection to a central server. The cooler the server is with all the whiz-bang technology would allow more next-gen functionality like speech to text. So you could do some of the above, and instead of a voice recording, you actually send off a text message, or an email or the to-do item is actually recorded as text instead of sound. But even if you didn't have that functionality, it'd still work pretty well with just recordings. For example, imagine if the voice service enabled something like automatic Podcasting. You say something like "New Podcast.... Hey everyone, I'm podcasting from my minimobi and I wanted to share some thoughts..." and bam, it's recorded and posted to your weblog.
Continuing from there, you can get prompted for calls with computer generated voices: "Dave's calling. Answer?" and get information from the world wide web in a VoiceXML style query system as well. Hell, open the platform up so that instead of having to have a telco infrastructure to provide voice services, anyone with a web server, a VoIP connection and a VoiceXML app can provide cool apps. Sports scores, traffic info, etc.
This would be a pretty cool device, no? It'd be like the iPod Shuffle of mobile phones - as simple as it gets. In fact, that's a cool idea! Imagine if instead of sitting on just one ear, you could attach another head phone to make a pair. Now you can listen to your music from home being streamed to your phone via a system like Orb.com, or use a system like Sprint's new Sirius or Rhapsody streaming audio.
All controlled via a centralized internet server over 3G data services using voice-recognition systems, hooked up to a variety of external systems like email, SMS/MMS, calendaring, weblog, and audio services. It'd be a pretty hip little device if you ask me. Even without a screen and a camera. :-)