Accessible AI part 1: moving fast and breaking things

When I look at all of the AI services available today, I can’t find a single one, holistically, that I like. ChatGPT works well on the web, but its iOS and Android apps leave a few things to be desired on iOS, and a lot on Android. Gemini works okay enough on the web that I use it now, and the same for Android and iOS, but it still needs work on all three. Google have stopped the bleeding, but haven’t really closed any wounds. Claude doesn’t even try. Their talk means nothing when using their AI is a pain.

These AI companies have had a grand opportunity to include blind people. The technology is, relatively, new. They built these interfaces and frontends from scratch. But instead they choose to move fast and break things, including accessibility.

Granted, there have been improv…

Granted, there have been improvements in a few of them in a few cases. But Claude hasn’t improved since practically day one, and the others only focus on web, not mobile. And mobile is where it’s at! More blind people have iPhones now than computers, just like the general public.

Now, I will say that expert blind people, more expert than the rest who just want to get on with their day, can easily work around all these gaping wholes in user experience. I’m not here for that. I don’t care that I can search for some button or image that marks the start of the AI’s response, I don’t care that I can monitor a button to see when the AI is done responding, I want an easy experience so that AI doesn’t add to the soul-crushing stress I’m already under as a blind person with a high-stress job.

So let’s look at each AI service and see what’s going on with them at this moment. My test will be simple:

Open the website

ind the prompt box or chat box or whereever I can type my prompt.

Type a prompt and review it.

Press Enter and hear the response.

So, let’s dive in.

Okay, so I open a new tab and go to chatgpt.com. After logging in, I’m immediately put into a “section editable multi-line ask anything.” It doesn’t say “edit” but I’m in NVDA focus mode so I assume I can type here. So, I type “hat are Tech Priests in Warhammer 40K?” After pressing Enter, ChatGPT says “ChatGPT is generating a response.” "Then, “ChatGPT is still generating a response” every few seconds. I like this. A lot. It could be shorter, “still generating” would work without being too short. But it gets the job done. It’s equivalent to a progress spinner. As long as it spins, we know something is happening.

After it’s done, it says “ChatGPT says:” and then the response. This is what I want. This makes chatting useful. In apps like iMessage on iOS, or Messages by Google on Android, if a person responds to you in a chat, VoiceOver or TalkBack will be instructed by the app to read the message. Again, this is what I want.

On iOS, it works about the same way. Except during the streamed response, all VoiceOver can do is speak the response, chunk by chunk, unless you exit the app. If you tap something else, it will move on to the next chunk in the stack. And that’s really annoying. But all the blind people put up with it because we are *so* used to receiving crumbs that we just live with it.

On Android, there is nothing after sending. TalkBack is silent unless you touch the output and hope it’s done generating.

Gemini is a bit different. It was made by Google, who has an accessibility team. So, let’s see how they work.

I open gemini.google.com, and type in “What does LLM stand for?” I press Enter, and NVDA says “ask Gemini. Gemini is typing.” Then after a moment, “Gemini replied.” Then, after a much longer moment while Gemini streams the response, it reads the response.

So, a few things. The “ask Gemini” is the text placehold in the text field. This is really annoying on iOS when writing in Braille. Anyway, you type your prompt, the placeholder goes away. You press Enter, and the text is sent to the server and emptied from the text field, and the placeholder comes back.

Then, we get”Gemini is typing” and then “Gemini replied.” I wish they’d get rid of the replied message because it’s a lie. Gemini has not replied. Google really needs to hire more blind people. Scratch that, everyone needs to hire more blind people. Well besides taxis. And pilots. And animators. Anyway…

On Microsoft Edge with NVDA on Windows, the actual output is read very well. On Firefox with Orca on Linux, the output is read in a way that makes me thing there’s a live region that’s set to polite. The text kind of changes and repeats a lot. I’ll have to see if I’ve set things the wrong way before I commit to Gemini sucks on Linux with Firefox. Otherwise, it works well on the web.

On Android, Gemini’s home, it speaks nothing after sending the message. I’ve told the Gemini folks about this, and there has been no resolution. On iOS, it does say “Gemini replied” because they know the NFB use iOS and they don’t want to mess with “The Nation’s Blind.” I’m not a part of NFB, BTW. But the iOS version does have this odd issue where, if reading in Braille, I cannot access list items. Since Gemini loves lists, this is a big issue for Braille users like me.

Next time, I’ll talk about Claude and Copilot. Let me know what you think of this, and if there is interest in any deeper dives.

No posts

Open the website

ind the prompt box or chat box or whereever I can type my prompt.

Type a prompt and review it.

Similar Posts