Intial Proposal




Observing the users



Location : SSMU building.
Event: "Seeing Voices Montreal"

Background:

As our application is targeted towards users with hearing impairments we attended a screening of a silent movie inside SSMU followed by a casual meet and greet. The event was attended by approximately thirty people which included people with hearing disabilities or relatives/friends.

Primary motivation:
Our main focus was to observe how deaf people communicate with people who may not observe/understand their sign language with a special consideration toward performing essential tasks. As we could not observe users doing the task we wanted to help them with i.e making phone call, we decided to focus more on how they interact with us using a text interface and use that as our initial guide.

Observations:
The movie was being screened was a completely silent film with the dialogue communicated exclusively through the use of sign language and English subtitles. At the socializing session at the Gerts bar we got to communicate with people from the deaf community. Our communication was dependent on an interpreter being present which was a major drawback considering that would require payment for their services (200$/hr). To efficiently utilize our time with the attendees we compiled a list of questions and recorded the responses through the use of the ‘Evernote’ smartphone application. Given the communication constraints this allowed us to gain a rudimentary understanding of the user’s requirements. The questions were asked from four of the attendees, of which three were completely deaf while one had partial hearing. The questions are as follows:

1)What is your preferred form of communication with. Sign language, lip reading etc.
2)What difficulties do you face when using smartphones?
3)Have you ever used text to speech systems?
4)Have you ever used audio recognition (Natural Language Processing) systems?
5)How would you go about doing a daily task e.g. doing a bank transfer?

A summary of our major discoveries are as follows:
- The degree of deafness varies greatly from person to person. A recurring response was to not generalize deafness as the different ‘deafness’ levels had respective associated skills.
- The main difficulty faced on smartphones is making phone calls with the majority of users communicating through text messaging.
-None of the people we talked to had used text to speech system before.
-For audio recognition systems they informed us of a new app which allowed for users to communicate in a group setting. The application, ‘AVA’ has been further described in the related products section.
- As for banking and similar needs they would primarily use an online option or otherwise visit the location in person with an interpreter.



Identifying the problem


From the results of our observations we were able to identify a few major issues:
- Regular phone calls are an obvious impossibility with even partially deaf people having substantial difficulty.
- An interpreter is required in majority of the situations resulting in almost no direct communication which results in cost and privacy concerns.
- Routine tasks such as dealing with banks, ISP’s, telecom providers not possible over the phone and generally not possible without external assistance.
- Current applications are difficult to set up.

The problem we are attempting to solve is to assist deaf people in making these increasingly essential phone calls with them being able to provide input in an efficient and timely manner. It has become increasingly important to deal with companies over the phone with a large number of services only available through the use of customer service centers over the phone. These include dealing with credit cards, getting better offers from cable and telecom providers, requesting extensions in bill payments etc. We aim to target specifically this category of services, specifically those that make use of IVR(Interactive Voice Response) menus. We aim to cut out the middle man and use audio recognition to convert IVR menus as well as human operator speech to text and present them to the user as textual options. The user will have the ability to convey their message through text input, which will be conveyed to the other party by a text to speech interface.



Personas


Emma Charles

Description :
She is in her mid 20’s and is partially deaf. She is very outgoing and is an exec in various groups that help people with hearing disabilities and organizes events to get them all together. She is in Univeristy studying psychology and is good at carrying conversations through use of lip reading and hearing aid devices. She can communicate well with non deaf people but is still at a loss at ways to converse on telephone calls

Goals :
She wants to have a higher level of independence and want to be able to carry out tasks like banking transactions over the phone and making hospital appointments.

Level of Expertise :
She can hear a little, but she is afraid of receiving a call from someone even if she can hear a little bit. When she receives a call, she can not understand how what they said because she does not get any visual cues for understanding.

Accessibility Considerations :
She is not totally deaf, but her hearing is limited; she have a difficulty to receive and make calls. In order to achieve her goals : 1. Enable a way for her to understand what the person on the other side of the telephone is saying 2. Enable a way for her to communicate what she wants to say to the other person on the phone

Emily Watson

Description :
she is In her mid 40’s and completely deaf. As her life boundary expand, she has to do many different interactions with non-deaf people from calling her children, getting some service from service centre call, making an appointment with someone and to receiving calls from coworkers in her life.

Level of Expertise :
Usually she uses e-mail when she has to talk with someone in both formal and informal ways. Some cases when she has to call or meet in person, she either hires interpreters or asks help from someone she knows. She has a smartphone, and uses it for tasks like messaging and maps, but does not use any of the applications that are catered to deaf people as they are not intuitive enough for her.

Accessibility Considerations :

She is totally deaf, so she has many difficulties when she have to meet someone for many reasons. therefore, to achieve the goals :
1. Making an application which is easy to use for everyone even elder people
2. Finding another way to communicate with people over the phone besides auditory responses.
3. Translating what each person said as soon as possible for continuous communication



Use Case Scenario

The user downloads our app from the google app store. On creating their account the user is given the opportunity to personalise the app by first entering a basic message that they would like to use when they are first in contact with an operator. An example would be
“Hi, my name is Allison Shaw. I am a deaf user and am using an assistive technology to aid my conversation with you. The app is facilitated by audio recognition so I would please ask you to keep your answers short, talk slowly and clearly so that our conversation can go smoothly. Can we continue?”
The user would then enter some of their personal information which they can select easily when prompted by the operator for security purposes e.g. mothers maiden number, date of birth, SIN number etc. After they have entered this information, they are ready to make their first call. They call the hospital to request an appointment with a doctor.

First menu;

"To continue in english press one"

User presses one

Second menu;
"If you know the extension of the person you are trying to reach enter it now"
"To get information on hospital timing press one"
"To request a checkup press two"
"To speak to the receptionist press zero at any time "

User presses zero
Music plays for a while

Operator picks up; “hello, this is Mandy how are you?”
User TTS: “Hi, my name is Allison Shaw. I am a deaf user and am using an assistive technology to aid my conversation with you. The app is facilitated by audio recognition so I would please ask you to keep your answers short, talk slowly and clearly so that our conversation can go smoothly. Can we continue?”
Operator: “yes”
User TTS: “I would like to make an appointment with Dr. Allison”
Operator: “How about friday five pm?”
User TTS: “Sounds good, thank you! Bye!”



Related Products


We have outlined the currently available technologies dealing with communication amongst deaf users below. They can be placed into two distinct categories: Speech to Text Services:
This category of applications is geared towards taking in audio input from the smartphones microphone and providing a transcription of the dialogue. A few of the technologies have been listed below:

Dictionary services:
This category of applications is geared towards people with normal hearing communicating with deaf people. There applications providing with English to sign language representations using videos. Another class of these applications is focused towards teaching people sign language. Examples of these include MobileSign, British Sign Language Finger Spelling.



Comparison with other products



High level Design



Group Contributions



Feasibility

For this project, our team plans to design and implement an android application that is able to provide the user with a transcription of the conversation taking place over the phone with an IVR service. The goal of this application is to provide an efficient manner in which users can deal with banks etc, The major focus of our application will be the user interface aspect of the application. A basic application with the transcription from a call has already been programmed by Muhammad Ammar. This will allow us to speed up the development process greatly as all our time can be used to optimize the user experience. An initial list of libraries to use has already been compiled.
All three of the team members have experience with Androdid development. We will be using the CMU SPhinx audio recognition library and Voice Crafts Text to Speech Library. Muhammad has previous experience with both of these libraries and has the basic structure already made for audio recognition over phone calls.