IHearYou
Emotional speech recognition for everyone.
Pitch Presentation: https://pitch.com/public/00c1a1be-aa56-4639-82a2-68f7aa1a29cb
Problem Background
People with ear impairments can now caption conversations and call like everyone else, thanks to voice to text technology. But what about emotions? How can we translate emotions in a non spoken medium?
We know that there is a fundamental analogy between sign language (visual) and spoken language (acoustic / phonological). In fact, just as phonemes are minimal units without meaning participating in the formation of words of articulated language, in the same way the cheremi, whatever the national sign language, are presented as minimal units without meaning, which - as formational parameters - can be combined between them to give rise to the signs of sign language.
When an ear impairments person communicates by text, the correspondence emotion into visual sign is lost.
Presently, we have successful apps with voice-to-text and text-to-voice technology. We know that there is an extended literature on AI models which can predict the emotion of the speaker by analysing the recorded audio clip of the speaker’s voice, what we don’t have is the text-emotion control application.
Research Insights
The main situation we wanted to understand was: how do deaf people deal with an emergency situation?
We interviewed three people with ear impairments and one ASL interpreter, and asked them to walk us through their actions and feelings during their approach to an emergency situation.
What we discovered is that there are two scenarios in an emergency case: the support is provided freely and immediately via a CODA, Children of Deaf Adult, given that generally deaf people have a very supportive network of people helping each other.
Otherwise there are on the market text-to-voice applications: as rogervoice or google live transcribe.
During the interviews, while texting our questions, we came across the same situation. The interviewee often asked us to slow down or rephrase the sentence because something was difficult for them to catch: emotion or intention in written text.
Proposed Solution
We narrowed down the solution to adding emotion into text. It is hard to recognize other people’s emotions by text. I wish my voice-to-text app could recognize their emotions so that I could communicate with people better.
The features we would like to have in IHEARYOU are:
- I am a person with a hearing impairment and I would like to express emotion and understand another's emotions through text.
- I would like to change the font size so that I could visually and quickly recognize so I could get involved with conversation
- I am in a phone call, using a live transcribe app, I would like to have text nuanced with the correlated emotion of the speaker.
A note on the last feature is that Automatic Recognition of Emotions from Speech and Text of speech is a challenging problem. Recently, AI and DeepLearning solutions have restricted audio data emotions to the top six classes: angry, disgusted, fear, happy, sad, neutral, surprise and calm, for these emotions text has a visual correspondence.
Speech Emotion Recognition (SER), is tough because emotions are subjective and annotating audio is challenging. If emotions can be encoded in visual text, people with ear impairments would have fewer difficulties in understanding voice-to-text conversation.
Solution Explanation
We created a list of User pain points, here below.
User Pain Points
User Story #1: As a user, I want to use voice to text with emotion, so that I understand non-visual signals.
Scenario #1: Understand emotion with text
Acceptance Criteria:
- User can see font ⇿ emotion correspondence 1 on 1
- User can see spaced text correspondence to vocal speed
- User can see contrast text correspondence to vocal intensity
(Setting : Equaliser and Text control such as font typeface and size)
User Story #2: As a user, I want to switch to another language, so that I can use the app anywhere.
Scenario #1: Translate
Acceptance Criteria:
- User can put the device that they want to listen
- User can see voice-to-text in another language
- Recognize the language and translate text to voice
User Story #3: If I don’t understand then I wish someone could explain me easy
Scenario #1: paraphrase and translate
Acceptance Criteria:
- Paraphrasing with real-life terminology
– Explain the dictionaries term
Based on our target users’ pain points, we knew we wanted to work on the text chat box with selecting emotion as the first feature, and with increasing tech difficulty arrive at the automatic emotion recognition feature.
Feedback
The feedback we received was enthusiastic, because of the general lack of resources for deaf people. Interviewees enjoyed the simplicity of the design and the efficacy of the solution. They would like to test the web site or app, and proposed to be a plug-in of the existing live transcribe app.
Lo Fi & Hifi Mockups
We created a flow user chart
From which the designer derived the HiFi Mockups
The chat box with emotion control menu has a dropdown menu with emotions to be selected and the background is colored according to the selected emotion.
Implementation Details
Technical implementation
- Where is it hosted? It’s hosted on Netlify.com
- What is your tech stack? We used React and Javascript to build the web app
Technical challenges
- What was the hardest part of development? The hardest part was actually figuring out how to build the app using React native but come to realize we weren't skilled in React Native, so we decided to switch over to the Basic React Javascript.
- Does your app have any scaling issues? Not really i believe this app can go even further into existing applications.
- What are some key takeaways? Start small, pick a small MVP that is workable within a tight deadline. Due to time constraints don't attempt to learn a new coding skill while trying to build an app that is due in a few weeks.
Future Steps
We are not working on the project as a team, because we are going into different professional paths, nonetheless the product is interesting and socially useful, and we would have liked to complete the product adding all the desired features, specifically the API for Emotion Speech Recognition.
Learnings
Product Manager Learnings:
Viviana Letizia
I have learned how to collaborate with a cross-functional team, dealing with contrast and difficult situations. How to prioritize features and listen to customers' needs. I am already using the learnings into my work and receiving good feedback.
Designer Learnings:
YeaGyeong (Rachel) Cho
Designer Learnings:
Jo Sturdivant
- Adapting to an Established Team: Joining the team in week 6 of 8 was challenging, as I had to quickly adapt to existing workflows, dynamics, and goals. This mirrors real-world situations where you often integrate into teams mid-project, and flexibility is essential.
- Work-Blocking for Efficiency: With only two weeks to complete the project, I learned the importance of a structured work-blocking system. This approach allowed me to manage my time effectively and meet deadlines under pressure.
- Making Data-Driven Design Decisions: Unlike my past projects, I had to rely on research conducted by others. This was a valuable experience in using pre-existing data to guide design decisions, helping me focus on the core insights without starting from scratch.
Developer Learnings:
Kat Sauma
Developer Learnings:
Vanady Beard
&
As the back-end developer, I learned how important it is to create efficient and reliable systems that support the entire application. This experience also taught me the importance of optimising the database and ensuring the backend is scalable and easy to maintain.
Developer Learnings:
Stephen Asiedu
&
As a back-end developer, I've come to understand the importance of being familiar with various database systems and modules. This knowledge enables me to build diverse applications and maintain versatility in my work. I've also learned that the responsibility for making the right choices rests on my shoulders, guided by my best judgement.
Developer Learnings:
Dre Onyinye Anozie
&
Gained more familiarity with the agile project management process while working with a team including a product manager, designer and developer
Learned a lot about building out applications that prioritize accessibility features for folks with hearing impairment, blindness and more. This allowed me to look into the technologies that already exists.
Developer Learnings:
Maurquise Williams
&
- Process of Creating an MVP: Developing a Minimum Viable Product (MVP) taught me how to focus on delivering core functionalities balancing between essential features and avoiding scope creep.
- Collaboration in a Real-World Tech Setting: This experience taught me how to collaborate efficiently in a fast-paced tech environment, keeping the team aligned and productive, even while working remotely across time zones.
- Sharpening Critical Thinking and Problem-Solving Skills: This experience honed my ability to think critically and solve problems efficiently. By tackling challenges and finding quick solutions, I sharpened my decision-making and troubleshooting skills in a dynamic, real-world setting.
Developer Learnings:
Jeremiah Williams
&
All in all this experience was very awesome I learned that in coding with others being transparent is key
Developers Learnings:
Justin Farley
&
I learned how important communication is when working with a team. Communication provides understanding, advice, ideas, and much more. While working with the product team, I’ve found that communication keeps everything flowing smoothly. Working with a team also showed me that every member brings something different to the table and we all have to work together in order to align and meet our end goal.
Full Team Learning
We have learnt how to collaborate and work through stressful situations. We successfully derived a solution from a complex problem as emotion recognition for the purpose of inclusivity.