Week 6 - June 15 - 19
Week 6 done!
This week’s main focus for me was to get the transcript up and running for all users’ input. See, we were able to get the local user’s transcript up but we did not have the script working to recieve the remote users’ ASR.
Seems simple to get all of the remote users’ text and add it to the transcript div but no, this required much more coding because we stream the users’ texts through a peer connection in WebRTC by sending a peer message and setting a peer listener.
At first, Norman suggested we try adding a protcol in text stream to other users to flag the text seperately for transcript and directly on the captioning div. However, that was a new concept to me and I could not figure it out so I did more research on how I could recieve all of the users’ text and add it to the transcript div. After some time reading up about EasyRTC Classes (because its the toolkit we use for WebRTC), I understood much more about how the peer connection for the text stream worked and realized that we would need to add lines of code in the peer listener to recieve the remote users’ texts and add it to the transcript div. I went straight to the code in the main.js for the page and added a couple of lines to also have all users’ text added to the transcript and it worked! It was not pretty but we were able to recieve all texts and print to transcript.
Yay, transcript worked! But there were a couple of things we needed to fix. The first draft of the code had the text stream to the transcript div but with no names and the text just piled up. So next on my list, I needed to set names for each user to identify who is talking. I decided to add an input in the first page of WebRTC - before a user enters a room - to recieve the users’ names. It was required too so the users could not proceed without inserting their name. I then had the code store the input and assigned it to the users by using easyrtc.setUsername(theusernameoftheuser) main.js of the page as well. That was successful and now the transcript was looking a bit better!
Now, this was a bigger fix to fix. The text stream for ASR was streaming to the transcript div successfully however for the RTT, there was an error. Everytime a key is pressed, it prints to the transcript div. So to demostrate, if I were to type “Hello”, in the text box, the transcript div would show:
Emelia: H
Emelia: e
Emelia: l
Emelia: l
Emelia: o
And that is a pain in the b*** and does not work. This was because the peer message listener is set to listen to each key that is pressed so with the code I added - it streams each character seperately. We approached this problem on Thursday and Norman had a great suggestion with adding a timer to print the text instead of having the text be directly streamed onto the transcript div. The code with the timer would set a timer of 5 seconds after every text that is streamed and if the 5 seconds go by with no new texts, it would print the text that was streamed to the transcript div. This way, we would have the text be streamed in sentences rather than each character. Norman helped write up a drafted code for the concept but by the time, it was Friday afternoon and the week had finished.
So, in all, this week was another successful week. We were able to get the transcript up and running with a few errors as well as add a name/id for the users. Next week, we should be able to check off the transcript on our list and move on to adding another ASR - AppTek (which I will explain more next week) and then integrate all three ASRs - MS Azure, GCP Web Speech Api, AppTek - together and see which ones work the best. See you all next week! :)