An Interview with ReadSpeaker® CTO Fredrik Larsson
Does the mere mention of text-to-speech (TTS) systems invoke memories of the robotic voice of the HAL 9000 computer in the epic 1968 film, 2001: A Space Odyssey? The latest generation of TTS technology bears little resemblance to the systems that were in use in the 1960’s. In fact, recent advances in TTS features – particularly in the areas of voice quality, web-based access, and mobile compatibility – have transformed TTS from an accessibility-focused technology for the blind into an attractive, affordable tool for the general public. In the education space, it has opened new doors for students who have dyslexia or other learning challenges, experience eye fatigue while studying, want to access course materials on handheld devices with tiny displays, have language barriers, or need to listen to course content while commuting to and from school. A number of studies conducted over the past 20 years have provided compelling evidence that bimodal content presentation – reading content while hearing the words – can result in improvements in reading comprehension, recall, vocabulary skills, motivation, and confidence.
Text-to-speech systems have been available to the general public for about 30 years, but early systems were known for their stiff, robotic voices. The technology was expensive, not available for all devices or operating systems, and required each user to install the text-to-speech software on his or her personal computer. All of these factors clearly limited the possible application areas for the technology. Due to these shortcomings, early systems – particularly those known as “screen readers”—were primarily used as computer accessibility tools by the blind and visually impaired. The current generation of software that uses TTS speech systems has changed all of that, with web-based functionality that can convert online content to speech across a wide range of devices and provide high voice quality. I recently had an opportunity to catch up with Fredrik Larsson, Chief Technology Officer for ReadSpeaker, one of the pioneers in this area.
Jeanne Heston (JH): What was it that first attracted you to ReadSpeaker 10 years ago?
Fredrik Larsson (FL): Niclas [Niclas Bergstrom, the founder and CEO of ReadSpeaker] was looking for blind people to test new ideas for using audio on the internet. I was at the university at the time, studying engineering physics and computer science, and I had been using screen readers for years, so I was intrigued with the notion of applying my knowledge of existing systems to modern technology on the Internet. When I joined the company, we did not know where the technology would take us, but we were convinced that text-to-audio had the potential to benefit a broader audience. In the early days, we applied the technology to eMail messages, homepage reading by phone, and web-based newspapers, so our focus was always on improving the quality of the reading – leveraging existing technologies and perfecting the front-end algorithms to suit a broader target group. That is what led us to invent the ReadSpeaker technology.
JH: What sort of response have you received from ReadSpeaker users?
FL: The response has been extremely positive. Users really like the fact that there is nothing for them to download onto their computers. Once a website developer has implemented ReadSpeaker, all of the content within the site is speech-enabled for all site visitors – including all page content, RSS feeds, Word and PDF documents, and mobile apps. We have also heard from users who benefited from ReadSpeaker-enabled sites and cloud-based software during their recoveries from accidents and surgeries – temporary disabilities that lasted for months.
JH: How do you determine which new features should be added to the ReadSpeaker service?
FL: In the beginning, we mainly conducted research with users of screen readers – all blind – because they understood the benefits and shortcomings of existing systems. They provided us with a great deal of information that we used to refine the initial releases. We eventually reached out to a broader set of target users for our research, but we had to take a different approach because they were not familiar with screen readers. So we asked the question, “How should this service work in order to be useful to you?”
JH: What has it been like to lead the ReadSpeaker development team for the past 10 years?
FL: It’s been very exciting for all of us; a chance for team members to learn about technology and applications that they would not be exposed to at a typical web development company. The technology is evolving so quickly that there are really no reference books to tell you what to do.
JH: Where do you see this technology going in the future?
FL: In the future, the quality of the voices will continue to improve, encouraging an increase in use of text-to-audio by the general public to access the content that they now read on a daily basis. I also expect to see increases in the number of online sites and content types that will be speech-enabled, including concierge services, forms, and documents.
Have you or your students used text-to-speech features within the past year or two? If so, we would love to hear from you. Please share your experiences using the comments.