4 min read

Hear Me Now: Fact Versus More Context, Part I

Picture of Ross Weinstein Ross Weinstein Monday March 30, 2015

As Oscar Wilde once said, "There is only one thing worse than being talked about, and that is NOT being talked about." Obviously, for a celebrity those are words to live by, but if you are anything like me, then you are more likely to feel visceral emotion when you read something in print about yourself, your family, your company, or things that you care deeply about.

Most recently, Voicebrook and our customers were featured in an article in CAP Today, entitled "Hear me Can_You_Hear_Me_Now now? Another audition for speech recognition." The author, Kevin B. O'Reilly, highlighted the positive impacts of speech recognition, while tempering the positives with some open-ended questions about its value for every Pathologist. Overall it was a VERY POSITIVE article about our customers' results that addressed many of the benefits of VoiceOver as the leading speech recognition reporting solution for Pathology.

Despite the article being very positive, I walked away feeling like a few of the examples used did not tell the entire story and could be taken the wrong way without the benefit of additional context. I also understand that there is only so much context that can be provided in an article of that length. This blog attempts to fill in some of those gaps, as well as bring focus to those things that help our clients save time, money and lives.

More Context: Pathology Report Context Plays a Major Part In Word Recognition

"When Pete Fisher, MD, says his name aloud, the speech-recognition system he uses spits out the words “deep fissure” on the screen."

While this is a fun way to start an article about speech recognition in Pathology, this is not a practical example of expected recognition accuracy for Pathology reports. In a recent blog post by Andrew Boutcher, he discusses the importance of context in helping speech recognition software recognize words.

The Dragon Pathology speech model used in VoiceOver is intended for Pathology report creation and it uses that context to determine the recognition of words in a report. Absent specific phrase training for "Pete Fisher" or adding his name into the context, as described in Andrew's blog post, "deep fissure" would be the likely result. With additional training and adding "Pete Fisher" to the vocabulary as a phrase, the recognition of Dr. Fisher's name should be highly accurate, if not 100%.

Fact: You Can Dictate and Sign-Out a Report in One Session

“It’s kind of revolutionized how we do the whole process of signing out cases,” Dr. (Pete) Fisher says of the system he uses, Voicebrook’s VoiceOver. “What happens is, I sit at my desk with the computer and my case. As I speak, I watch the text spit out on the screen, read it, review it, hit the button and it’s gone. It goes to the hospital and the doctors—it’s all done.”

Yes, the majority of our 3500 VoiceOver users choose to create and sign out reports in real-time. Being able to deliver a highly accurate report to the patient care team as soon as you finish dictating could have profound benefits in terms of patient outcomes and reduces transcription costs dramatically. That said, we do offer alternative workflows for the entire Pathology practice that still keep transcriptionists in the mix for those practices who are not 100% ready to move to real-time dictation and sign-out. The long-term goal is to convert everyone over to real-time, and in Dr. Fisher's words, when you "sit down to sign out a bunch of cases, it’s pew! pew! pew!—one after the other—and they’re gone."

More Context: Ten Percent of UPMC Pathologists Use the Software

Despite the strong results in the gross room with speech recognition, UPMC has not taken the tack of requiring pathologists to adopt the technology. Some handling high volumes of dermatopathology cases have made the switch, he says, but overall fewer than 10 percent of the health system’s 90-plus pathologists are using the technology.

“We still have traditional methods of reporting available to them,” Dr. Parwani says. “When it comes to adoption of technology, even when the technology is good, if there’s an alternative available that’s already in place and that people are accustomed to, people are resistant to change. . . . That’s my experience with any technology, and I’ve deployed many different types of technology across the hospital.”

Dr. Parwani makes a great point. It is difficult to force people to use any technology, especially if you leave them the choice of not having to change, and frankly forced technology adoption can have very negative consequences.

What is lost in these statements is that UPMC is using VoiceOver for 100% of their gross reports, and we have found that gross reporting has been the biggest bottleneck when it comes to turnaround time. Enabling the gross room was the intention of the project and all UPMC PA's and residents are using it, and using it effectively. The 10% of the Pathologists who are using VoiceOver have done so on their own volition, but the fact is that in an academic laboratory, where volume is not the priority, use of these technologies has different benefits and addresses different pain points than high-volume private laboratories and non-teaching hospitals.

Fact: Laboratories Using VoiceOver Will Create More Accurate Reports

UPMC conducted a study of their use of VoiceOver and they found..."the average turnaround time fell by 81 percent, from 554.4 minutes to 102.8 minutes. The median TAT was slashed at an even greater rate of 85 percent, dropping from 203.5 minutes to 30. Most gross descriptions were completed within an hour using speech recognition. But surely the rate of “deep fissures” must have been greater using the technology? No, Dr. Parwani and his colleagues found. Transcription errors fell by 48 percent (Kang HP, et al. Am J Clin Pathol. 2010;133[1]:156–159)."

As Kevin pointed out in his article, there continues to be skepticism about the accuracy of speech recognition versus a human transcriptionist. I believe this skepticism derives from a comfort level of having multiple people check a report, and our own discomfort with interpreting certain accents. The problem with this thinking is that while most transcriptionists are excellent at what they do, they are still people and people are inconsistent by nature.

By contrast, speech recognition software can't make spelling mistakes, and when a PA or Pathologist dictate a case they can review the report for "content and diagnostic accuracy" without worrying about spelling, and while the specimen is still in front of them. In a traditional transcription workflow, the turnaround between dictation and review can introduce its own set of diagnostic inaccuracies since the case is no longer fresh in mind and the specimen is no longer in front of you.

In the case of an accent, an accent is spoken words that are consistently spoken the same way. This makes for excellent recognition by a system that is programmed to learn how you speak, and since the pathologist or PA reviews the dictation while it is still on the screen, any misrecognitions are caught in real-time.

All told, the UPMC Case Study showed that the skepticism is unfounded, and that the impact of a highly accurate and adaptable speech recognition solution coupled with real-time editing was a dramatic decrease of 48% transcription errors.

This concludes Part one of this blog post. In the remainder of this discussion I will focus on the Stony Brook case study, learning curve, transcription expense and availability, and Chester County's use of VoiceOver as a discussion of what type of users benefit most from the solution. Hopefully this added context helped clarify some of the less obvious points of the article. I look forward to sharing the rest with you soon.