TMS, thank you, will check that out as well.
To add a little meat to the request, I'm building a tool that compares two documents, which are different versions of transcripts from audio calls, and assigning a score to them based on the content of one "golden" version. At this point in the process, I have two text strings (StringA and StringB) that I'm normalizing before tokenizing and dumping into a pair of columns on the spreadsheet.
So what I'm trying to do in this phase is remove score deductions we don't really care about, like when Transcript A has "There is 1 line of text here" and the Transcript B has "There is ONE line of text here"
I have a hidden sheet that has 132 values and their replacements (besides numbers, there are contractions and a few other terms I want to swap out).
Since these transcripts will be between 20 and 50,000 words each, I think looping through the cells to replace may add a bit too much time, but I haven't finished making the adjustments to the first code bit you provided to test that out.
Thanks again for your guidance!
Bookmarks