A year ago in the 2024 Relevance Report, I detailed how we were “atomizing” our processes in communications at Microsoft to identify where we could apply generative AI. I talked about the twenty-step journey that an earned media story takes in our organization and noted where there was opportunity for AI (and automation). Over the last year, we have been methodically building and applying AI across those steps.
Step 10 in that process for us is where interviews are conducted, recorded and transcribed. This is a process many of us will be familiar with – setting our phone or recorder down on the table to capture the audio of an interviewer talking with our spokesperson. We then take that audio file and have it transcribed, sometimes by hand, other times with existing AI tools. It can be time-consuming, often decentralized, and at times, prone to error. We had a strong sense that AI could transform the process and an instinct that it would bring other benefits.
And so, we set off to build a tool that any of our teams (including our agencies) could use to quickly upload a file to a secure location and have it quickly transcribed and simultaneously translated. There are a few key points here worth noting in that last sentence:
- Secure: These recordings often contain unreleased material that is highly confidential, so we built a solution that stores the files securely on our cloud.
- Translated: At times, it’s useful to have these transcripts in additional languages for our teams around the world to have insight into what was said.
- Quickly: Speed is often of the essence so a simple solution with a fast turnaround time to deliver a transcription was essential.
Using Microsoft’s Power Platform and our Azure AI services, we were able to quickly build a solution that met all of these needs, and it’s now released and being put to use by our teams. Transcripts are uploaded and returned as a Word document, often in a matter of minutes. As we rolled out this AI-powered solution, we didn’t just meet our initial goals; we uncovered valuable lessons and unexpected benefits along the way.
- This is a solution for 80% of the cases. There will remain times when we still need human intervention for the transcription for 100% accuracy. And there are other tools (such as automatic transcription in Microsoft Teams) that do the job just fine. Horses for courses as they say.
- A useful byproduct is helping improve our data estate. Transcripts previously sat in many different places – often an individual inbox, or desktop folder – meaning the IP was siloed and locked up. Our solution ensures that all transcripts are stored securely in a single location.
- Which created another useful byproduct – now we can reason across those transcripts using AI and ask questions of them such as “what did our executive say in that interview 2 months ago?”.
We’re building more and learning as we go by building tools such as transcription and translation and in doing so, freeing up our teams to do more of the work that they love, and finding new ways to unlock the gold mine of information we create on a daily basis.
As AI continues to revolutionize communications, the potential to streamline, innovate, and unlock new insights is boundless. This is just the beginning. Now is the time to embrace this AI-driven future, to push the boundaries of what's possible, and to ensure that we stay ahead in the evolving world of communications.
Steve Clayton is the vice president of communications strategy at Microsoft. Having more than 25 years of experience at Microsoft across technical, strategy, and storytelling roles. He leads the Microsoft strategy team whose primary focus is to reinvent how the company operates their communications.