What is the future of AI, and how will it affect the way that we conduct work? In this episode we speak with an expert in the field: Vijay Parthasarathy, Head of AI and ML at Zoom. Vijay will draw on his experiences in AI at Zoom, Meta, and Apple, as well as the unique industry perspective of Zoom. By virtue of their position as a leading video chat application used by businesses and individuals globally, their Zoom AI Companion will no doubt affect the future of work, and we as engineering leaders need to be prepared for these changes in our own workflows as well as in the products we build.
Vijay Parthasarathy is Head of AI and ML at Zoom, where he is helping to pioneer Zoom’s human-centered innovations aimed at developing products that empower smarter experiences and workflows, easing everyday pain-points and unleashing worker productivity, connectivity and engagement. Prior to joining Zoom, he also worked as an engineering leader in AI and ML at Meta and Apple, where he led teams that built distributed machine learning frameworks. He was also a committer to the Apache Cassandra project, the open source NoSQL distributed database, and previously wrote a book on Learning Cassandra for Packt Publishing.
Join us for this revealing conversation about the future of AI and meeting tools, and how virtual meetings may become productive than in-person meetings!
Listen on Spotify
Listen on Apple Podcasts
Watch the video:
Show notes with links to jump ahead are below
Show Notes from Episode 25 – Vijay Parthasarathy on AI and the Future of Work
Timestamp links will open that part of the show in YouTube in a new window
-
- Introduction
- 00:00 Vijay’s opening quote: “The AI Companion today can summarize a meeting, take notes, find the action items, and send you an email at the end of the meeting … this enables people to have a real time conversation and focus on the conversation … it’s more accurate nowadays … the second part is, say you are late to a meeting, you can use AI Companion to query what happened in the meeting, without interrupting anyone … these features will help you be more productive and engaged in a meeting.”
- After the opening quote, Arin and David talk about their views on the future of work and how AI will change it. David notes that at the beginning of 2023 we seemed to be at the peak of a bad hype cycle, where all of the talk was about AI taking our jobs. While that’s still a concern and topic of discussion, we also see more realistic discussions happening now about more practical ways that this will augment our work. We are just beginning to scratch the surface of what is possible!
- Arin introduces Vijay: Vijay Parthasarathy is Head of AI and ML at Zoom, where he is helping to pioneer Zoom’s human-centered innovations aimed at developing products that empower smarter experiences and workflows, easing everyday pain-points and unleashing worker productivity, connectivity and engagement. Prior to joining Zoom, he also worked as an engineering leader in AI and ML at Meta and Apple, where he led teams that built distributed machine learning frameworks. He was also a committer to the Apache Cassandra project, the open source NoSQL distributed database, and previously wrote a book on Learning Cassandra for Packt Publishing.
- A new age of productivity?
- 05:30 Arin starts the conversation by asking Vijay if we are at an “Oppenheimer” moment where something very destructive is being unleashed on the world, or are we about to enter a new age of productivity enabled by AI? Vijay notes that the current progress of AI is something that computer scientists wanted 40 years ago, and it took a lot longer than expected. But the progress in the last 4-5 years has been incredible and is increasing.
- Vijay is an optimist and doesn’t see widespread job loss, but notes that “I think these models are getting really good at solving some specific problems and the ability for it to create new jobs, new possibilities. The problems which we were not being able to solve, we were not able to solve before, can be solved better now.”
- David talks about the impact of ChatGPT on universities and higher education, and how many educators now expect and require their students to use AI in assignments. Vijay agress this is an area with big impact and it will change the workflow of students and educators. David talks about using a local copy of an LLM to avoid privacy concerns, but he has found great productivity from having “conversations” with that LLM, including using it for drafting complicated client communications.
- Is AI here to help you?
- 11:15 Arin talks about a recent series of podcast episodes on Freakonomics, and how an economist interviewed in that series said something like “We should not be thinking about how to build AI that is better than humans at chess, but how to use AI to make humans better at chess.” Arin and Vijay agree that they like that framing of using AI to assist humans and make them better, not trying to replace them.
- Vijay notes that “I believe most of the jobs will be enhanced and people will be more productive in doing their job and being more efficient and move on a higher level cognition to think about what are the problems which are more important for them, for the business and try to solve them. And I think right now, the way the tools are, it is going to augment your productivity. It is going to augment what you do already.”
- AI in technical workforces
- 13:56 Vijay jokes about how software developers often argue over which text editor Vim or Emacs is better, but software IDE’s have made coding more efficient than just text editors. This is an example of how new tooling can help developers to be more productive in their work and take on more complicated work.
- Vijay talks about the impact of AI on software development: “So there is autocomplete compiling the smaller piece of code and then showing you feedback immediately. You don’t need to write code perfectly on first time right. I mean, back in the days when you punch the wrong card, you have to redo your whole program … I think this wave is interesting because [AI] will help you write better code. It can help you write better code faster for your task, writing better test cases, writing better documentation.”
- Arin and David also talk about the potential impact of AI on other technical roles like DevOps and system monitoring and maintenance.
- Vijay’s journey with AI
- 17:00 Vijay talks about what he’s seen with the progression of AI over his career, working at Zoom, Meta, and Apple in AI roles. He explains that classic ML could be more explainable in terms of probability. At Meta, it was all about scale and training models quickly since they had so much data. Now more people “understand the imperfections of AI, what it can do, what it cannot do, which is a good news overall.”
- After being asked by David about what he sees as coming next, Vijay talks about multimodal AI. ChatGPT is primarily text based (as of the time of our interview), and soon we will see it become more multimodal and deal with video and audio. Arin and David both agree with this prediction and talk about their experiments with tools that all for ingestion of audio directly into an LLM for things like podcast transcriptions.
- Automatic Translation
- 23:45 Arin asks Vijay about the future of automatic translation – do we still need to learn foreign languages or can meeting tools automatically translate for us in real-time? Vijay notes that transcription and translation is available today in Zoom for text modes. Their next step is to improve the latency and accuracy of it, since there is some delay in the translated text. Like with most real-time captioning, the text auto corrects itself as sentences are spoken.
- Vijay talks about the challenges of speech to speech translation, where you have the audio converted to another language: “So that’s the challenge in live transcription. So when you talk about speech to speech translation, the cost of correcting is much higher, right? So humans do this interpretation. Like if you go to big conferences, with different delegates from different governments, the European Parliament for example, has multiple [human] interpreters in different languages.” Current technology doesn’t allow for that sort of accurate voice translation in high stakes conversations but likely can in the future. The accuracy also varies a lot across languages right now based on the popularity of that language and how much data is available to models for training.
- The AI Companion in Zoom
- 28:20 Zoom has a goal to make Zoom meetings more productive than an in person meeting, which is a fascinating goal. Vijay talks more about the role of AI in Zoom now: “The AI companion today for example, can summarize the meeting, take notes, find all the action items, can send you an email at the end of the meeting … So this enables people to have real time conversation, focus on the conversation rather than taking notes and stuff like that.”
- “And the second part of it is let’s say you are late to a meeting, you can use AI companion and query what happened in the meeting and without interrupting people.” Instead of interrupting the meeting to ask what happened so far, you can ask AI companion to explain what happened so far. “These features will help you be more productive and be more engaged in a meeting.”
- Vijay also talks about other upcoming features they want to build into AI Companion. Features to help you prepare for a meeting, such as preparing an agenda.
- A message from our sponsor: WebRTC.ventures
- 31:12 Building custom WebRTC video applications is hard, but your go live doesn’t have to be stressful. WebRTC.ventures is here to help you build, integrate, assess, test, and manage your live video application for web or mobile! Show notes continue below the ad
- Using AI to augment instead of replace
- 32:03 Arin talks about how he’s fascinated by the ideas of how AI can help us to be more productive. It’s not about replacing jobs, but allowing us to do more than we could previously. As an example, he talks about using AI to help automatically produce marketing clips from the Scaling Tech Podcast, and how tools like that can help us do more than we do now.
- Arin brings up sentiment analysis in the call center, to see how an agent is handling a call with a customer, and if the customer sentiment is improving to indicate they are being helped. Vijay talks about the Zoom virtual agent and contact center products. You can also track what’s going on in a call to see if they are asking enough questions, and use metrics to see how the call contributes to revenue growth.
- AI Architecture in Zoom
- 35:21 How much of AI in Zoom is done in the cloud versus on the edge or client side? Vijay explains that this is an interesting mix due to the architecture of Zoom as a desktop client. Features like virtual backgrounds, audio suppression, and noise cancellation, are all AI models and they run in the edge on your client application. This increases your CPU usage when using the application, but it also increases privacy since that processing stays local, so it’s a tradeoff.
- More computationally intensive models are run in the cloud, like for example, large language models. Zoom uses their own models as well as third party models. Automatic Speech Recognition, machine translation and others, are better deployed in the cloud.
- Zoom also applies AI in the background to improve the call experience and call quality in general.
- Conversational AI
- 37:49 Conversational AI is another modality for communication, and Vijay talks about this interesting use case. The chatbots need to be constructed to use corporate data and know what they can and cannot answer, so the answers are grounded. The data needs to come from your knowledge base but can still use LLMs to make the answers more conversational
- Privacy in AI at Zoom
- 40:27 Arin asks Vijay, “How is data privacy addressed in AI at Zoom and kind of what have you learned that you would recommend to others implementing an LLM into their business practice?”
- Vijay refers to a recent public statement they made that “we [at Zoom] don’t use audio, video, screen sharing, chat or any type of poll or any type of content in the meeting to train our ML models, [nor do] we allow our third parties to train their ML models, either. And we take privacy seriously, and we do believe in responsible AI development.”
- Vijay also talks about using call data to generate updates to the model and feed back into the knowledge base used for answering questions, as well as how eventually a call summary might even be different based on the meeting participant that summary is provided to. For example, your personal meeting summary may include action items specific to you, but leave out other details not as relevant to you from the conversation.
- Virtual Coach
- 44:45 Vijay talks about the idea of a virtual coach, someone that you can interact with in a meeting and have a conversation back and forth. This is something Zoom is experimenting with around sales conversations initially, though it could be used in other use cases also. Imagine that you are able to have sales reps practice their pitches with a virtual coach.
- Zoom’s bet on AI
- 50:37 Vijay talks about Zoom betting heavily on AI and using it to improve productivity. They see ways for making Zoom a video platform that is used before, during and after the call, and focusing on those value adds is how they will continue to grow and compete.
- Conclusion
- 52:05 We conclude the conversation, and Vijay recommends that listeners looking to learn more about the use of AI in Zoom go to Zoom.us where you can learn more about many of the new features discussed in this episode.