Sidecar Sync
Welcome to Sidecar Sync: Your Weekly Dose of Innovation for Associations. Hosted by Amith Nagarajan and Mallory Mejias, this podcast is your definitive source for the latest news, insights, and trends in the association world with a special emphasis on Artificial Intelligence (AI) and its pivotal role in shaping the future. Each week, we delve into the most pressing topics, spotlighting the transformative role of emerging technologies and their profound impact on associations. With a commitment to cutting through the noise, Sidecar Sync offers listeners clear, informed discussions, expert perspectives, and a deep dive into the challenges and opportunities facing associations today. Whether you're an association professional, tech enthusiast, or just keen on staying updated, Sidecar Sync ensures you're always ahead of the curve. Join us for enlightening conversations and a fresh take on the ever-evolving world of associations.
Sidecar Sync
Project Strawberry, Hugging Face Speech-to-Speech Model, & AI and Grid Infrastructure | 47
In this special hurricane edition of Sidecar Sync, Amith and Mallory dive into the intersection of AI and associations. From the upcoming launch of OpenAI’s Project Strawberry, set to redefine how AI handles reasoning, to Hugging Face's cutting-edge speech-to-speech model, the hosts cover all the latest AI advancements. They also explore how AI is transforming grid infrastructure, fueling climate tech investments. Plus, you'll get a behind-the-scenes look at how AI could help associations manage power grids and customer service more efficiently.
🛠 AI Tools and Resources Mentioned in This Episode:
Hugging Face Speech-to-Speech ➡ https://huggingface.co
Project Strawberry ➡ https://openai.com
OpenAI ➡ https://openai.com
The Hottest Sectors in Climate Tech? Follow the VC Money ➡ https://shorturl.at/nzmKu
‘Strawberry’ Article ➡ https://shorturl.at/ouPay
Chapters:
00:00 - Introduction
03:26- Overview of Leveraging AI in Associations
06:02 - How Associations Can Use AI to Unlock Their Data
12:22 - Counter-Positioning Your Business for Success
15:19 - Multi-Tiered AI Offerings for Associations
18:28 - The Plummeting Cost of AI Tokens
24:41 - Impacts of Token Costs on AI Accessibility
28:46 - Generational Differences in AI Adoption
34:51 - Why Leaders Need to Adopt AI Now
🚀 Follow Sidecar on LinkedIn
https://linkedin.com/sidecar-global
👍 Please Like & Subscribe!
https://twitter.com/sidecarglobal
https://www.youtube.com/@SidecarSync
https://sidecarglobal.com
More about Your Hosts:
Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.
📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan
Mallory Mejias is the Manager at Sidecar, and she's passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.
📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias
Small models are where a lot of the action is, because they allow you to do things in ways on device and at low cost, or, in some cases, effectively no cost. That, to me, is exciting and can build a lot of applications. Welcome to Sidecar Sync, your weekly dose of innovation. If you're looking for the latest news, insights and developments in the association world, especially those driven by artificial intelligence, you're in the right place. We cut through the noise to bring you the most relevant updates, with a keen focus on how AI and other emerging technologies are shaping the future. No fluff, just facts and informed discussions. I'm Amit Nagarajan, chairman of Blue Cypress, and I'm your host. Welcome back to the Sidecar Sync, your resource for all things association and artificial intelligence. My name is Amit Nagarajan.
Speaker 2:And my name is Mallory Mejiaz.
Speaker 1:And we are your hosts and we have another action-packed episode lined up for you today, with three interesting topics at that intersection of AI plus associations. Before we dive in, let's take a quick moment to hear from our sponsor.
Speaker 2:Today's sponsor is Sidecar's AI Learning Hub. The Learning Hub is your go-to place to sharpen your AI skills, ensuring you're keeping up with the latest in the AI space. With the AI Learning Hub, you'll get access to a library of lessons designed to the unique challenges and opportunities within associations, weekly live office hours with AI experts and a community of fellow AI enthusiasts who are just as excited about learning AI as you are. Are you ready to future-proof your career? You can purchase 12-month access to the AI Learning Hub for $399. For more information, go to sidecarGlobalcom slash hub. Hello listeners, this is a special hurricane edition of the Sidecar Sync podcast for Amit, who's in New Orleans, who is about to experience Hurricane Francine. How's it going, amit?
Speaker 1:Well, right now it's the calm before the storm. The storm's supposed to come through here in a few hours, I think, maybe later this evening at this point. At the moment we have power. At the moment I've got an internet connection here. We are recording the podcast. I've got my kids at home. They're at school's close today, probably tomorrow as well. Hopefully the whole area will fare reasonably well. I think it's going to be a category one when it hits the coastline and then quickly downgrade to a tropical storm. So hopefully everyone in the area stays safe and there's a minimal amount of damage. But these things usually do hit us roughly. Once a year we get something, maybe not a direct hit, but something pretty close by, as you know from living here, and it's just kind of part of the flow of New Orleans life, and I'm sure there's a lot of people in the French Quarter just hanging out drinking hurricanes.
Speaker 2:I do remember one year we had a set of mutual friends that had their wedding fall in the French Quarter over a hurricane and they still chose to have it and I didn't go, but apparently it was a great time because the storm was not that bad. Of course, thankfully and you know, I'm sure you're right there will be people in the French Quarter having a grand old time.
Speaker 1:Well, you know, if history is a guide for structural stability, the French Quarter can be, you know, a good place to think about, because it's a lot of those buildings are 250 plus years old and they're still there. It is a little bit higher in the French Quarter than it is in other parts of the town. Up here I'm in Uptown, close to the universities, and our ground is just a touch higher. It's still. I think maybe we're like a couple of feet above sea level instead of five feet below sea level. But yeah, it's dicey when you're in New Orleans and there's a big storm coming. So hopefully AI will help us figure out how to fix that over time.
Speaker 2:And it really should. I will say this has made me happy that I'm in Atlanta. This is the first time, having basically being from Louisiana, having grown up in Louisiana for most of my life, that I am not impacted by this hurricane, so that feels good. But we're absolutely thinking of everybody in the storm's path and it got me thinking of me. So this is episode 47 of the Sidecar Sink podcast, and we have never missed a week of posting, come internet problems, hurricanes, other bad weather, travel, and so I was just thinking I'm pretty proud of us.
Speaker 1:Yeah, that's pretty awesome. And also you say 47, I'm thinking that's pretty amazing because I don't keep track of the episode numbers that closely in my head, but I do have a general idea of where we're at. I knew we're approaching a year, so we'll have to think of something fun to do for the 52nd episode.
Speaker 2:And the 50th. I feel like the 50th is going to be a big one too.
Speaker 1:50th is good too, yeah.
Speaker 2:Yeah, so maybe we'll do two back-to-back promos. Y'all stay tuned, all right. Today we will be talking about Hugging Face's new speech-to-speech model, and then, finally, we'll wrap up with a talk around AI and grid infrastructure tech. Open AI's Project Strawberry is an upcoming AI model that's generating a lot of interest in the tech world due to its reported advanced capabilities. As a reminder, open AI is the company behind ChatGPT, and you may recall, as a listener of the Sidecar Sync podcast, that we covered Strawberry in an earlier episode, when the name of the model was Q-Star. Now I want to give a little disclaimer. While Project Strawberry seems like it will be quite impressive, it's important to note that a lot of the information we have so far is based on reports and leaks. The full extent of its capabilities will become clearer upon its official release and public testing. So what is it?
Speaker 2:Project Strawberry is said to possess significantly improved reasoning and problem-solving abilities compared to current AI models. Some of its recorded capabilities include advanced mathematical skills, able to solve unseen math problems that current chatbots struggle with. Improved programming abilities, capability to solve complex word puzzles, like the New York Times Connections. The ability to perform deep research autonomously on the web. And enhanced reasoning for answering subjective questions like those related to product marketing strategies, for example. In terms of development and release, it may be integrated into chat gpt 5, potentially releasing as early as fall of this year. Open ai is aiming to launch strawberry as a part of a chat bot, potentially within chat gpt, and the project is said to be in the final stages, with a possible release within the next two weeks, so we might have a very interesting episode coming up soon.
Speaker 2:Strawberry is described as a reasoning model representing the second of five stages of AI innovation defined by OpenAI, and it appears to have the ability to trigger self-talk reasoning steps multiple times throughout a response.
Speaker 2:Now OpenAI is reportedly considering high-priced subscriptions for access to its next-generation AI models, including Strawberry. Executives are weighing charging users potentially as much as $2,000 over an undetermined period of time for access to their most advanced models. So, amit, when you sent this to me, we had a quick discussion around the idea of system one and system two thinking, so I want to define that really quickly for our listeners. You can think of system one thinking as fast, intuitive and automatic. It's your brain's quick response system requiring little conscious effort, so you can think fight or flight. And then, with system two thinking. It's slower, more deliberate and analytical and involves conscious reasoning and problem solving, and you mentioned to me, amit, that the current models that we see right now use system one thinking, or that's the way that you can think about it, whereas something like strawberry falls into the system two category. Can you talk a little bit about this?
Speaker 1:Sure, and for those that aren't particularly familiar with this classification of system one, system two this is drawing directly on the work of Daniel Kahneman, who is a famous, or was a famous psychologist who recently passed, who wrote a book called Thinking Fast and Slow, and we'll include a link to that in the show notes. I would highly recommend that book to anyone who's just interested in digging deeper and understanding how this works in the biological brain and the similarity with AI. I think is worth noting in some ways that we're simply borrowing those terms to help us understand the way AI works currently. So if you think of system one as the thinking fast side, where it's intuitive, you're not actually thinking about thinking, you're just doing stuff right, you're thinking. It's kind of like a reaction to. I think I mentioned to you in the text thread when we were talking about this if you see an alligator on the street of New Orleans, which doesn't happen too often, but if you did see one here, you could and you definitely see them in the ponds. If you saw one and if you were very close to it, you might react by running away or you might react by jumping back. You did not stop and say, oh no, there's an alligator, it might eat me. I better leave. You didn't think through that, you just jumped away, right. So there's basically a rapid response. And then system two might be oh, we need to figure out how to plan the Digital Now conference and we need to figure out how to market it well so that we get hundreds of people there and that it's an awesome event, complex, and we say, well, how do you break that down? How do you break that into smaller pieces? What is the series of steps and thoughts, or a problem that has no known solution, where you say, oh well, we want to create a room temperature superconductor or something like that, where it's like you have a very different approach to that, like how you break down a problem like that.
Speaker 1:So system two thinking more broadly, I would say is something that requires metacognition, or thinking about thinking where you have to basically come up with a plan and current models, in spite of their incredible capability, do not plan, they do not reason, they simply predict the next token, as we've talked about. Essentially, think about the current generation of AI that we have available to us right now as very powerful statistical models. These are doing probabilistic, next token or next word prediction. Now what's happened is these models have scaled so much and they're so powerful that they actually have these amazing capabilities to write prose and poetry, to write code, to interact with us, to be brainstorming partners, all the amazing stuff everybody has basically been freaking out about for the last two, three years.
Speaker 1:But they don't think, they do not reason and they most definitely do not plan, and so they get stumped on problems that are as simple as how many R's are in the word strawberry, which is why the strawberry name is thrown out there. So, how many R's are in the word strawberry? Well, if you said, oh, okay, well, let me look at the word and let me count the R's, it's a pretty easy problem to solve. But if you're not doing that, if you just are saying what would be the answer to that question based upon the corpus of text that I have, what would be the answer to that question based upon the corpus of text that I have, which would tell me what should I predict the number would be? It's not based on actually breaking down the problem, saying, well, to figure out how many R's are in the word strawberry, I have to look at the word, I have to count the number of instances of the letter R, and then I have to tell you that number. That's what a deterministic algorithm would do. That's what we would do also if we were thinking about the problem first.
Speaker 1:But the reason these models, including many of the most recent models, fail at that is because they don't have enough of a corpus of content that has that answer pre-written. So it's a weird question, right? How many R's in the word strawberry? Like how many times has that been asked prior to all of this stuff going on? So that's why these systems are failing is because the training data doesn't have that. In comparison, though, like it's a simple computation.
Speaker 1:So what if a model was able to determine if it should slow down and think as opposed to immediately react? Right, don't just jump back from the alligator type of instantaneous reaction, but take a breath, think about it and let your bigger brain basically kick in to do the harder work right the longer range planning, the step-by-step breaking things down into complex parts. Ai models do not do that right now. They are about to, and Strawberry is possibly going to be the first model that has this capability. Openai has been talking about it for a long time, possibly through well-architected leaks and drama or whatever. You can have whatever theory you want about why they've been doing what they've been doing, but I think they've got something interesting and I think I don't think I know from just reading what these people are talking about across the community generally. This is the thing researchers are working on in terms of model design.
Speaker 2:So if you can bake actual reasoning, planning and all these capabilities into the model. It's astounding what that's going to do. Well, I can say I really like the name Strawberry a lot more than QSTAR. It's more fun. It's very memorable. Am I understanding you correctly, saying that strawberry is not a next word predictor, or is there some element of it? That will be, but there's also these other pieces in the background.
Speaker 1:The short answer is we don't know yet, because we don't know what strawberry is or isn't, but my suspicion is that it's an ensemble or a hybrid where you have multiple different models within it. We've talked on this podcast before Mallory, about mixture of experts models, where we have different kinds of models, that kind of combined. Like you know, Mistral had the first kind of broadly available model that was open source. That was an MOE model with eight underlying models, and there's a lot of belief that GPT-4 and its first edition was an MOE model as well. And a mixture of experts model is basically multiple models that are combined together with an intelligent routing mechanism.
Speaker 1:So the question or the prompt comes in and then the first thing that the model is doing is deciding which subcomponent or subcomponents of its mixture of experts it's going to use. It's like saying, hey, I got a question in my email inbox and my email inbox might answer both questions about our upcoming meeting, but it also might answer questions about our knowledge base within our domain and it might be a complaint, and so each of these types of emails that come in might need to be handled by different people on my team, right? So I might send, oh well, the domain expertise I'm member services. That's not my strength, so I'm going to send it to our publications person who comes from the field and they can probably answer this. That's the idea, like it's a team-based approach. So model architectures have already embraced this idea. But the mixture of experts thus far have been mixtures of next token predictors.
Speaker 1:So they're just tuned differently, with different capabilities as their strengths and as powerful as that is. This will probably be a similar kind of architecture, where there's some kind of intelligent routing and then, within that model architecture, you'll have the ability to run longer range planning, do step-by-step reasoning and execute things. If you give the model more time, you might say, hey, just like I'd work with a team, I might say, mallory, I'd really appreciate it if you could develop a plan for marketing the Digital Now Conference. Can you do that over the next week and come back and you would put some thought into that. You'd come back and say what resources do we have, what are our goals, what types of things might we want to do? And you'll come back with a plan in a week.
Speaker 1:If I say, oh, I need it, like in five seconds, well, you're just going to blurt out whatever it is you can immediately think of right, and both of those capabilities have purpose and value. But if the model is going to completely self-determine should it take more resources or not? Or perhaps it's something where you have to tell the model take your time thinking about this and do the following right. But I suspect that it will probably be able to self-determine. If the nature of the prompt is such that it requires or would benefit from longer range thinking and planning, does that kind of make sense?
Speaker 2:It does, it does. I think the part I kept getting hung up on this was with the system two, the thinking slowly and understanding how a model needs to think slower because, right is technology, it just probably happens as quickly as it can or as a chip can support. But I the idea of mixture of experts and essentially all these things happening. You provide a prompt and then all of these experts, let's say consulting, quote unquote, with one another, this routing that you mentioned, it makes sense that that is the slowdown and that is why we're getting the more thoughtful output.
Speaker 1:Exactly.
Speaker 2:Amit, how does this kind of system if we're talking about mixture of experts architecture here compare to multi-agentic systems which we've discussed at length on the pod?
Speaker 1:Yeah, you know, multi-agent systems are basically similar. If you think about it, it's like you have units of capability that are constructed together and you're kind of like building blocks. You know, or think of it as Lego blocks, right, that are constructed together and you're kind of like building blocks. You know, or think of it as Lego blocks, right. In a multi-agent system, you have prompting, you have memory, you have all these resources which are kind of like Lego blocks, and you construct them up to create an agent that does certain things. What we're saying here is that the underlying model itself will have the capacity to handle basically a broader range of tasks, and so, because the model itself will be capable of doing this, you end up potentially with much more powerful capabilities. Because in an agentic system, what you're doing is taking a fairly low-powered model in the grand scheme of things, like a GPT-4.0, a Cloud 3.5, sonnet. They're fairly low-powered models and we're trying to make them smarter. What we're trying to make them smarter, what we're doing, is essentially putting a layer of software on top of that to say, oh, mallory's asked for a marketing plan for digital now. So GPT-4.0 can certainly give you an answer to that question. But what if we had a multi-agent system that said okay, I'm a marketing planning expert, the first thing I'm going to do when I get a prompt is I'm going to ask a model break this down into a set of tasks. So that's the first thing we do. Is we go to like Cloud 35 or Gemini or 405B from Lama and say give us a breakdown of what we should do. Let's not try to solve the problem, let's like break this down into a set of steps, a chain of thoughts, a chain of tasks, right, a tree of thought. These are all the different terms that get chucked around like candy in this field and basically all they mean all the same damn thing, which is, essentially, you're going to break it down into a sequence of steps. It's like the simplest concept in the world. That's been.
Speaker 1:You know, complexity has been added for reasons that make sense, because they're all subtly different, but also because it's helpful for people, I think, to feel like it's really cool and more complex than it really is, but basically it's just breaking the thing down into small pieces. So the agent, the first step would be break it down into a plan. And then what would the agent do? The agent would say okay, step one think about our goals. Okay, what should our goals be? Maybe that gets broken down into five other steps.
Speaker 1:And so the agent breaks it down into small enough tasks where it thinks that it can actually farm out these tasks either sequentially or in parallel to different LLMs. And so I might say, okay, now I've broken it down into a series of tasks I want to do paid advertising, I want to do email marketing, I want to create a landing page, I want to do these eight different things. And then the agent says, okay, I need to do those eight different things. And it starts firing off those eight different tasks to eight different prompt strategies, to maybe the same LLM or different LLMs, and then it pulls those results back in and it constructs them into one integrated marketing plan. Right, well, if the model itself had the capacity to do that, it would have a higher level of understanding, because right now the model is only doing one little narrow task. So the model only knows that little bit that you've given it.
Speaker 1:So in theory, if the model itself was doing an entire, broader task, it potentially could be far more robust in its capability because it actually has visibility understanding.
Speaker 1:So it's kind of like in this kind of an agentic system, each step in the process has zero understanding of where that step is relative to all the other steps.
Speaker 1:That's just kind of the nature of how these things work at the moment, and the way you construct long-term memory, the way you construct state and all this other stuff that you do in agent systems is a software layer above the model.
Speaker 1:But if you bake it into the model itself, then in theory that will result in far more powerful emergent capabilities and the model will have the ability to like use tools itself, like researching on the web. In concept, this could help us do original science, where a model is expressing hypotheses and the hypotheses aren't just predicted token sequences but are based on some kind of reasoning steps and then go experiment. There's all sorts of cool stuff that could come from it. But, to answer your question, at surface level it's actually very much the same as what a multi-agent system does, but under the hood it's very different because the model itself has far more sophistication and from a performance perspective it should allow us to scale in an interesting way. So I'm really pumped about it. I think OpenAI tends to be the leader in a lot of new research, but there's a lot of other companies working on the exact same problems, so I expect to see a flurry of this this fall.
Speaker 2:And to really simplify this, when we're talking about agents or multi-agent systems, we just mean multiple models in one system. When we're talking about something like Strawberry, we think we're talking about one model with, let's say, experts within it. So one model versus multiple is that like a simplified way to think about it?
Speaker 1:Yeah, you can definitely think of it that way, and the word model in this particular context could be in quotes, in the sense that they might all be the same actual model. It might be Claude working across all of those. It's just a different prompting strategy, gotcha. So, like you know, the way you would prompt Claude to give you an email campaign would be a different prompt than what you would use for creating a web page or a different prompt for creating you know. So the idea is that Claude may be doing all of those things for you. It might be one underlying language model, but you're prompting it in different ways.
Speaker 1:And what an agent system does, is it essentially is software that does all of these multitude of steps for you and then pulls back those results. So if you want to simulate what it's an agent to, you could say okay, I'm going to, I as a user, I'm going to break down this complex task. I'm going to ask an LLM for a set of steps, then I'm going to go ask that same LLM for the output for each of those other things. Then I'm going to take the response from each of those separate prompts, I'm going to pull them back up and I'm going to go back to the first level and I'm going to say here's the results for all these pieces. Bring it all together for me and the LLM will do that. That's basically what agents are.
Speaker 1:They're actually very simple pieces of software. They just kind of, you know, they're kind of like the general contractor that's constructing the house, but the subcontractors, you know, put up the framing, they do the siding, they do the roof, they do the electricity and so forth. So the agent system is kind of like just overseeing a complex task, but it's super inefficient in a lot of ways. It's very much lacking in terms of context sharing. So there's a lot to be desired.
Speaker 1:And, by the way, agent systems don't go away If Strawberry's successful. What happens is agent systems still can be built on top of layers like Strawberry, and this has happened with software since the beginning of time. Where you build something, it becomes more powerful. You build another layer on top of it. So embedding these capabilities into the model would mean that some use cases for agents aren't necessary to be agents anymore, but it also means that you can do so much more with agents. So agents like Betty and Skip in our ecosystem will become just crazy powerful if we plug in a model like a strawberry into them. They already have multi-agentic structures like I'm describing, but they'll become far, far more powerful if the underlying model is smarter.
Speaker 2:And I was just curious on this last piece of meat. Well, it sounds like you just kind of answered it, but when strawberry is released, if it's at the price point I read on one article of $2,000 per month Is this something you're going to run and try out and plug into your products? Do you have any sort of other use cases that you personally would want to use something like this?
Speaker 1:for yeah, I mean, as soon as anything comes out, our group will collectively go and all experiment with it pretty much immediately if it's for a major player or if it's something particularly interesting. So yeah, for sure, I mean, you know, for us it's not going to change, Like the price is not really relevant to whether or not we'll test it, because we'd look at it and say, hey, you know, this is kind of like a fine wine, you know, we'll use it sparingly in certain cases we're not going to, like you know, pour it out of the faucet like water. So I think that you're still going to have uses for models like GPT-4-0 Mini or Lama 7B. I would say collectively. You know I am, I'm super excited about what something like Strawberry is going to bring us. I'm actually more excited about the scaling of the power in small models and I've said this a number of times in this pod for the last, you know, 47 episodes and a number of different times.
Speaker 1:Small models are more exciting because if you look at what small models do, like Lama 3.17b, 7 billion parameter model it's the smallest Lama model. Or Microsoft Fi, which is PHI their third generation and many others these small open source models are super fast, super cheap and they're open source. And, most importantly, super fast, super cheap and they're open source. And, most importantly, the capability of Lama 3.1, their smallest model, is about on par with earlier versions of very high-end models, right? So the compression of size while retaining functionality and capability from the larger models of the prior generation just keeps happening consistently. So it's a really exciting time.
Speaker 1:And so actually I misspoke. It's Lama 3.1, 8 billion parameter and then it's 70 billion. I always flip the two around. But it's the small model. It's unbelievably powerful, it's as good as GPT-3.5 Turbo from last summer and Lama 70B is about on par with the first version of GPT-4. So why do I say that's more exciting? Well, not that I have to choose, because it's kind of a cool time, we can do both. But those models you can use at light speed and you can use them all over the place in solutions, and then the fine wine might be strawberry, right? Or you say, okay, I'm going to use that only in situations where I know I need that layer of true power. So if it's super expensive, that's fine. You just use it sparingly.
Speaker 2:Next we're talking about HuggingFace Speech-to-Speech. Huggingface is a leading platform in the field of AI and machine learning, particularly focused on natural language processing. You can think of it like a model hub. Huggingface provides a vast repository of pre-trained AI models for NLP or natural language processing tasks. These models can be easily accessed, downloaded and fine tuned for specific applications.
Speaker 2:Hugging Face recently released multilingual speech to speech, so now it's cross platform pipeline to run GPT-4.0 like experiences on device can now seamlessly switch languages mid conversation with onlya 100 second millisecond. Sorry delay. They released a short video recently on LinkedIn showing this quick switch in languages between Chinese, english and French and I will say there is a slight lag, but honestly it's pretty imperceptible and I would say I hear longer lags in just phone trees when you're calling for customer service reasons in English in the same language. And Hugging Face announced that speech-to-speech has been received really well on GitHub and it's working to expand the languages included in speech-to-speech. Amit, we know LLMs can, or large language models can, interact with users right now in multiple languages. So what would you say is novel in your opinion about speech-to-speech on Hugging Face? Is it its ability to detect language? The reduced latency? Is it all of those things. What do you think is impressive?
Speaker 1:It's all of those things. It's open source, it's super inexpensive, you can inference or run it on almost any device and it's going to get faster. It's going to go from 100 milliseconds probably down to half of that, and then half of that. It's going to keep progressing at the rate we've been talking about. The languages will explode and it'll basically have all known languages at some point, right, and these will be available for multiple different vendors.
Speaker 1:I think that the other key thing here is that it's speech-to-speech. It's not speech-to-text-to-text-to-speech right. It's not speech to text to text to speech, right, where traditionally what we've been doing is essentially using multiple models to communicate via speech. So if you use ChatGPT's voice mode, they're working on a native GPT-4.0 speech capability. That means it's out there, but most people don't have access to it yet. The current version people have access to. Where you talk to ChatGPT, what it's doing is it's saying first thing is it's talking to you, then it's converting that to text, then it's running the text prompt, then it's getting text back and then it's converting that back to audio, and so the idea here is that speech-to-speech allows you to do translation, but it also allows you to do direct. You know you're not losing the modality of speech.
Speaker 1:So think about it from an information theory perspective for a moment. You say, okay, where is there higher information density? In text or in audio? You say, well, clearly, audio can be converted to text and you lose information because you don't have tone, you don't have pauses, you don't have all the things that make audio a richer modality. A richer modality when you go from text to audio, of course, then you're upscaling, so to speak, and you're trying to gain information, but at the same time you're losing a lot.
Speaker 1:So if I go to a model and I say you know, how should I prepare for the hurricane? Just like that, you know it might interpret it one way. But if I said it kind of like, hey, how should I prepare for this hurricane? You know it's a little bit different. It's kind of like understanding my tone, same exact words. The text would not capture that, but the audio input would and therefore the model potentially can take into account a lot more of what it is that I'm thinking, doing, feeling, et cetera. And the same thing is true also for translation, because literal translations, word for word, one of the reasons they've historically been really unsuccessful compared to human translators is that they don't really capture the context of what's going on in the conversation, whereas these AI-based translators have been really good because they have this broader context window that can understand pretty much the full conversation at this point and be super accurate at translation because of that. So it's really powerful on a lot of levels.
Speaker 1:It actually ties to my comments in the last topic pretty well that small models are where a lot of the action is, because they allow you to do things in ways on device and at low cost or, in some cases, effectively no cost. That, to me, is exciting. You can build a lot of applications. So for associations, you might say, okay, what do we do with this? What if you had in your meetings app a real-time translator? Say, you have an international crowd coming to your event and people are speaking whatever language. So you build a simple tool that basically pipes in real-time audio from the AV system, from the thing to the device, and that's translated in real-time. It's got 100 millisecond lag or whatever, but it's imperceptible, and so I can keep an earbud in and listen to it in whatever language and it sounds like the person speaking right as opposed to a translation. So it's a translation in terms of the voice and all that. That would be amazing, and there's a lot of other applications, but that's just one that comes to mind.
Speaker 2:In my mind I was thinking that, even though this is speech-to-speech, that it was still going through that text phase. So that's really interesting to think about. So this model we can assume was trained on lots of audio to develop this capability like not text data, I'm assuming.
Speaker 1:So Okay, yeah, I haven't read the paper. They published a paper on it as well, which is another great thing about the open source communities. You not only get the software and the weights, but you also typically get a paper that explains how did they train, what did they train on. And sometimes it's interesting just to kind of skim those and see what they talk about, but they'll tell you about their training approach with the content. Where do they get it? And these guys are big open source advocates based in France, and they are publishing tons of research all the time, but they are, as you described earlier, a hub for a lot of pretty much everyone posts stuff there as soon as it's ready for consumption. Have you played around in Hugging Face? Yeah, I have no-transcript. Actually, I think there was news, maybe in the last day or so, that the Firefox browser actually includes Hugging Chat, built right in Wow, so that's a cool thing to check out.
Speaker 2:Right.
Speaker 1:We'll include that in the show notes as well.
Speaker 2:Now I know you mentioned translating keynote sessions as one potential use case, and then my mind immediately goes as well to customer service calls and being able to translate those real time. And this leads me to kind of a broader thought, and I don't know if you know a ton about this, amit, but it's just something I think about a lot, especially given how advanced AI is getting, thinking that we still have to call these major players like Verizon and Delta and we're still working through these phone trees and it seems like you have to say the right keyword right for them to move you on to the customer service rep. Do you know why we aren't seeing kind of major companies use AI or at least it doesn't seem like they are using AI in their customer service calls?
Speaker 1:You know, my suspicion is that for the larger players there's cultural roadblocks. There's also complexity of systems integration. For Delta Airlines to implement this would be a lot harder than something like Klarna, who we talked about before, and they implemented a customer service bot. Smaller company rolling it out. But it also creates an expectation in the mind of the consumer right. Think about the mainstream consumer and what their expectations are, which expectations quickly turn into demands by consumers. You will see adoption of these tools, both because it's better for the consumer but also because it saves a lot of money. So ultimately, those companies, 100%, are going to be deploying this kind of tech. I don't know about Verizon, actually, but everyone else probably will.
Speaker 2:Don't get me started on Verizon.
Speaker 1:Yeah, exactly. So the way I describe customer service tech and technology in general I think this is a true statement is that the first priority for companies has always been to serve themselves. Meaning when you think about your integrated voice response, ivr phone tree that you're talking about. Those systems were never built to optimize for customer experience. They were built to hopefully minimize how bad the experience was, but they were built to optimize for cost. And if you think about the original chatbots that have been on the web for 10 plus years right and longer in some cases they're horrendous, but they're not optimized to serve the customer, to improve the experience. They're optimized to reduce cost primarily right, I mean, there's other motivations that could exist. It's like, oh, we don't have people on on the clock 24 hours a day and this thing can help in the meantime, but, but generally speaking, they're extremely limited of value. And, like, the first thing I always do is I say agent and I say give me an actual person.
Speaker 2:But they're better now. I mean, they make you say it like 25 times before you get to the agent. Oh, so frustrating.
Speaker 1:Proving my point right. It's not about the experience of the customer, it's about reducing costs, because if they do that, I might go away and then I don't have to call them again, or you know, and they don't really care, right? So I mean, I don't want to be a cynic about all companies. There's definitely plenty of companies who do value creating a great customer experience, but, generally speaking, over the course of time, the technology has been focused on how do you optimize costs and how do you optimize other priorities internally, which rarely are about the customer experience, with some exceptions, obviously. What I would say now is that you're entering a world where, fortunately, the alignment of the priorities is similar, where the customer experience is both improved and the company saves a crazy amount of money.
Speaker 1:And the Klarna case study is a great one. We included it in the book and we talked about it in this podcast. Klarna is a buy now, pay later company. They power a lot of e-commerce brands people who want to purchase a product but essentially pay for it in installments over time. They're one of several companies in the BNPL space and they implemented a ChatGPT4-based chatbot, and they found that the chatbot was able to, in its first month of operation answer the equivalent number of inquiries as 700 people, which is, I think, about a third of the number of people they actually employ in that role. And what was really interesting? So that's the cost-saving side of it, right, and so that's exciting, that's great.
Speaker 1:The flip side of it is that the customers were happier with the outcome of the AI. Why? Not necessarily because the bot did a better job responding although over time it most certainly will do a better job of responding but it was about the amount of time it took. Like when you call Delta, would you rather have your problem solved in 90 seconds or 10 minutes? Right, you want to be off the phone or off of the chat bot as quickly as possible with your resolution to you know you want the information you went there for or you have your transaction handled or whatever it is. The chatbot from Klarna was able to resolve cases an average of, I think, about two minutes, if I'm remembering correctly, whereas the human agents took an average of 11 minutes. So that is a massive improvement in customer experience and to me that's a big opportunity coming back to this topic of speech to speech, because it opens up a new modality and, again, it's a richer form of information coming in audio that isn't downscaled to text and then upscaled later back to audio, doesn't lose the information right. More likely that a speech-to-speech model or that is a component of these systems, will be able to serve customers in a brilliant way and to do it so that you call Delta Airlines down the road, hopefully in the not-too-distant future, and a quote-unquote person answers which is an AI?
Speaker 1:And you just have a conversation. You're like yeah, I'd like to change my flight, or I'm having a problem, I didn't get miles on my last flight, or whatever it is. They're like oh, let me look it up, they authenticate you quickly. And they're like oh, mallory, sure, here you go, we're going to solve that problem for you and you're done. It's like having your own dedicated Delta expert.
Speaker 1:Also, too, when you go through those phone trees, you know very well that you either struck gold, that you got one of their good customer service reps, or every once in a while, you get someone. They don't know how to spell Delta. They just have no idea how to serve you. You're like damn, I'm going to have to hang up and basically get back on this call to roll the dice again. So if everyone get the best customer service rep instantly. Wouldn't that be cool?
Speaker 1:And this is a technology that will enable exactly that outcome For associations. You're not Fortune 500 companies, but people expect from you the same that they expect from the largest brands in the world, which may not be fair, but it's reality, and so understanding this technology is important, because it can not only help you play defense where you're thinking about how do you provide a comparable experience to what people are looking for, but allows you to create a broader surface area to serve your customers and members in new ways. Right, translating your content, like we talked about translating content in the context of a conference, but also think about, like an LMS, for example, being able to dynamically translate and have conversations with an AI agent in voice-to-voice about the content, so that the LMS really becomes like a synchronous platform where the best tutor in the world on your domain of expertise is available side by side with the pre-recorded content that's coming and you'll be able to leverage that from. Probably all the mainstream LMS providers will have that in the next couple of years, in my estimation.
Speaker 2:Lots of use cases there, from your experience, would you say customer service, or we could call it member service, is a big pain point for a lot of associations that you've spoken with me.
Speaker 1:Yeah, for sure. I mean, it's an area of significant time investment. So if you think about where do associations invest their staff's time? A big chunk of it is interacting with members and answering questions, and what I get excited about with this type of technology isn't that you're eliminating that, but that you're freeing up your people to actually have meaningful conversations with people. So, rather than answering an endless stream of emails that are asking you basically the same 20 or 30 questions, you can have an AI. Just take care of that. Give people a better experience because it's immediate, and then have your people focus on, first of all, figuring out better things they can do to serve members. But also, when you do have those interactions, you're not hurried. You can actually take the time to go deeper with the member and make sure that you're really connecting them with the best possible outcome.
Speaker 1:It's kind of like your experience, all of our experiences. When you go to the doctor's office, I always feel rushed. It's like, oh, you're going to spend 10 minutes with the doctor. If you're lucky, they don't remember anything about you. Maybe they remember your name, maybe not. They pretend like they do and you have 10 minutes to tell them all about whatever it is that ails you. Wouldn't it be great if that didn't feel that way and you just had all the time in the world, at your pace, to address whatever the issue is?
Speaker 2:And that's a customer service problem in the medical world. Same thing there. So I just get pumped about this because I think it can democratize access to great services. Associations can be a big part of that. All right. Topic three AI and grid infrastructure tech.
Speaker 2:Recent reports highlight an interesting trend in climate tech investments, particularly related to AI. While overall investment in climate tech startups has declined since peaking in 2021, one sector is seeing significant growth and that is grid infrastructure technology. Grid infrastructure startups raised $2.73 billion globally in the first half of 2024, with US deals accounting for almost half of that. The investment is driven largely by the increasing power demands of AI and electric vehicles, largely by the increasing power demands of AI and electric vehicles. According to John McDonough, a senior analyst at PitchBook, vc investment in electric grid infrastructure is on track to surpass last year's $4.37 billion. A significant portion of this funding is going toward battery energy storage, new battery technology, ai and software for grid management and hardware for grid management.
Speaker 2:The surge in electricity demand due to AI, particularly for data centers, is actually driving interest in technologies to strengthen and manage the power grid more efficiently, which underscores how AI is a driver of innovation in other sectors, particularly in managing and improving our power infrastructure to meet growing energy needs. So, amit, when you sent this to me, you mentioned that many associations rightfully so and many association leaders are concerned about the environmental impact of AI, which I don't feel like we definitely haven't spent a whole dedicated topic on, I think, on this podcast. So can you speak a little bit to those concerns?
Speaker 1:Well, first of all, we have to recognize that AI consumes an enormous amount of power. There's an unbelievable amount of power and growing really rapidly. Every time you turn around, there's some new data center that's coming up that is going to be the next bigger version, the next supercomputer, the next cluster, the next thing. So these things are all very, very energy intensive and that is a valid concern, both in terms of the financial costs. But then it's. You know, energy right now is a scarce resource. Effectively, it's expensive and it's scarce in the context, independent of the financial side, in that it produces, generally speaking, is not clean. You know, most energy is still not clean. So there's valid concerns there. At the same time, I think that some of the points you've made about the investment being had in this sector, coupled with the growth in AI itself actually being a massive assistant in solving for these problems, I'm very optimistic that all of this AI investment, in terms of the energy side and the carbon footprint of it, is going to pay massive dividends, probably in next, you know, probably in the next decade, you know, and so you think about what you said of storage. So a lot of the problems with our energy infrastructure is that when power is available isn't necessarily when it's going to be consumed. So think about solar. Solar is generated during the day and, assuming that there are no clouds in the sky and so forth, that isn't necessarily when it needs to be consumed. Sometimes it is, sometimes it isn't. So storage is a helpful thing because if I can say I'm going to have a solar farm, I'm going to store the energy and then I can redistribute it when I need to, that's very interesting.
Speaker 1:Another problem is transmission. So we lose a large percentage of energy in transmission because the way we transmit energy is the way we've done it for a long time and there's loss and transmission. For every additional mile there's a percentage of lossiness and so we can't generate power too far away from the consumption of that power, even though we have a national grid. It's really a bunch of localized grids that are connected, because the farther you try to transmit the power, the more loss there is. You know the loss basically generates a lot of heat, so that has its own issues, but really, more than anything, you're just losing a large percentage of it's like you know. Know you're driving an oil tanker down the road and there's a hole, and so you started off with a million gallons of gas or whatever it is, and by the time it gets to where it's going, it loses a certain amount, right? That's kind of the analogy that I'd paint in picture in people's minds. You see transmission wires overhead. You assume that.
Speaker 1:Well, how do you lose energy? Well, it's just the nature of the materials. It's not 100% efficient in terms of the way that material is transmitting or conducting the electricity, which is, by the way, one of the reasons there was so much excitement about this idea of a room temperature superconductor, which didn't actually turn out to be true. But last year we talked about it a little bit on one of our early pod episodes. The theory behind a room temperature superconductor. Superconductivity essentially means there's zero loss in transmission of electrical current and so in theory and this has been produced in the lab at like, basically close to absolute zero temperature. So if you could do superconductivity with materials that are affordable at scale, right in room temperature or like in the real world, then you can transmit power with no loss, and that solves one of the major problems. Also, it means that the chips that we have, if we have superconductivity at that level, you're going to lose a lot less energy. Chips can be smaller, faster, more energy efficient, so that's interesting.
Speaker 1:Why did I bring that up? It's not part of the particular investments you're talking about. That's a material science problem more than anything else, and so material science is an area that's exploding, in large part due to AI, so I'm excited about that. And then the last thing I'd say is there's a lot of optimizations in terms of the grid management. You mentioned the software and hardware being able to be smarter about how we manage the power that we do have and how we're transmitting it. So that's good in terms of better quality of service, fewer brownouts and blackouts, but it's also a good thing from an efficiency perspective, which both saves money and reduces the carbon footprint. So I'm pumped about this article and the statistics about the investment, because where capital flows, innovation goes, because where capital flows, innovation goes.
Speaker 2:What a good segue, I'm thinking for those of our listeners who are not. I fully understand, right, that there's a huge energy consumption piece to AI, but it's always helpful for me then to go out and like find some stat or data point. So I felt like this would be helpful to share with listeners as well who might be trying to bring this back down to the ground. Share with listeners as well who might be trying to bring this back down to the ground. By 2025, it's estimated that data centers will consume between 3% and 13%, which is a large range. But think about the 13% Will consume between 3% and 13% of global electricity. So for me, that definitely helps put this into terms that I can understand.
Speaker 2:When thinking about greater environmental impact. I think if you ask the average person, right, what can you do to help the environment, they might say things like recycle more or eat less meat or maybe drive electric vehicles, even though, amit, I think you told me that they're actually not as good for the environment as you would think. But I want to know, in terms of this conversation around energy consumption and AI, is there anything an association can do or keep in mind as they're implementing AI, or even an individual, or do we just kind of have to let this play out?
Speaker 1:Well, I mean, I'll come back to the analogy of the fine wine versus tap water thing that I talked about earlier and that might have a financial cost. And you see, you allocate the fine wine more closely and more carefully because you don't want to spend money or you don't want to wait as long for the response and you see built systems that use that resource a lot more sparingly.
Speaker 1:The same thing applies to our energy consumption. If you use the smaller models, they're much more energy efficient. So both you will save money and also you will have a smaller carbon footprint. So it's easy to just say, oh yeah, we're going to go with, you know, chachi, pt or Claude, and we're just going to use the most intense model because it's the most powerful. You know? Using another metaphor, it's like flying an Airbus A380 double-decker that can seat 600 people and putting one person on it. You don't need the Airbus 380 for all missions. You might have a Cessna that works just fine, and so it's the same thing.
Speaker 1:Small models have their purpose and I get excited about how small models especially if you inference them locally on your own computer, running them on your phone, right? Apple had their event this week where they talked about Apple intelligence and the silicon that they have, which is capable of running their own models on device, which are quite capable models. That excites me because that efficiency is going to result in a lot of the workloads being done by super efficient chips. I'm also excited about hardware innovations, like we talked about Grok with a Q G-R-O-Q grokcloudcom. We're big fans of those guys because they have a radically more efficient hardware architecture for AI inference these massive data centers you're talking about. They do two things they train models and then they serve models, or what people call inference, which is basically when you run the model, and so the ratio of training versus inference is constantly changing. But as demand grows, it's going to be a massive amount of inference, and so if you're able to be more efficient the way Grok is, it's both way faster, but it's less energy intensive.
Speaker 1:So I think an association should be, especially as they're looking to scale their use of AI. Right, like when you're in the experiment phase. It's not a big deal, you're not even a rounding error, but if you're going to scale your use of AI in a significant way, you should definitely be thinking about these issues. It's part of the responsible AI framework, where you're thinking about the ethics of data privacy, you're thinking about models and how they affect people, you're thinking about biases, and you should also be thinking about carbon footprint energy consumption. The good news is that there is an incentive to save money, which everyone's tuned into around the world, and that same incentive will drive people to use the more efficient models, so that's exciting as well.
Speaker 2:So your advice, then, is not to stop using these powerful large language models, but simply to identify opportunities where you can use small models, if possible.
Speaker 1:Yeah, last time I checked, horses have a smaller carbon footprint than vehicles that we drive. But I'm not suggesting you go back in time and ride a horse around. I mean you could, but I don't even know if that's true. I think it is, but I'm assuming that the horse's output is lesser than a typical car.
Speaker 1:But my point would be that the technology is going to go forward and if you want to be relevant in your field, you have to use AI to be relevant and to be effective in your field. But you can be smart about it. You don't need to just say, oh, we've got plenty of money, so let's just turn on the car engine and yeah, I'm not driving it right now. I'm going to keep my car running in my driveway just because I want the air conditioning on all summer long in New Orleans, because I never want to experience not having AC in my car. It's like that's ridiculous, like I don't know if anyone hopefully no one does that. But, like you know, even if I had an unlimited amount of money, I didn't care about wasting the money. That would be just crazy, like to run your car all the time.
Speaker 1:The same thing is true with wasted resources and AI. If I use GPT-4-0 for every single inference request I had, it's unnecessary, it's wasteful and it has a bigger environmental impact. So there are a lot of choices and one of the choices you can make is using a mixture of different models to serve your association's needs, and it's not the simplest topic in the world to be thinking through. So I don't recommend it as a concern for people who are in that early, early phase, which is almost everyone right now. Still, most people are in their very, very beginning of the journey. You should flag this as an area of importance. You should note it as something you should be thinking about actively as you grow your use of AI, but don't let it stop you from getting started in the journey.
Speaker 2:All right everyone. Thank you for tuning in to today's episode For any of our listeners that are in the storm's path. Stay safe and we will see you all next week the storm's path, stay safe.
Speaker 1:Sidecar is here with more resources, from webinars to boot camps, to help you stay ahead in the association world. We'll catch you in the next episode. Until then, keep learning, keep growing and keep disrupting.