Sidecar Sync

Ian Andrews Talks Groq’s Fast, Smart AI Inference Tech | 80

Amith Nagarajan and Mallory Mejias Episode 80

Send us a text

In this special episode of Sidecar Sync, we dive into the future of AI infrastructure with Ian Andrews, Chief Revenue Officer at Groq (that’s Groq with a Q!). Ian shares the story behind Groq’s rise, how their LPU chip challenges Nvidia’s dominance, and why fast, low-cost, high-quality inference is about to unlock entirely new categories of AI-powered applications. We talk about the human side of prompting, the evolving skillset needed to work with large language models, and what agents and reasoning models mean for the future of knowledge work. Plus, Ian shares how Groq uses AI internally, including an incredible story about an AI-generated RFP audit that caught things humans missed. Tune in for practical insights, forward-looking trends, and plenty of laughs along the way.

🔎 Find Out More About Ian Andrews and Groq:
https://www.linkedin.com/in/ianhandrews/
https://www.groq.com

🔎 Check out Sidecar's AI Learning Hub and get your Association AI Professional (AAiP) certification:
https://learn.sidecar.ai

📕 Download ‘Ascend 2nd Edition: Unlocking the Power of AI for Associations’ for FREE
https://sidecar.ai/ai

📅  Find out more digitalNow 2025 and register now:
https://digitalnow.sidecar.ai/

✨ Power Your Newsletter with AI Personalization at rasa.io:
https://rasa.io/products/campaigns/

🛠 AI Tools and Resources Mentioned in This Episode:
Superhuman ➡ https://superhuman.com
Groq Cloud ➡ https://www.groq.com
ChatGPT ➡ https://chat.openai.com
Claude (Anthropic) ➡ https://www.anthropic.com
Perplexity ➡ https://www.perplexity.ai

Chapters:

00:00 - Introduction
01:24 - Meet Ian Andrews, CRO of Groq
05:48 - Ian’s Generative AI “Aha” Moment
10:00 - What Groq Is and What It Does
14:57 - The Launch and Growth of Groq Cloud
18:43 - What “High Quality Inference” Really Means
23:29 - Agents and Interns: Speed in Autonomous Work
29:4

🚀 Follow Sidecar on LinkedIn
https://linkedin.com/sidecar-global

👍 Please Like & Subscribe!
https://twitter.com/sidecarglobal
https://www.youtube.com/@SidecarSync
https://sidecarglobal.com

More about Your Hosts:

Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.

📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan

Mallory Mejias is the Manager at Sidecar, and she's passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.

📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias

Speaker 1:

Speed's cool, but I can only read so fast. Why would I ever need speed, Like I don't need it to go faster than I can read, which, if you step back you know? There were probably people who said that when we were transitioning from horses to cars, so it will become a funny comment in the future.

Speaker 3:

Welcome to Sidecar Sync, your weekly dose of innovation. If you're looking for the latest news, insights and developments in the association world, especially those driven by artificial intelligence, you're in the right place. We cut through the noise to bring you the most relevant updates, with a keen focus on how AI and other emerging technologies are shaping the future. No fluff, just facts and informed discussions. I'm Amit Nagarajan, Chairman of Blue Cypress, and I'm your host.

Speaker 2:

Hello everyone and welcome to today's episode of the Sidecar Sink Podcast. My name is Mallory Mejiaz and I'm one of your co-hosts, along with Amit Nagarajan, and the Sidecar Sink Podcast is your go-to place for all things associations, innovation and artificial intelligence. Today, we have a really exciting episode lined up for you with someone from Grok, and that is Grok with a Q, as we always say on the podcast, not Grok with a K. We don't just have someone on the podcast today, we actually have Ian Andrews, the chief revenue officer at Grok, so we have an exciting conversation lined up for you. Before we get into a little bit more info on Ian's background, I want to share a quick word from our sponsor.

Speaker 4:

Let's face it Generic emails do not work. Emails with the same message to everyone result in low engagement and missed opportunities. Imagine each member receiving an email tailored just for them. Sounds impossible, right? Well, Rasaio's newest AI-powered platform, Rasaio Campaigns, makes the impossible possible. Rasaio's campaigns transform outreach through AI personalization, delivering tailored emails that people actually want to read. This opens up the door to many powerful applications like event marketing, networking, recommendations and more. Sign up at rasaio slash campaigns. Once again, rasaio slash campaigns. Once again, rasaio slash campaigns. Give it a try. Your members and your engagement rates will thank you.

Speaker 2:

Ian Andrews is helping to deliver fast AI inference to the world at Grok as chief revenue officer. You've got to imagine to work at a place like Grok with a queue. You must be pretty impressive, and Ian is exactly that. Prior to Grok, he served as CMO at Chainalysis, achieving 4x growth in customers and revenue in under four years. Prior to Chainalysis, he was SVP of products and marketing at Pivotal, where he guided the company from launch to successful IPO and later a $2.7 billion acquisition by VMware. Ian's earlier experience includes revenue roles at market-creating firms like Opsware and Astro Data. Ian specializes in launching innovative products, scaling early-stage companies and building high-performing teams. His expertise spans AI infrastructure, blockchain analytics, cloud-native applications and enterprise software.

Speaker 2:

So what do we have lined up for you all today? Well, of course, we're going to be talking about AI inference, what that is, why it matters and why Grok is dominating in that area. Now Ian's going to provide a much more sophisticated answer as to what AI inference is, but, to give you the short and sweet, it's essentially running the AI model. So why would we need AI models to generate output faster than we can even read? Ian's got an answer for that, so stay tuned. We also discuss how an AI inference company like Grok is actually leveraging AI internally. We have a really awesome conversation lined up for you all today, so please enjoy this interview with Ian Andrews. Ian, thank you so much for joining us on the Sidecar Sync podcast. I'm hoping, as we kick off this episode, you can share a little bit with our listeners about your background. Who are you? Who is Ian?

Speaker 1:

Great question. So I am a chief revenue officer at a company called Grok. I've spent about 25 years in technology. Actually, very early in my career worked with Amit at one of his companies. So we go way, way, way back. And you know I think of myself as a technologist. I've always tended to spend time in early emerging technology and you know I've built a career around bringing that technology to the world.

Speaker 2:

Awesome, Awesome. Well, if you've worked with Amit and I know Amit has pretty much exclusively worked with associations for his whole career Does that mean you have some familiarity with our association audience?

Speaker 1:

I do Absolutely. I would you know. First couple of years of my career I live in Washington DC. I was very focused on the association market.

Speaker 2:

Awesome, awesome. And you were a speaker at Digital Now 2022, which is Sidecar's annual conference. Is that right?

Speaker 1:

I did. I came and talked about blockchain analytics. At the time, I was chief marketing officer for a company called Chainalysis, and we built a product that allowed companies and government agencies to actually understand what was going on in the blockchain, and so I think I maybe scared the audience a little bit by going deep into the bowels of cryptocurrency and talking about North Korean hackers and Russian ransomware gangs and narco traffickers. Hopefully didn't put people off the technology too much.

Speaker 2:

It was a great time. I was attending Digital Now that year for the first time as an attendee. I was working for the Greater Blue Cypress family of companies and before we started recording this episode I remembered that for me, that event Digital Now 2022, is kind of my aha moment for generative AI. It's the first time that my mind was really blown and it kind of trickled this whole career path after it. I thought it would be a fun question to ask you if you have had a generative AI aha moment, a moment where your mind was blown, where you knew I need to be working in this for the foreseeable future.

Speaker 1:

When ChaiGBT launched and I started playing with it was probably the first moment where I was like, wow, this is actually very real Because I've worked for a couple of companies over the years in the data and analytics space and so I would describe myself as fairly familiar with machine learning and, you know, brought solutions to customers to help them do like early web and click stream analytics, so big data, you know, kind of at scale information processing. But honestly, like that technology was very hard to use, very expensive to implement. A lot of companies didn't get value from it. The first time I started playing with ChatGPT I turned around and showed it to my kids and my kids picked it up and immediately started using it and my now 11-year-old, I think, went and started writing a story and then came back and read the whole thing to me and he had kind of interactively gotten the AI to give him, you know, something about superheroes and dragons and I was like this is very real and if you know you rewind back to that period two and a half years ago. The technology kind of looks like a toy in comparison to what we have today. So I think I've had a series of moments over the last two and a half years where I'm like, wow, I can't believe that's possible. Probably one that comes to mind.

Speaker 1:

Recently I use a product called Superhuman for email and I've been a longtime Superhuman user. It sits on top of Gmail and its original claim to fame was it just goes really fast. So it's a very streamlined interface and, as someone who gets you know thousands of emails a week, like just getting through that pile is pretty difficult. And they introduced, you know, some typical AI features along the way where you can, you know, auto draft a response. But it was never like super compelling to me.

Speaker 1:

I tend to write very short precision emails and the AI wanted to be you know all this flourish and lots of extra words that I would never actually use.

Speaker 1:

I never found that that valuable. But they just introduced a feature where it will look at the email and if it's an open question or something that requires a follow-up and I don't get a response, it will automatically put a prompt email in the top of my inbox. So it'll say and it'll draft the email and the drafting is now like precision. It's basically how I would write the message and I was totally blown away and I've been using this feature for the last two weeks and I'm like I'm going to get very soon to the point where I never have to write email again, like those reminders are just going to fire off automatically because now it's like still human in the loop, but the quality is such that I don't really need to look at it that closely. So I wake up every day and I get a list of these emails that superhuman has suggested I send to follow up on topics that happened, you know, over the last couple of weeks that are left unresolved. It's very good.

Speaker 2:

I think I need Superhuman. It sounds like I've not heard of it. It's a pretty cool tool.

Speaker 1:

Yeah, I'm sadly not an investor, so I'm not even, like you know, showing my own bag here. It really is a great product.

Speaker 3:

I'm going to be downloading it shortly. You get it on the phone.

Speaker 1:

It's great.

Speaker 2:

And I wanted to share with our listeners. You all are actually the first group of people to hear this, but Ian will be joining us at Digital Now 2025.

Speaker 1:

I just signed up. It's amazing, quite literally.

Speaker 2:

I just gave him the dates he's coming. He'll be there.

Speaker 1:

It's blocked on the calendar for sure.

Speaker 3:

Awesome. Well, you know that is eight months from now, right? So just a little bit under eight months from now, and that's, you know, more than a one X or more than one doubling of AI power in that timeframe. So, as we have our discussion today, it'll be very interesting to reflect back on this conversation that's being recorded in late March relative to early November. You know lots will be happening in the world of AI between now and then.

Speaker 1:

There's no doubt you know, we could sit here and try and plan out the content for that talk today, and it would almost certainly be obsolete by the time we get to November member Agreed, agreed.

Speaker 2:

So, ian, I want to talk about Grok. Where you work, everyone that is Grok with a Q, just so you know. For someone who is not familiar with Grok, who's never heard of it, how would you describe what Grok does?

Speaker 1:

So Grok builds a vertically integrated AI inference platform that gives you super high quality AI responses incredibly fast at very reasonable prices. Now, a lot of words in there. So inference engine a fancy way of saying we run the models. So we're not. If you're familiar with OpenAI or Anthropic, you know. If you're familiar with OpenAI or Anthropic, they are foundation model companies. They build models. They also run them. Grok doesn't do that. We run other people's models. We run lots of open source models. So if you're familiar with the meta family of models called Lama, we run a lot of Lama, lot of Lama. But we also run open source models from Google and from OpenAI, like their whisper models for speech to text conversion or transcription.

Speaker 1:

And Grok was founded by a guy named Jonathan Ross. Jonathan is famous for building a chip called the Tensor Processing Unit, which he designed at Google as a 20% project. People probably aren't that familiar with the TPU unless they're in AI, but if you've ever used a Google service, from search to mail, or ever written in Waymo, the self-driving car division of Google all of that is powered by the TPU. So TPU is now an incredible part of the backbone of Google's infrastructure. And Jonathan, after creating the TPU came to the realization that the rest of the world was going to catch up to where Google was and, like a lot of visionary founders, he got the direction of the market correct but, quite honestly, missed on the timing a little bit. So he founded Grok back in 2016, assembled a world-class team to build our chip called the LPU, which is the foundational element that gives us our speed and our economic advantages over alternatives like NVIDIA. But he was very early to market because back in 2016, no one was talking about artificial intelligence seriously. You know the open AI themselves, I think, were only created that year. You know the average enterprise was hoping to do, you know, business intelligence on top of a data warehouse, and that was kind of state of the art.

Speaker 1:

Machine learning was still restricted to really the high end landscape, and so the company struggled commercially, actually until that moment we were talking about. A little while ago, chatgpt launched and suddenly we all woke up to the realization that, wow, the world, how we work, how we live, is about to change in a really dramatic way, and you know, I think the default for most people would be, you know, nvidia as the destination of choice to run AI workloads, but obviously, they're one very expensive and two very hard to actually get access to, particularly if you're outside the US market. Gpus are kind of generally unavailable, and so Grok started getting a lot of attention from that moment on as a potential alternative a contender, if you will to NVIDIA. The second big moment, though, happened a little over a year ago, when we launched Grok Cloud, and so this is the second part of what is Grok is.

Speaker 1:

We're not just a chip company. We're actually a vertically integrated service. So you can go to grokcom right now and interact with models running on Grok. You can experience the speed yourself, the performance, the quality, but we have a whole enterprise platform. So for anyone who's building applications that use AI, grok Cloud is a great destination for you. We launched that a year ago. We've had over a million people sign up for it, nearly 200,000 monthly active users to our paid developer tier, and we've seen tens of thousands of signups in the last two months. So it really has been this kind of organic phenomenon of popularity and really something special in the market.

Speaker 3:

You know I can relate to that timeline, specifically the last well, really the whole timeline you mentioned, because some of the stuff we've been doing in AI dates back to the early 2010s and the mid-2010s and it was tough going. I mean, we were selling a personalization engine in this market the association market starting in 2014, 2015. And the technology was super rudimentary back then but it was very difficult to convince anyone that it was worth investing in. Ai was you might as well have been selling sci-fi to someone. So I can definitely relate to that.

Speaker 3:

And the more recent history you described is super interesting to hear from your perspective on the Grok team, because over here in our little world we happen to have a hackathon down in Florida. I think it was February of 24. And I remember I was at this hackathon. I had heard of LPUs and Grok just in passing at some point prior to that. I had a general vague idea of it, but then one of the developers at the hackathon said, hey, we got to check this thing out and demoed the Grok cloud, I think very early on, like you guys had just released it, and it was insane because you know we're used to just kind of sitting there like oh, you want to send your query of any significance to OpenAI, go use the bathroom, get. They're like oh, you want to send your query of any significance to OpenAI, go use the bathroom, get a cup of coffee, come back and hopefully it's there and then he's like something went wrong. No-transcript.

Speaker 1:

It really is, and thank you for being a customer. I mean I think you bet very early on the platform and were patient with us as we were scaling up, and that early support I think led to significant success for the company. We were able to do a large equity raise in August last year which we've invested back into scaling our global capacity. By the end of Q2, we'll have 10x'd our fleet globally while driving really significant performance enhancements at our networking and software layer. So combined it's a massive scale-up of Glock Cloud which in turn fuels the opportunity for more startups, more application builders, to come and use the service, which we're very excited about.

Speaker 2:

All right, I want to hop in here as the least technical person on this call for all of our listeners. You mentioned a few terms, ian. You mentioned inference, tpus and LPUs. I don't think we defined LPUs, but just to clarify when we become industry jargon. But absolutely it's just running a model.

Speaker 1:

So if you've ever gone to chat GPT and typed in a question and gotten a response in the backend, the model ran your input prompt and provided you an output response. That's inference that makes sense.

Speaker 2:

And then a TPU.

Speaker 1:

TPU is just the. It's a chip that Google uses specifically to run AI and machine learning workloads. Stands for Tensor Processing Unit, and then Grok's chip is called the LPU, or Language Processing Unit. So yeah, lots of jargon in the technology circles for sure.

Speaker 2:

So yeah, lots of jargon in the technology circles for sure, yeah, but it's helpful and I think that makes sense. When you talked about inference with Grok, I think you use the phrases like higher quality inference and faster inference. I think speed makes sense, and I've seen Grok in action. It's pretty mind blowing how insanely fast these responses are generated. What do you mean when you say high quality?

Speaker 1:

Yeah, well, if you've ever, if you rewind back, you know, a couple years, the first time you saw ChatGPT, you could ask certain questions like, hey, write me an essay, you know in the voice of a ninth grader, and you might get like a pretty reasonable essay. But I think if you looked at it closely you might be able to piece together that it was probably produced not by a human but actually by an AI. And if you now go and run that same question against any of the modern LLMs, you're going to get probably a much more quality response in terms of the output of the model. It might actually be hard to distinguish between human-produced and AI-produced. So the measure of quality in that way I think we've seen a great progression across AI models, across AI models, but specifically in the context of Grok versus other providers, there's a few tricks that people can use to go fast.

Speaker 1:

You can actually reduce the precision of a model and you'll get more tokens quickly so that the text will return faster. But the trade-off there is that you then get lower quality responses. A lower precision model may produce output more quickly, but the output is of lower quality, and so that's often a trade that is not desirable right. You can imagine, you want highest quality possible at the fastest speed, and so the other corner of that triangle is cost. So usually to have, you know, fast and high quality, you have to pay a lot. And this is where I think Grok is unique in the market. As compared, to, say, nvidia, you know we're significantly cheaper, much faster and equivalent or better quality.

Speaker 2:

It sounds pretty good. It sounds pretty good. Cheaper, better precision, faster, I know, in terms of speed it definitely feels like a nice to have if we're just using, like consumer grade AI tools, but can you share some examples of what kinds of applications might become possible when response times drop from seconds to milliseconds from?

Speaker 1:

seconds to milliseconds Totally. This is a great question and actually when we first launched Grok, that was the reaction we got from a lot of people Like speed's cool, but I can only read so fast. Why would I ever need speed Like I don't need it to go faster than I can read, which, if you step back, there were probably people who said that when we were transitioning from horses to cars, why would I ever want to go further than the horse can take me in 30 minutes? Or why would I ever want to go faster than a horse can gallop? I can't imagine the possible scenarios where that applies, so it will become a funny comment in the future, but I think so. Let's make some concrete suggestions.

Speaker 1:

If you've ever been on hold in a call center, you call your healthcare company or the cable company. You get put in one of these queues where they tell you endlessly how important you are and we're sorry, you're waiting right Probably one of the most frustrating experiences ever. Imagine that we can scale that call center up, not by hiring more people, but by providing an AI interaction that's as good as or better than the human to human interaction. So live voice, not text, not a chat app, but actually talking to an agent that sounds like a human and has all the same information retrieval and decision-making authority as a human does, like that capability exists today, but it requires a lot of compute to make it possible. Because you don't want to talk hey, can you help me with my cable bill? And then wait three minutes for a response, like you want it to be an interactive conversation like the one that we're having right now. And so when you go into different um, different modalities, from, like, textual only to audio, speed starts to matter a lot in terms of the quality of experience. Um, but I'll take it, go ahead.

Speaker 1:

I was going to say I'll take it one step further, and this requires maybe a little bit of imagination for the audience is when we start to talk about agents not just a call center agent, but think of agents as being almost like employees or members of your team and staff and staff, and so they have the ability to autonomously work on tasks that you direct them to take on. And I think the state of the art today, basically to frame what's possible, is you can have agents that are probably at the level of a good intern. You know high school, college level experience, don't deeply understand your business necessarily, but can work tirelessly at information retrieval and synthesis, and they can probably give you a you know, a pretty good answer. That's the state of the art today in the world of agents and, as a result, you actually will carve up tasks into very specific, narrow domains, right, as you would with an intern. You don't say, hey, you know, amit, as my intern, can you make the business plan for the year? You probably say, hey, can you go do some research on, like the market landscape and product opportunity related to you know, member management software and like here's three sources that I want you to make sure you go look at and then you get some information back and then you go send them off on another task that and then that assembles itself into all the data necessary to produce that, that annual business plan.

Speaker 1:

Agents kind of operate in the same way and so tying that all together from a speed perspective, you're now not thinking about a one-to-one interaction. From a speed perspective, you're now not thinking about a one-to-one interaction. You know Mallory and Ian are interacting. It's a one-to-many and I think most people are going to have a fleet of agents. Your marketing agents will support, you know everything from content measurement to content creation, preparation to dissemination. Like all of that is going to be autonomously delivered, and so speed becomes really important here. With these fleets of agents that interact with each other, you're no longer operating in kind of human terms, you're now in the computer speed.

Speaker 3:

You know, ian, building on that, one of the things we've talked about a lot since the fall on this pod is this new class of models, the so-called reasoning models, like what was called Strawberry, then O1, o3, and then you know the DeepSeek R1 moments you know, earlier this quarter everybody freaked out about, and you know the key thing that's happening there that we talk about in the pod is this idea of giving the model longer to think and so essentially giving it more compute cycles to do. The inference you referred to earlier, which of course you know is a direct tieback to your comment about speed, because if you can do more and more with less actual time, that is potentially really powerful. And of course you know there's a lot of things in AI that there's big research moments that are. You know someone writes a paper that reveals that something like test time compute or runtime compute correlates with this next scaling law of AI models, where you take the same size model but you run it for longer and you get a better output. It's kind of like saying, hey, ian, what's two plus two? You can instantly answer. But if I give you a complex question and you have to instantly answer, you're just guessing, whereas if I give you a couple minutes to think about it, you can go solve the problem, and it's very, very similar to what's happening within the chain of thought in the models themselves. And so I think that feeds back to the point about why fast inference is super important.

Speaker 3:

But then, if you do that in parallel a hundred or a thousand times, because within one of these agents, the other scaling law that's now being quote unquote discovered, but it's also, I think, pretty obvious is that if you have multiple AI models running in parallel solving the same problem and then you ask another AI to pick the best answer out of 10 or 20 or 50 or a thousand iterations of the same problem, you can actually get a much better answer again out of the exact same piece of software, the exact same model. And of course, you know we ourselves, with our Skip platform, are in fact doing exactly that with Grok, where we will. Actually one of the pieces that Skip has to do is to figure out writing SQL queries right, which is pretty mundane and pretty simple. But if you have a thousand different tables of data and lots of metadata to understand what's in those tables, there's lots of ways to approach it Just like a human programmer or a human data analyst might write different queries and test them.

Speaker 3:

Well, what if you know Skip can do 10 or 20 or 100 at a time in parallel and get to the best query. And the same thing for writing code to present a chart or whatever. So we're actually doing this and this was not possible even really six months ago. So the fast inference thing I think people do go back to their personal experience in the chat application and of course that's an amazing use case, but it's just one, one tiny example of where inference is necessary.

Speaker 1:

It's totally true. I think the anecdote that stuck for me on these reasoning models if people haven't used them is imagine that I asked you to write a five-paragraph essay and you're not allowed to use the backspace key, so you only get the opportunity to type forward. How well is that essay going to be written? You can't go back and revise a previous sentence. If you have a typo, you can't delete it. You're only allowed to print forward.

Speaker 1:

And that's a non-reasoning model, right it? It is, at the basic level, a predictor of the next uh, few, uh characters of text and each, each character follows um, whereas the reasoning models can go back and edit. And so if you just think about your own ability to produce that perfect five paragraph essay in one shot versus your ability to, you know, write out a draft, then go back and review the draft, realize that maybe the first paragraph really should be the second paragraph and the last sentence doesn't make any sense. You want to rephrase it. You end up with a much higher quality product. It takes more time, but with Grok you can. You can to rephrase it. You end up with a much higher quality product. It takes more time, but with Grok you can shrink that time back down to being better or equivalent to the non-thinking token. So it's a dramatically different experience and significant improvement in output quality.

Speaker 2:

Ian, are there any particular sectors or areas where you're seeing especially strong interest or innovative applications of Grok?

Speaker 1:

Yeah, I think there's a couple things going on, Like I would say and I think about this a lot actually because I have, you know, three kids, two of whom are teenagers. So, you know, there's always the question of like, what should I study in school, when should I apply my time? And it seems clear to me that the models are very, very good at information synthesis. So if you think about any role where, uh, the job is basically to collect data, um summarize that data and then present that information to somebody higher up the the organizational chart, who's making decisions, making decisions like those jobs to me are very likely to be replaced, maybe first augmented, but ultimately replaced by AI systems. So I was in New York recently, met with a couple of our customers who are building what I would describe as like equities, equity market analysts, right, and they actually demoed. You know, a company earnings call was going on and they're taking a live transcript of that call and comparing it against every previous earnings call transcript, public statement, you know, press release the company's ever put out. So they're looking for inconsistencies, they're looking for changes in previously stated strategy. They're comparing the financial numbers against, you know, the previous forecasts as well as what the market consensus was going into, the earnings call.

Speaker 1:

You know that used to be a team of people not particularly senior people in the organization, right, that's like an entry level kind of Wall Street job. That's now a software application, and so I would say you know you can find that in other industries as well. Like I think about legal work, you know there's the partner who has a ton of experience in tricky situations. You know they've worked in the industry for a long time. They have a lot of context. But underneath that partner you have a team of associates and underneath them you have a team of first years, and those first years do again that information synthesis task. They'll go research, precedent, case law, They'll do all the discovery analysis, Like all of that is being consumed by AI. All of that is being consumed by AI. And so I think you can take that kind of paradigm of anything that's information synthesis and summarization should probably be moving from a human-driven task to an AI, initially augmented and then supplanted situation. I think the other thing is that we're seeing accessibility of information change quite a bit.

Speaker 1:

So if you imagine how the three of us would have had a meeting, you know, let's say, pre-pandemic five years ago, we probably would have scheduled that meeting. We would have tried to get together in person. I would have flown down to New Orleans and we would have sat across the table from each other. Then the pandemic happened and we all ended up on Zoom or one of the other equivalent meeting services, but the meeting itself still stayed in that virtual room. Right, you could record it and distribute the recording. But let's be honest, like how many of us sat down and like watched somebody else's meeting? It certainly is not my favorite way to do it. But now, with AI in the meeting, not only can I get a recording which is turned into a full transcript, I can get a summarization that's very focused on the type of conversation that I'm trying to have. So if it's a candidate interview, it tells me exactly what you know. It focuses on candidate strengths, weaknesses, background, you know. If it's a customer meeting, it focuses on you know next steps and key elements of the discussion, and that's like incredibly productive. But if you think about what's valuable there is, now I can share that content with everyone in the organization. So you suddenly, like you get a level up from, I think, like the information sharing which is hard as organizations get bigger, to keep everyone on the same page, like that is changing dramatically.

Speaker 1:

And then maybe the third example I'll give is just the whole concept of customer service has been economically driven to.

Speaker 1:

You know, like call centers are sort of the most expensive means of interacting with your membership or your customers.

Speaker 1:

And so the entire strategy if you operate a call center of any scale is how do we have people not actually talk to humans?

Speaker 1:

Like? Those are the metrics how quick, how many people can we defer from call center and how quickly can we end the calls that show up in the call center, which, if you think about it, is totally misaligned to providing quality customer experience. It's like oh, if you have a question, we should probably just go to my website and like research it and find it on your own. And I think what's happened with AI is now we can actually provide the most knowledgeable person in your organization times an infinite number of clones of that individual who is always available, never tired, never goes on a coffee break and can be available via text, audio, whatever format or modality you want. And so it just expands the meaningful ways in which you can interact with your customers or your membership well beyond what we were able to do in a financially reasonable way in the past, and so I would kind of categorize, like the trends that I'm seeing along those three axes.

Speaker 3:

You know, related to the last point you made, in the world of associations we're talking about member services.

Speaker 3:

Much like their corporate counterparts, customer service, member service is.

Speaker 3:

I wouldn't say that it's looked at exclusively as a cost center, but it's definitely thought of, at least partially, in that way, and associations have attempted to put in place various kinds of technology to help improve and automate and increase efficiency and so forth.

Speaker 3:

But principally, the way I would translate what you said is that it's been all about the organization having lower cost per transaction or per interaction, which leads to the misalignment you described. And I think, beyond just the efficiency, what you just said a moment ago about having the most knowledgeable person in your organization, or maybe even the most knowledgeable person on the planet, available at all times, not only solves the problem the customer thinks they have, but also flips the script and creates the opportunity in our world for associations to provide more value more continuously, so that rather than the member only calling when they really have to because they don't quite like it. You know, for the reasons you mentioned, they were like wow, this is amazing. I really should be calling this member services hotline a lot more often or emailing them because I can get instant amazing results.

Speaker 1:

A hundred percent If, if people haven't tried uh 1-800-CHAT-GPT like you have to do this. I mean I I don't make a lot of like actual dialing the phone, phone calls these days more of a text message person, but the experience is incredible and you know that the technology is now at such a state that it's reasonable for everyone listening to this podcast. Your organization could have a similar experience.

Speaker 2:

Well, I was going to. This might be more of a question for you, amit, but it sounds like on paper. As I said, grok seems amazing. Grok with a Q faster, better quality. I think member services is a huge category, which we actually just did a series of episodes on Ian right before this one, talking about AI, augmented or enabled member services. But, amit, I'm wondering for our listeners who are in agreement Grok is amazing, sounds good. What do we do with this information? Like, do we need to be asking our vendors if they're using Grok? Do we need to be doing something ourselves with Grok? So, amit, what's your take on that?

Speaker 3:

My thought process, first and foremost, is that the awareness of this technology existing and being available at scale, at low cost, with high quality is important to note in your brains, because a lot of people in their minds still have a year old or two year old experiences lodged in there saying, yeah, you know, chat GPT was interesting, but it was not that great and it was kind of slow.

Speaker 3:

I can't imagine using that with my members, right? And so that type of dated perspective can be problematic in terms of strategy when you think what is possible. So I think that's the first thing. And then the second part that I would move on to share is that when you think about having effectively unlimited free or not literally free, but close to free cost of incremental inference and heading in that direction, right, the cost curves are very quickly pushing all these costs further down. It opens up new possibilities, right. The thing I mentioned about running 10, 20, 50, 100 queries in parallel to test the best query, to give the customer the best answer. That wasn't possible even six months ago and it would have been potentially technically possible but just not cost effective, not reasonable in terms of time. So I think, more than anything, it's about opening your mind to the possibilities and starting to rethink business models.

Speaker 2:

Ian, I wanted to ask about a saying that I hear many times, not just about tech, but even in business all the time and life, which is the cobbler's children having shoes or not having shoes. I should say I want to know how Grok, one of the most you know leading companies in the world in terms of AI, is using artificial intelligence internally, because you would think right that if you're doing all these incredible things with inference internally, you must be doing incredible things as well, so I was hoping you could talk a little bit about that.

Speaker 1:

Well, we definitely use it in software development, which for this audience may be not as interesting as some of the other items I talk about, but you know it's an accelerant. We're a relatively small organization Grok's about 300 employees and we're building a lot right. We're doing everything from semiconductors to deploying a global cloud to the whole software interaction layer that wraps around that to make it easy to use for everyone. That's a customer. So you know, in that way AI is scaling our ability to build and deliver technology. But I'll pull a couple examples from my organization.

Speaker 1:

Just recently we had a proposal response where a prospective customer had sent us an RFP you know fairly complex document, like 10 pages of questions, and you know we diligently prepared a response. You know kind of a typical human driven. Multiple people reviewed it. You know it was on my desk, it was on a couple of the other exec teams desk because it was pretty important customer opportunity. And at the end we're like, yeah, we're ready to ship this thing. We said, hey, let's just let let the AI take a look at it. And so we took both the proposal and the original RFP, submitted it and said, hey, what do you think about our proposal? Is there. Any way we can improve it. That was the prompt, nothing more complex or sophisticated than that, and the response that came back was incredible. It said, you know, it caught that we had some inconsistencies, you know, because they're a pretty long RFP, and so they'd asked kind of the same question in a few different ways. It's like, well, in section one and section three you're referencing the same thing, but you've indicated like different values in the response, and it was like, you know, you failed to clearly answer this question out of section two and section four is just kind of like poorly written, like you could clean this up and the clarity would be much higher. And so it was like we had an expert, uh, member of the team with both like very good copywriting skills but also, uh, domain knowledge, to build a reason about what was a very technical subject. Um, and so this, to me, was one of those aha moments we we talked about earlier where I wasn't actually sure what I would get back, if there was going to be anything valuable in this response. It was really kind of a final checkoff in in the process that we just, you know, happened to throw it in there, and now I'm like, wow, I'm never going to send a proposal to anyone again without doing this step, because it it clearly caught some things that were missed by the, by the humans involved, and so I think that we're constantly pushing the boundaries of what, what parts of our jobs can we automate, eliminate or or augment, and it's a lot of fun. I actually really encourage people to go out and try it.

Speaker 1:

Your comment earlier, amit, about people having one experience where like, oh, it couldn't do this I think the mindset shift that's necessary to excel in the current time is to assume the AI is like one of your kids or an intern. As I said earlier, they're going to need fairly precise instructions, but they learn really quickly, and so, if you don't quite get the answer that you want or the result you know, send another prompt like refine your original request, provide a little bit more information. Sometimes you can even chain together prompts, so you can ask one LLM hey, I want the following things Can you help me write the prompt that will get this other model to do this for me? And so you get the prompt written by one AI and then you hand that prompt off to another system and you tend to get much better results.

Speaker 1:

We actually have some customers who are doing that where they use they've built their own image generation model but they use Grok to improve the customer prompt. So you know a user might come in and say, hey, I want a picture of leprechauns because it's St Patrick's Day, as we just had a few days ago. You know dancing in a field, but that prompt is not particularly specific and so you're liable to get like either low quality images or kind of random, and so they'll feed that transparent to the user through an LLM running on Grok and then the output of that goes into their image generation model and so they get much higher output, higher quality images, as a result.

Speaker 3:

Yeah, it's pretty awesome.

Speaker 3:

It goes to the point both of how quickly this stuff is changing but also the skill sets that you need to work effectively with these tools, like this whole area of prompt engineering or prompt design.

Speaker 3:

And on the one hand, you know, the smarter the AI gets, either through an iterative model or a multi-stage model like you're describing, or just because the model's smarter inherently, it might make prompt design perhaps a little bit less of a challenge on the one hand, but on the flip side of that, there's always skills to learn, and so what you're describing, I think, is a good reminder to all of our listeners that you just continue to hone and refine your skills and not being stuck in a particular mindset. I find myself doing this all the time, where I just get used to using a tool a certain way and I ask the same questions. Right, we all have that. You know. Our neural nets have those Grand Canyons that get formed after multiple repetitive uses of a particular pathway and it's hard to break out of those Grand Canyons. So it's really really interesting. The AI has no such problem, but we have to reinforce our own you know desire to change, constantly learning.

Speaker 1:

It's my wife and I. We're taking the kids on vacation in a few weeks and we have the opportunity to design a menu, you know, from a set of choices that we'll have while we're on vacation. And my wife started kind of like filling it out by hand and she's asking everybody what they want to eat. And I just took this PDF menu and I threw it into an LLM and said, hey, five of us are going on vacation, we've got five days worth of meals, but we're going to go out to dinner one night, so skip that one. And we want to have a particular breakfast over Easter, so make sure that's pancakes. And you know we're going to depart, you know, early morning, so you don't need the full day, just breakfast.

Speaker 1:

Give me, give me a menu, you know, and it spit, it, spit it back in like two seconds and it wasn't perfect, it was, you know, recommended some things that the kids were like, oh, that's gross, dad. It's like, okay, well, the kids don't like these foods, can you fix it Right? And then, pretty quickly, we got the, we got the full menu taken care of in like you know, three minutes. So you can even find stuff in your own lives. That has nothing to do with work where you can start to apply the technology and you start to fill the boundaries of what's really useful and what maybe doesn't quite work so well yet.

Speaker 3:

I did something the other day similar to that. I had an Excel spreadsheet that I had used. I had this ski trip with a group of friends and I had a whole bunch of data in this spreadsheet about money we spent and shared expenses and reallocating it and all this stuff. I just dropped it into an LLM and I said, hey, analyze the spreadsheet First of all. Are there any incorrect things in there? And on top of that, tell me, like, where we could have been smarter in terms of how we spent money and tell me, give me an idea of, like, how we spent the money on this trip.

Speaker 3:

And so that was actually run through Anthropix latest cloud 3.7 SANA model, but on the desktop, which I enabled that feature where it can run its own code inside the artifacts feature. So it started writing a bunch of code and I was seeing it write the code and I was watching it. I'm like man, that's pretty good. And then it started running the code. It had errors. It ran the code again, had more errors, it fixed it and then after about maybe 45 seconds, something like that, it showed me this thing which was like a pie chart of how we spent the money, the allocations of different people. It called out one of the guys rented a particularly expensive car and they're like yeah, you didn't need to spend that much on this rental. I thought it was awesome. I'm like that's way too much inference to throw. I probably, like you know, burned a thousand gallons of water or something gallons of water or something.

Speaker 1:

But yeah, it was good. It's so funny how the LLMs are like very conservative on money spending. I've had other people tell me stories about like throwing their credit card bills in there and asking for guidance. It's like you know you really shouldn't be taking so many Ubers. Like public transportation's available, you'd save like $70. So there's something in the training data set that is like very frugal. You know they've been watching like late night CNBC or something like that, the Money Stuff podcast or something maybe.

Speaker 2:

Claude calling you out, so you don't have to call yourself out or your friends out. Ian, I'm going to put you on the spot. Amit mentioned Claude. I'm a big Claude fan over here. Do you have a preferred model that you go back to or do you kind of switch it up?

Speaker 1:

You know I tend to switch around based on tasks, but I definitely, like Claude, I think what they're doing in terms of software development is outstanding. User interface is really good. I'm a big fan of Perplexity because they've really focused on web search combined with LLMs. So for answering questions and now their deep research feature I think is fantastic.

Speaker 1:

I was having a debate with one of the folks here at Grok the other day about the cost of operating infrastructure and data centers in Europe and they had a particular point of view on how much more expensive the UK was relative to some place like Norway where energy is abundant and nearly free. And I said I don't think you're right. And they said no, no, I'm absolutely right, I know the answer. And so I went and looked it up and, as opposed to having to assemble all this data myself, you know, I just kind of fired off a quick basic query like give me the you know operating expenses for data centers in Europe and you know, in a couple minutes it came back and had research with citations about the sources of data and it turned out I was. I was right. It's about 2x more expensive to run a server in the UK over Norway, but not 10x, which is where my colleague was arguing. I was arguing their point, so I'm a big fan there.

Speaker 1:

Openai has got some great tools as well. I don't know. Just yesterday they launched a pretty incredible update to their audio models. They launched a pretty incredible update to their audio models. So if you haven't go play with OpenAIfm and you can interact with a bunch of different generative voice models, it's a ton of fun, will do.

Speaker 3:

We've talked about audio a couple of different times so far on this pod in different contexts, both in terms of importance of inference speed to provide real-time, high-quality audio, what you just mentioned. And there's so much innovation happening in the world of audio and a lot of it's happening in the world of open source. I mean, even OpenAI has played that game, at least on one side of the audio equation with Whisper that you guys have turbocharged on Grok, which is super exciting, and I find that to be particularly compelling because that's kind of our natural modality. You know, that's the way we tend to interact with each other, so the fidelity and resolution, so to speak, of audio over text is dramatic and I think there's an opportunity there.

Speaker 3:

The question I wanted to ask you, ian, is how do you feel about the domain of? You know we talked about research and synthesis of information, perhaps some analytics, but part of what a lot of folks are talking about is, you know, and I have teenagers as well, as you know and guiding them on career decisions is. I think that's a fool's errand on a good day, but nonetheless I try to do it myself as well and I just say, like, try to do things where there's human connection and communication. That's emphasized. My question to you is like especially with audio becoming so lifelike, what are your thoughts on that in terms of AIs being able to form connections with people of sorts? Right, you might put the connections in?

Speaker 1:

air quotes, but nonetheless it might feel as such to a lot of users. Yeah, it's not something that I've personally gotten into. To be honest, like many people, I've probably seen the movie Her a few too many times, which paints a little bit dystopian outcome for getting too attached to your AI. Uh, but I definitely know people who, um, have have formed those, those bonds, and they talk emotionally to the, to the AI. You know they say please and thank you, and although I've been told that it, it turns out if you're mean in your prompts you actually often get better results.

Speaker 2:

I'm pretty nice in my prompts. I'm just going to say it.

Speaker 1:

Yeah, exactly Like you think you act like you're probably nicer than you might be to some of the real humans in your life. It's a really common thing, and so I think there is a real potential. As these things get more lifelike and they get much more emotional, or sympathetically emotional, maybe, that you're just naturally going to be attached to them. This was kind of the pitch around the GPT-45 model. Openai came out and said, hey, it's not a frontier model, it's not going to blow away people with new achievements on benchmarks, but we think it's going to be much more pleasing to interact with as a human companion, which is a very interesting kind of positioning tactic. Positioning tactic and that was kind of the early result was, you know, people like, oh yeah, I actually feel like I could chat with this thing for a while and talk about, you know, not work problems, not like synthesize this data. But you know, here's something going on in my life. Let me give me some advice. So I think we're going there. It's going to happen.

Speaker 2:

As we wrap up this episode, Ian, I want to hear what are you looking forward to at Grok and what AI infrastructure things do you think we'll be seeing in the next few years, six months, whatever that may be.

Speaker 1:

Well, we're on a mission to bring inference to the world, which means both making it globally available and affordable for everyone. So we're investing significantly in our core technology. So we've got new chips coming this year that we're super excited about faster, more efficient, able to run even bigger models, as the Frontier Model Labs continue producing those. We're rolling out new data centers around the world, so adding capacity every week and bringing that to more markets all around the world. And we're building some really cool application level tools to make consuming this easier and easier. So you know, stay tuned for more on that front. But you know, going from idea to working in production is just going to get shorter and shorter and shorter as we go through the year. So it's a lot of fun this year.

Speaker 2:

Awesome, awesome, ian, thank you so much for joining us on the Sidecar Sync podcast. This was a really fun conversation and I know our listeners are going to get a lot out of it, so thank you for joining us.

Speaker 1:

My pleasure, Mallory.

Speaker 3:

Thanks for tuning into Sidecar Sync this week. Looking to dive deeper? Download your free copy of our new book Ascend Unlocking the Power of AI for Associations at ascendbookorg. It's packed with insights to power your association's journey with AI. And remember, sidecar is here with more resources, from webinars to boot camps, to help you stay ahead in the association world. We'll catch you in the next episode. Until then, keep learning, keep growing and keep disrupting.