Sidecar Sync

Andrej Karpathy's Eureka Labs, ChatGPT-4o Mini is Mighty, and Llama 3.1 Unveiled | 40

Amith Nagarajan and Mallory Mejias Episode 40

Send us a text

In this week's episode of Sidecar Sync, Amith and Mallory dive into the latest innovations in artificial intelligence and their implications for the association sector. They explore Andrej Karpathy's new venture, Eureka Labs, and its revolutionary AI-native education platform. The discussion then shifts to the release of ChatGPT-4o Mini, a smaller yet powerful model from OpenAI, and Meta's groundbreaking Llama 3.1 models. Listen in as they unpack how these advancements can transform education and professional development within associations and beyond.

🛠 AI Tools and Resources Mentioned in This Episode:
Eureka Labs ➡ https://www.eurekalabs.com
ChatGPT-4o Mini ➡ https://www.openai.com
Llama 3.1 ➡ https://www.metaai.com
Khanmigo ➡ https://www.khanmigo.ai/ 
Andrej Karpathy's YouTube channel ➡ https://www.youtube.com/@AndrejKarpathy

Chapters:
00:00 - Introduction
05:09 - Andrej Karpathy’s Eureka Labs
09:08 - The Importance of AI in Education for Associations
12:22 - Exploring Khan Academy's Khanmigo and Its Implications
17:56 - Overview of ChatGPT-4o Mini and Its Capabilities
22:10 - Impact of Cost Reduction in AI Models
24:43 - AI Generated Tagging and Its Applications
28:24 - Meta's Release of Llama 3.1 and Its Significance
34:06 - Capabilities for Synthetic Data Generation
40:24 - Synthetic Data Generation and Its Applications
46:08 - Open Source vs. Closed Source Debate in AI
48:26 - Closing

🚀 Follow Sidecar on LinkedIn
https://linkedin.com/sidecar-global

👍 Please Like & Subscribe!
https://twitter.com/sidecarglobal
https://www.youtube.com/@SidecarSync
https://sidecarglobal.com

More about Your Hosts:

Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.

📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan

Mallory Mejias is the Manager at Sidecar, and she's passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.

📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias

Amith:

It's going to be an explosion right, like this Cambrian explosion of models we're already seeing. You're going to see more of it because of this particular decision. Welcome to Sidecar Sync, your weekly dose of innovation. If you're looking for the latest news, insights and developments in the association world, especially those driven by artificial intelligence, you're in the right place. We cut through the noise to bring you the most relevant updates, with a keen focus on how AI and other emerging technologies are shaping the future. No fluff, just facts and informed discussions. I'm Amit Nagarajan, chairman of Blue Cypress, and I'm your host. Greetings everyone and welcome back to another episode of the Sidecar Sync. We are excited to be back with you today with some crazy interesting news, all about artificial intelligence, and we're going to tell you all about why these particular topics matter so much to the association sector. My name is Amit Nagarajan.

Mallory:

And my name is Mallory Mejiaz.

Amith:

And we are your hosts. Before we dive into all of the fun and crazy things that have been happening in the world of AI and how they apply to associations, we're going to take a moment for a quick word from our sponsor.

Mallory:

Today's sponsor is Sidecar's AI Learning Hub. The Learning Hub is your go-to place to sharpen your AI skills, ensuring you're keeping up with the latest in the AI space. With the AI Learning Hub, you'll get access to a library of lessons designed to the unique challenges and opportunities within associations, weekly live office hours with AI experts and a community of fellow AI enthusiasts who are just as excited about learning AI as you are. Are you ready to future-proof your career? You can purchase 12-month access to the AI Learning Hub for $399. For more information, go to SidecarGlobalcom. Slash hub. Amit, we are on episode 40 of the Sidecar Sync. I can barely believe that you and I have been meeting every single week to talk about AI, at least for an hour, for 40 weeks. What do you think about that?

Amith:

I think it's been a lot of fun. It's crazy how quickly it's gone by too, because it does seem like we just started it. So 40 episodes, it's a good start.

Mallory:

It is a great start. We're also seeing a lot of new listeners or should I say viewers joining us on YouTube, so if you are listening on your favorite podcasting platform, you should also subscribe to our YouTube channel and join us there.

Amith:

Yeah, youtube is a fun medium and I think over time we will experiment with some additional things we can do for our YouTube viewers.

Mallory:

For sure. It also means you and I have to look just a bit nicer, remembering that we do have a YouTube audience as well. Amit, yesterday I told you a little bit about this. I got the chance to co-host an Intro to AI webinar for an association of HVAC companies. That's something that we offer at Sidecar and it was a really great session. It was kind of a mock-up of our regular Intro to AI webinar that we do for associations and nonprofits every month, but this time all the examples and use cases were tailored to HVAC companies, so it was definitely a test of my AI skills and knowledge. We mostly use the same tools. We showed ChatGPT, claw, gemini and MidJourney, but I will say the types of things we were doing with these tools were a little different.

Amith:

Yeah, it makes sense. You know that last mile problem, as they'd say in telecom, or it's like that final bit of contextualization where you make it exactly what these people think about in their terminology, with their examples. It makes such a big difference because you're taking away that intellectual gap that someone has to translate a generalized concept or a similar concept that's in another space into their world. So if you're talking to accountants and you're using examples from lawyers, sure the accountants can pretty much understand what you're talking about, but they have to translate it in their mind. So you've done that work and I'm sure that it was well-received that way. That's really cool.

Mallory:

Yeah, we've gotten some good feedback so far. For me particularly, I focus on mid-journey. Normally in that webinar and typically with Sidecar, we are creating kind of cartoon images if you all have seen our branding before but HVAC companies are creating images of real people. So it was definitely a challenge for me to learn how to create more photorealistic images in mid-journey, but, happy to say, I've added that skill to my toolkit.

Amith:

Yeah, that's really cool. Well, and I'm excited about some of the content around video images and all these new AI tools that we're including in the forthcoming edition of Ascend, which the team here has been hard at work at for a number of months, and I think we're probably going to have it out there on Amazon in the next two or three weeks, so can't wait to see that drop.

Mallory:

Absolutely Everybody. Stay tuned for Ascend's second edition. Today, we've got some exciting topics lined up. We're going to be talking about Andrej Karpathy's new startup, eureka Labs. We're going to be talking about GPT-40 Mini and the release of another family of models I should say Lama 3.1. So it has been a really exciting couple weeks in the world of AI, starting with Andrej Karpathy. If you don't know him, he's a Slovak-Canadian computer scientist renowned for his contributions to artificial intelligence, particularly in deep learning and computer vision. He was a founding member of OpenAI, the company behind ChatGPT, and he became senior director of AI at Tesla, and now he's launched a new venture called Eureka Labs.

Mallory:

So what is it? Eureka Labs is described as an AI-native education platform that seeks to create a new kind of school. Eureka Labs envisions a learning environment where AI and human teachers collaborate seamlessly, allowing anyone to learn anything efficiently. The platform will use AI teaching assistants to support and scale the efforts of human teachers. These AI assistants will guide students through expertly designed course materials, making learning more interactive and personalized, and its teacher plus AI symbiosis is expected to expand both the reach and depth of education, allowing more people to learn a wider array of subjects more effectively.

Mallory:

Now its inaugural offering is an AI course called LLM 101N. It's an undergraduate level class designed to guide students through the process of training their own AI, similar to a scaled-down version of the AI teaching assistant itself. The course is available online with plans to run both digital and physical cohorts and overall. Eureka Labs is targeting the education sector, particularly focusing on digital learning and AI education. The platform is expected to appeal to universities, online learning platforms, tech enthusiasts eager to explore AI and, I'm thinking, even associations, amit. So what were your initial thoughts when you saw this release?

Amith:

First of all, for those in the world of AI, andrej Karpathy is one of these people that you just you follow what he talks about and you listen very closely, because he's a brilliant mind and he actually seems to be historically pretty generous with his time in regards to sharing ideas and kind of open communication with the community. In fact, he has a series of YouTube videos that are really great that explain lower level concepts than most folks want to dig into. But if you want to dig deeper on some of the fundamentals of AI and deep learning, he has some great stuff on ideas like back propagation and other things that are really interesting that I'd encourage people who want to go deeper to check him out. You can just Google him on YouTube and we'll include the links in the show notes, so anything he does, I think, is worth noting.

Amith:

And then, specifically, what caught my attention about this is it's in the bullseye of what I think associations need to pay attention to is how do you deliver professional education, or education of any kind, in a better way, right? How do you do it differently? How do you take that next leap forward? And what they're talking about at Eureka Labs seems to be similar to what the Khan Academy has done with their Khanmigo, which we've talked about in prior episodes of this podcast, and similar to the vision that we laid out for what associations should be doing from an education perspective in this forthcoming edition of Ascend, where we talk about personalization, we talk about tutoring, we talk about the whole idea of guiding the learning journey based upon the individual, and so it seems to touch on similar themes. Because Carpathi is a fundamental research scientist, I'm thinking they're going to be creating some new innovations that are different than just applying large language models Like Conmigo is really cool, but they've essentially just taken GPT-4 and tailored it to work in the context of what Conmigo needs to do.

Amith:

So I'm very curious, as they share more, if there's some fundamental improvements in the models that they're making that make this a better fundamental technology platform for education. I think people need to pay attention really really closely to this, because it's at the heart of what many associations really consider a really key part of their value prop is the way they deliver education.

Mallory:

If you go to the Eureka Labs website right now, you see kind of a big paragraph of text and then a link to that course that I mentioned in the intro for this topic. But there's not a ton of other information just yet, as I guess they're kind of building out their products and offerings. But something that stood out to me was the idea of the human plus the assistant. I feel like a lot of these tools that we're looking at these days tend to kind of lead with this can do it all for you. But just the fact that they're leading with this AI plus human symbiosis, I think that's really powerful.

Amith:

Yeah, I agree, and I think it touches on the idea of how can the humans involved in education spend more time on the human part of education as opposed to the things that are automation capable, and so that's exciting.

Amith:

I also think there and many other institutions are pursuing a similar vision.

Amith:

That's exciting because the 8 billion people online have access to the best tutoring, the best education billion people online have access to the best tutoring, the best education and that's really what AI promises to bring is to be able to you know, personalize and to deliver meaningful education.

Amith:

If you think about your own experiences in life this has certainly been my experience in my journey is that you know, sometimes you encounter a teacher, whoever that may be it might be in a formal setting or an informal setting and that person really makes a big impact on your life because they inspire you. They perhaps get you interested in a subject that you may not have thought you were good at or may not have thought you were interested in, and a lot of that isn't because of the actual content, but it's the way they connected with you, the way they contextualized it, the way they just related to you, and so I think there's more opportunity for that with the humans involved. But I also think that there's not enough humans who are good at that to do that at scale and have that kind of impact for every person on the planet. So if AI can play a role to approximate that or be a facsimile of that, that's exciting.

Mallory:

I'm thinking in terms of prompt engineering and how we always say it's good to tell the AI to adopt a persona. So I'm going to tell you that, amit, for this next question. So if you were the director of education at an association, or even someone who worked at a company who was in charge of education, what would you be doing right now to prepare for a potential product like this?

Amith:

Well, if I was a director of education and association and I wasn't super familiar with what we were just talking about, I would do everything I could to learn about the fundamental capabilities. So the first thing I'd recommend is go to the Khan Academy it's just khanacademyorg. We'll include that in the show notes. There's free access to Conmigo now available. It used to be like 10 bucks a month or something, but through a grant from Microsoft, they made it available for free to every K-12 educator in North America, and I think you can get free trial for anyone else as well. But if you have to pay $10, pay $10. But the idea is, go in there and play with this thing. What they've done is, I think, a pretty good preview of what might be possible in your own organization.

Amith:

So Conmigo is pretty interesting because Conmigo learns the student and Conmigo is able to guide the student in kind of a Socratic way, as opposed to just providing the answer. You think about, like you know, your experience with Sonnet or with ChatGPT. You say, hey, what's the answer to this question? And it gives you the answer Well, but is that really the best way to educate someone? Are you really teaching them anything? Sure, it's solving your problem, but then you're moving on to the next thing.

Amith:

But if I go to Conmigo and I say I don't know how to solve this problem, conmigo is not just going to give me the answer. It's going to try to guide me to the right answer by helping me learn how to solve the problem. So that's the difference. That's a significantly different approach than what a generic, large language model would do. I suspect that has something to do with the way they're training models at Eureka Labs as well, as they're optimizing for this use case, whereas Conmigo is built on top of a generic language model. If Carpathian team are building a model that's good at this at the fundamental level, where that kind of mindset of being a good educator is built into the model itself, that could be a new level of capability.

Mallory:

To take that preparation step a bit further. Do you think the director of education in this imaginary scenario should be taking a catalog of kind of all of their educational offerings and seeing which might be worth training a new AI model on? Is there anything else kind of more tangible that they could be doing in the next six months?

Amith:

I think, beyond getting that basic sense of what the AI technology currently can do which is important, and I suspect most education directors and other similar roles and associations haven't really taken a deep dive. So, beyond that, I do think taking an inventory of your current offerings and thinking about where you can apply this would be good. Rather than thinking about how to apply it to everything, it might make sense to say, well, could we potentially apply it to just one of our courses? Could we apply it to something really simple? That would be fairly easy on a comparative basis to the entirety of what we offer, because many associations have quite comprehensive offerings in this whole world of education. So I think obviously there's also different modalities. So you have live instruction, you have web-based instruction that's synchronous, you have asynchronous courses in an LMS. You have a lot of different modalities and the way I can help will be potentially different in each of these scenarios. So I think it is important to start thinking about that and maybe run a small experiment, see if you can maybe use one of these new tools. And the other thing you could do is take some of the content from one of your courses, upload it into one of these AI tools. And again, you have to be thoughtful about privacy of your data and things like that, making sure you're comfortable with the vendor that you're working with. But you could take your content to something like a Cloud Sonnet. You could take it to ChatGPT Gemini and give it to the model and then ask questions in the context of a good prompt to your earlier point about prompt engineering and say, hey, this is what you are doing, you are the tutor on this subject matter and then see, see what the interactions are like. That will give you a basic idea, you know, because that that AI will then have familiarity with your content. My main thought here is that this announcement from Eureka labs is a really important signal that there is mainstream, you know, fundamental research interested in this topic Not that this is the first time this has been stated.

Amith:

Education is one of the obvious bullseye targets for AI to make a big difference. I think there's a lot of excitement in the field, but if you're an association thinking that you know you're doing just fine and you're kind of value to your professionals and all of that, maybe you are, but you've got to look at this technology and understand that it fundamentally changes the expectations of the consumer. People are going to be receiving education through AI-enabled platforms and really AI-native modalities in the very near future. Some people already are and if they come to your association, they're receiving a very static, one-size-fits-all type of experience. That's just. You know the things that up until recently might have been considered state-of-the-art, with a really nice LMS on the web and on mobile device. It's not going to be good enough for long. So you have to look at it both in terms of that risk, which is that friction you're creating if you don't enable your content with this technology in the near future, but also the opportunity is.

Amith:

The flip side is because a lot of associations think about the center of their universe, the nucleus of their audience, is where they look first. Well, let's say, hey, we deal with this medical specialty or this other area, this very narrow vertical, but their content may have applicability in other contexts as well, and so AI potentially could be helpful in broadening their market appeal. So I think it's both a risk mitigation thing but, more importantly, like an opportunity to deliver much greater value to their current audience and potentially expand the audience. So I think that's the thing people have to look at and say, well, if, if major players in Silicon Valley like this are really putting their focus on education associations, hopefully we'll see that as a great opportunity.

Amith:

I think maybe Eureka Labs would be a great partner for some associations. I don't know, I don't know what their business model is. Maybe Eureka Labs would be a great partner for some associations. I don't know, I don't know what their business model is. We're certainly going to be investigating it because I think whatever they're doing should be. You know, it's just going to be interesting, but there will be many companies like this right. There are others out there doing this. It's time to take a look.

Mallory:

Moving on to topic two for today, gpt-40 Mini. On July 18th, openai, company behind ChatGPT, released GPT-40 Mini and we covered the release of GPT-40 on the podcast earlier this year and the new release is pretty much what it sounds like a small model that is significantly smarter, cheaper and just as fast as GPT-3.5 Turbo. So, diving into a few key features of GPT-4.0 Mini, this model is over 60% cheaper than GPT-3.5 Turbo. Despite being a smaller model, gpt-4.0 Mini outperforms GPT-3.5 Turbo in various benchmarks, including textual intelligence and multimodal reasoning. It supports text and vision inputs and outputs, with plans to include audio and video in the future, and it can handle up to 128,000 tokens, which is useful for processing large documents or maintaining long conversations and, just as a heads up, one token is about three-fourths of a word. Now, amit, we have revisited this idea over and over the idea of smaller, smarter models being released. We've seen it across companies at this point in the AI space, why do you think that is?

Amith:

Well, I mean fundamentally accessibility, performance, you know, cost. It's making it possible to build applications at scale that perform quickly, inexpensively and are usable in so many other domains. So GPT-40 Mini, I think, is one of these things where people are going to look back at it and say, yeah, that really opened up the doors to a lot of new applications. So GPT-35 Turbo was a workhorse for a long time. It's really old. At this point it's like 18 months old, which I think in the world of AI means it's really ancient, but it's actually still a pretty good model. But compared to GPT-4.0 Mini, it's not even comparable. So 4.0 Mini is almost as good as 4.0 in a lot of testing and for lots of use cases. So you now have a super inexpensive and quite quick model to weave into business applications. Of course, this is based on.

Amith:

You know, I think OpenAI would probably pursue this path generally because they realize the importance of it. But also the competitive pressure is substantial. You know you have different sizes of other models from other companies. You know Claude being probably their main commercial competitor, with the Sonnet and the Haiku series of models, which are, I should say, flavors of the Claude model, are the kind of medium and small models, respectively Sonnet and Haiku, that is, and that's just their branding, but the idea is that those smaller, faster models are quite performant and they're capable of doing a lot of really cool things. You see the same thing with Mistral producing small models that are really performant and really inexpensive. So it's going to be a major area where people play. The capabilities of current state models are so much greater than the way people are using them that even these small models are way more capable than what most people realize. So it's exciting because it basically means the stuff is close to free.

Mallory:

I think some people probably hear us talk about this idea of things getting cheaper and maybe they say, well, you know, chatgpt is still 20 bucks a month for me to use, maybe not thinking about the full context of what this means. And someone in our family of companies, dre McFarlane, posted this in our AI channel, which is where we post a lot of updates and new tools and things like that with one another, and this was a quote from OpenAI the cost per token of GPT-40 mini has dropped by 99% since text DaVinci three, a less capable model introduced in 2022. And I think the point that Dre made was that he's one of the creators of Betty bot, that they originally trained Betty bot on that text DaVinci three model, and we're seeing a 99% reduction in cost. And what did you say? 18 months, or maybe it was a little longer than that. So can you talk a little bit about how you've seen this reduction in cost even affect the products that we've spun up in the Blue Cypress family?

Amith:

Sure Well, betty's a great example. When we were first starting work on Betty in 2022, right around the time ChatGPT had its moment in the public consciousness we were looking at the cost of inference the AI model cost as a pretty major component of what we'd have to figure out what the economics were, and the original model for Betty was to actually pass that along to the customer. We had a license fee for Betty, which would be for the Betty software and then customers would directly pay for their token usage, and we thought that would be a fairly significant impediment to adoption, because there was a variable there that people really couldn't predict that well, and in the first six months of Betty's adoption in early 23, it turned out to be a factor, and then, in kind of the I think it was the March April timeframe of 2023, if memory serves me GPT-4 came out and then GPT-3-5 became much less expensive, and so between the two we were able to significantly lower the cost, but it was still a meaningful component of a Betty investment and deployment, and since then what you described is accurate. So now there really isn't much incremental cost for inference on top of the base license. So that's really exciting in terms of adoption of tools like Betty on top of the base license. So that's really exciting in terms of adoption of tools like Betty.

Amith:

You know Skip, which is our AI business analyst and AI reporting analyst.

Amith:

That product also is the same thing and, in fact, skip is a very, very heavy user of advanced AI models in order to do the work that it does for code generation and for, you know true, like MBA level data analytics.

Amith:

So having lower-cost models that are quite capable, in a blend of models where you can, let's just say, for example, use something like GPT-4.0 for some things but then use 4.0 mini for other things, really makes those types of products more accessible, and that's applicable to all sorts of problems in the association domain. So, for example, if you say, hey, I have a million documents that I want to process in some way through a language model, for example, I want to automatically tag every article we've ever published in our journal, and our journal goes back to 1920 or something like that, and we have a million articles, doing that with GPT-4.0 when it first came out probably was actually already pretty reasonable, but GPT-4.0 Mini makes that something where you could say, yeah, I can do that for a few thousand dollars versus hundreds of thousands of dollars. So the applications become more palatable when you have access to really high quality models at a really low cost. Plus, of course, the speed is great too. To really high quality models at a really low cost Plus of course, the speed is great too.

Mallory:

Okay, kind of a counterpoint here. Then, back in 2022, if an association wanted to do AI generated tagging, do you think it would have been best case for them to wait?

Amith:

at that point, not necessarily. I would say it would be more of this thing where it's still a somewhat scarce resource, they might apply it to just new content that they're publishing. They might apply it to just new content that they're publishing. They might apply it to particular types of content that are really the most important content elements. So you'd be kind of looking at it as a gated resource or a constrained resource. So you'd say, hey, I only have a budget of X dollars per month that I want to spend on this, so therefore you limit it to what you can do with that budget. And now the budget just goes way, way further. 99% is kind of that inverse of going to infinity on capability, right. So it's a really exciting thing and I know we're going to talk about Lama 3.1 in a minute, so I won't go there but just the idea of small models being super fast and super inexpensive. It just makes it possible to do more and to be less concerned with cost. We don't think about internet bandwidth and delivery of education or delivery of video anymore. It's just kind of assumed that there's high speed bandwidth available in most places and the incremental cost is pretty close to zero. I make a point.

Amith:

When I do executive briefings for association leadership teams. I do those fairly regularly for folks who ask for them. I'll deliver an hour of education and I'll talk a lot about kind of this curve of what's happened, these doublings, and part of what I talk about is how that's also reduced the cost of what was previously scarce, expensive, out of reach resources. And the point that I usually make, because it's usually over Zoom video or Teams video, is like we're having a high bandwidth, you know, really high quality video conference with 10, 20, 40, 100 people, whatever it is, and we didn't think about the cost. No one thought about like, oh, should we do this AI executive briefing with Amith, you know? Because it's going to cost us $1,000 of video conferencing bills, and not that long ago you would have been thinking very, very carefully about that. Before you used video conferencing it was a scarce and expensive resource. Now we use it for everything because it's effectively free. Same thing is happening here with AI.

Mallory:

I know we talk about a lot on the pod that one day, maybe in the near future, we'll just use the AI. We won't necessarily know the model that's behind it, we won't know how big it is, it'll just be there for whatever purpose we need it. But at this point if you go into chat GPT, you can actually toggle between GPT-4, which they're calling their legacy model, gpt-4.0 or Omni, and then GPT-4. Mini. I'm wondering you, amit, are you ever opting for Mini at this point I know it was just released or do you find yourself just sticking?

Amith:

with GPT-4.0? I've been using 4.0. I played with 4.0 Mini just for kicks, just to see what it was like in the playground, but I haven't really done anything with it For the most part. 4.0 for me. We have the I forget what it's called. It's the next level up from the individual subscriptions. That I think it's like the Teams plan at Blue Cypress that we pay for.

Amith:

So it's a little bit more expensive and we get really good performance out of that. So when I'm using it, I just haven't bothered. I'm almost at the point, honestly, where what you described, which is the future, where people don't really care that much about the model, like when I'm, when I have my consumer hat on and I'm working on business problems or I'm talking to the models about marketing, or I'm just like brainstorming ideas a lot of times it's over voice too and I'm just walking around on my phone and talking to my phone. I don't really care which model it is. These models are all really good. So if I'm doing something like really deep, like at a technical level, I might want to make sure I'm using the best and latest model, but oftentimes I just really don't care that much because all the models are so good at this point.

Mallory:

Well, that is a good segue into our third topic for today, which is Lama 3.1. And I think, Amit, you just sent me this topic yesterday, a few days ago, right?

Amith:

Came out yesterday morning.

Mallory:

Exactly so you can be assured, listening to the Sidecar Sync podcast, you're getting the latest AI news. Meta recently announced the release of Lama 3.1, the latest and most advanced version of its open source large language model family. Lama 3.1 is available in three sizes 8 billion or 8B, 70 billion or 70B. And 405 billion or 405B parameters. The 405B model is the largest open source AI model to date, designed to rival top proprietary models like OpenAI's GPT-4 and Anthropix CLAWed 3.5 Sonnet.

Mallory:

Lama 3.1 introduces capabilities for synthetic data generation and model distillation, which I want to talk about in just a bit, enabling developers to use its outputs to improve other models, which is a significant step for open source AI. The Meta website says quote until today, open source language models have mostly trailed behind their closed counterparts when it comes to capabilities and performance. Now we're ushering in a new era with open source leading the way". Listeners, you can try out Lama 3.1 right now by going to metaai, and you actually don't even have to log in. I didn't test it out much, but I dabbled with it just a bit before this call, amit. We've seen lots of new model releases lately. Does anything stand out for you with the Lama release?

Amith:

lately. Does anything stand out for you with the Llama release? So the Llama 3.1 release is on the heels of the 3.0 release that came out in the April timeframe I believe April May timeframe and 3.0 was a really good release and they had mentioned that they have a bigger model, that they were still training and that they were going to update the three series models with 3.1. So 3.1 and even the two smaller versions the really small one, the 8 billion and the medium-sized one, that's 70, have both been updated and they both are better than the 3.0 versions, which is really cool. And then the 405 billion parameter model. First of all, this model is considerably smaller than GPT-4.0. It's smaller than what we believe the CLOD 3.0 opus is. So it's a little bit smaller size-wise, but its benchmarking shows that it's essentially at parity with GPT-4.0 and with Claude 3.5 Sonnet in pretty much all categories. So in fact it's a little bit better in some. So it's kind of like saying hey, do you have a Ferrari or a Lamborghini? One goes 210 miles an hour, the other one goes 205. You know they're both way faster than you're probably ever going to drive. So you're good, and I think we're getting to that point for the current sets of use cases that we have.

Amith:

So what is notable about the bigger model is it is on the order of these proprietary closed models and it's totally open source and you can deploy it anywhere. You can deploy it on Azure and AWS. You can deploy it in an environment you have complete control over. That means that if you have a highly sensitive application, we need to maintain 100% control over your data, never send it to any third-party vendor. But you also need to have frontier-level AI capability from the smartest and best model. You don't have an option. You have the ability to deploy Lama 3.1 405B in a private cloud type environment or even on physical hardware. If you wanted to. You'd need some pretty beefy hardware to run a model that size, but you can do that and that becomes very affordable for enterprises that are interested in that kind of private deployment. Affordable for enterprises that are interested in that kind of private deployment. So if you're in the healthcare sector or if you're in a particular field where, for whatever reason, your data sensitivity is so high that you really need to focus on control, this is a whole new capability that didn't exist until now Because the earlier open source models, including LLAMA's earlier and current small models are not at the level of 405B. So 405B is a big deal because it gives you essentially the capabilities of the biggest, most powerful models, but in a totally free, open source format that can be deployed anywhere. So to me that's a significant shift.

Amith:

Companies like Mistral and many others are going to fast follow with all sorts of derivative products that are new models based on the Lama architecture or in some cases, are parallel universes to the Lama architecture, like Mistral kind of is, and you're going to see a lot of innovation.

Amith:

And lots of innovation leads to lots of growth, lower costs, better capabilities, and you don't get that quite as directly from the closed models coming out with new capabilities. The last thing I'll quickly mention is that I'm very excited about this product because Grok G-R-O-K, that company we've talked about here that has the language processing units, the LPUs, has 10 times the speed for runtime or inference for AI models than the GPU-based approaches. They have just crazy fast inference and they are a launch partner with Meta on the Lama 3.1 family of models, or what they call the herd of models actually, which I think is kind of cool. So the Lama 3.1 herd all runs on Grok. So if you're building an application, you can inference your app on Grok. What that means is you're going to have really this state-of-the-art AI capability. That's way faster than either Cloud or ChatGPT, so that's really exciting too. So for applications as sophisticated as something like Skip that we were talking about earlier, that's a big deal. So you can be assured that our teams are all over this stuff.

Mallory:

We do have a previous episode on Grok as well, and I will reiterate what Amit just said it is fast, and that was just a few months ago. I don't know if it's any faster. In my mind, right when I see Claude or ChatGPT work, I'm like, oh, it's quick enough, I don't really need that to be faster. But then you see the examples of a Grok chip and you realize it's, you know, near instant your responses. So so recommend checking out that episode if that's of interest to you. Amit, I'm not sure how to phrase this next question, but I feel like some of our listeners might have it too, so I'm going to do my best. But because these models are open source, does that mean this is kind of the new foundation for AI models, in the sense that if someone else out there wanted to create a brand new AI model, could they use the Lama 3.1 family as kind of their starting foundation and build on top of that? Is that how that works?

Amith:

Sure, yeah, so there's a lot of ways you can do that. So the actual software code the interesting thing about these AI models is, if you look at the source code, it's very limited. It's like 1,000, 2,000 lines of code and it's not particularly interesting and they all work in similar ways. It's the open weights, which is essentially when we talk about these parameters, which is the output of this pre-training process that you do when you're taking massive amounts of data and spending months of time and millions or hundreds of millions of dollars these days on GPU farms and clusters that are generating the models. That process results in these things called weights, and those weights are open. If you have the model itself, which is that small amount of clusters that are generating the models, that process results in these things called weights, and those weights are open. If you have the model itself, which is that small amount of source code, and you have the weights, you can not only run it, but you can do what you're referring to, which is you can create various versions of it, you can further train it through fine-tuning, you can actually quantize the model to shrink it. You can do all sorts of things in order to build additional capabilities on top of it. Some of the fine tuning might be to create flavors of the model that are particularly good at certain things. So, for example, in the prior iterations of Lama, there was something called Code Lama that actually Meta also released, which was particularly good at code generation. I'm sure there'll be a Code Llama for 3.1 as well, but people created versions of Lama that did all sorts of things.

Amith:

So you're going to see the same thing happen. It's going to be an explosion of innovation. It's one of the reasons. Linux if you think about Linux as kind of the standard infrastructure for the web and for the internet that didn't happen overnight. It used to be that back in the day, the closed source Unix systems were far better than Linux when Linux first came out. But because of the community behind Linux, it's become so much more robust, more secure, more reliable and more capable. That's why it's become the standard, because it's become the better operating system for enterprise scale, everything basically, and that's really the strategy behind Lama, and Zuckerberg talked about that in his release comments yesterday.

Amith:

So you know it's they're not. You know this is not something that Meta is doing out of the goodness of their hearts. This is because it's good for them. Having open source means if Lama. If Lama is successful at becoming kind of the gold standard open source AI model, then there's going to be millions and millions of developers working on it, tons of companies investing in it, and that helps them, because all of their products they make money off of are based on Lama as well or will be going forward.

Mallory:

So pretty much everyone on earth just got access to a frontier AI model to do with which what they please, right?

Amith:

Yep, yeah, and the small model you can run, you know, almost anywhere, the 8 billion, like out of the three I think you know. Look at the look at the performance stats, the benchmarks of the Lama 3.1, 8 billion. Actually, I think on Grok they call it Lama 3.1 instant is what they call it which it basically is and you, it's so small and fast and efficient that you can probably run it on a phone. I'm guessing that they're going to package it into future versions of all sorts of devices. So to me, that gets really, really exciting when you can have on-device AI. We talked about that a little bit with Apple in the past. We've talked about that with Microsoft's PHY models, which are also all really small. All of this stuff is going in the same direction, right? It's basically to make it become invisible, where the technology is just assumed. It's part of every application on every device, everywhere, and we're going to be there in the next year or two in terms of that basic capability, that basic assumption. And then, of course, the question is well, where do you keep pushing in terms of new capabilities? What can we not currently do?

Amith:

One thing I didn't mention earlier we were talking about 4.0 Mini and it's relevant to that topic and also the smaller Lama 3.1 models is these smaller models are so cheap and so fast that you can actually use them in a way that you wouldn't have been able to use models up until recently, which is, you can use them in a multi-agentic style. What that means basically is you can sometimes, in parallel, go out to the model and ask it to do five things at the same time in parallel bring back the results, compare them, analyze them, compress them and then reprocess them. So there's this idea of zero shot, which is just go to the model with the prompts and hope you get something good. And then there's these multi-shot five shot. There's a lot of the different benchmarks actually have criteria for whether it's zero shot, two shot, five shot, et cetera, and what multi-agent solutions do is they take the approach of end shots, where they're basically going back to the model over and over, so like, for example, something like Skip. That's what its internal architecture is doing. Sometimes Skip will go to multiple models in parallel and get those models to do some piece of the work and then compare the results and pick the best version and then iterate from there.

Amith:

And that would have been both way too expensive and way too slow up until recently, right. So when you have unlimited bandwidth, you start doing real-time video and virtual reality, whereas a few years ago, you know, you wouldn't have done that, you would have been real happy with just high quality phone calls, with, like when Skype first came out, you know, 15 years ago, or whatever. So it's the same thing with this. It gives you new applications and capabilities. So even until the frontier models are better at reasoning, for example, you can get really good reasoning out of agents, because the agents do what I described of chaining together prompts and doing multiple shots and stuff like that. So these innovations my point is is that they make those kinds of applications more possible and more affordable.

Mallory:

I mentioned we would touch on this, so I want to make sure that we do that. Lama 3.1 introduces the capability for synthetic data generation and, honestly, you brought this up a few times on the pod. It might be worth having a topic around it one day, but can you explain what this means?

Amith:

Sure Well, synthetic data generation. It actually is kind of what it sounds like, but it's basically using a language model to generate content for you. So let's say that I wanted to create my own model and I needed a lot of really high quality training data to build a model, let's say, specific for my association. I wanted to build a model that was used for some purpose. It doesn't even matter what it is, but the idea is I need a lot of data for that, and maybe I have some example data. Maybe I have 10,000 or 20,000 pieces of data. That are good starting points, but maybe I need 5 million pieces of content. And so I can use, by license and by capability and I'll explain what that means in a second but by both license and by capability, I can use the Lama 3.1 family of models to generate this new content by prompting the language model, essentially saying, hey, here are several examples, I need you to produce more examples, examples and then writing a program that essentially keeps prompting the model over and over and over and over again, asking it for more results, and then I save those results and I have my synthetic data. So, by license and by capability, what I mean by that is up until recently.

Amith:

Most of these large models said you're you're not allowed to use them for synthetic data generation because what they're essentially trying to do is protect their moat. So, like OpenAI specifically, their terms of use do not allow you to use GPT-4 to generate training data, because if they did, then you could use a much lesser model, fine-tune it or even pre-train a new model with something coming out of GPT-4. So it's kind of a moat protection mechanism that the license hasn't allowed for that. And then by capability is, you know, even if your license allows you to do it, if the output isn't great, then there's no value to it.

Amith:

But these models are so sophisticated that they're quite extraordinary actually developing synthetic data. There's still a lot of questions to be answered about synthetic data in terms of the efficacy of training models that are based on them. But all of the initial indicators from the research that's been published and the models, like the Microsoft 5.3 model, a lot of the work that Mistral is doing uses synthetic data like high, high quality synthetic data. It's going to be an explosion right Like this Cambrian explosion of models we're already seeing. You're going to see more of it because of this particular decision.

Mallory:

Do you think or do you know if OpenAI and Anthropic are using synthetic data generation to train their models?

Amith:

from their own models. I don't know that either of them have been particularly open about what their training data sets are. I'd be shocked if they're not, because both of them have had frontier capabilities for a long time that have been synthetic data capable. So I would be fairly confident in saying that I think both of those companies have been heavily utilizing synthetic data but I don't really know the answer to that definitively. But it would be hard to imagine that they wouldn't, because it'd be an advantage that they'd be giving up for no reason, I think.

Mallory:

And then my last question for you, amit. We had a whole I think it was a really early episode that we did dedicated to kind of this open source versus closed source debate. Claude, and OpenAI's GPT models are closed source, google Gemini is closed source, but Google Gemma is actually open source, and then, like we just talked about, lama is open source. Can you kind of just give a high level, your take on what you think is most important with either, what you should keep an eye on if you're a business leader?

Amith:

listening to this podcast, Well, I mean, there's a couple sides to it, so I'll talk about it from a business making a decision on which model to choose, and then I'll also talk about kind of the societal implications, the AI safety implications and I'd love to hear your thoughts on this topic as well, mallory. First of all, on the business, making a decision on which model to use always start with the basic rubric of saying what are the capabilities and what are the costs? With any software vendor of any kind, you need to look at it and say what will this product do for me and how much will it cost me? And how much might not just be financially but it might also be deployment costs, complexity of integration, things like that. And so the capability side what you might find is that at open source you actually get capabilities that you don't get with closed source, because you can use it with data that you consider too sensitive to provide, to cloud or to open AI. So that's one thing that essentially creates a new capability. It's not that the fundamental technology can do something that, like in Lama, can do something that open AI cannot, but you are willing to use it for this more sensitive application because you have it in a controlled environment. So that's one thing, and there probably are situations where Lama is better or Anthropic is better, but they're all getting so good that I think for most use cases it's going to be fairly commoditized.

Amith:

In terms of the fundamental capabilities, you might want to think about who you're partnered with for deployment. So Lama is going to be deployed across all the major cloud providers. It's going to be available on Grok. It's going to be available from Meta directly. It's going to be available in a lot of places. That is an advantage because you have portability. You can also do a lot of other things with fine-tuning with it. But the flip side is there's a little bit more complexity in managing an open source LLM like this. If you're using it in an environment you're in control over, if you're deploying it through AWS or Azure or Grok, it's just as simple. It's just an API, just like using OpenAI. But there's pros and cons to that. So I think as a business decision maker, you're looking at it from a capabilities versus potential risks versus cost type of rubric that you would look at with any software decision.

Amith:

Societally, I think there's an interesting conversation to be had about what's safer closed source, frontier models or open source. The people at OpenAI would probably continue to strenuously argue in front of policymakers in Congress and elsewhere that open source is dangerous, that open source could lead to state actors and others doing really bad things with the AI. And there's a valid point to be argued there that if you release the most cutting edge AI for anyone to use any way they want, what does that mean for the world Right? What does it mean for export controls? What does it mean for defense? What does it mean for a variety of things?

Amith:

The flip side of it is there's a massive amount of ego that goes into that statement and also essentially regulatory capture mindset that an open AI would want people to say oh no, open source is really bad, you've got to have closed source because that protects their business. So the open source folks would say well, actually, really the safest model is the open model because it's something you can look at, you can inspect it, you know how it was trained, you know where it's deployed, you have control over it. And, by the way, there's a lot of people who are going to be doing a lot of bad things, no matter what you do. And the more you can aggressively deploy good AI. That's really the only possibility you have to protect against bad AI, whether it's bad AI that's based on open source or bad AI that's based on something else.

Amith:

So I tend to lean in that direction. I think the debate is a very interesting one and I think there's good points in both directions or on both sides of it, but I think it's one of these things where, ultimately, it's a hard question to answer with a definitive yes or no, good or bad, open source or closed source, because it kind of depends. So that's where my head is at. What are your thoughts based on everything you've been exposed to over the last year, year and a half with this stuff?

Mallory:

It is a great question. I've learned a ton about this from you, amit, honestly and kind of your takes. I don't think I've made a ton of technology decisions just in my career, so I feel like I can't really approach it from that angle. But I think as a human, as a person, I tend to be on the all ships rise kind of mindset. So I would lean open source in the sense that you can collaborate more, you can promote more innovation, more creativity. But I also understand the concern that well, now everybody has access to 3.1. And what are they going to do with that technology?

Mallory:

I think, in the end, I believe the more eyes we have on something, the more we'll be able to prepare for, kind of the bad uses of it, like you said. So I would say leaning open source. However, I mean I'm using Anthropics Cloud every single day, so it's not to say that I'm only going to use open source AI models, but in theory I think that's the path I align with more. But I'm sure I have much more to learn to, kind of on the flip side. Now, if I were Sam Altman, I would be like proprietary all the way. I want to make as much money as I can off of this thing that we created.

Amith:

So I definitely understand the other side too Completely, and I do think there's a validity to saying, okay, well, meta, put 405B out into the world. You know, out into the world. Kim Jong-un in North Korea can download it just like anyone else and use it for whatever, and it's a very powerful piece of technology. There's no export controls over that. You can run it. Anyone can take it and run it and do stuff with it. So is that good or bad? What does that ultimately mean? So I think that there are a lot of reasons to have a degree of thoughtfulness at least around open source.

Amith:

Regulatory control ultimately is extremely unlikely to have any impact. There's been conversation around this for a long time already. Very little has actually been done. To the extent that you see in the EU there's a little bit more regulatory control, or attempted regulatory control, and really what you're seeing there is there's just a stifling of innovation. A lot of companies are saying, yeah, we're not going to operate there, we're not going to offer our products there, and it's happening everywhere else. It's not really stopping anything because you have to have every country in the world agree to that at the same time, basically to have that occur. So I don't know what the answer is.

Amith:

I think that we all have to be thoughtful about this. We have to be willing to hear. You know opposing viewpoints, now more than ever. Truthfully, you know about everything, of course, not just about technology and something as important as AI but you know, the more you know about this stuff, the more you should know that you really don't know a whole lot, and that's how I feel every single day about this stuff. It's a bit overwhelming. It's a bit overwhelming, it's a bit exciting. It's actually very exciting, but it also is humbling and it needs to teach us all that, as we look to apply these technologies in the best possible ways for our organizations, we also need to keep our hat on or available to us, that's our citizen hat and say what's the best use case and the best approach for deploying this thing throughout our world.

Mallory:

That is a great point to end this episode on. Everyone thanks for tuning in to today's episode. If you liked it, please drop us a review on your favorite podcasting platform or, if you're joining us on YouTube, give us a like, give us a subscribe. We so appreciate it and we will see you next week.

Amith:

Thanks for tuning in to Sidecar Sync this week. Looking to dive deeper? Download your free copy of our new book Ascend Unlocking the Power of AI for Associations at ascendbookorg. It's packed with insights to power your association's journey with AI. And remember Sidecar is here with more resources, from webinars to boot camps, to help you stay ahead in the association world. We'll catch you in the next episode. Until then, keep learning, keep growing and keep disrupting.