Google announces Gemma 4 open AI models, switches to Apache 2.0 license



Google’s Gemini AI models have improved by leaps and bounds over the past year, but you can only use Gemini on Google’s terms. The company’s Gemma open-weight models have provided more freedom, but Gemma 3, which launched over a year ago, is getting a bit long in the tooth. Starting today, developers can start working with Gemma 4, which comes in four sizes optimized for local usage. Google has also acknowledged developer frustrations with AI licensing, so it’s dumping the custom Gemma license.

Like past versions of its open-weight models, Google has designed Gemma 4 to be usable on local machines. That can mean plenty of things, of course. The two large Gemma variants, 26B Mixture of Experts and 31B Dense, are designed to run unquantized in bfloat16 format on a single 80GB Nvidia H100 GPU. Granted, that’s a $20,000 AI accelerator, but it’s still local hardware. If quantized to run at lower precision, these big models will fit on consumer GPUs.

Google also claims it has focused on reducing latency to really take advantage of Gemma’s local processing. The 26B Mixture of Experts model activates only 3.8 billion of its 26 billion parameters in inference mode, giving it much higher tokens-per-second than similarly sized models. Meanwhile, 31B Dense is more about quality than speed, but Google expects developers to fine-tune it for specific uses.

The other two Gemma 4 models, Effective 2B (E2B) and Effective 4B (E4B), are aimed at mobile devices. These options were designed to maintain low memory usage during inference, running at an effective 2 billion or 4 billion parameters. Google says the Pixel team worked closely with Qualcomm and MediaTek to optimize these models for devices like smartphones, Raspberry Pi, and Jetson Nano. Not only do they use less memory and battery than Gemma 3, but Google also touts “near-zero latency” this time around.

More powerful, more open

All the new Gemma 4 models will reportedly leave Gemma 3 in the dust—Google claims these are the most capable models you can run on your local hardware. Google says Gemma 31B will debut at number three on the Arena list of top open AI models, behind GLM-5 and Kimi 2.5. However, even the biggest Gemma 4 variant is a fraction of the size of those models, making it theoretically much cheaper to run.



Source link

  • Related Posts

    SpaceX to acquire AI coding platform Cursor for $60 billion

    SpaceX will acquire AI coding tool Cursor for $60 billion in an all-stock transaction, the companies announced today. The deal is expected to close in the third quarter. It comes…

    West Antarctica Is Missing Way Too Much Ice

    Antarctica’s west coast is missing an area of winter sea ice the size of France, sparking concerns for threatened penguins other marine life and global sea levels. One expert said…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    WNBA Power Rankings: Liberty are back on top

    WNBA Power Rankings: Liberty are back on top

    Scaloni tras debut de Argentina: “No va a pasar que nos confiemos; será difícil que nos ganen”

    Scaloni tras debut de Argentina: “No va a pasar que nos confiemos; será difícil que nos ganen”

    Carney says he had several talks with Trump during G7 despite no official meeting

    Carney says he had several talks with Trump during G7 despite no official meeting

    Univar Solutions Expands Exclusive Beauty, Personal Care, and Pharmaceutical Ingredients Partnership with American Distilling in EMEA

    The new global scourge to crack the G7 agenda: cancer

    The new global scourge to crack the G7 agenda: cancer

    SpaceX to acquire AI coding platform Cursor for $60 billion

    SpaceX to acquire AI coding platform Cursor for $60 billion