Connect with us

artificial intelligence

Google Unleashes Imagen 3, Heating Up the AI Image Generator Race

Published

on


Google is putting the icing on the cake for a busy week in the generative AI space with the launch of Imagen 3, its brand-new text-to-image model. This release builds upon the success of Imagen 2, introduced in December 2023, which already rivaled industry heavyweights like Dall-E 3 and MidJourney v5.

Imagen 3, originally announced in May, boasts enhanced capabilities in understanding and executing complex prompts, generating images with improved details, and better prompt adherence compared to its predecessor. It is pretty versatile, producing good results that range from photorealism to art and 3D compositions.

“Imagen 3 is our highest quality text-to-image model, capable of generating images with even better detail, richer lighting, and fewer distracting artifacts than our previous models,” Google said in its official announcement.

Imagen 3’s prompt improvements allow users to describe desired images in natural language without complex prompt engineering. The model’s training also incorporated richer image captions, enabling it to capture nuanced details like specific camera angles or compositions and long text prompts when needed.

The tech giant has placed particular emphasis on Imagen 3’s enhanced text rendering capabilities. Although noticeably improved, our initial tests show that its capabilities are not quite on par with other models like Dall-E 3, Auraflow, or Flux.

Generations by Imagen 3 and Grok 2 using the same prompt

Google has also stressed its commitment to safety and responsibility in the development and deployment of Imagen 3. The company implemented what it described as “extensive filtering and data labeling” processes to minimize harmful content in the model’s training datasets. Additionally, Google said it conducted thorough evaluations, including red team exercises, to identify and fix potential vulnerabilities.

It is also important to note that Imagen 3 integrates SynthID, Google’s watermarking tool. SynthID embeds a digital signature directly into the pixels of generated images. This watermark is imperceptible to the human eye but detectable by specialized software, providing a means to identify AI-generated content.

Currently, Imagen 3 is available through Google’s ImageFX platform and Vertex AI. Looking ahead, Google plans to introduce popular editing features from Imagen 2, such as inpainting (editing elements in the image) and outpainting (expanding it), to Imagen 3 in the coming months. The company has also announced intentions to expand Imagen 3’s availability across its broader product ecosystem, including integration into the Gemini app, Google Workspace, and Google Ads.

This release is part of a broader Google strategy that aims to put Gemini and AI technology in basically all of its services and hardware. This week, the company introduced its new Pixel 9 lineup, which was designed with AI capabilities at its core. The new Pixel phones can handle certain generative AI tasks locally, including text-based tasks and small image generations.

The release of Imagen 3 comes amid a flurry of activity in the AI image generation space. Elon Musk’s xAI recently unveiled Grok 2, featuring the Flux.1 image generator, which has gained attention for its ability to produce highly realistic, uncensored images alongside strong text generation capabilities.

Meanwhile, MidJourney, another key player in the field, announced an imminent v6.2 update to its model. The company also teased the development of MidJourney v7, slated for release in the coming months. Ideogram, another contender in the AI image generation arena, has also hinted at a forthcoming update to its model. Finally. the Open Model Initiative has chosen Flux.1 as the foundation for developing its state-of-the-art open-source image generation model.

Edited by Ryan Ozawa.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

AI

Decentralized AI Project Morpheus Goes Live on Mainnet

Published

on



Morpheus went live on a public testnet, or simulated experimental environment, in July. The project promises personal AIs, also known as “smart agents,” that can empower individuals much like personal computers and search engines did in decades past. Among other tasks, agents can “execute smart contracts, connecting to users’ Web3 wallets, DApps, and smart contracts,” the team said.



Source link

Continue Reading

artificial intelligence

How the US Military Says Its Billion Dollar AI Gamble Will Pay Off

Published

on



War is more profitable than peace, and AI developers are eager to capitalize by offering the U.S. Department of Defense various generative AI tools for the battlefields of the future.

The latest evidence of this trend came last week when Claude AI developer Anthropic announced that it was partnering with military contractor Palantir and Amazon Web Services (AWS) to provide U.S. intelligence and the Pentagon access to Claude 3 and 3.5.

Anthropic said Claude will give U.S. defense and intelligence agencies powerful tools for rapid data processing and analysis, allowing the military to perform faster operations.

Experts say these partnerships allow the Department of Defense to quickly adopt advanced AI technologies without needing to develop them internally.

“As with many other technologies, the commercial marketplace always moves faster and integrates more rapidly than the government can,” retired U.S. Navy Rear Admiral Chris Becker told Decrypt in an interview. “If you look at how SpaceX went from an idea to implementing a launch and recovery of a booster at sea, the government might still be considering initial design reviews in that same period.”

Becker, a former Commander of the Naval Information Warfare Systems Command, noted that integrating advanced technology initially designed for government and military purposes into public use is nothing new.

“The internet began as a defense research initiative before becoming available to the public, where it’s now a basic expectation,” Becker said.

Anthropic is only the latest AI developer to offer its technology to the U.S. government.

Following the Biden Administration’s memorandum in October on advancing U.S. leadership in AI, ChatGPT developer OpenAI expressed support for U.S. and allied efforts to develop AI aligned with “democratic values.” More recently, Meta also announced it would make its open-source Llama AI available to the Department of Defense and other U.S. agencies to support national security.

During Axios’ Future of Defense event in July, retired Army General Mark Milley noted advances in artificial intelligence and robotics will likely make AI-powered robots a larger part of future military operations.

“Ten to fifteen years from now, my guess is a third, maybe 25% to a third of the U.S. military will be robotic,” Milley said.

In anticipation of AI’s pivotal role in future conflicts, the DoD’s 2025 budget requests $143.2 billion for Research, Development, Test, and Evaluation, including $1.8 billion specifically allocated to AI and machine learning projects.

Protecting the U.S. and its allies is a priority. Still, Dr. Benjamin Harvey, CEO of AI Squared, noted that government partnerships also provide AI companies with stable revenue, early problem-solving, and a role in shaping future regulations.

“AI developers want to leverage federal government use cases as learning opportunities to understand real-world challenges unique to this sector,” Harvey told Decrypt. “This experience gives them an edge in anticipating issues that might emerge in the private sector over the next five to 10 years.

He continued: “It also positions them to proactively shape governance, compliance policies, and procedures, helping them stay ahead of the curve in policy development and regulatory alignment.”

Harvey, who previously served as chief of operations data science for the U.S. National Security Agency, also said another reason developers look to make deals with government entities is to establish themselves as essential to the government’s growing AI needs.

With billions of dollars earmarked for AI and machine learning, the Pentagon is investing heavily in advancing America’s military capabilities, aiming to use the rapid development of AI technologies to its advantage.

While the public may envision AI’s role in the military as involving autonomous, weaponized robots advancing across futuristic battlefields, experts say that the reality is far less dramatic and more focused on data.

“In the military context, we’re mostly seeing highly advanced autonomy and elements of classical machine learning, where machines aid in decision-making, but this does not typically involve decisions to release weapons,” Kratos Defense President of Unmanned Systems Division, Steve Finley, told Decrypt. “AI substantially accelerates data collection and analysis to form decisions and conclusions.”

Founded in 1994, San Diego-based Kratos Defense has partnered extensively with the U.S. military, particularly the Air Force and Marines, to develop advanced unmanned systems like the Valkyrie fighter jet. According to Finley, keeping humans in the decision-making loop is critical to preventing the feared “Terminator” scenario from taking place.

“If a weapon is involved or a maneuver risks human life, a human decision-maker is always in the loop,” Finley said. “There’s always a safeguard—a ‘stop’ or ‘hold’—for any weapon release or critical maneuver.”

Despite how far generative AI has come since the launch of ChatGPT, experts, including author and scientist Gary Marcus, say current limitations of AI models put the real effectiveness of the technology in doubt.

“Businesses have found that large language models are not particularly reliable,” Marcus told Decrypt. “They hallucinate, make boneheaded mistakes, and that limits their real applicability. You would not want something that hallucinates to be plotting your military strategy.”

Known for critiquing overhyped AI claims, Marcus is a cognitive scientist, AI researcher, and author of six books on artificial intelligence. In regards to the dreaded “Terminator” scenario, and echoing Kratos Defense’s executive, Marcus also emphasized that fully autonomous robots powered by AI would be a mistake.

“It would be stupid to hook them up for warfare without humans in the loop, especially considering their current clear lack of reliability,” Marcus said. “It concerns me that many people have been seduced by these kinds of AI systems and not come to grips with the reality of their reliability.”

As Marcus explained, many in the AI field hold the belief that simply feeding AI systems more data and computational power would continually enhance their capabilities—a notion he described as a “fantasy.”

“In the last weeks, there have been rumors from multiple companies that the so-called scaling laws have run out, and there’s a period of diminishing returns,” Marcus added. “So I don’t think the military should realistically expect that all these problems are going to be solved. These systems probably aren’t going to be reliable, and you don’t want to be using unreliable systems in war.”

Edited by Josh Quittner and Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

Continue Reading

artificial intelligence

AI Startup Hugging Face is Building Small LMs for ‘Next Stage Robotics’

Published

on



AI startup Hugging Face envisions that small—not large—language models will be used for applications including “next stage robotics,” its Co-Founder and Chief Science Officer Thomas Wolf said.

“We want to deploy models in robots that are smarter, so we can start having robots that are not only on assembly lines, but also in the wild,” Wolf said while speaking at Web Summit in Lisbon today.  But that goal, he said, requires low latency. “You cannot wait two seconds so that your robots understand what’s happening, and the only way we can do that is through a small language model,” Wolf added.

Small language models “can do a lot of the tasks we thought only large models could do,” Wolf said, adding that they can also be deployed on-device. “If you think about this kind of game changer, you can have them running on your laptop,” he said. “You can have them running even on your smartphone in the future.”

Ultimately, he envisions small language models running “in almost every tool or appliance that we have, just like today, our fridge is connected to the internet.”

The firm released its SmolLM language model earlier this year. “We are not the only one,” said Wolf, adding that, “Almost every open source company has been releasing smaller and smaller models this year.”

He explained that, “For a lot of very interesting tasks that we need that we could automate with AI, we don’t need to have a model that can solve the Riemann conjecture or general relativity.” Instead, simple tasks such as data wrangling, image processing and speech can be performed using small language models, with corresponding benefits in speed.

The performance of Hugging Face’s LLaMA 1b model to 1 billion parameters this year is “equivalent, if not better than, the performance of a 10 billion parameters model of last year,” he said. “So you have a 10 times smaller model that can reach roughly similar performance.”

“A lot of the knowledge we discovered for our large language model can actually be translated to smaller models,” Wolf said. He explained that the firm trains them on “very specific data sets” that are “slightly simpler, with some form of adaptation that’s tailored for this model.”

Those adaptations include “very tiny, tiny neural nets that you put inside the small model,” he said. “And you have an even smaller model that you add into it and that specializes,” a process he likened to “putting a hat for a specific task that you’re gonna do. I put my cooking hat on, and I’m a cook.”

In the future, Wolf said, the AI space will split across two main trends.

“On the one hand, we’ll have this huge frontier model that will keep getting bigger, because the ultimate goal is to do things that human cannot do, like new scientific discoveries,” using LLMs, he said. The long tail of AI applications will see the technology “embedded a bit everywhere, like we have today with the internet.”

Edited by Stacy Elliott.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

Continue Reading
Advertisement [ethereumads]

Trending

    wpChatIcon