Google’s new AI tool Whisk uses images as prompts

Google has yet another AI tool to add to the pile. Whisk is a Google Labs image generator that lets you use an existing image as your prompt. But its output only captures your starter image’s “essence” rather than recreating it with new details. So, it’s better for brainstorming and rapid-fire visualizations than edits of the source image.

The company describes Whisk as “a new type of creative tool.” The input screen starts with a bare-bones interface with inputs for style and subject. This simple introductory interface only lets you choose from three predefined styles: sticker, enamel pin and plushie. I suspect Google found those three allowed for the kind of rough-outline outputs the experimental tool is most ideal for in its current form.

As you can see in the image above, it produced a solid image of a Wilford Brimley plushie. (Google’s terms forbid pictures of celebrities, but Wilford slipped through the gates, Quaker Oats in tow, without alerting the guards.)

Whisk also includes a more advanced editor (found by clicking “Start from scratch” from the main screen). In this mode, you can use text or a source image in three categories: subject, scene and style. There’s also an input bar to add more text for finishing touches. However, in its current form, the advanced controls didn’t produce results that looked anything like my queries.

For example, check out my attempt to generate the late Mr. Brimley in a lightbox scene in the style of a walrus plushie image I found online:

Screenshot of an AI generation tool producing images a man who looks a bit like Wilford Brimley.Google / Screenshot by Will Shanklin for Engadget

Whisk spit out what looks like a vaguely Wilford Brimley-esque actor eating oatmeal inside a lightbox frame. As far as I can tell, that dude is not a plushie. So, it’s clear why Google recommends using the tool more for “rapid visual exploration” and less for production-ready content.

Google acknowledges that Whisk will only draw from “a few key characteristics” of your source image. “For example, the generated subject might have a different height, weight, hairstyle or skin tone,” the company warns.

To understand why, look no further than Google’s description of how Whisk works under the hood. It uses the Gemini language model to write a detailed caption of the source image you upload. It then feeds that description into the Imagen 3 image generator. So, the result is an image based on Gemini’s words about your image — not the source image itself.

Whisk is only available in the US, at least for now. You can try it at the project’s Google Labs site.

This article originally appeared on Engadget at https://www.engadget.com/ai/googles-new-ai-tool-whisk-uses-images-as-prompts-210105371.html?src=rss

HOT news

Related posts

Latest posts

BlackRock Says Bitcoin’s Portfolio Position Is Altering: Why 1-2% Issues

The world’s largest asset supervisor, BlackRock, has reiterated that bitcoin’s position in funding portfolios is evolving, describing the asset as a viable complementary diversifier...

Meta reportedly dips its pathetic toes into the prediction market house

Mark Zuckerberg has reportedly directed Meta to construct its personal prediction market.

Ethereum Worth Prediction: ETHLABS in Frontline to Save ETH Future

Ethereum value is bleeding. ETH sits at $1,650, down 6% in in a brutal vogue, and the technical setup provides little consolation for bulls...

Bitcoin Caught in Crossfire as Tech Shares Unravel

Nasdaq 100 futures dropped 2% at present alongside a 1.1% decline in S&P 500 futures, whereas South Korean tech shares tanked as a lot...

DeXe (DEXE) Explodes 50% Regardless of Crypto Massacre: What Comes Subsequent?

The crypto market has been fairly unstable (to say the least) these days, with the previous 24 hours delivering one other substantial correction. Bitcoin...

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!