Not Investing Advice

Share this post

Market Design for Text-to-Image Generation

anthonyleezhang.substack.com

Market Design for Text-to-Image Generation

Or "surge pricing for art", or "How can we design a market system to pay artists for their tributes to our AI overlords?"

Anthony Lee Zhang
Oct 1, 2022
5
Share this post

Market Design for Text-to-Image Generation

anthonyleezhang.substack.com

An idea that’s been floating around economics/policy circles recently is that consumers provide data that is a key input into tech firms’ ML/AI models, and that consumers should be paid for the use of their data. This idea floats around under many names, such as data as labor, data as oil, consumer data ownership, and many others.

Despite substantial policy discussion, in my opinion, the idea hasn’t really picked up very much policy momentum thus far. Here’s one possible reason why. Twitter, Facebook, Youtube, Amazon, Netflix and a variety of other firms do make a lot of money off my data. My shitposts, video watching, product browsing, and other behavior generates data that is useful for predicting what I and other people will want to watch, buy, or click.

Thanks for reading Not Investing Advice! Subscribe for free to receive new posts and support my work.

This data makes these firms a lot of money. But the cost of producing these data, from my perspective, are very low. I watch Youtube videos, browse Amazon products, produce shitposts on Twitter incidentally — as part of the course of everyday life — I do not generally go out of my way to expend costly effort to produce shitposts. Twitter, Facebook, Youtube, and the like are fairly effective at gathering these kinds of data, because simply by providing a good user experience, they can gather such incidentally produced user data for their ML models.

There was never any real need to pay people to click on videos — sure, it would be nice if I got a few cents for every video I watched, but not enough for me to push strongly to make this happen. There was also never any strong sense that figuring out how to pay people for clicking on videos would generate higher quality video watching data — so while regulators, policymakers, and academics have bounced this idea around for a while, no firms really invested much in trying to make this idea work. When data has zero cost to produce, there is no real value in creating a market for the data. The market clearing price for your shitposts is functionally zero.

An interesting development, which has the potential to overturn this logic, is the recent rise of text-to-image software, such as DALL-E and Stable Diffusion. Text-to-image software generates seemingly miraculous results. But they essentially do so by stealing the work of large corpuses of art, and “gluing” these together in a computer-assisted way.

There seem to be large ethical, and potentially legal, problems behind these miracles. I do not care a huge amount if Youtube makes money off my video clicks. It seems a much more severe problem if a text-to-image firm is making large amounts of money from the unpaid labor of artists producing inputs for these art-Frankenstein-machines. Regulators have good reasons to be concerned about IP law and policy around these cases.

But here’s a thought: even without regulatory actions, markets may fix the artist compensation problem in the art case, much more efficiently than they did in the case of incidentally generated user click data.

Why is this? The core reason is precisely that making art is much more costly than browsing Amazon or watching youtube videos. As a result, the elasticity of art output with respect to compensation is much higher. If a firm figures out how to pay people to click on videos, you don’t get much more video watches, and you can’t improve your AI much. But if you figure out an efficient way to pay artists — for producing precisely whatever the images that are most valuable to a given text-to-image bot at a given point in time — you potentially get much more art, substantially improving your text-to-image bot!

Conceptually, a market design solution for text-to-image bots would basically try to match the supply and demand for art inputs into the AI bot. It would figure out the parts of “image space” there is high demand, and low supply for. Suppose the world suddenly wants lots of art of cyberpunk hot dog stands. The current outputs from DALL-E for this particular input are… unimpressive.

There is too much demand for cyberpunk hot dog stand pictures, relative to the supply of art which “feeds into” creating a good cyberpunk hot dog stand through AI. But a market can fix this problem! The AI could figure out what kinds of art feeds into this — presumably, cyberpunk art, and hot dog art — and increase the prices paid to artists creating these kinds of art.

From artists’ side, a market could look like the following. Artists create art, and submit it for evaluation and “purchase” by the AI. The AI would then “score” the art according to its current market value in the supply-demand system. This score is determined by how much the given piece of “fills demand gaps”, giving the AI valuable information, in parts of text space — like cyberpunk hot dog stands — where many people want good pictures, but there’s currently not enough high quality inputs. The AI would then pay for the (right to use the) art, according to the art’s score. Effectively, in this system, the AI runs an art market with surge pricing. Artists are paid, and what they’re paid depends on how much their art inputs help the machine produce output images in high demand.

In principle, a firm that figured out how to solve this market problem could produce a text-to-image product much better than firms that just use publicly available training datasets! Just like surge pricing brings drivers to locations with high demand for rides and low supply of drivers, at any point in time, weaknesses in the text-to-image AI are fixed by a stream of optimizing artists, chasing whatever inputs have highest market value at any point in time.

Market design for text-to-image bots thus seems like both a significant ethical problem, and a big market opportunity. The first firm that figures out how to efficiently pay artists for their contributions to our text-to-image AI overlords can potentially make a very large amount of money, all while improving both the livelihoods of artists as well as the quality of text-to-image bot output.

Thanks for reading Not Investing Advice! Subscribe for free to receive new posts and support my work.

Share this post

Market Design for Text-to-Image Generation

anthonyleezhang.substack.com
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Anthony Lee Zhang
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing