Scott Beynon on NonZero Newsletter

1 Comment

Feb 23, 2024

Yes, when he said this I wondered about the different perspectives you might get if you approach it from the LLM as opposed to the image generator POV. If you come from the text-to-image side of things it's harder to imagine any understanding going on, because it screws up so readily, and not just with things like negation, but with even simply concepts. Midjourney gives you six-fingered hands and elbows that bend backwards, from which it's easier to conclude that it understands nothing about anatomy - it's just gluing photos together.

Expand full comment

Like

Reply

Share

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts