All Your Images Are Belong to Zuck

If you have what you consider to be a hard-line stance against AI-generated images, and you post your photos and/or artwork to Instagram, Threads, and/or Facebook, you should likely either rethink that hard-line stance or stop posting your images.

Zuckerberg’s Going to Use Your Instagram Photos to Train His AI Machines:

During his earnings call for Meta’s fourth quarter results yesterday, Mark Zuckerberg made it clear he will use images posted on Facebook and Instagram to train his generative AI tools with.

Last month, Meta announced a standalone AI image generator to compete with the likes of DALL-E and Midjourney.

Meta has already admitted that it has used what it calls “publicly available” data to train its AI tools with.

Essentially, if you have a public Facebook or Instagram profile where you post photographs, there is a strong chance that Meta is using your work to train its AI image generator tools.

Yeah, this sucks, though it’s not surprising. I’ve stopped posting to Instagram, but still post a lot on Facebook, because this is where most of my friends are. I wish Mastodon would get more traction (I’m not tempted by either Threads or Bluesky; Threads is just another arm of Meta, Bluesky is more Jack Dorsey, neither is actually federating yet despite a lot of lip service, and neither currently allows post schedulers to tie in, which keeps me from using them for Norwescon posts), or, even better, that there was more of a push back towards actual self-owned blogs (like this one!) that aren’t locked behind virtual walls. But I don’t want to lose track of all of my friends, so until something major shifts, I’ll stick around, which means I’m probably going to end up shrugging and resigning myself to feeding Zuck’s AI machines, which I have definite ethical issues with.

I’m Training AI Chat Bots (Non-Consensually)

The Washington Post has published an article looking at the websites used to train “Google’s C4 data set, a massive snapshot of the contents of 15 million websites that have been used to instruct some high-profile English-language AIs, called large language models, including Google’s T5 and Facebook’s LLaMA.” If you scroll down far enough, there’s a section titled “Is your website training AI?” that lets you drop in a URL to see if it was scraped and included in the data set.

I checked three strings — “michaelhans” (to cover both this site and its prior address at michaelhanscom.com), “djwudi” (for my DJ’ing blog), and norwescon (which I’ve written or tweaked and edited much of the content for). All three of them are represented.

  • norwescon.org: 45k tokens, 0.00003% of all tokens, rank 528,147
  • michaelhanscom.com: 37k tokens, 0.00002% of all tokens, rank 635,948
  • djwudi.com: 3.7k tokens, 0.000002% of all tokens, rank 4,002,025

For the record, I’m not terribly excited about this. I’m also under no illusion that anything can be done; this stuff is all out on the open web, and as it’s free for actual people to browse through and read, it’s also free for bots to scrape and ingest into whatever databases they keep. Sometimes this is a good thing, for projects like the Internet Archive. Sometimes it’s unwittingly helping to train our new AI overlords.

AI Art, Ethics, and Where I Stand

While nobody specifically asked, since I have some friends who are all about the AI art and some who believe it’s something that should be avoided because of all the ethical issues, and since I’m obviously having fun playing with it with my “AImoji” project, I figured I’d at least make a nod to the elephant in the room.

An AI generated image of an African elephant standing in what appears to be a Victorian sitting room.

There are absolutely some quite serious ethical questions around AI generated artwork. To my mind the three most serious are (not in any particular order):

  1. Much of the material used to train the AI engines was scraped off the internet, often without any consideration of copyright, certainly without any attempt to get permission from the original creators/artists/photographers/subjects/etc., and some people have even found medical images that were only approved for private use by their doctor, but somehow ended up in the training sets. That situations like this are likely (hopefully) in the minority doesn’t absolve the companies who acquired and used the images to create their AI engines from being responsible for using these images.

  2. As the AI engines continue to improve, it is getting more and more difficult to distinguish an AI generated image from one created by an artist. There are also a number of people and organizations who have flat-out stated that they are looking at AI generated imagery as a way to save money, because it means they now don’t have to pay actual artists to create work. Obviously, this is not a particularly good approach to take.

  3. Because some of the engines are able to create images in the style of a particular artist, and the output quality continues to improve, there have already been instances where a living artist is being credited for creating work that was generated by an AI bot. And, of course, if you can create an image that looks like your favorite artist’s work for low or no cost…well, for a lot of people, they’ll happily settle for an AI generated “close enough” rather than an actual commissioned piece. Obviously, this is also not a particularly good approach to take.

I’m enjoying playing with the AI art generation tools. I’m also watching the discussions around the ethical questions around how they can and should be used.

The issues above are all very real and very serious. It’s also true that AI art can be just another tool in an artist’s toolbox. I’ve seen artists who use AI art generators to play with ideas until they find inspiration, or who use parts of the generated output in their own work. I’ve seen reports of people who want to commission art use the generator to get a rough idea of what they’re looking for that they can give to an artists as a rough example or proof of concept. So there are ways to use AI art generators in, well, more-ethical ways (it’s hard to argue they’d be entirely ethical when the generators have unethical underpinnings).

So, where I stand in my use at this point:

  1. I don’t use living artist’s names to influence the style one way or another, and have only occasionally used dead artist’s names as keywords (I’ll admit, H.R. Giger has been a favorite to play with).

  2. I don’t feed images in, try to generate images of actual people, or use images of actual people (including myself) as source material.

    One caveat: if a tool does all of its processing locally on my device, I may use my own images, including some of myself. But nothing that feeds images into the systems.

  3. And, of course, anything I do is just for fun, and to make me, and maybe a few other people, laugh (or occasionally recoil in horror).

For a few months this past year, I used an AI-generated image of a dragon flying over a city skyline for the Norwescon website and social media banner image. This was always intended as a temporary measure to fill the gap between last year’s convention and getting art from this year’s Artist Guest of Honor, and as soon as we had confirmed art from our GOH, the AI-generated art came down. It was also chosen much earlier in the “isn’t AI art neat” period, before I’d read as much about the issues involved. As such, I won’t be using AI art for Norwescon again, and will go back to sourcing copyright-free images from NASA or other such avenues when we are in the interregnum period.

So: I understand those who see AI art as something that should be avoided. I also understand those who see it as another tool. And, honestly, I also understand those who just see a shiny new toy that they want to play with. I’m somewhere in the midst of all those points of view, and while I don’t personally see the need to avoid AI art bots entirely, I am consciously considering how I use them and what I use them for.

AI Weirdness • 2020 headlines: “Midway through 2020, people started suggesting that I train a neural net on 2020 headlines, and I was skeptical that there would be enough weird ones to make a decent project. Then 2020 continued to be 2020.”

AI Dungeon 2

I haven’t taken the time to try this yet, but this seemed like something quite a few people I know would be into: a Zork-style game with an AI backend, so you can do…well, anything, apparently.

I wrote earlier about a neural net-powered dungeon crawling text adventure game called GPT-2-Adventure in which gameplay is incoherent and dreamlike, as you encounter slippery sign text, circular passages, and unexpected lozenge rooms. A PhD student named Nathan trained the neural net on classic dungeon crawling games, and playing it is strangely surreal, repetitive, and mesmerizing, like dreaming about playing one of the games it was trained on.

Now, building on these ideas (and on an earlier choose-your-own-adventure-style game he built), Nick Walton has built a new dungeon-crawling game called AI Dungeon 2. Nick made a few upgrades, such as beefing up the AI to the huge GPT-2-1.5B model OpenAI recently released, adding a penalty for repetitive text, and expanding the dungeon game training examples to a bunch of modern human-written games from chooseyourstory.com.

I CAN’T STOP PLAYING THIS GAME

Here’s the actual game site: AI Dungeon. Have fun!

Linkdump for November 14th through November 29th

Sometime between November 14th and November 29th, I thought this stuff was interesting. You might think so too!

Back again!

Woohoo! We’ve reconfigured a few areas of the network here at Casey’s house, and it seems that things are back up and fully functional for me again. So, as things go here, I’ll do my best to return to updating my pages on a regular basis. I know, I know, something of a shock after about a month of near-nonexistent updates…but I’ll try.

Things for me are still in something of a holding pattern at the moment. I got word from the landlord of my apartment complex that the carpets are scheduled to be installed this Monday, so I should finally be able to get into my place Monday afternoon/evening sometime. I’ve made the requisite calls to the telephone and electric companies and am all set up there, so should be good to go as soon as I get the word from the landlord on Monday. I’ll be sending out the mailing address and phone number to those who need it in the near future.

Internet access options for me are still being investigated. I’m hoping to get set with a DSL line, I just need to get in contact with the local ISP‘s to see if my apartment has that as an option. I’m assuming it does — I’m going to be living right on Capitol Hill, just about 20 blocks or so uphill from downtown Seattle — but I’m not entirely sure yet. In any case, Casey has graciously allowed me to keep my webserver at his place until I have things up and running at my apartment, so the server shouldn’t be going down again at all, however there may be a couple weeks where my online abilities are severely limited until I get my own connection up and running. It’ll all get straightened out eventually — I’m just glad to have friends down here who are able and willing to assist me in all of this

In other news, I’ve been playing a lot with my digital camera since I got it. I took some time recently to stitch together some panoramas I’d taken. The first three were all taken before I left Alaska — from top to bottom, the Inlet as seen from Earthquake Park in Anchorage, a view of the Palmer hayflats where I hit a bonfire with some friends, and Jewel Lake, a popular destination in South Anchorage.

Cook Inlet, Anchorage, AK

Bonfire Panoramic, Mat-Su Valley, AK

Jewel Lake, Anchorage, AK

The fourth shot was taken at Gasworks Park here in Seattle during the 4th of July celebrations, about an hour before the fireworks display. I wanted to try and capture the sheer mass of people — later reports placed it at around 6,000 people just at this park (and it was one of two major fireworks displays within Seattle). I think it came out pretty decently.

4th of July 2001, Gas Works Park, Seattle, WA

I’ve been out to see two movies since I came down here so far — since it’s been a bit since I’ve seen them, I’ll just give brief rundowns of each. First off was Atlantis, Disney’s latest animated flick — another fun one from Disney. Not one of their all-time classics, but very enjoyable, with some absolutely breathtaking animation at times. More recently was A.I., the Spielberg/Kubrick sci-fi collaboration. In brief — I believe it to be an astounding piece of work, quite possibly Spielberg’s best work yet, and a film that, while getting wildy mixed reviews, is very likely to stand the test of time like few other recent films. Very, very impressive filmmaking, and my hat is off to Spielberg, Kubrick, and the rest of the forces behind this film. I’ll most likely post more about it after I’ve had a chance to see it a second time.

That’s the majority of the big news so far. As mentioned earlier, now that things are up and running again, I’ll do my best to return to a more reliable update schedule here. It’s good to be back….