Artificial intelligence still has some way to go

Message Bookmarked

Bookmark Removed

Not all messages are displayed: show all messages (5927 of them)

could you expand on that? in what way is she a crank?

She traffics her beliefs the same way Yudkowsky does. Their positions are predicated on science fiction so they’re impossible to refute, e.g., posted this weekend:

Why aren’t the “godfathers” of AI talking about the massive data theft from artists & their lawsuits eg? Because that discourse is too beneath their genius brains to cover? They have to talk about grand endeavors like SAVING HUMANITY? Because their practices would be implicated?
— @timnitGe✧✧✧@dair-commun✧✧✧.soc✧✧✧ on Mastodon (@timnitGebru) June 4, 2023

This equivalency of embeddings and data is so absurd it’s impossible to refute.

If Yudkowsky is “Bill Gates is micro-chipping us with vaccines,” Gebru is “COVID-19 is a bio-weapon.” Gebru is far closer to reality but she’s still far from reality. She also benefits from credentials (e.g., Fei-Fei Li was her PhD advisor) that Yudkowsky lacks.

― Allen (etaeoe), Monday, 5 June 2023 15:00 (two years ago)

personally i don't think it makes you a crank to think that real-world right-now issues like AI processing's impact on climate change, rights/pay/conditions for AI workers, sexism/racism at AI companies, training-set secrecy, or copyright matters... -might- be the biggest-stakes questions surrounding AI

She isn’t a crank because of her beliefs. She’s a crank because she predicates her beliefs on bad science. If it helps to understand my perspective, I believe every issue you identified is an important issue.

― Allen (etaeoe), Monday, 5 June 2023 15:02 (two years ago)

Erm. We may disagree about “copyright matters.” I don’t know what your position is about this.

― Allen (etaeoe), Monday, 5 June 2023 15:03 (two years ago)

thanks for the responses

This equivalency of embeddings and data is so absurd it’s impossible to refute.

could you explain that? I'm not sure what it means. And while that tweet is p strident, I wouldn't think the idea that generative AI models are trained on copyrighted materials would be controversial, but I'm probably missing your point

― rob, Monday, 5 June 2023 15:06 (two years ago)

so far, there is no precedent on which to argue that training AI on copyrighted materials is a copyright violation. it's speculation based on non-existent rulings (and, imho, a dangerous precedent to be calling for.)

― sean gramophone, Monday, 5 June 2023 15:14 (two years ago)

OK but a) there are several lawsuits going on right now, so it's a lot less speculative than "AI will prob end humanity" and b) I don't see that tweet really making a strict legal argument but a moral or ethical one. You can argue against her stance, but I don't think labelling it "speculation" make sense since the "theft" did already in fact take place

― rob, Monday, 5 June 2023 15:21 (two years ago)

could you explain that? I'm not sure what it means. And while that tweet is p strident, I wouldn't think the idea that generative AI models are trained on copyrighted materials would be controversial, but I'm probably missing your point

Her implication is that it’s possible to reconstruct the original data from these embeddings. It isn’t. Her other implication is that creator-specific features are being embedded. That’s unlikely.

― Allen (etaeoe), Monday, 5 June 2023 15:27 (two years ago)

OK but a) there are several lawsuits going on right now, so it's a lot less speculative than "AI will prob end humanity" and b) I don't see that tweet really making a strict legal argument but a moral or ethical one. You can argue against her stance, but I don't think labelling it "speculation" make sense since the "theft" did already in fact take place

What makes this theft rather than fair use?

― Allen (etaeoe), Monday, 5 June 2023 15:28 (two years ago)

I'm not a fair use expert, but I don't think it's impossible to argue against it in some cases. Getty obviously thinks there's a violation, and afaict the reproduced watermark makes it seem like a decent argument: https://www.theverge.com/2023/1/17/23558516/ai-art-copyright-stable-diffusion-getty-images-lawsuit.

Tbc I don't personally have a firm opinion on this, I'm just objecting to the idea that taking this position makes someone a crank. I think the real test wrt copyright will be once if these bots start being (more) commodified.

― rob, Monday, 5 June 2023 15:36 (two years ago)

btw etaeoe, I'll never find it in this big thread, but I swear *you* posted a paper somewhere recently itt arguing that you could reconstruct training data...?

― rob, Monday, 5 June 2023 15:38 (two years ago)

https://arxiv.org/abs/2301.13188 was the paper. As you have probably guessed, I'm not a computer scientist so maybe I misunderstood the implications or I don't get the different terms that are being used

― rob, Monday, 5 June 2023 15:45 (two years ago)

btw etaeoe, I'll never find it in this big thread, but I swear *you* posted a paper somewhere recently itt arguing that you could reconstruct training data...?

Yeah. I should’ve been clearer. Reconstruction isn’t a _definitive_ outcome. I think that’s why I’m puzzled by the copyright rhetoric. If the model can be used to reconstruct, it’s a copyright issue. If it can’t, it should be considered fair use. There’s nothing intrinsically problematic about the underlying methods (e.g., diffusion) and I don’t understand why we’re not presently equipped to deal with this distinction (there’s already caselaw about compression).

― Allen (etaeoe), Monday, 5 June 2023 16:37 (two years ago)

so-called AI godfathers don't care about copyright lawsuits because they're both inevitable and essential to prove out the legal ramifications and decide what counts as fair use and derivative work

weighing in on those things is something you'd do at trial, and with specific answers about the technology and how it works

researchers/programmers/etc. should act ethically and can act as advocates or whistleblowers, but I'd use them as primary sources for ethical and legal questions as much as I'd approach any random person who isn't an ethicist or lawyer

― mh, Monday, 5 June 2023 17:08 (two years ago)

that probably came off as glib, in that any person is entitled to an opinion and so-called godfathers should have considered these things. the way general media covers AI isn't for the most part useful when it comes to evaluating its use both within existing ethical and legal frameworks or determining how we change those frameworks to address new technology

― mh, Monday, 5 June 2023 17:13 (two years ago)

If the model can be used to reconstruct, it’s a copyright issue. If it can’t, it should be considered fair use.

It shouldn't be considered fair use if the results are going to be used for commercial purposes. Why should it be fair use to train a model on copyrighted data with the goal of producing content so you don't have to pay copyright holders?

― Random Restaurateur (Jordan), Monday, 5 June 2023 17:34 (two years ago)

That’s too small of a concern. Not SAVING HUMANITY.
— @timnitGe✧✧✧@dair-commun✧✧✧.soc✧✧✧ on Mastodon (@timnitGebru) June 4, 2023

― the manwich horror (Neanderthal), Monday, 5 June 2023 17:37 (two years ago)

I mean, the answer here is to stop paying attention to figureheads of the "AI movement" if they're providing nothing of value to the public conversation

if they only talk about SAVING HUMANITY then find someone else to listen to, because there's nothing there

― mh, Monday, 5 June 2023 17:49 (two years ago)

xp if I understand etaoe's point correctly now, what they're saying is that there are two ways of thinking about this:
(1) all outputs of generative AI violate copyright because they were trained on (some) copyrighted materials
(2) some outputs of generative AI may violate copyright depending on [factors]

sort of like how you can definitely use a sampler to violate copyright beyond fair use, but it's not inherent to sampling that that is the case.

fwiw I'm not sure Gebru was actually saying (1), but this is why Twitter is a bad forum for complex arguments

― rob, Monday, 5 June 2023 17:50 (two years ago)

Neanderthal's repost of her tweet was her being sarcastic

― mh, Monday, 5 June 2023 17:54 (two years ago)

is there even any legal requirement to disclose what goes into a model? if someone builds their own private model entirely off of copyrighted material, how would anyone even know based on the outputs?

This is not unlike imagining if someone built a massive library of microscopic samples from famous songs and then used those to make new music in which the source samples were entirely unrecognizable. There wouldn't be rights issues raised because the end result is completely different from the inputs. (Yes, I understand that AI image engines do not actually piece together elements of existing images)

I'm somewhat open to the idea that people can opt their images out of publicly available models, even though I don't exactly buy that putting them in there causes harm in any obvious way that taking them out would somehow fix.

― Muad'Doob (Moodles), Monday, 5 June 2023 18:00 (two years ago)

serious question for those who actually think AGI is gonna be able to self-replicate and produce world-ending superintelligence within 5 minutes or whatever - what is this going to run on? wouldn't this sort of thing just instantly overload whatever CPU it was running on?

― frogbs, Monday, 5 June 2023 18:03 (two years ago)

something something nvidia stock price

― mh, Monday, 5 June 2023 18:17 (two years ago)

It's surprising how recognizable even micro-samples are (and apparently AI is being used for sample-snitching now). And "There wouldn't be rights issues raised because the end result is completely different from the inputs." -- there definitely are if the sample is identified and the end result has made a lot of money, it doesn't matter if it's a one-shot.

― Random Restaurateur (Jordan), Monday, 5 June 2023 18:20 (two years ago)

I don't know how good they've gotten at recognizing all samples, and I also don't know how far they would plan to take sample litigation. Sampling is massively widespread and most of it happens without issues. My example specified that the samples in the context of the new piece of music were unrecognizable. Perhaps at this point that is a purely hypothetical idea because technology has gotten so good at recognizing samples, so let's assume I mean unrecognizable by human ears. In my experience, not a lot music gets targeted for sample violation unless the samples are fairly discernable and have an active role in the music, but perhaps that has changed.

Either way, AI images are not in fact made up of samples of other images so not sure it's relevant at all. My only point was, if the inputs are not discernable in the outputs, how is someone even going to go about proving harm?

― Muad'Doob (Moodles), Monday, 5 June 2023 18:28 (two years ago)

Right, the fact that it's going to be impossible to prove is all the more reason that opt-outs need to be put in place now, imo.

(the sample issue is an aside, but micro-chops that you'd think would be unrecognizable to the human ear are often not, people are surprisingly good at that kind of pattern recognition even if it's been re-pitched. There are tons of Dilla and Daft Punk samples that have been identified that are just split seconds of sound, not to mention drum hits. Of course if something has been completely mangled with effects to sound totally different then it's probably impossible, but usually the point of sampling is because there's some sort of valuable quality in the source material you want to maintain?)

― Random Restaurateur (Jordan), Monday, 5 June 2023 18:52 (two years ago)

yes, microsamples are now recognizable, but I was indeed thinking of samples that had been mangled beyond all recognition, which absolutely happens all the time, granular synthesis being one prominent example.

I don't know how far litigation has been taken with stuff like microsamples, are there cases of musicians being successfully sued for barely discernable or entirely unrecognizable samples? My totally non-professional understanding of the laws around sample clearance led me to believe that the mere presence of an uncleared sample in a piece of music isn't necessarily enough to hold a musician liable. I'm under the impression that the length of the sample, how central it is to the piece of music, and the financial losses incurred by the original artist are all taken into account.

― Muad'Doob (Moodles), Monday, 5 June 2023 19:03 (two years ago)

My understanding is that the whole "it's fine if it's under X seconds" thing is a myth.

This article talks about an NWA example: https://www.wipo.int/wipo_magazine/en/2009/06/article_0006.html

I don't know of a lawsuit example around true micro-samples, but I'm sure it could happen if the sampling track was a big enough hit, especially these days.

― Random Restaurateur (Jordan), Monday, 5 June 2023 19:08 (two years ago)

this is exactly why I avoid recording anything good enough to be a big hit

― Muad'Doob (Moodles), Monday, 5 June 2023 19:17 (two years ago)

My outlook on copyright infringement litigation for music, especially samples in music, is that it has gone way overboard for decades, so I hope new technology does not mean it will start to ramp up further. I get the sense that people coming up today, possibly inspired by the AI discourse, have a much more welcoming attitude towards suing musicians for this stuff, but hopefully I'm wrong.

I think with actual AI data sets, lawsuits are going to be much more difficult for individuals since there isn't any way to use something generated through AI to identify what specific items were in the model and which ones helped determine the thing that was generated.

― Muad'Doob (Moodles), Monday, 5 June 2023 19:25 (two years ago)

As much as I don't love the practice, it's hard for me to see how you could draw a legal distinction between training an AI on a bunch of music and the normal process by which a human writes music in part by synthesizing ideas from music they've listened to. If I could write a "Drake-style" song but wasn't impersonating Drake, sampling Drake, or borrowing any specific copyrightable elements in my song, the mere fact that I could ingest and spit back out his style would not make me a copyright infringer. So if AI does the same, I don't see the claim.

― longtime caller, first time listener (man alive), Monday, 5 June 2023 19:29 (two years ago)

I think the "wasn't impersonating Drake" part could end up being more of a sticking point with this tech? but honestly I agree with the general idea here that copyright isn't a very good legal framework for analyzing this stuff (my preference would be more serious consideration of the idea of the commons, but that's fairly idealistic). Still, I also think "let the AI companies do whatever they want" isn't a good approach either; I don't know how much longer these tools will remain free to use

― rob, Monday, 5 June 2023 19:34 (two years ago)

some of the most popular image tools are not free to use right now

― Muad'Doob (Moodles), Monday, 5 June 2023 19:40 (two years ago)

it's hard for me to see how you could draw a legal distinction between training an AI on a bunch of music and the normal process by which a human writes music in part by synthesizing ideas from music they've listened to.

I'm somewhat sympathetic to this argument, but since AI can ingest and spit out music at an incredible rate compared to a human, it doesn't feel equivalent. And since this is likely to lead to devaluing human-made music (at least in certain areas, like commercial and soundtrack music) even more than it's already be devalued, maybe not?

― Random Restaurateur (Jordan), Monday, 5 June 2023 20:01 (two years ago)

ya for instance I think Utopia's "Deface the Music" is fair game and shouldn't have to pay any royalties but if you prompted an AI to write a bunch of "Beatles-like" songs and released the result that should not be kosher

― frogbs, Monday, 5 June 2023 20:03 (two years ago)

― Random Restaurateur (Jordan), Monday, June 5, 2023 3:01 PM (eight minutes ago) bookmarkflaglink

I mean yeah but there's really nothing in the current legal framework to deal with this. And it's also hard to conceive of how you would compensate musicians for it.

― longtime caller, first time listener (man alive), Monday, 5 June 2023 20:11 (two years ago)

which isn't to say people shouldn't try to come up with something, I just don't think any existing royalty type framework is usable

― longtime caller, first time listener (man alive), Monday, 5 June 2023 20:12 (two years ago)

sounds like something for actual legal experts to parse and negotiate

or perhaps a lawyer AI

― mh, Monday, 5 June 2023 20:23 (two years ago)

Yeah, idk how royalties would work, maybe it depends on the size of the training set (ie you get basically nothing if you're part of a massive training set, but you get something if an AI is trying to copy a more narrow set of artists?). I'd gladly take an opt-in structure where the burden is on AI companies to get consent, and get a big ol' fine if they're found to have skipped that bit.

― Random Restaurateur (Jordan), Monday, 5 June 2023 20:34 (two years ago)

I liked this piece:

wrote about AI turning everyone into 'creators' and the end of endings https://t.co/ZXzPwxu01m pic.twitter.com/T3WRIYrFRf
— Charlie Warzel (@cwarzel) June 6, 2023

― jaymc, Tuesday, 6 June 2023 15:00 (two years ago)

Uh oh, Alison Goldfrapp is part of the AI hivemind.

https://www.youtube.com/watch?v=qYkFBecIGRo

― Muad'Doob (Moodles), Tuesday, 6 June 2023 22:58 (two years ago)

lol

https://venturebeat.com/ai/senators-send-letter-questioning-mark-zuckerberg-over-metas-llama-leak/

― Allen (etaeoe), Wednesday, 7 June 2023 14:16 (two years ago)

https://www.washingtonpost.com/technology/2023/06/05/chatgpt-hidden-cost-gpu-compute/

― rob, Wednesday, 7 June 2023 17:35 (two years ago)

article goes all over the place. it’s good to raise the issue that there’s a huge resource cost to these things and a ton of physical devices behind the scenes, but I also read this and thought “great, the completely imaginary nvidia stock market shenanigans are going to be worse now”

― mh, Wednesday, 7 June 2023 22:48 (two years ago)

article goes all over the place. it’s good to raise the issue that there’s a huge resource cost to these things and a ton of physical devices behind the scenes, but I also read this and thought “great, the completely imaginary nvidia stock market shenanigans are going to be worse now”
― mh, Wednesday, June 7, 2023 6:48 PM (yesterday) bookmarkflaglink

It’s also _rapidly_ changing. This is why I’m surprised by NVIDIA’s success. Yes, everyone is buying A100s as fast as they can be built (myself included), but everyone is also actively jumping ship. Google, Meta, and Microsoft already fabricate their own devices that use a fraction of the energy than GPUs.

I also don’t think many people know about advances in optical computing. It’s entirely possible to build _entirely passive_ accelerators. I don’t think we’ll see entirely passive devices ship to consumers but I’d bet anything we’ll see optical-electrical accelerators in popular consumer devices in a few years that use very little energy (and certainly wouldn’t increase existing energy consumption).

― Allen (etaeoe), Thursday, 8 June 2023 14:10 (two years ago)

I'm skeptical that AI can do anything better than our natural intelligence. We don't understand our intelligence enough. We don't understand the brain enough, our bodies etc.

We don't have to get mired in that question to be excited about what we as centaurs could do
— Holly Herndon (@hollyherndon) June 7, 2023

― Allen (etaeoe), Thursday, 8 June 2023 14:16 (two years ago)

I touched one of the machine with 8 A100s in it a couple weeks ago!

NVidia's thrown so much money into marketing and supporting software libraries and frameworks to lock people into their ecosystem and to my understanding, a bunch of things people are doing aren't even necessarily a great fit for the hardware. I've gotten the impression they're doing the best to provide free or incentivized resources up and down the academic and research pipeline to further lock people into CUDA, etc.

I haven't read up on the optical computing field but that sounds promising.

― mh, Thursday, 8 June 2023 14:20 (two years ago)

xp oof

― mh, Thursday, 8 June 2023 14:21 (two years ago)

Oh, your natural intelligence is as an object rotator? I'm a protein folder. I'm just built different

― mh, Thursday, 8 June 2023 14:22 (two years ago)

I’m sure nvidia is going to be developing specialized AI chips going forward, and they have some of the best chip engineers in the world no?

― 龜, Thursday, 8 June 2023 15:52 (two years ago)

I cursed myself by posting about this because a meeting mere minutes ago devolved into a tangent about all the different companies trying to entice my coworkers on to different compute platforms

but the kicker is the ones specifically pitching themselves not as compute platforms, but as domain-specific solutions

with the caveat that I'm definitely not in the pharmaceutical space, this is the kind of thing nvidia is pitching:
https://www.nvidia.com/en-us/gpu-cloud/bionemo/

― mh, Thursday, 8 June 2023 16:06 (two years ago)

Skipping 5827 messages at this point... Click here if you want to load them all.

You must be logged in to post. Please either login here, or if you are not registered, you may register here.