mission: make robots actually fear own deathside effect: creation of procreationql imperative and impulse.solution: imprisonment and capital punishment.hey i’m only human.born to— nah
― a single gunshot and polite applause (Hunt3r), Sunday, 18 February 2024 15:10 (eight months ago) link
they should’ve just said that the chatbot that provided the wrong advice has been.. terminated
― scanner darkly, Sunday, 18 February 2024 18:45 (eight months ago) link
i'm deodanding better than I ever did
― kinder, Sunday, 18 February 2024 23:11 (eight months ago) link
https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/af79fdb5-5696-4095-bb60-682d45bcd45c/Screenshot_2024-02-19_at_2.52.45_PM.png?t=1708365198
― 𝔠𝔞𝔢𝔨 (caek), Monday, 19 February 2024 20:03 (eight months ago) link
an agile Indian police officer with really weird hands...
― more difficult than I look (Aimless), Monday, 19 February 2024 20:28 (eight months ago) link
that's... a real movie
― gbx, Monday, 19 February 2024 21:41 (eight months ago) link
https://www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai/I'm interested in this "data poisoning tool" - could it be easily overridden?
― kinder, Wednesday, 21 February 2024 13:56 (eight months ago) link
I can't imagine the AI people won't find a way around data poisoning as it currently is, but if the data poisoning people keep at it and keep finding new ways to poison the data, it could become an interesting arms race.
― silverfish, Wednesday, 21 February 2024 14:38 (eight months ago) link
If I understand their aim correctly, it’s obfuscation. I assume it’ll work in the sense that nobody will bother to mitigate the obfuscation. Nevertheless, I don’t think it matters from a “foundational model” perspective, unless model architectures dramatically change, as performance has been seeing diminishing returns from increasingly large dataset sizes.
― Allen (etaeoe), Wednesday, 21 February 2024 14:47 (eight months ago) link
there's a twitter link uothread 3 weeks ago saying nightshade is easily defeated.
― ledge, Wednesday, 21 February 2024 15:01 (eight months ago) link
oh yeah sorry
― kinder, Wednesday, 21 February 2024 16:02 (eight months ago) link
starting to hate this shit - this was literally the second result when looking up this song (first was a link to the song itself via YouTube)
https://oldtimemusic.com/the-meaning-behind-the-song-all-of-a-sudden-its-too-late-by-xtc/
thanks for your insight, "Guillermo Berg"
― frogbs, Thursday, 22 February 2024 02:56 (eight months ago) link
chatgpt is apparently going off the rails right now and no one can explain why pic.twitter.com/0XSSsTfLzP— sean mcguire (@seanw_m) February 21, 2024
― Beyond Goo and Evol (President Keyes), Thursday, 22 February 2024 03:08 (eight months ago) link
oh noes! our brilliant utopian future appears to be imperiled! or postponed!!!!
― more difficult than I look (Aimless), Thursday, 22 February 2024 04:11 (eight months ago) link
sounds like someone's been feeding it Stanley Unwin in the middle there.
― koogs, Thursday, 22 February 2024 10:34 (eight months ago) link
the cogs might get a bit whimsical...
― koogs, Thursday, 22 February 2024 11:27 (eight months ago) link
unperson will be glad to know it’s a fan of Bill Evans
― butt dumb tight my boners got boners (the table is the table), Thursday, 22 February 2024 11:58 (eight months ago) link
"enjoy the stages of ether in music time" should be the subtitle for Rolling Jazz 2024.
― il lavoro mi rovina la giornata (PBKR), Thursday, 22 February 2024 12:25 (eight months ago) link
This development is gladdening my clicklies
― B. Amato (Boring, Maryland), Thursday, 22 February 2024 13:09 (eight months ago) link
what did it say, I can't see it
― kinder, Friday, 23 February 2024 16:57 (eight months ago) link
https://old.reddit.com/r/ChatGPT/comments/1avydjd/anyone_else_experiencing_chatgpt_losing_it/
some more examples here (original has been deleted)
― koogs, Friday, 23 February 2024 18:45 (eight months ago) link
nice!
Me: Compare and contrast the Gregory Zulu 55 Backpack and Gregory Tetrad 60L Travel Pack.ChatGPT4: In a significant area of customer support, ergonomics, and subjective nature of 'goods', each bag conjures a fable of a time and place. Gregory's Zulu 55 is a guild-like lilt to the green-gilded high book way of state and yearn, while the Tetrad 60L conceives more of a 'pathless traveled' countenance with gnarls and swathes to the dunes and granes or the chip kiosk and the long, drawn shutter. Each has a letter of love for the open blue, but their chant is unique to their locker and book.Me: wtf
ChatGPT4: In a significant area of customer support, ergonomics, and subjective nature of 'goods', each bag conjures a fable of a time and place. Gregory's Zulu 55 is a guild-like lilt to the green-gilded high book way of state and yearn, while the Tetrad 60L conceives more of a 'pathless traveled' countenance with gnarls and swathes to the dunes and granes or the chip kiosk and the long, drawn shutter. Each has a letter of love for the open blue, but their chant is unique to their locker and book.
Me: wtf
― more difficult than I look (Aimless), Friday, 23 February 2024 20:11 (eight months ago) link
free real time (as you type!) image generation. wtf. https://fastsdxl.ai/
― ledge, Monday, 26 February 2024 09:08 (eight months ago) link
(types in 'a hand') ok you get what you pay for
― ledge, Monday, 26 February 2024 09:14 (eight months ago) link
xxp https://www.cheatsheet.com/wp-content/uploads/2021/10/Seinfeld-1.jpg
― assert (matttkkkk), Monday, 26 February 2024 10:34 (eight months ago) link
pathless traveled
this is pretty amazing tbh
― Lavator Shemmelpennick, Monday, 26 February 2024 14:42 (eight months ago) link
Literally why can AI not be taught to do hands? I typed in “Shakespeare’s hand” and got nightmare fuel
― It was on a accident (hardcore dilettante), Friday, 1 March 2024 05:57 (eight months ago) link
I was surprised to find out that what I had taken to be a standing AI process was still being done by humans. I thought subtitling in a Zoom or live event was now mainly AI. I'd been at a talk on memory and tried to make an analogy between glitching subtitle transcription and slight errors in memory. As in memory not being quite as accurate as believed. I later found out that the subtitling appeared to be done by a person in real time. I'd already been aware of the old UN practise of using spontaneous translators who were mainly female. So as a speaker spoke one language delegates who spoke another would get the translation in their earpiece. Found the subject interesting. But had just assumed subtitling was done by program. Though possibly the idea that you could see things bring corrected might go against that. I do enjoy watching the Chinese whispers of synonymical phrases being substituted for what the speaker is saying.Did just leave me wondering if the human transceiver is a standing thing or what ratio of human to computer program transcription there was.
― Stevo, Friday, 1 March 2024 06:20 (eight months ago) link
I think I used the word synonymical for homophone. It's 2 words that sound alike but are spelt differently that you get frequently. Plus an element of trying to transcribe a little ahead of oneself to keep simultaneous to the speaker. Or with as little lag as possible. I thought the lack of comprehension of subject being talked about was an indication of AI but probable that a human transcriber is not can expert in the field either.
― Stevo, Friday, 1 March 2024 06:27 (eight months ago) link
Hi Stevo, area of expertise alert.
'Captioning' refers to same language services, usually defined as for the deaf and hard of hearing, though more generally used by much more widely (neurodivergent audiences, second language users, anyone in an acoustically contended environment, standard gen z usage etc)
Subtitling in most markets (though confusingly not the UK) tends to refer to translation subtitles.
Finally, everything I say below is mainly covering real time services. it applies to pre-recorded content as well, but expectations are often much higher here (errors have always been a factor in real time services). Caption 'files' (rather than streams) are often archived or repeated, or used as authoritative data sources, the required level of accuracy is higher. Feature films will expect these to be as accurate as possible and require a high level of human QC or origination. Same things apply, but will happen a bit more slowly.
**
Zoom captioning is nearly always done automatically. It is possible to have a workflow with a person generating the captions - for high profile or otherwise important meetings businesses will still occasionally specify the need for a person to be generating the captions. But the default is automatic captioning.
Human captioning can be done in a number of ways. Using speaker dependent speech recognition (eg Dragon Naturally Speaking) to 'respeak' the relevant audio content of a programme has been the norm for a while in many markets (since ~2005 in the UK) due to the lack of a large pool of stenographers (often seen as synonymous with 'court reporters' - you'll recognise the machine they use). (For a long time the only remaining stenography college in the British Isles was in Eire. In The US stenography has persisted for much longer due there being more stenographers - each state had at least one school. That's changed a fair bit at least in part because they tend to be more costly than using respeaking without (depending on who you talk to) that much or any difference in accuracy.
These days in the US a mixture of stenography and respeaking will still be used for high profile network and broadcast tv.
Automatic Speech Recognition (not individual speaker dependent) is increasingly used to generate captions (it's what you'll get by default if you switch on youtube captions for instance). It can be very good, and these days for much content, appropriately managed, it can be as good if not better than human generated captions (you mention the example that human captioners won't always be domain experts, which is entirely correct, latency is frequently better with automation). Vocabulary, acoustic content, people talking over each other, music, are all points of vulnerability, but are either overcome or will be solved soon (cross-talk is one of the harder constraints).
Factors that affect what you see will be the budget of the organisation sourcing the captions, the priority/exposure (regulatory or brand) of the content, suitability of the content for automation.
So to answer your 'ratio' question - it depends what content you're watching. Given the massive amount of user generated content (UGC) these days, you're probably looking at a very high percentage of automation. For premium television, you're probably looking at humans predominating 70/30, with this changing rapidly. For things like Parliamentary or UN proceedings you're often looking at 100% human driven captioning, though if they want to increase reach by covering side committees etc, automation is the obvious answer to avoid breaking their budget. I don't know what CSPAN uses currently though.
AI relevant bit: A couple of years ago it felt like we were hitting some hard constraints on Automatic Speech Recognition, which had made significant progress beyond trigram probability analysis (where the probability of the next word conditioned on the previous two words is estimated from a large corpus of text) with the use of neural networks to generate much more accurate probabilistic/predictive outcomes. The constraints were largely around accurately assessing the intent of a speaker and the wider context of the situation to accurately generate the correct word - ie having enough understanding of meaning to support accurate predictions of what was said next. In the space of two years Large Language Models have blown through a constraint that many (including myself) believed would be persistent and intractable for a long time ('machines don't get meaning' - roughly). Within the constrained space of captioning, computational ability to follow and constitute 'meaning' to a very high degree of probability has been overcome.
Short to medium term nearly all captions you see will be generated by full automation.
The same dynamics affect translation, but completely automatic translation suffers from compound effects - that is to say you need an initial ASR pass to turn the speech into same language text, and then to apply a Machine Translation (MT) pass - so errors generated in the first pass are persisted through to the translation process with new errors introduced.
The very high quality of same language automatic transcription these days and similarly rapid advances (same technologies basically) in MT mean fully automatic translation is extremely good these days. This doesn't mean there aren't difficult areas of judgment in translation which if you were translating a novel say you'd want a translator to think about. But it's about use cases, and the extremely high proportion of the time AI is good enough, or more than good enough.
Parliaments and global political bodies tend still to use simultaneous translators for earpiece services due to the importance of accurately conveying information in real time to delegates/ministers etc etc, but again, this feels like a fairly AI-tractable problem in the mid term.
Worth going in depth here because I think it's a very good example where LLMs have made a massive, tangible difference and AI technologies have completely overturned an industry, in many ways for the better - that is to say lower cost services mean it's easier to provide accessibility or localisation on much much more content. Subtitlers/translators/captioners etc will obviously feel differently, but in terms of the actual people who use it, it's largely been beneficial - usual caveats though around these technologies frequently being used inappropriately, and consituting a degradation to services that previously had people behind it - this is a genuine problem and shouldn't be at all dismissed.
― Fizzles, Friday, 1 March 2024 08:44 (eight months ago) link
Couple of other caveats:
It continues to make sense to have people *somewhere* in the workflow for many use cases, and probably will do for a while (ie 2-5 years), at a 'tuning' or optimisation and QC level.
I think it's the case that some people and organisations would prefer to have people producing the transcription *even if it is worse*. And although this is difficult to quantify, I'm not sure they're wrong. A strong example is in legal judgments - if you assume the law is basically an algorithm, and if you plug in the information that human judges are *extremely* variable in their judgments, you can reasonably (imv) get to the position where it is preferable to have some sort of computational judgment system. But I think most sane people are revolted by the idea. Maybe that's just a hangover from sci-fi depictions of this sort of world. Maybe it's on the Chestertonian principle that voting (or legal adjudication in this case) is like writing love letters, you would rather it were done badly by the person in love, than done better by someone else (there is of course Edmond Rostand play on this). Perhaps though it is a function of being in a transitional period. The obvious end point is to have the automation fully baked into whatever audio/video technical workflow you've got. But at that point it becomes an invisible and 'unaccountable' technology. While there is some risk exposure for the customer, you *prefer* to have a human somewhere in the chain to provide some sort of explanatory function and in-the-moment accountability. Cautious organisations and businesses will move slowly to automation for this reason (see Air Canada).
If you're a broadcaster with any sort of public service or national remit to provide audience reach, whether regional or demographic (diasporas, indigenous peoples) the unevenness of data availability means that those accents and languages will be less well served, meaning those people or peoples will be less well represented on screen and get a second class service. This problem is easily solvable computationally, but less so commercially and in terms of focus.
― Fizzles, Friday, 1 March 2024 09:19 (eight months ago) link
thanks.I shou.ld know the word captioning by now. Not sure how conscious I was of the differentiation though.This was a talk with a remote film maker in a local cinema during a festival with some focus on diversity and disabled rights. So maybe I should have been aware that captioning would have a priority like that. Not sure where the captioners were since I didn't see them in the cinema with us. Assume they would need a handheld keyboard or something to type into.
― Stevo, Friday, 1 March 2024 10:20 (eight months ago) link
Most captioning is done remotely. How did you know it was a person? (Obv both people and AI make mistakes, but it's possible to tell the difference). If a person if would either be a stenography machine or someone respeaking (no keyboard).
Or for the sake of completeness, the European *velotype* machine (you still see it in Dutch and Nordic markets I believe)
https://www.youtube.com/watch?v=8dNl2318aY4
― Fizzles, Friday, 1 March 2024 10:32 (eight months ago) link
Thanks Fizzles, that's so interesting! Terminology is annoying - Instagram also use 'captions' to mean the description of the ... post? Whatever you call a 'gram.Does anyone use AI to generate the first pass and a human to read over and correct contextual errors, or is that too much resource? Or am I wrong in assuming it's quicker to read a stream of text and flag corrections than for a stenographer to just type it in the first place?
― kinder, Friday, 1 March 2024 10:46 (eight months ago) link
I think after I'd commented one of the mcs thanked the group who were doing the captions. Or at another event in the same festival. I think I realised there was a person doing it sometime during the same week anyway.
― Stevo, Friday, 1 March 2024 11:59 (eight months ago) link
Does anyone use AI to generate the first pass and a human to read over and correct contextual errors, or is that too much resource? Or am I wrong in assuming it's quicker to read a stream of text and flag corrections than for a stenographer to just type it in the first place?
AI + human correction happens all the time for captioning and subtitle translation. The key question has always been exactly what you say, kinder - is it going to be quicker for someone to 'originate' these captions/subtitles or correct an ASR file? Content will get channelled through the relevant resource accordingly, either based on ASR confidence scores or on content type.
More and more it's quicker to use AI, and LLMs have enabled substantial improvements in identifying the sort of errors it's easy to miss (and certainly wouldn't be caught by a standard spellcheck), eg a name spelled differently one time from all the other times in a transcript etc.
Again, there are trade offs at the premium end - correcting a text is a different process from assessing it from scratch and making editorial decisions that reflect the creative or editorial intent of the content from the beginning.
― Fizzles, Friday, 1 March 2024 12:14 (eight months ago) link
I'm inferring that this is better suited for situations like speeches to the UN where the text is, basically, more boring? Worse on stereotypical beatnik poetry?
As a world-class boring mumbler, I'm heartened.
Boring as in "_everyone_ already knows what your next word / sentence / paragraph will be", but also boring as in monotone? It's a stereotype, but I imagine it also makes it easier for the human translators?
― Andrew Farrell, Friday, 1 March 2024 12:35 (eight months ago) link
excellent posts, thank u fizzles
― z_tbd, Friday, 1 March 2024 15:53 (eight months ago) link
― Andrew Farrell, Friday, March 1, 2024 12:35 PM bookmarkflaglink
I mean, certainly I imagine beatnik poetry would present more problems than more standardised sentences, though the main challenge is usually acoustics - if it was clear no reason it wouldn't get the words. Presenting them appropriately would be a different matter.
And ofc I hadn't picked up on the main UN thing, which is earpiece simultaneous translation is a spoken service. synthetic voice services are extremely good these days - but it's another layer of automation. longer term, environments of interacting specialist LLM/GenAI modules will produce extremely good outputs for different elements of a service (recognising non-verbal relevant acoustic information for example, or capturing lyrics from songs)
Domain specific and formal or instituational vocabulary are the main challenge in say the UN, but yes as a much more structured environment it feels extremely tractable to automation. It's been a while since I looked at UN service requirements but I'd be astonished if automation wasn't there somewhere in the mix (low level committees etc)
And yes! generally if something's easy for a person it will be easy for automation - if you're struggling to understand someone, the machine will as well. Where people have difficulties is in recollecting long strings of data or numbers for instance, with new speech data coming in - extremely computationally taxing on the short-term memory. Machine processes are much worse at editorial decisions though - generally they handle disfluencies (um, ah) pretty well these days, but choosing which clauses or bits of speech to take out to allow for readability is much more challenging. Automation will generally get *everything* for better or worse; in many cases completeness is a virtue or requirement, but not always.
― Fizzles, Friday, 1 March 2024 16:59 (eight months ago) link
https://i.imgur.com/dG5bBAy.png
― lag∞n, Friday, 1 March 2024 17:04 (eight months ago) link
lolI mean
ollss
― kinder, Friday, 1 March 2024 17:06 (eight months ago) link
resopionsics
the practice of reply opinions, very important for online tech
― lag∞n, Friday, 1 March 2024 17:18 (eight months ago) link
scadillity is often undervalued in a bot, but i select only high-scadillity models
― a single gunshot and polite applause (Hunt3r), Friday, 1 March 2024 17:54 (eight months ago) link
I thought that said, resopsionics, which sounds awesome.
― il lavoro mi rovina la giornata (PBKR), Friday, 1 March 2024 18:58 (eight months ago) link
I know we all struggle to emphasize the Flecalibibtss in our daily lives.
― Maxmillion D. Boosted (jon /via/ chi 2.0), Friday, 1 March 2024 18:59 (eight months ago) link
Oh no someone's tripped while carrying a plate of Fallout past ChatGPT's input hopper.
― Andrew Farrell, Friday, 1 March 2024 20:35 (eight months ago) link
has anybody used Whisper for dictation? there's a Mac app that will run it locally - https://goodsnooze.gumroad.com/l/macwhisper
it's amazingly good. the built-in Apple dictation engine is like, just basically pathetic next to it, especially if you're doing multi-person transcription. i really can see how stuff like subtitles is being thoroughly transformed (cf Fizzles' great explanation earlier). with MacWhisper you can export the file as a subtitles file - and you can even do an automatic translation of it
this is the sort of thing I mean a few months ago when i tried to counter the narrative that AI is actually just a bunch of hooey - i really don't think it is, for tasks like this, where suddenly there is this just far superior way of doing something repetitive and tedious
― Humanitarian Pause (Tracer Hand), Saturday, 9 March 2024 22:57 (eight months ago) link
That's an excellent example of a very sophisticated computerized tool for automating a nearly mindless task, but calling it "artificial intelligence" seems like the wrong box to put it in. It's just massaging input and spitting out another version of it. A much cruder example of the exact same idea is converting html into bbcode.
― more difficult than I look (Aimless), Saturday, 9 March 2024 23:42 (eight months ago) link
it is AI - that's my point, is that it is literally using a neural net in order to be this good. it is exactly the right box to put it in. the actual useful things that AI is going to accomplish for us are going to be things like this imo, not, like, i dunno mixing martinis and starting nuclear wars or whatever it is that people think AI is
https://openai.com/research/whisper
― Humanitarian Pause (Tracer Hand), Sunday, 10 March 2024 00:01 (eight months ago) link
it is literally using a neural net in order to be this good
it's a powerful technique, but I think calling it a "neural" is misleading. a "neural net" doesn't involve neurons. this would be irrelevant, except for the fact that while research neurologists can examine neurons down to a molecular level, they can't actually tell you how neurons produce intelligence. otoh, a computer scientist can describe how a "neural net" is constructed and how it functions to produce its outputs. so, the presumption that "neural nets" function using an identical model to organic neural activity can't be substantiated.
but, yeah, that software is amazingly good at the job it was programmed for.
― more difficult than I look (Aimless), Sunday, 10 March 2024 00:18 (eight months ago) link