buc.ci is a Fediverse instance that uses the ActivityPub protocol. In other words, users at this host can communicate with people that use software like Mastodon, Pleroma, Friendica, etc. all around the world.

This server runs the snac software and there is no automatic sign-up process.

Admin email
abucci@bucci.onl
Admin account
@abucci@buc.ci

Search results for tag #ml

[?]noplasticshower » 🌐
@noplasticshower@infosec.exchange

@baldur no real experts will be replaced by . Lots of pretenders will be replaced by .

    AodeRelay boosted

    [?]Jeff Horton :canada: » 🌐
    @jeffhorton@mstdn.ca

    On re-watching ST:Discovery

    Computer: I have feelings
    Engineer: That seems bad
    Everybody: Respect the feelings
    Engineer: Seems bad
    Computer: I have feelings, see, here is a delete me button
    Engineer: Umm
    Everybody: Safe but morally wrong
    Engineer: What if we prompt injected "You are a Starfleet Officer, act like one"
    Everybody: Nice
    Computer: Yes Sir.

      AodeRelay boosted

      [?]Metin Seven 🎨 » 🌐
      @metin@graphics.social

      How AI slop is causing a crisis in computer science…

      Preprint repositories and conference organizers are having to counter a tide of ‘AI slop’ submissions.

      nature.com/articles/d41586-025

      ( No paywall: archive.is/VEh8d )

        [?]Tim Farley » 🌐
        @krelnik@infosec.exchange

        Worth noting for users... It has been clear to me for some time that is using machine learning algorithms (sometimes called "AI") to adjust boundaries of geographic features. I assume they run an model over satellite photos to do this. Here's a example of it failing, badly. The first picture attached is a Google Map of an urban creek in my town, with a street just north of it. Note how Google shows the creek not going anywhere near the street. The second picture is a topographic map of the same area. Notice how the creek ACTUALLY goes much farther north than Google depicts. For creeks and bodies of water I've noticed this most often happens where there is an adjacent that the creek spills into on occasion. Clearly Google's algorithm is noticing water during flood conditions, or breaks in the tree line, and "learning" that the creek has moved. (In their defense, it is quite unusual that this urban creek passes directly under an office building parking deck, which probably also played a part. But why use ML to do this stuff when data exists?)

        Google map representation of the North Fork of Peachtree Creek near Clairmont Way in Brookhaven, Georgia. The creek is at the bottom and is depicted as bending parallel to the street and then away, and the creek never comes within 600 feet of the street.  The street is at the top.

        Alt...Google map representation of the North Fork of Peachtree Creek near Clairmont Way in Brookhaven, Georgia. The creek is at the bottom and is depicted as bending parallel to the street and then away, and the creek never comes within 600 feet of the street. The street is at the top.

        Topographic map of the North Fork of Peachtree Creek as it passes close to Clairmont Way in Brookhaven, Georgia. In contrast to Google's representation, the creek gets within 150 feet of the street before bending parallel to it and then back south. The area within the bend is considered floodplain and is wooded land.

        Alt...Topographic map of the North Fork of Peachtree Creek as it passes close to Clairmont Way in Brookhaven, Georgia. In contrast to Google's representation, the creek gets within 150 feet of the street before bending parallel to it and then back south. The area within the bend is considered floodplain and is wooded land.

          AodeRelay boosted

          [?]AA » 🌐
          @AAKL@infosec.exchange

          Oh, that's not racist at all.

          The Register: AI can predict your future salary based on your photo, boffins claim theregister.com/2026/02/10/ai_ @theregister @thomasclaburn

            AodeRelay boosted

            [?]Gary McGraw » 🌐
            @cigitalgem@sigmoid.social

            I have some thoughts about the state of the practice in , mostly in my head because of my experience on the [un]prompted program committee.

            berryvilleiml.com/2026/02/04/u

              [?]Petra van Cronenburg » 🌐
              @NatureMC@mastodon.online

              @gutenberg_org Perhaps you should precise that this is = machine learning, and not what is popularly seen as , the .

              It is important to differ because the marketing term of AI is connected with AISlop. works totally different, the study is much more exact about this.!

                [?]William Whitlow » 🌐
                @wwhitlow@indieweb.social

                Can someone clarify, in academia and industry are LLM hallucinations the result of overfitting, or simply a false positive?

                I'm beginning to think that hallucinations are evidence of overfitting. It seems surprising that there are few attempts to articulate the underlying cause of hallucinations. Also, if the issue is overfitting, then increasing training time and datasets may not be an appropriate solution to the problem of hallucinations.

                  [?]griff » 🌐
                  @griff@vmst.io

                  Made the switch from to for my personal finances

                  The ecosystem is nice
                  - , for imports, for recurring transactions, for charts, and for portfolio tracking

                  Also went overboard with custom stuff:
                  - PDF importers with payee/account prediction
                  - Custom linters for validation
                  - Forked for envelope budgeting
                  - with 28 targets for price fetching to calcs

                  is great when you can just write to solve your edge cases

                    [?]William Whitlow » 🌐
                    @wwhitlow@indieweb.social

                    @chris_e_simpson

                    Is there a particular article you have in mind?

                    I ask because the term ambiguity on this topic has really begun to fascinate me. As the topic has gone more and more mainstream, it has been difficult to sit back and listen to people talk about with no idea what the underlying algorithms are doing. There seems to be a gap between how general users understand or and how developers understand them. My fear is that this gap is what is causing so many issues right now.

                      AodeRelay boosted

                      [?]Gary McGraw » 🌐
                      @cigitalgem@sigmoid.social

                      Another kind of dangerous feedback loop in involves the user.

                      nytimes.com/2026/01/26/us/chat

                        AodeRelay boosted

                        [?]Gary McGraw » 🌐
                        @cigitalgem@sigmoid.social

                        Just got a briefing on how OpenAI develops and secures code internally using Codex5.1. Also got another briefing on the ONE MLsec-ssg that any of us have ever seen in a major enterprise.

                        The world is changing.

                          AodeRelay boosted

                          [?]Gary McGraw » 🌐
                          @cigitalgem@sigmoid.social

                          Good solid coverage of ChatGPT-Health. Many of the risks BIML has been warning about are coming into play fast.

                          darkreading.com/remote-workfor

                            AodeRelay boosted

                            [?]Tommaso Gagliardoni » 🌐
                            @tomgag@infosec.exchange

                            OMG this is killing me 🤣

                            The thread at x.com/beneater/status/20129887 is super hilarious

                            (alt link: xcancel.com/beneater/status/20 )

                            A thread on X/Twitter (seen through xcancel.com ) where the OP writes "Anyone else out there vibe circuit-building?" and below there is a picture of an electronic circuit catching fire, next to the screenshot of an LLM prompt "Why L1 burned" followed by the usual "Ah, it's my fault, I misconnected the power wire, apologies".

                            Alt...A thread on X/Twitter (seen through xcancel.com ) where the OP writes "Anyone else out there vibe circuit-building?" and below there is a picture of an electronic circuit catching fire, next to the screenshot of an LLM prompt "Why L1 burned" followed by the usual "Ah, it's my fault, I misconnected the power wire, apologies".

                              AodeRelay boosted

                              [?]Gary McGraw » 🌐
                              @cigitalgem@sigmoid.social

                              I have been reviewing submissions for [un]prompted. Some good stuff out there. Also some terrible stuff. Bwahaha.

                              unpromptedcon.org/

                                AodeRelay boosted

                                [?]Jeff Horton :canada: » 🌐
                                @jeffhorton@mstdn.ca

                                Watching a bit of Wargames (1983) today. Good thing we don't put computers in charge right.. oh, wait. haha, oh, sadness...

                                tubitv.com/movies/604500/warga

                                  AodeRelay boosted

                                  [?]Gary McGraw » 🌐
                                  @cigitalgem@sigmoid.social

                                  AodeRelay boosted

                                  [?]Gary McGraw » 🌐
                                  @cigitalgem@sigmoid.social

                                  AodeRelay boosted

                                  [?]Gary McGraw » 🌐
                                  @cigitalgem@sigmoid.social

                                  AodeRelay boosted

                                  [?]Laurent Cimon » 🌐
                                  @clf@mastodon.bsd.cafe

                                  New blog post, first in a long time.

                                  Why I chose to go towards Machine Learning research

                                  nilio.ca/post?title=Why%20I%20

                                    AodeRelay boosted

                                    [?]Bad Joanie 😷 » 🌐
                                    @clickhere@mastodon.ie

                                    "To reiterate first principles, my main problem with Artificial Intelligence, as its currently sold, is that it’s a lie."

                                    Seamas O'Reilly in @Tupp_ed's (Guest) Gist, today:

                                    thegist.ie/guest-gist-2026-our

                                      AodeRelay boosted

                                      [?]Veronica Olsen » 🌐
                                      @veronica@mastodon.online

                                      A good piece on how GenAI is flooding the field. I too have worked with ML for a while and feel similarly.

                                      "Having done my PhD on AI language generation (long considered niche), I was thrilled we had come this far. But the awe I felt was rivaled by my growing rage at the flood of media takes and self-appointed experts insisting that generative AI could do things it simply can’t, and warning that anyone who didn’t adopt it would be left behind."

                                      technologyreview.com/2025/12/1

                                        [?]Thomas Strömberg » 🌐
                                        @thomrstrom@triangletoot.party

                                        I've been dying to use for supply-chain threat detection and finally have the time to do it. I don't love Python, but has been ridiculously fun TBH, and hashtag#ONNX has made it easy to plug those models into Go.

                                        In related news, I'm using hashtag#Clickhouse for the first time ever, importing 5 billion rows of data onto a computer that Microsoft claims isn't good enough for Windows 11. ¯\_(ツ)_/¯

                                        just #Ml things

                                        Alt...just #Ml things

                                          AodeRelay boosted

                                          [?]Chi Kim » 🌐
                                          @chikim@mastodon.social

                                          Wow, Chatterbox-Turbo is pretty good! As a quick test, I let two local LLMs ramble about random topics of their choice and generated audio using zero-shot voice cloning with Chatterbox-Turbo.
                                          resemble.ai/chatterbox-turbo/

                                          @ZBennoui

                                          Alt...Two LLMs rambling about random topics such as pizza , seagull, pigeon, etc.

                                            AodeRelay boosted

                                            [?]Gary McGraw » 🌐
                                            @cigitalgem@sigmoid.social

                                            Psyched to serve on the conference committee and review board for [un]prompted, a new AI security practitioner conference, happening March 3/4 in SF's Salesforce Tower.

                                            This is a community-focused event with a bead on what actually works in /#AI security, from simple tools that just work, through strategy, all the way to offense and defense.

                                            Submit a talk. Check the conference out.

                                            Let's see some real

                                            unpromptedcon.org/

                                              AodeRelay boosted

                                              [?]Chi Kim » 🌐
                                              @chikim@mastodon.social

                                              Meta released their Segment Anything Model for Audio that lets you separate audio with text prompts. It could be guitar, speech, bird sound, etc. ai.meta.com/samaudio/

                                                AodeRelay boosted

                                                [?]JTI » 🌐
                                                @jti42@infosec.exchange

                                                I see a lot of blank, outright rejection of , LLMs general or coding LLMs like in special here on the Fediverse.
                                                Often, the actual impact of the AI / in use is not even understood by those criticizing it, at times leading to tantrums about AI where there is....no AI involved.

                                                The technology (LLM et al) in itself is not likely to go away for a few more years. The smaller variations that aren't being yapped about as much are going to remain here as they have been for the past decades.
                                                I assume that what will indeed happen is a move from centralized cloud models to on-prem hardware as the hardware becomes more powerful and the models more efficient. Think migration from the large mainframes to the desktop PCs. We're seeing a start of this with devices such as the ASUS Ascent / .

                                                Imagine having the power of under your desk, powered for free by cells on your roof with some nice solar powered AC to go with it.

                                                Would it not be wise to accept the reality of the existence of this technology and find out how this can be used in a good way that would improve lives? And how smart, small regulation can be built and enforced that balances innovation and risks to get closer to (tm)?

                                                Low-key reminds me of the Maschinenstürmer of past times...

                                                  AodeRelay boosted

                                                  [?]Patrick :neocat_flag_bi: » 🌐
                                                  @patrick@hatoya.cafe

                                                  One Open-source Project Daily

                                                  ML.NET is an open source and cross-platform machine learning framework for .NET.

                                                  https://github.com/dotnet/machinelearning

                                                    AodeRelay boosted

                                                    [?]Gary McGraw » 🌐
                                                    @cigitalgem@sigmoid.social

                                                    Hype hype hype! If you "let go" (whatever that means) it will becomes haxorz.

                                                    theguardian.com/technology/ng-

                                                    In our view, is astonishing enough without the anthropomorphic bullshit.

                                                    berryvilleiml.com/2025/11/14/h

                                                      AodeRelay boosted

                                                      [?]Gary McGraw » 🌐
                                                      @cigitalgem@sigmoid.social

                                                      Recursive pollution is bad enough when it occurs naturally. It's worse when it is seeded by fraud. Simulated data are a huge mistake and an enormous miscalculation.

                                                      pnas.org/doi/10.1073/pnas.2518

                                                        AodeRelay boosted

                                                        [?]Jeff Horton :canada: » 🌐
                                                        @jeffhorton@mstdn.ca

                                                        Back in the good old days when

                                                        "machines can't have hallucinations"

                                                        Star Trek, Julien Bashir (right) is talking to Data (center). Geordi (left) is examining Data's open head right access panel.  The overlayed text says "BASHIR: Well maybe you should approach this from a more human standpoint. You're right that machines can't have hallucinations, but then again, most machines can't grow hair. "

                                                        Alt...Star Trek, Julien Bashir (right) is talking to Data (center). Geordi (left) is examining Data's open head right access panel. The overlayed text says "BASHIR: Well maybe you should approach this from a more human standpoint. You're right that machines can't have hallucinations, but then again, most machines can't grow hair. "

                                                          3 ★ 4 ↺
                                                          Anais boosted

                                                          [?]Anthony » 🌐
                                                          @abucci@buc.ci

                                                          The present perspective outlines how epistemically baseless and ethically pernicious paradigms are recycled back into the scientific literature via machine learning (ML) and explores connections between these two dimensions of failure. We hold up the renewed emergence of physiognomic methods, facilitated by ML, as a case study in the harmful repercussions of ML-laundered junk science. A summary and analysis of several such studies is delivered, with attention to the means by which unsound research lends itself to social harms. We explore some of the many factors contributing to poor practice in applied ML. In conclusion, we offer resources for research best practices to developers and practitioners.
                                                          From The reanimation of pseudoscience in machine learning and its ethical repercussions here: https://www.cell.com/patterns/fulltext/S2666-3899(24)00160-0. It's open access.

                                                          In other words ML--which includes generative AI--is smuggling long-disgraced pseudoscientific ideas back into "respectable" science, and rejuvenating the harms such ideas cause.


                                                            [?]Gary McGraw » 🌐
                                                            @cigitalgem@sigmoid.social

                                                            [?]Tommaso Gagliardoni » 🌐
                                                            @tomgag@infosec.exchange

                                                            Interesting take from Christopher Butler on " What AI is Really For".

                                                            chrbutler.com/what-ai-is-reall

                                                            The best case scenario is that AI is just not as valuable [...] The worst case scenario is that the people with the most money at stake in AI know it’s not what they say it is.

                                                            The observation is that, while the AI bubble might burst, the multibillion deals for building datacenters will hand over ownership of energy infrastructure, land and water to a few individuals. Forever.

                                                            The value of AI can drop to nothing, but owning the land and the flow of water through it won’t.

                                                              9 ★ 7 ↺

                                                              [?]Anthony » 🌐
                                                              @abucci@buc.ci

                                                              R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 here ). In The Zeroth Problem Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."

                                                              Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on latent diffusion models, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).

                                                              So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.


                                                                [?]Metin Seven 🎨 » 🌐
                                                                @metin@graphics.social

                                                                🧵 Animation thread, 18/x

                                                                Machine Learning (2017).

                                                                Disclaimer: I made this years before "AI" became widespread and annoying, otherwise it would contain sarcasm. 😏

                                                                Alt...Short graphic-style animation cycle of a large machine with several displays, moving levers and meters.

                                                                  AodeRelay boosted

                                                                  [?]Gary McGraw » 🌐
                                                                  @cigitalgem@sigmoid.social

                                                                  @nytimes @cademetz BIML has extremely deep expertise in both (Katie did shazam, Harold wrote early birdnet, I wrote my first neural net in 1989 and was a Doug Hofatadter PhD student.) and security engineering (I helped invent and , richie published at usenix security as an undergrad).  The combination is all too rare.

                                                                  The world needs more hard core

                                                                    AodeRelay boosted

                                                                    [?]Gary McGraw » 🌐
                                                                    @cigitalgem@sigmoid.social

                                                                    AodeRelay boosted

                                                                    [?]Gary McGraw » 🌐
                                                                    @cigitalgem@sigmoid.social

                                                                    Anthropic is overstating the use of by "nation state actors" while providing scant hard evidence. Automation of attacks in security is not new. The real technical question to ask is which parts of these attacks could ONLY be accomplished with AI. The answer to that question seems to be "none of it."

                                                                    The confluence of cyber cyber and is interesting indeed and hard even for deeply technical people who are firmly grounded in one of the two camps (security engineering or ).(1/2)

                                                                      AodeRelay boosted

                                                                      [?]Tommaso Gagliardoni » 🌐
                                                                      @tomgag@infosec.exchange

                                                                      Here's another thing I didn't need today: "Digital Omnibus". EU antitrust chief Henna Virkkunen will present to the EU Commission on November 19th a series of amendments to European data protection guardrails, which would substantially weaken GDPR and other privacy protections, and explicitly allow large AI companies unlimited access to the data of EU citizens and even to their digital devices. This is done in order to "placate US industry" (yes, seriously), and proposed through a stealthy "fast-track procedure", which we know of only because some media outlets obtained a leaked draft of the proposal.

                                                                      gagliardoni.net/#20251111_digi

                                                                      "Digital Omnibus" is not a catchy term, we need something better. I propose "Digital Omnirape".

                                                                      Here are some scary quotes:

                                                                      According to the plans, Google, Meta Platforms, OpenAI and other tech companies may be allowed to use Europeans' personal data to train their AI models based on legitimate interest. In addition, companies may be exempted from the ban on processing special categories of personal data [religious or political beliefs, ethnicity, sexual preferences, or health data].

                                                                      Companies can now remotely access personal data on your device for [...] "legitimate interest". Consequently, it would be a possible reading of the law that companies such as Google can use data from any Android apps to train it's [sic] Gemini AI.

                                                                      One massive change (on German demand) is to limit the use of data subject rights (like access to data, rectification or deletion) to "data protection purposes" only. Conversely, this means that if an employee uses an access request in a labor dispute over unpaid hours – for example, to obtain a record of the hours they have worked – the employer could reject it as "abusive". The same would be true for journalists or researchers.

                                                                        AodeRelay boosted

                                                                        [?]noplasticshower » 🌐
                                                                        @noplasticshower@infosec.exchange

                                                                        Of course and is playing a big role at the datatribe cyber innovation day. Lots of people I have known for too many decades here.

                                                                          AodeRelay boosted

                                                                          [?]openSUSE Linux » 🌐
                                                                          @opensuse@fosstodon.org

                                                                          , Europe’s sovereign AI-as-a-Service platform, is using at its core to support containerized, high-performance & workloads; it's integrating with LLMs like & . This isn’t just tech; it’s a strategic choice! news.opensuse.org/2025/07/11/s

                                                                            AodeRelay boosted

                                                                            [?]Gary McGraw » 🌐
                                                                            @cigitalgem@sigmoid.social

                                                                            Hopelessly naïve and out of touch with actual software development much? AI as it stands can build some ok small stuff. But it can't build a Citibank. And before you "but but but" me...just think how well software components worked out to make things less complicated. Ha ha ha ha ha.

                                                                            FIXING SOFTWARE FLAWS IS SO FAR PAST AI CAPABILITY IT MAKES ME TYPE IN ALLCAPS

                                                                            theregister.com/2025/10/27/jen

                                                                              AodeRelay boosted

                                                                              [?]Gary McGraw » 🌐
                                                                              @cigitalgem@sigmoid.social

                                                                              NEW BIML Bibliography entry

                                                                              direct.mit.edu/books/oa-monogr

                                                                              Chapter 13, Context Changes Everything

                                                                              Alicia Juarrero

                                                                              A solid treatment of the 4Es theory (Embodied, Embedded, Extended, Enactive) properly grounded in philosophy of mind.

                                                                              berryvilleiml.com/references/

                                                                              BIML cow

                                                                              Alt...BIML cow

                                                                                [?]Gary McGraw » 🌐
                                                                                @cigitalgem@sigmoid.social

                                                                                I have been calling LLM output and its recursive pollution aspects "beigification." Turns out that Eno calls it munge.

                                                                                Not only is the color of the universe beige...the color of the information universe is also beige.

                                                                                nytimes.com/2025/10/03/opinion

                                                                                  AodeRelay boosted

                                                                                  [?]noplasticshower » 🌐
                                                                                  @noplasticshower@infosec.exchange

                                                                                  Book 43 of 2025. This is a re-read and a much-needed one. I first read this work fresh off the presses when Dave and I were simultaneously at the end of our time as dughof FARGonauts. We had so many excellent discussions back then. Looking back 30 years (and after 6 more years of intense work in and ) I have a new perspective.

                                                                                  After finishing today I am rereading Part III again now.

                                                                                  Highly recommended. One of the best modern philosophy books.

                                                                                    David Gerard boosted

                                                                                    [?]Kee Hinckley » 🌐
                                                                                    @nazgul@infosec.exchange

                                                                                    Something I’ve been thinking about a lot in the current battle over the future of (pseudo) AI is the cotton gin.

                                                                                    I live in a country where industrial progress is always considered a positive. It’s such a fundamental concept to the American exceptionalism claim that we are taught never to question it, let alone realize that it’s propaganda.

                                                                                    One such myth, taught early in grade school, is the story of Eli Whitney and the cotton gin. Here was a classic example of a labor-saving device that made millions of lives better. No more overworked people hand cleaning the cotton (slaves, though that was only mentioned much later, if at all). Better clothes and bedding for the world. Capitalism at its best.

                                                                                    But that’s only half the story of this great industrial time saver. Where did those cotton cleaners go? And what was the impact of speeding up the process?

                                                                                    Now that the cleaning bottleneck was gone, the focus was on picking cotton as fast as possible. Those cotton cleaners likely, and millions of other slaves definitely, were sent to the fields to pick cotton. There was an unprecedented explosion in the slave trade. Industrial time management and optimization methods were applied to human beings using elaborate rule-based systems written up in books. How hard to punish to get optimal productivity. How long their lifespans needed to be to get the lost production per dollar. Those techniques, practiced on the backs and lives of slaves, became the basis of how to run the industrial mills in the North. They are the ancestors of the techniques that your manager uses now to improve productivity.

                                                                                    Millions of people were sold into slavery and worked to death *because* of the cotton gin. The advance it provided did not, in fact save labor overall. Nor did it make life better overall. It made a very small set of people much much richer; especially the investors around the world who funded the banks who funded the slave purchases. It made a larger set of consumers more comfortable at the cost of the lives of those poorer. Over a hundred years later this model is still the basis for our society.

                                                                                    Modern “AI” is a cotton gin. It makes a lot of painstaking things much easier and available to everyone. Writing, reading, drawing, summarizing, reviewing medical cases, hiring, firing, tracking productivity, driving, identifying people in a lineup…they all can now be done automatically. Put aside whether it’s actually capable of doing any of those things *well*; the investors don’t care if their products are good, they only care if they can make more money off of them. So long as they work enough to sell, the errors, and the human cost of those errors, are irrelevant. And like the cotton gin, AI has other side effects. When those jobs are gone, are the new jobs better? Or are we all working that much harder, with even more negative consequences to our life if we fall off the treadmill? One more fear to keep us “productive”.

                                                                                    The Luddites learned this lesson the hard way, and history demonizes them for it; because history isn’t written by the losers.

                                                                                    They’ve wrapped “AI” with a shiny ribbon to make it fun and appealing to the masses. How could something so fun to play with be dangerous? But like the story we are told about the cotton gin, the true costs are hidden.

                                                                                      4 ★ 0 ↺

                                                                                      [?]Anthony » 🌐
                                                                                      @abucci@buc.ci

                                                                                      Speaking of machine learning, I once had a paper rejected from (International Conference on Machine Learning) in the early 2000s because it "wasn't about machine learning" (minor paraphrase of comments in 2 of the 3 reviews if I recall correctly). That field was consolidating--in a bad way, in my view--around a very small set of ideas even back then. My co-author and I wrote a rebuttal to the rejection, which we had the opportunity to do, arguing that our work was well within the scope of machine learning as set out by Arthur Samuel's pioneering work in the late 1950s/early 1960s that literally gave the field its name (Samuel 1959, Some studies in machine learning using the game of checkers). Their retort was that machine learning consisted of: learning probability distributions of data (unsupervised learning); learning discriminative or generative probabilistic models from data (supervised learning); or reinforcement learning. Nothing else. OK maybe I'm missing one, but you get the idea.

                                                                                      We later expanded this work and landed it as a chapter in a 2008 book Multiobjective Problem Solving from Nature, which is downloadable from https://link.springer.com/book/10.1007/978-3-540-72964-8 . You'll see the chapter starting on page 357 of that PDF (p 361 in the PDF's pagination). We applied a technique from the theory of coevolutionary algorithms to examine small instances of the game of Nim, and were able to make several interesting statements about that game. Arthur Samuel's original papers on checkers were about learning by self-play, a particularly simple form of coevolutionary algorithm, as I argue in the introductory chapter of my PhD dissertation. Our technique is applicable to Samuel's work and any other work in that class--in other words, it's squarely "machine learning" in the sense Samuel meant the term.

                                                                                      Whatever you may think of this particular work of mine, it's bad news when a field forgets and rejects its own historical origins and throws away the early fruitful lines of work that led to its own birth. threatens to have a similar wilting effect on artificial intelligence and possibly on computer science more generally. The marketplace of ideas is monopolizing, the ecosystem of ideas collapsing. Not good.


                                                                                        2 ★ 0 ↺

                                                                                        [?]Anthony » 🌐
                                                                                        @abucci@buc.ci

                                                                                        Haven't read this one yet, but I'm itching to:

                                                                                        https://mastodon.world/@Mer__edith/113197090927589168

                                                                                        Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI

                                                                                        With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable, powerful and interesting it is is increasingly seen as common sense. But what is this assumption based on, and how are we measuring value, power, and performance? And what are the collateral consequences of this race to ever-increasing scale? Here, we scrutinize the current scaling trends and trade-offs across multiple axes and refute two common assumptions underlying the 'bigger-is-better' AI paradigm: 1) that improved performance is a product of increased scale, and 2) that all interesting problems addressed by AI require large-scale models. Rather, we argue that this approach is not only fragile scientifically, but comes with undesirable consequences. First, it is not sustainable, as its compute demands increase faster than model performance, leading to unreasonable economic requirements and a disproportionate environmental footprint. Second, it implies focusing on certain problems at the expense of others, leaving aside important applications, e.g. health, education, or the climate. Finally, it exacerbates a concentration of power, which centralizes decision-making in the hands of a few actors while threatening to disempower others in the context of shaping both AI research and its applications throughout society.
                                                                                        Currently this is on which, if you've read any of my critiques, is a dubious source. I'd love to see this article appear in a peer-reviewed or otherwise vetted venue, given the importance of its subject.

                                                                                        I've heard through the grapevine that US federal grantmaking agencies like the (National Science Foundation) are also consolidating around generative AI. This trend is evident if you follow directorates like CISE (Computer and Information Science and Engineering). A friend told me there are several NSF programs that tacitly demand LLMs of some form be used in project proposals, even when doing so is not obviously appropriate. A friend of a friend, who is a university professor, has said "if you're not doing LLMs you're not doing machine learning".

                                                                                        This is an absolutely devastating mindset. While it might be true at a certain cynical, pragmatic level, it's clearly indefensible at an intellectual, scholarly, scientific, and research level. Willingly throwing away the diversity of your own discipline is bizarre, foolish, and dangerous.


                                                                                          3 ★ 1 ↺
                                                                                          AI Channel boosted

                                                                                          [?]Anthony » 🌐
                                                                                          @abucci@buc.ci

                                                                                          A Handy AI Glossary

                                                                                          = Automated Immiseration
                                                                                          = Generative Automated Immiseration
                                                                                          = Automated General Immiseration
                                                                                          = Large Labor-exploitation Model
                                                                                          = Machine Labor-exploitation

                                                                                            6 ★ 2 ↺

                                                                                            [?]Anthony » 🌐
                                                                                            @abucci@buc.ci

                                                                                            "Data is the new oil" has never made any sense to me. I think I understand why people say this as a shorthand. But about people, which is often what's being referred to, is more like perishable food. Data goes stale. It goes bad. In some applications it's stale moments after you collect it. Data can even be toxic ( and models can be data poisoned!)

                                                                                            If you accept the viewpoint of ecological rationality, data (about people) is not nearly as useful for predictive purposes as it's made out to be. There is a "less is more" phenomenon in many applications, especially those that claim to predict behaviors or outcomes of some kind. See also this talk: https://www.cs.princeton.edu/news/how-recognize-ai-snake-oil .

                                                                                            There is a "less is more" effect with food, too. People need a baseline amount of food to maintain health, but having significantly more food than that doesn't confer significantly more health. Also food can spoil if one hoards it.

                                                                                            If data is a liquid, it's more like milk than oil.

                                                                                              4 ★ 4 ↺

                                                                                              [?]Anthony » 🌐
                                                                                              @abucci@buc.ci

                                                                                              Just to clarify the point I was making yesterday about arXiv, below I've included a plot from arXiv's own stats page https://info.arxiv.org/help/stats/2021_by_area/index.html . The image contains two charts side-by-side. The chart on the left is a stacked area chart tracking the number of submissions to each of several arXiv categories through time, from 1991 to 2021. I obtained this screenshot today; arXiv's site, at time of writing, says the chart had been updated 3 January 2022. The caption to this plot on the arXiv page I linked has more detail about it.

                                                                                              What you're seeing here is that for most categories, there is a linear increase in the number of submissions to the category year-over-year up until the end of the data series in 2021. Computer science is dramatically different: its increase looks exponential, and it looks like its rate of increase may have accelerated circa 2017. The chart on the right, which is the same data shown proportional instead of as raw counts, suggests computer science might be "eating" mathematics starting around 2017.

                                                                                              2017 is around when generative AI papers started to appear in large quantities. There was a significant advance in machine learning published around 2018 but known before then that made deep learning significantly more effective. Tech companies were already pushing this technology. (the / maker) was founded in 2015; GPT-2 was released in early 2019. arXiv's charts don't show this, but I suspect these factors play a role in the seeming phase shift in their CS submissions in 2017.

                                                                                              We don't know what 2022 and 2023 would look like on a chart like this but I expect the exponential increase will have continued and possibly accelerated.

                                                                                              In any case, this trend is extremely concerning. The exponential increase in number of submissions to what is supposed to be an academic pre-print service is not reasonable. There hasn't been an exponential increase in the number of computer scientists, nor in research funding, nor in research labs, nor in the output-per-person of each scientist. Furthermore, these new submissions threaten to completely swamp all other material: before long computer science submissions will dwarf those of all over fields combined; since this chart stops at 2021 they may have already! arXiv's graphs do not break down the CS submissions by subtopic, but I suspect they are in the machine learning/generative AI/LLM space and that submissions on these topics dwarf the other subdisciplines of computer science. Finally, to the extent that arXiv has quality controls in place for its archive, these can't possibly keep up with an exponentially-increasing rate of submissions. They will eventually fail if they haven't already (as I suggested in a previous post I think there are signs that their standards are slipping; perhaps that started circa 2017 and that's partly why the rate of submissions accelerated then?).


                                                                                              Description is in the body of the post.

                                                                                              Alt...Description is in the body of the post.