Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit and then some time on kbin.social.

  • 0 Posts
  • 90 Comments
Joined 6 months ago
cake
Cake day: March 3rd, 2024

help-circle











  • Not necessarily. Curation can also be done by AIs, at least in part.

    As a concrete example, NVIDIA’s Nemotron-4 is a system specifically intended for generating “synthetic” training data for other LLMs. It consists of two separate LLMs; Nemotron-4 Instruct, which generates text, and Nemotron-4 Reward, which evaluates the outputs of Instruct to determine whether they’re good to train on.

    Humans can still be in that loop, but they don’t necessarily have to be. And the AI can help them in that role so that it’s not necessarily a huge task.


  • It means that even if AI is having more environmental impact right now, there’s no reason to say “you can’t improve it that much.” Maybe you can improve it. As I said previously, a lot of research is being done on exactly that - methods to train and run AIs much more cheaply than it has so far. I see developments along those lines being discussed all the time in AI forums such as /r/localllama.

    Much like with blockchains, though, it’s really popular to hate AI and “they waste enormous amounts of electricity” is an easy way to justify that. So news of such developments doesn’t spread easily.




  • The term “model collapse” gets brought up frequently to describe this, but it’s commonly very misunderstood. There actually isn’t a fundamental problem with training an AI on data that includes other AI outputs, as long as the training data is well curated to maintain its quality. That needs to be done with non-AI-generated training data already anyway so it’s not really extra effort. The research paper that popularized the term “model collapse” used an unrealistically simplistic approach, it just recycled all of an AI’s output into the training set for subsequent generations of AI without any quality control or additional training data mixed in.



  • Apparently, I am. People actually want this

    Thank you for recognizing this. It gets quite frustrating in threads like these about new AI tools being deployed when people declare “nobody wants this!” And I try to explain that there are actually people that do want it. I find many AI tools to be quite handy.

    I tend to get vigorously downvoted at that point, as if that would make the demand “go away” somehow. But sticking heads in sand doesn’t accomplish anything except to make people increasingly out of touch.


  • I don’t know what you’re suggesting is going on here, those images you linked don’t work as far as I can tell. Firefox throws a security certificate error. Are you hosting them yourself and collecting IP addresses from people who click on them? If so, that’s not exactly a Lemmy-specific flaw. That’s just basic Internet 101.