I code and do art things. Check https://private.horse64.org/u/ell1e for the person behind this content. For my projects, https://codeberg.org/ell1e has many of them.
@leminal.space
I code and do art things. Check https://private.horse64.org/u/ell1e for the person behind this content. For my projects, https://codeberg.org/ell1e has many of them.
Windows could be considered uniquely responsible here since on a desktop you might think you're safe from this if you always shut it down properly after use. Yet "fast startup" is a thing. It's a questionable default that should be revisited.
Maybe it's just me, but I find fully automated transit slightly creepy, at least when it comes to passenger trains. I personally prefer when there's somebody human still on board having an eye on things.
We need the equivalent investment now. If average code is cheap, then the scarce resource is no longer the ability to produce it. The scarce resource is the ability to read it, to navigate it
You know what would help a lot with understanding the code one is working on? Writing it yourself without turning your brain off via AI.
But that's an insight the article somehow seems to be missing.
Bug report asking them to undo it: https://bugzilla.mozilla.org/...
Perhaps it's just me, but to me this article feels like belittling the problem by not differentiating between "hated" products and "harmful" products.
If a company makes you work on something that is hated, it's fair and good to have sympathy. If a company makes you work on something that is harmful or unethical, like many perceive Co-Pilot to be, then an article about getting user hate that doesn't talk at all about ethics feels a little tonedeaf.
I don't know, perhaps that's just me. I certainly don't envy the writer for being employed to work on it.
While on some level I agree, perhaps it's time to push Linux phones as well?
For anybody who has any sort of techie knowledge, that could be a better long term option once Linux phones get more momentum and funding.
Often it is respected, but the resulting problem is platforms conflate things with the questionable AI scraping crawlers to blackmail websites into participating in feeding AI.
For example, Googlebot if enabled won't just list you for search, but will also scrape your contents for Google's AI. Edit: see https://arstechnica.com/... as source. I imagine LinkedinBot, given it's microsoft, will feed some other AI of theirs as well on top of the previews.
Until regulation steps in to require AI bots to separately ask for crawling permission, or to actually get a proper license for reuse of the contents, this situation isn't going to improve.
If the accountability cannot be practically fulfilled, the reasonable policy becomes a ban.
What good is it to say "oh yeah you can submit LLM code, if you agree to be sued for it later instead of us"? I'm not a lawyer and this isn't legal advice, but sometimes I feel like that's what the Linux Foundation policy says.
Ultimately, the policy legally anchors every single line of AI-generated code
How would that even be possible? Given the state of things:
https://dl.acm.org/doi/10.1145/3543507.3583199
Our results suggest that [...] three types of plagiarism widely exist in LMs beyond memorization, [...] Given that a majority of LMs’ training data is scraped from the Web without informing content owners, their reiteration of words, phrases, and even core ideas from training sets into generated texts has ethical implications. Their patterns are likely to exacerbate as both the size of LMs and their training data increase, [...] Plagiarized content can also contain individuals’ personal and sensitive information.
https://www.theatlantic.com/...
Four popular large language models—OpenAI’s GPT, Anthropic’s Claude, Google’s Gemini, and xAI’s Grok—have stored large portions of some of the books they’ve been trained on, and can reproduce long excerpts from those books. [...] This phenomenon has been called “memorization,” and AI companies have long denied that it happens on a large scale. [...]The Stanford study proves that there are such copies in AI models, and it is just the latest of several studies to do so.
The court confirmed that training large language models will generally fall within the scope of application of the text and data mining barriers, [...] the court found that the reproduction of the disputed song lyrics in the models does not constitute text and data mining, as text and data mining aims at the evaluation of information such as abstract syntactic regulations, common terms and semantic relationships, whereas the memorisation of the song lyrics at issue exceeds such an evaluation and is therefore not mere text and data mining
https://www.sciencedirect.com/...
In this work we explored the relationship between discourse quality and memorization for LLMs. We found that the models that consistently output the highest-quality text are also the ones that have the highest memorization rate.
https://arxiv.org/abs/2601.02671
recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models. However, it remains an open question if similar extraction is feasible for production LLMs, given the safety measures [...]. We investigate this question [...] our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.
How does merely tagging the apparently stolen content make it less problematic, given I'm guessing it still won't have any attribution of the actual source (which for all we know, might often even be GPL incompatible)?
But I'm not a lawyer, so I guess what do I know. But even from a non-legal angle, what is this road the Linux Foundation seems to embrace of just ignoring the license of projects? Why even have the kernel be GPL then, rather than CC0?
I don't get it. And the article calling this "pragmatism" seems absurd to me.
It doesn't seem to be voluntary at all, from what I can tell from the draft:
"Upon that notification, the provider shall, in cooperation with the EU Centre pursuant to Article 50(1a), take the necessary measures to effectively contribute to the development of the relevant technologies to mitigate the risk of child sexual abuse identified on their services. [...]"
“In order to prevent and combat online child sexual abuse effectively, providers of hosting services and providers of publicly available interpersonal communications services should take all reasonable measures to mitigate the risk of their services being misused for such abuse […]”
These quotes sound mandatory, not voluntary. And let's look what these technologies referenced are:
"In order to facilitate the providers’ voluntary activities under Regulation (EU) 2021/1232 compliance with the detection obligations, the EU Centre should make available to providers detection technologies [...]"
"The EU Centre should provide reliable information on which activities can reasonably be considered to constitute online child sexual abuse, so as to enable the detection [...] Therefore, the EU Centre should generate accurate and reliable indicators,[...] These indicators should allow technologies to detect the dissemination of either the same material (known material) or of different new child sexual abuse material (new material), [...]"
Oops, it sounds again like mandatory scanning.
Source: https://cdn.netzpolitik.org/...
The new draft seems to pretend better to look less mandatory, but it still looks mandatory to me. Feel free to correct me if somebody can figure out that I'm wrong.
thanks for using Leebra!
go to feed...