Software behemoth Adobe has recently found itself entangled in a proposed class-action lawsuit. Elizabeth Lyon, an author hailing from Oregon, brought forth the lawsuit on behalf of a group of affected authors. She alleges that Adobe employed a dataset replete with pirated versions of their literary works to pre-train its compact language model, SlimLM.
The open-source dataset, SlimPajama-627B, upon which SlimLM heavily depends, is under fire for allegedly encompassing roughly 191,000 unauthorized e-books, collectively known as the 'Books3' subset. Prior to this, industry heavyweights like Meta, Apple, and Anthropic have also found themselves embroiled in legal wrangles for utilizing datasets that incorporated content from Books3.
SlimLM is meticulously crafted to streamline and enhance document assistance tasks on mobile devices. However, this accusation has cast a shadow over its technological core, revealing potential compliance risks lurking beneath the surface.
Adobe, for its part, has refrained from issuing a formal statement on the matter thus far. As regulatory frameworks continue to tighten their grip, the compliance of training data is rapidly emerging as a pivotal juncture in the legal skirmishes that are increasingly commonplace within the tech industry.
