Are eBooks Books and eLibraries Libraries?
A pending lawsuit seeks to settle those questions.
In an op-ed at Inside Higher Ed, a team of distinguished librarians argues, “The Internet Archive Is a Library.”
The Internet Archive, a nonprofit library in San Francisco, has grown into one of the most important cultural institutions of the modern age. What began in 1996 as an audacious attempt to archive and preserve the World Wide Web has grown into a vast library of books, musical recordings and television shows, all digitized and available online, with a mission to provide “universal access to all knowledge.”
Right now, we are at a pivotal stage in a copyright infringement lawsuit against the Internet Archive, still pending, brought by four of the biggest for-profit publishers in the world, who have been trying to shut down core programs of the archive since the start of the pandemic. For the sake of libraries and library users everywhere, let’s hope they don’t succeed.
You’ve probably heard of Internet Archive’s Wayback Machine, which archives billions of webpages from across the globe. Fewer are familiar with its other extraordinary collections, which include 41 million digitized books and texts, with more than three million books available to borrow. To make this possible, Internet Archive uses a practice known as “controlled digital lending,” “whereby a library owns a book, digitizes it, and loans either the physical book or the digital copy to one user at a time.”
Despite its incredible library collections, which serve the needs of millions of people, Hachette Book Group, HarperCollins Publishers, John Wiley & Sons Inc., and Penguin Random House assert that the Internet Archive is not a real library.
In their lawsuit against the Internet Archive, which could extract millions of dollars from the nonprofit organization, the publishers claim that the Internet Archive “badly misleads the public and boldly misappropriates the goodwill that libraries enjoy and have legitimately earned.” In their view, the archive’s “efforts to brand itself as a library” are part of a scheme to “fraudulently mislead” people, circumvent copyright law and limit how much profit publishers can extract from the ebook market. They describe the Internet Archive as a “pirate site” and its business model as “parasitic and illegal” and characterize controlled digital lending as “an invented paradigm that is well outside copyright law.”
The Internet Archive, in turn, argues that the practice of controlled digital lending constitutes fair use under copyright law, and asserts that “libraries have been practicing CDL in one form or another for more than a decade, and hundreds of libraries use it to lend books digitally today.”
To say the least, the language used by the publishers is wildly over the top. There’s rather clearly no intent on the part of the Internet Archive to mislead people, fraudulently or otherwise. But that doesn’t preclude its practices being outside copyright law. And, rather plainly, its existence puts a damper on people’s willingness to pay publishers money for ebooks.
As described, though, the Internet Archive isn’t a “pirate site.” It pays for the content it lends out. And, crucially, if it only lends one copy to one person at a time—as opposed to allowing an infinite number of people to download and keep a copy for themselves—it sounds very much like how libraries have operated for centuries.
Why is it so important to the publishers that the Internet Archive not be identified as a library? Primarily because Congress has long recognized the valuable role that libraries play in our copyright system and has created special allowances in the law for their work. In this suit, the publishers seek to redefine the Internet Archive on their own terms and, in so doing, deny it the ability to leverage the same legal tools that thousands of other libraries use to lend and disseminate materials to our users.
The argument that the Internet Archive isn’t a library is wrong. If this argument is accepted, the results would jeopardize the future development of digital libraries nationwide. The Internet Archive is the most significant specialized library to emerge in decades. It is one of the only major memory institutions to be created from the emergence of the internet. It is, and continues to be, a modern-day cultural institution built intentionally in response to the technological revolution through which we’ve lived.
Libraries are defined by collections, services and values. In The Librarian’s Book of Lists (ALA, 2010), George M. Eberhart offers this definition: “A library is a collection of resources in a variety of formats that is (1) organized by information professionals or other experts who (2) provide convenient physical, digital, bibliographic, or intellectual access and (3) offer targeted services and programs (4) with the mission of educating, informing, or entertaining a variety of audiences (5) and the goal of stimulating individual learning and advancing society as a whole.”
The Internet Archive has all these characteristics. It is a one-of-a-kind independent research library, with its holdings fully available in digital form. Its substantial physical and digital collections are unique. It employs librarians and other information professionals. It is open to all interested readers. It cooperates with peer libraries in support of archiving the information and contemporary discourse as manifested in the World Wide Web. It has an active community of researchers who depend on its collections. And it is an engaged, responsive, resource-sharing partner to hundreds of peer libraries. It is also now an integral part of the interlibrary loan system, sharing its holdings with other libraries worldwide. It shares the keystone values of all libraries: preservation, access, privacy, intellectual freedom, diversity, lifelong learning and the public good. And it does all this without commercial motive as a mission-driven not-for-profit organization.
Because my main interaction with the Internet Archive has been the aforementioned Wayback Machine—which is invaluable—I had little idea it did all of these things, certainly not to that level. So, it’s rather clearly a library. But, again, it’s plausible that eBooks aren’t accounted for under existing laws and regulations and the practices of this particular library—accessible as it is to anyone with a decent computer and Internet connection—unfairly infringe on rightsholders. And, alas, the op-ed doesn’t address that point.
Writing at The Nation, journalist Maria Bustillos argues the importance of digital preservation in an essay titled “Just Because ChatBots Can’t Think Doesn’t Mean They Can’t Lie.” It begins with a dissection of a weird blog post, since taken down, by George Mason economic Tyler Cowen and the fact that ChatGPT and like programs have filled the Web with nonsense, making Google searches much less reliable than they once were. She argues that being able to easily access any book in the world is a powerful antidote to that.
[I]t took me less than two minutes to access the original, correct, searchable text of The Advancement of Learning at the Internet Archive’s Open Library—for now, that is.
Unless the publishers’ lawsuit against the Internet Archive fails, that free, searchable online book will disappear—along with many millions of other valuable resources currently held at the Open Library. And until it is discovered and challenged, some incalculable amount of false information at Google will likely remain. (The Retreat of Learning, you might call it.)
The outcome of the lawsuit, hinging as it does on defining the legal ownership of digital books, may well determine the right of libraries to own and lend from their own collections, freely and without interference—whether those books are on paper, or digital.
At the heart of the dispute is the publishers’ contention that “ebooks are a fundamentally different products from physical book.” The Internet Archive loans its ebooks to patrons by scanning a paper book in its collection, storing away the paper copy, and loaning just the scan to one patron at a time, a common library practice known as Controlled Digital Lending, or CDL. The publishers claim that these ebooks are “infringing copies of the Publishers’ works that directly compete with the Publishers’ well-established markets for authorized consumer and library ebooks.”
So, this is indeed something different than what I understand to be the normal practice of libraries. (But I may be woefully out of date!) The Internet Archive is not lending either physical books or eBooks but rather scanned copies (PDFs?) of physical books. I have no idea what the law is on this practice.
But in its brief in opposition to the publishers, the Internet Archive argues that its model preserves traditional library practice in a digital world. By conflating licensed ebooks with the Open Library’s scans of physical books, they argue, the publishers expose the lawsuit’s true goal: “Plaintiffs would like to force libraries and their patrons into a world in which books can only be accessed, never owned, and in which availability is subject to the rightsholders’ whim.”
My university library subscribes to various databases of eBooks. They come with limitations as to how long they may be “checked out,” how many pages may be printed, etc. I don’t like that. But I’m hard-pressed to come up with a reason why the rights owner can’t put such restrictions on its properties.
In effect, the Internet Archive is fighting to prevent the devolution of ebooks into Netflix-like, un-ownable licensed products. An “authorized” licensed book that can’t be owned outright isn’t fundamentally a book at all; books that can only be licensed are impermanent object that can disappear from the virtual shelves of libraries for any number of reasons.
Again, I don’t like this. I find it a little creepy that Netflix can go back and effectively change the past, whether it’s subtle things like changing the name of an actor retroactively or simply deleting scenes or entire episodes that have retrospectively become controversial. But I’m not sure that’s fundamentally different from George Lucas’ multiple revisions of the Star Wars canon or publishers bowdlerizing old books to conform to modern sensibilities.
The stakes in this lawsuit have become clearer in the years since it was filed, as attacks against the freedom of individuals to read, write, teach, and learn have escalated—shading, not infrequently now, into threats of violence: Florida Governor Ron DeSantis taking aim at academic freedom on multiple fronts; literal book bannings and library closings; open aggression against school board members and librarians. Do we want to live in a world where books can disappear with one click of DeSantis’s mouse?
While I find DeSantis’ actions deplorable, unless he’s bought Amazon, I suspect the books in question will continue to exist.
Jennie Rose Halperin, the director of Library Futures, a digital library policy and advocacy organization, told me: “If libraries do not maintain the right to purchase and lend materials digitally as well as physically on terms that are equitable and fair to the public, we risk further exacerbating divides in our democracy and society, as well as the continued privatization of information access. Just because a book is digital does not make it licensed software—a book is a book, in whatever form it takes.”
But that’s rather the question, no?
We decided, roughly a quarter century ago, that a digital copy of a song or collection of songs is different from a physical record or compact disc of the same material.
Many of us of a certain age grew up recording songs off of the radio onto cassette tapes or even copying friends’ entire cassette tapes onto our own cassette tapes. While the record companies, songwriters, and artists presumably disliked that practice, it was considered legal to do so unless it was done at scale. That is, while I could make copies for personal use I couldn’t make a hundred copies of an album and then sell them.
In the late 1990s, though, Napster and other peer-to-peer “sharing” applications upended that model. Suddenly, anyone with the equipment to do so could transform their entire music collection to digital files and then put them onto the Internet for anyone to download. Conceptually, it was no different than recording a buddy’s copy of the latest hit album. Economically, of course, it was radically different: it was piracy.
Napster could make many of the claims that the Internet Archive and its supporters make now. While its existence surely cut into the sales of “Livin’ la Vida Loca,” it was also a great way to find obscure songs that were no longer in print. Still, it was only a matter of months before the Recording Industry Association of America filed suit.
Libraries, it’s clear, need their traditional statutory protections now more than ever. The right of first sale, which allows libraries to own and loan the books in their own collections, in particular, must be preserved for digital books as well as print ones.
Although, again, the Internet Archive isn’t buying digital books—it’s copying print books into digital form.
But not every library appears to understand these stakes. Vermont State University recently announced that it will be closing all its physical libraries and moving to an “all-digital” model, ostensibly to save money—though ebook price gouging scandals have been plaguing libraries and universities for years, prompting ongoing fights in the courts.
If Vermont State University’s plan takes effect this summer, as scheduled—and at the time of writing, there’s been no indication that they’re backing down—we’ll be seeing a whole university system at the mercy of publishers who can remove library access to any book they please, at the drop of a hat. These are economic, as well as political, disasters waiting to happen.
As Internet Archive founder Brewster Kahle wrote in an e-mail: “If the library only negotiates access licenses for their students to view publishers’ database products, is it a library anymore? Or is it a customer service department for corporate database products?”
I agree that there’s something to be said for having a “permanent” copy of something.
Then again, the downside is rather obvious. Owing print copies is expensive, in that they have to not only be purchased but maintained. Someone has to physically check them in and out. They get worn out and have to be replaced or repaired. They get lost, stolen, or misplaced. Further, in many cases, with relatively rare exception, they simply become obsolete in short order, becoming museum pieces to be warehoused rather than useful for contemporary research agendas.
Further, as people become more accustomed to doing their research via the Internet, there’s simply less demand for physical books. I walk past my university library on the way to and from my car daily. I seldom go inside. I don’t believe I’ve checked out a book or looked at a physical journal in the nearly ten years I’ve worked there.
Indeed, I don’t know whether the library even still subscribes to hard copies of academic journals, which must then be bound at the end of the year. I seriously doubt it. And the prices for access to journal databases constantly go up, too. It’s not immediately obvious to me why books are fundamentally different from journal articles in this regard.
Her closing argument is more aesthetic and economic than academic:
In my lifetime, the tension between commercial and cultural imperatives in the world of books has never been more stark.
The future of digital culture must not be left in the hands of commercial interests, because corporations don’t protect or develop culture: They sell it. Which is fine, and healthy, so long as businesses stay in their lane—but they don’t. Again and again, corporate overreach like the lawsuit against the Internet Archive has shown that where there is more money to be made, business will all too happily interfere with schools, universities, and libraries—no matter the cost to the quality or utility or posterity of education, or art, or literature.
Hollywood and the music industry abound with examples of this imbalance. The stranglehold of commercial imperatives has already radically impoverished culture in the United States, as “works of art” are increasingly considered “intellectual property.” The pressure to produce blockbusters, hits and bestsellers drives the mega-marketing of increasingly mega-boring mega-sequels, sometimes featuring megastars and adapted from mega-bestsellers. New and innovative writers, directors, artists and musicians—who present a greater commercial risk—not only get less and less of the cultural pie; they have a harder time even getting to the table where the pie is cut. The desire to squeeze more and more profits out of ever-lengthening copyright terms means, too, that new artists are prevented from creating meaningful responses to the masterworks of the past—while the culture steadily grows poorer and poorer. Everywhere you look, considerations of profit are encroaching on innovation and creativity.
Movies have pretty much always been made at the commercial judgment of studio heads. Recorded music has mostly depended on the commercial judgment of label heads. The same has been true of books, which have long required a publishing house to decide there was a sufficient market to invest in the project. If anything, modern technologies have made it easier than ever to bypass these gatekeepers.
And now we have to worry about the safety and freedom of libraries in schools and universities, the integrity of digital archives, and the preservation of digital ownership rights, too. It’s high time for the pendulum to swing toward protecting cultural posterity; the courts should begin by ensuring the preservation of the Internet Archive.
Again, I find this overwrought.
I hate what DeSantis and others are doing in the culture wars. I suspect that the courts will ultimately limit the power of politicians to interfere in these matters.
As to “digital ownership rights,” the question at hand is precisely, Who owns what? My kids have various games on their Nintendo Switch that we have paid for but that they don’t really own, in the sense that Nintendo could stop supporting them, change fundamental aspects of them the next time they turn the device on and connect them to the Internet, etc. I’m sure I’ve agreed to all of that in some TOS that I didn’t read.
I have all manner of digital files, mostly in PDF, of academic journal and other articles that I’ve preserved for my own use. I suppose I “own” those as much as I own any of the hundreds of print books on my shelves.
With regard to the Internet Archive, we’ll find out whether owning a print copy of the book entitles one to make a digital copy of it and put it on the Internet for all to use. That strikes me as a fairly limited question.