Michael Weinberg's Techdirt Profile

Michael Weinberg

About Michael Weinberg

Posted on Techdirt - 31 May 2024 @ 01:41pm

Clearing Rights For A ‘Non-Infringing’ Collection Of AI Training Media Is Hard

In response to a number of copyright lawsuits about AI training datasets, we are starting to see efforts to build ‘non-infringing’ collections of media for training AI. While I continue to believe that most AI training is covered by fair use in the US and therefore inherently ‘non-infringing’, I think these efforts to build ‘safe’ or ‘clean’ or whatever other word one might use data sets are quite interesting. One reason they are interesting is that they can help illustrate why trying to build such a data set at scale is such a challenge.

That’s why I was excited to read about Source.Plus (via a post from Open Future). Source.Plus is a tool from Spawning that purports to aggregate over 37 million “public domain and CC0 images integrated from dozens of libraries and museums.” That’s a lot less than are used to train current generative models, but still a lot of images that could be used for all sorts of useful things.

However, it didn’t take too much poking around on the site to find an illustration of why accurately aggregating nominally openly licensed images at scale can be such a challenge.

The site has plenty of OpenGLAM images that are clearly old enough to be in the public domain. It also has a number of newer images (like photographs) that are said to be licensed under CC0. Curious, I clicked on the first photograph I found on the Source.Plus home page:

photograph of a library reading room full of patrons shot from above

According to the image page on Source.Plus, the image was from Wikimedia Commons and licensed under a CC0 public domain dedication. It listed the creator as Pixabay and the uploader (to Wikimedia) as Philipslearning.

Clicking through to the wikimedia page reveals that the original source for the image was Pixabay, and that it was uploaded on March 9, 2023 by Philipslearning (an account that appears to no longer exist, for whatever that is worth). The file metadata says that the image itself was taken on May 18, 2016.

Clicking through to the Pixabay page for the image reveals that the image is available under the Pixabay Content License. That license is fairly permissive, but does state:

  • You cannot sell or distribute Content (either in digital or physical form) on a Standalone basis. Standalone means where no creative effort has been applied to the Content and it remains in substantially the same form as it exists on our website.
  • If Content contains any recognisable trademarks, logos or brands, you cannot use that Content for commercial purposes in relation to goods and services. In particular, you cannot print that Content on merchandise or other physical products for sale.
  • You cannot use Content in any immoral or illegal way, especially Content which features recognisable people.
  • You cannot use Content in a misleading or deceptive way.
  • You cannot use any of the Content as part of a trade-mark, design-mark, trade-name, business name or service mark.

Which is to say, not CC0.

However, further investigation into the Pixabay Wikipedia page suggests that images uploaded to Pixabay before January 9, 2019 are actually released under CC0. Section 4 of the Pixabay terms confirms that. The additional information on the image’s Pixabay page confirms that it was uploaded on May 17, 2016 (which matches the metadata added by the unknown Philipslearning on the image’s wikimedia page).

All of which means that this image is, in all likelihood, available under a CC0 public domain dedication. Which is great! Everything was right!

At the same time, the accuracy of that status feels a bit fragile. This fragility works in the context of wikipedia, or if you are looking for a handful of openly-licensed images. Is it likely to hold up at training set scale across tens of millions of images? Maybe? What does it mean to be ‘good enough’ in this case? If trainers do require permission from rightsholders to train, and one relied on Source.Plus/wikimedia for the CC0 status of a work, and that status turned out to be incorrect, should the fact that they thought they were using a CC0 image be relevant to their liability?

Michael Weingberg is the Executive Director of NYU’s Engelberg Center for Innovation Law and Policy. This post is republished from his blog under its CC BY-SA 4.0 license. Hero Image: Interieur van de Bodleian Library te Oxford

Posted on Techdirt - 25 October 2022 @ 03:43pm

If GitHub Copilot Is A Copyright Problem, Perhaps The Problem Is Copyright

Last week a new GitHub Copilot investigation website created by Matthew Butterick brought the conversation about GitHub’s Copilot project back to the front of mind for many people, myself included. Copilot, a tool trained on public code that is designed to auto-suggest code to programmers, has been greeted by excitement, curiosity, skepticism, and concern since it was announced.

The GitHub Copilot investigation site’s arguments build on previous work by Butterick, as well as thoughtful analysis by Bradley M. Kuhn at the Software Freedom Conservancy. I find the arguments contained in these pieces convincing in some places and not as convincing in others, so I’m writing this post in the hopes that it helps me begin to sort it all out.

At this point, Copilot strikes me as a tool that replaces googling for stack overflow answers. That seems like something that could be useful. It also seems plausible that training such a tool on open public software repositories (including open source repositories) could be allowed under US copyright law. That may change if or when Copilot evolves, which makes this discussion a fruitful one to be having right now.

Both Butterick and Kuhn combine legal and social/cultural arguments in their pieces. This blog post starts with the social/cultural arguments because they are more interesting right now, and may impact the legal analysis as facts evolve in the future. Butterick and Kuhn make related arguments, so I’ll do my best to be clear which specific version of a point I’m engaging with at any given time. As will probably become clear, I generally find Kuhn’s approach and framing more insightful (which isn’t to say that Butterick’s lacks insight!).

What is Copilot, Really?

A large part of this discussion seems to turn on the best way to think about and analogize what Copilot is doing (the actual Copilot page does a pretty good job of illustrating how one might use it).

Butterick seems to think that the correct way to think about Copilot is as a search engine that points users to a specific part of a specific (often open source) software package. In his words, it is “a con­ve­nient alter­na­tive inter­face to a large cor­pus of open-source code”. He worries that this “selfish interface to open-source software” is built around “just give me what I want!” (emphasis his).

The selfish approach may deliver users to what they think they want, but in doing so hides the community that exists around the software and removes critical information that the code is licensed under an open source license that comes with obligations. If I understand the argument correctly, over time this act of hiding the community will drain open source software of its vitality. That makes Copilot a threat to open source software as a sustainable concept.

But…

The concern about hiding open source software’s community resonates with me. At the same time, Butterick’s starting point strikes me as off, at least in terms of how I search for answers to coding questions.

This is probably a good place to pause and note that I am a Very Bad coder who, nonetheless, does create some code that tends to be openly licensed and is just about always built on other open source code. However, I have nowhere near the skills required to make a meaningful technical contribution to someone else’s code.

Today, my “convenient alternative interface” to finding answers when I need to solve coding problems is google. When I run into a coding problem, I either describe what I am trying to do or just paste the error message I’m getting into google. If I’m lucky, google will then point me to stack overflow, or a blog post, or documentation pages, or something similar. I don’t think that I have ever answered a coding question by ending up in a specific portion of open source code in a public repo. If I did, it seems unlikely that code – even if it had great comments – would get me where I was going on its own because I would not have the context required to quickly understand that it answered my question..

This distinction between “take me to part of open source code” (Butterick’s view) and “help me do this one thing” (my view) is important because when I look at the Copilot website, it feels like Copilot is currently marketed as a potentially useful stack overflow commenter, not someone with an encyclopedic knowledge of where that problem was solved in other open source code. Butterick experimented with Copilot in June and described the output as “This is the code I would expect from a talented 12-year-old who learned about JavaScript yesterday and prime numbers today.” That’s right at my level!

If you ask Copilot a question like “how can I parse this list and return a different kind of list?,” in most cases (but, as Butterick points out, not all!) it seems to respond with an answer synthesized from many different public code repositories instead of just pointing to a single “best answer” repo. That makes Copilot more of a stack overflow explorer than a public code explorer, albeit one that is itself trained by exploring public code. That feels like it reduces the type of harm that Butterick describes.

Use at Your Own Risk

Butterick and Kuhn also raise concerns about the fact that Copilot does not make any guarantees about the quality of code it suggests. Although this is a reasonable concern to have, it does not strike me as particularly unique to Copilot. Expecting Copilot to provide license-cleared and working code every time is benchmarking it against an unrealistic status quo.

While useful, the code snippets I find in stack overflow/blog post/whatever are rarely properly licensed and are always “use at your own risk” (to the extent that they even work). Butterick and Kuhn’s concerns in this area feel equally applicable to most of my stack overflow/blog post answers. Copilot’s documentation if fairly explicit about the value of the code it suggests (“We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn’t write yourself.”), for whatever that is worth.

Will Copilot Create One Less Reason to Interact Directly with Open Source Code?

In Butterick’s view, another downside of this “just give me what I want” service is that it reduces the number of situations where someone might knowingly interact with open source code directly. How often do most users interact directly with open source code? As noted above, I interact with a lot of other people’s open source software as an extremely grateful user and importer of libraries, but not as a contributor. So Copilot would shift my direct deep interaction with open source code from zero to zero.

Am I an outlier? Nadia Asparouhouva (née Eghbal)’s excellent book Working in Public provides insight into open source software grounded in user behavior on GitHub. In it, she tracks how most users of open source software are not part of the software’s active developer community:

“This distribution – where one or a few developers do most of the work, followed by a long tail of casual contributors, and many more passive users – is now the norm, not the exception, in open source.”

She also suggests that there may be too much community around some open source software projects, which is interesting to consider in light of Butterick’s concern about community depletion:

”The problem facing maintainers today is not how to get more contributors but how to manage a high volume of frequent, low-touch interactions. These developers aren’t building communities; they’re directing air traffic.”

That suggests that I am not necessarily an outlier. But maybe users like me don’t really matter in the grand scheme of open source software development. If Butterick is correct about Copilot’s impact on more active open source software developers, that could be a big problem.

Furthermore, even if users like me are representative today, and Copilot is not currently good enough to pull people away from interacting with open source code, might it be in the future?

“Maybe?” feels like the only reasonable answer to that question. As Kuhn points out, “AI is usually slow-moving, and produces incremental change far more often than it produces radical change.” Kuhn rightly argues that slow-moving change is not a reason to ignore a possible future threat. At the same time, it does present the possibility that a much better Copilot might itself be operating in an environment that has been subject to other radical changes. These changes might enhance or reduce that future Copilot’s negative impacts.

Where does that leave us? The kind of casual interaction with open source code that Butterick is concerned about may happen less than one might expect. At the same time, today’s Copilot does not feel like a replacement for someone who wants to take a deeper dive into a specific piece of open source software. A different version of Copilot might, but it is hard to imagine the other things that might be different in the event that version existed. Today’s version of Copilot does not feel like it quite manifests the threat described by Butterick.

Copilot is Trained on Open Source, Not Trained on Open Source

For some reason, I went into this research thinking that Copilot had explicitly been trained on open source software. That’s not quite right. Copilot was trained on public GitHub repositories. Those include many repositories of open source software. They also include many repositories of code that is just public, with no license, or a non-open license, or something else. So Copilot was trained on open source software in the sense that its training data includes a great deal of open source software. It was not trained on open source software in the sense that its training data only consists of open source software, or that its developers specifically sought out open source software as training data.

This distinction also happens to highlight an evolving trend in the open source world, where creators conflate public code with openly licensed code. As Asparouhouva notes:

”But the GitHub generation of open source developers doesn’t see it that way, because they prioritize convenience over freedom (unlike free software advocates) or openness (unlike everly open source advocates). Members of this generation aren’t aware of, nor do they really care about, the distinction between free and open source software. Neither are they fired up about evangelizing the idea of open source itself. They just publish their code on GitHub because, as with any other form of online content today, sharing is the default.”

As a lawyer who works with open source, I think the distinction between “openly/freely licensed” and “public” matters a lot. However, it may not be particularly important to people using publicly available software (regardless of the license) to get deeper into coding. While this may be a problem that is exacerbated by Copilot, I don’t know that Copilot fundamentally alters the underlying dynamics that feed it.

As noted at the top, and attested to by the body of this post so far, this post starts with the cultural and social critiques of Copilot because that is a richer area for exploration at this stage in the game. Nonetheless, the critiques are – quite reasonably – grounded in legal concerns.

Fair Use

The legal concerns are mostly about copyright and fair use. Normally, in order to make copies of software, you need permission from the creator. Open source software licenses grant those permissions in return for complying with specific obligations, like crediting the original creator.

However, if the copy being made of the software is protected by fair use, the copier does not need permission from the creator and can ignore any obligations in a license. In this case, GitHub is not complying with any open source licensing requirements because it believes that its copies are protected by fair use. Since it does not need permission, it does not need to copy with license requirements (although sometimes there are good reasons to comply with the social intent of licenses even if they are not legally binding…). It has said as much, although it (and its parent company Microsoft) has declined to elaborate further.

I read Butterick as implying that GitHub and Microsoft’s silence on the details of its fair use claim means that the claim itself is weak: “Why couldn’t Microsoft pro­duce any legal author­ity for its posi­tion? Because [Kuhn and the Software Freedom Conservancy] is cor­rect: there isn’t any.”

I don’t think that characterization is fair. Even if they believe that their claim is strong, GitHub cannot assume that it is so strong as to avoid litigation over the issue (see, e.g. the existence of the GitHub Copilot investigation website itself). They have every reason to avoid pre-litigating the fair use issue via blog post and press release, keeping their powder dry until real litigation.

Kuhn has a more nuanced (and correct, as far as I’m concerned) take on how to interpret the questions: “In fact, these areas are so substantially novel that almost every issue has no definitive answers”. While it is totally reasonable to push back on any claims that the law around this question is settled in GitHub’s favor (Kuhn, again, “We should simply ignore GitHub’s risible claim that the “fair use question” on machine learning is settled.”), that is very different than suggesting that it is settled against GitHub.

How will this all shake out? It’s hard to say. Google scanned all the books in order to create search and analytics tools, claiming that their copies were protected by fair use. They were sued by The Authors Guild in the Second Circuit. Google won that case. Is scanning books to create search and analytics tools the same as scanning code to create AI-powered autocomplete? In some ways yes? In other ways no?

Google also won a case before the Supreme Court where they relied on fair use to copy API calls. But TVEyes lost a case where they attempted to rely on fair use in recording all television broadcasts in order to make it easy to find and provide clips. And the Supreme Court is currently considering a case involving Warhold paintings of Prince that could change fair use in unexpected ways. As Kuhn noted, we’re in a place of novel questions with no definitive answers.

What About the ToS?

As Franklin Graves pointed out, it’s also possible that GitHub’s Terms of Service allow it to use anything in any repo to build Copilot without worrying about addition copyright permissions. If that’s the case, they won’t even need to get to the fair use part of the argument. Of course, there are probably good reasons that GitHub is not working hard to publicize the fact that their ToS might give them lots of room when it comes to making use of user uploads to the site.

Where Does That Leave Things?

To start with, I think it is responsible for advocates to get out ahead of things like this. As Kuhn points out:

”As such, we should not overestimate the likelihood that these new systems will both accelerate proprietary software development, while we simultaneously fail to prevent copylefted software from enabling that activity. The former may not come to pass, so we should not unduly fret about the latter, lest we misdirect resources. In short, AI is usually slow-moving, and produces incremental change far more often than it produces radical change. The problem is thus not imminent nor the damage irreversible. However, we must respond deliberately with all due celerity — and begin that work immediately.”

At the same time, I’m not convinced that Copilot is a problem. Is it possible that a future version of Copilot would starve open source software of its community, or allow people to effectively rebuild open source code outside of the scope of the original license? It is, but it seems like that version of Copilot would be meaningfully different from the current version in ways that feel hard to anticipate. Today’s Copilot feels more like a fast lane to possibly-useful stack overflow answers than an index that can provide unattributed snippets of all open source software.

As it is, the acute threat Copilot presents to open source software today feels relatively modest. And the benefits could be real. There are uses of today’s Copilot that could make it easier for more people to get into coding – even open source coding. Sometimes the answer of a talented 12 year old is exactly what you need to get over the hump.

Of course, GitHub can be right about fair use AND Copilot can be useful AND it would still be quite reasonable to conclude that you want to pull your code from GitHub. That’s true even if, as Butterick points out, GitHub being right about fair use means that code anywhere on the internet could be included in future versions of Copilot.

I’m glad that the Software Freedom Conservancy is getting out ahead of this and taking the time to be thoughtful about what it means. I’m also curious to see if Butterick ends up challenging things in a way that directly tests the fair use questions.

Finally, this entire discussion may also end up being a good example of why copyright is not the best tool to use against concerns about ML dataset building. Looking to copyright for solutions has the potential to stretch copyright law in strange directions, cause unexpected side effects, and misaddressing the thing you really care about. That is something that I am always wary of, and a prior that informs my analysis here. Of course, Amanda Levandowski makes precisely the opposite argument in her article Resisting Face Surveillance with Copyright Law.

Michael Weinberg is the Executive Director of NYU’s Engelberg Center for Innovation Law and Policy and Board President of Open Source Hardware Association. This article is reposted with permission from Michael Weinberg’s blog.

Posted on Techdirt - 27 May 2021 @ 03:33pm

A Second Cambrian Explosion of Open Source Licenses Or Is it Time For Open Source Lawyers to Have Fun Again?

As the open source world has grown, so have concerns about the context in which openly licensed items are used. While these concerns have existed since the beginning of the open source movement, today?s larger and more diverse movement has brought new urgency to them. In light of this revived interest within the community, the time may be ripe to begin encouraging experimentation with open source licensing again.

How We Got Here

While the history of open source software is long and varied (and predates the term open source software), for the purposes of this blog post its early evolution was driven by a fairly small group of individuals motivated by a fairly homogeneous set of goals. As the approach became more popular, the community developed a wide range of licenses designed to address a wide range of concerns. This ?First Cambrian Explosion? of open source models and software licenses was a time of experimentation within the community. Licenses varied widely in structure, uptake, and legal enforceability.

Eventually, the sprawling nature of this experimentation began to cause problems. The Free Software Foundation?s Free Software Definition and the Open Source Initiative?s Open Source Definition were both attempts to bring some order to the open source software world.

In the specific context of licensing, the Open Source Initiative began approving licenses that met its criteria. Soon thereafter, it released a License Proliferation Report detailing the challenges created by this proliferation of licenses and proposing ways to combat them.

These activities helped to bring order and standardization to the world of open source licensing. While OSI continues to approve licenses, for well over a decade the conventional wisdom in the world of open source has been to avoid creating a new license if at all possible. As a result, for most of this century open source software license experimentation has been decidedly out of style.

Largely for the reasons described in the License Proliferation Report, this conventional wisdom has been beneficial to the community. License proliferation does create a number of problems. Standardization does help address them. However, in doing so standardization also greatly reduced the amount of license experimentation within the community.

Reduced experimentation means that concerns incorporated into approved licenses (access to modifications of openly licensed code) have been canonized, while concerns that had not been integrated into an approved license (restrictions on unethical uses of software) at the moment of formalization were largely excluded from consideration within the open source community.

What Changed

What has changed since the move towards codification of licenses? The open source software world has gotten a lot bigger. In fact, it has gotten so much bigger that it isn?t just the open source software world anymore. Creative Commons – today a towering figure in the world of openness – did not even exist when the Open Source Initiative started approving licenses. Now the open world is open source hardware, and Creative Commons-licensed photos, and open GLAM collections, and open data, and all sorts of other things (this is a whole other blog post). The open source world has moved beyond early debates that questioned the fundamental legitimacy of open source as a concept. Open source has won the argument.

An expansion of applications of open source has lead to an expansion of people within open source. Those people are more diverse than the early open source software proponents and are motivated by a wider range of interests. They also bring with them a wider range of concerns, and a wider range of relationships to those concerns, than early open source adopters.

What is Happening Now

This broader community does not necessarily share the consensus about how to approach licensing that was developed in an earlier period of open source. They bring a range of viewpoints that did not exist in the earlier days of open source software into the open source community itself. They are also applying open source concepts and licenses to a range of applications that were not front of mind – or in mind at all – during the drafting of today?s canonical licenses.

Unsatisfied with the consensus rules that have delivered us the existing suite of (incredibly successful) licenses, parts of the community have begun experimenting again. Veteran open source lawyers are rewriting licenses with public understandability in mind. Community members are transforming their interpretation of open source development into licences that invite collaboration without intending to adhere to the open source definition. Some of these licenses are designed to address concerns traditionally excluded from the scope of open source licenses. I am directly involved in the ml5.js attempt to do just that.

The creators of these experiments are responding to a standardized approach to licensing that does not fully accommodate their needs and concerns. In some cases the standardized approach does not accommodate these concerns because the community litigated including them in the past and decided it could or should not be done. However, even in those cases, that debate happened within a very different community in at least somewhat different contexts. The conclusions arrived at then are not necessarily valid for the broader world that open source finds itself inhabiting.

In light of that, it may be time to begin encouraging experimentation in open source licensing again. Encourage people to test out new approaches by applying them to real world problems. In some cases, the decisions made in the past will prove to be robust and sustainable. In others, a new debate will reveal the decisions? shortcomings. In both cases, the open source community will be stronger by being tested from within.

Coda: Is This Post Just a Lawyer Advocating for Lawyers to Have More Fun?

Throw out the old ways of doing things! Try something new! Experiment! Is this just a call for lawyers to have fun by screwing around with exotic licensing concepts at the expense of everyone else?s stability (and sanity)?

It could be. But I don?t really think so. The thing about lawyers (as a group – there are always exceptions) is that novelty and instability makes us nervous. Things that are tried and true will probably work. That means we do not have to worry about them. New things – who knows what will happen to them? That uncertainty makes lawyers nervous.

That is part of the reason why lawyers like today?s conventional wisdom. The canonical set of open source licenses have more or less worked for decades. It is unlikely that they will explode, and it is even less likely that they will explode in the face of the lawyer who uses them on any given project. In contrast, any lawyer who writes their own license is setting themselves up for a period of anxiety, waiting to discover what they missed or how things will go wrong.

Of course, some lawyers do think it is fun to cook up new open licenses. And maybe this post is a call for them to do more of it. But, on balance and as a whole, introducing new licenses into the world of open source will probably cause open source lawyers more anxiety than joy.

I think that anxiety is probably worth it. But that will be far from a universally held position.

Originally published on Michael Weinberg’s blog and repbulished under a CC-BY-SA 4.0 license.

Posted on Techdirt - 28 April 2015 @ 02:54pm

3D Printed Copyright Creep

When your car runs out of gas, you can fill it up at any gas station you like. You never worry if the company that made your car has an exclusivity deal with one gas station or another, or even if that company has a preference for one brand of gas. In fact, you would probably find it some combination of ridiculous, galling, and offensive if the company that made your car threatened you with a copyright infringement lawsuit if you didn’t go to their preferred gas station to fill up.

This dynamic is true for all sorts of things. Once you buy it, it is up to you to decide how you maintain it and replace what needs replacing. This is true of gas in a car, water in a bottle, and filters in a vacuum cleaner. But as software gets introduced into more and more everyday objects, some companies are trying to stretch copyright law beyond its limit in order to lock you into buying replacements only from them.

A decade ago, we saw this play out with 2D printers and toner ink. Some companies that made printers decided that they would prefer that consumers buy replacement toner (at a substantial markup) only from them. In order to attempt to lock themselves in as the only place to buy replacement toner, these companies designed their printers to look for a special verification chip on new toner cartridges to prove that the new cartridge came from them. When another company figured out a way around these chips, the printer manufacturers ran to copyright law to try and shut them down.

Fortunately, the courts saw through this ruse and were able to recognize that allowing consumers to choose where they get replacement toner for their printers has nothing to do with copyright law. Unfortunately, today some 3D printer manufacturers are trying this same gambit and hoping for a different outcome.

In a proceeding in front of the Copyright Office, 3D printer manufacturers offer a parade of horribles of what will happen [pdf] if users are free to choose the materials they use in their printers. Notably, none of these have anything to do with copyright. The only connection any of this has with copyright is that the printer manufacturers use a small line of code to verify if they sold the refills.

Just as adding a verification chip to a gas tank shouldn’t be used as a pretext to lock a car owner into a single source of gasoline, adding a verification chip shouldn’t be used as a pretext to lock a 3D printer user into a single source of 3D printing material.

3D printing is an emerging engine for innovation, and because of that this issue would be important even in isolation. However, the battle being fought over 3D printer material occurs against the backdrop of other attempts to use copyright as a pretext to limit consumer choice in all sorts of contexts. Be it accessing data from medical devices implanted in your body, repairing farm equipment that breaks down in the field, or unlocking your cell phone, the current proceeding before the Copyright Office ? known as the “1201 triennial” after the part of the law that created it ? is a preview of a future where manufacturers have the power to lock consumers into whatever they please.

That is what makes the Registrar of Copyrights’ decisions so important in this proceeding. Not only will the right decision clear the way for consumer choice. Strongly siding with users and against copyright creeping into everything sends a strong message that copyright has its purpose, but that it should not be abused.

Public Knowledge is hosting a 3D printing event at the U.S. Capitol Visitor Center April 29 from 10:30 a.m. to 8 p.m. This free event is an opportunity to engage with 3D printing experts on panels and interact with the latest 3D printing technology. You may register here.

Michael Weinberg is a 3D printing advocate and can be found at michaelweinberg.org. He is a former Vice President of Public Knowledge and currently IP & General Counsel of Shapeways, but writes here in his personal capacity.

Posted on Techdirt - 11 March 2015 @ 01:45pm

Licensing Your 3D Printed Stuff: Why 3D Printed Objects Challenge Our Copyright Beliefs

This blog post is reprinted from Public Knowledge, and is quite timely. On Thursday of this week, we’ll be discussing this very topic at our Copia Inaugural Summit, with Natalia Krasnodebska from Shapeways. We’ll also be distributing copies of this new report at the event. If you haven’t signed up to attend or to join Copia, please check it out.

Among a host of other (arguably more important) wonders, widespread access to 3D printing raises all sorts of interesting intellectual property law questions. Some of these questions are the obvious result of combining physical objects, digital files, and the distributive power of the internet. Others, however, are less obvious. 3D printing has the potential to take many of the things we assume about intellectual property law and turn it on its head.

The past fifteen years or so have given us all a collective informal education in intellectual property law. We have been taught to assume that everything we see on our computer screen is protected by intellectual property law (usually copyright), and that copying those things without permission can often result in copyright infringement (and potentially lawsuits).

By and large, this has been a reasonable rule of thumb. The things that we most often associate with our computer screens ? those are the music, movies, software, photos, articles, and whatnot ? happen to also be the types of things that are protectable by copyrights. As copyright automatically protects things that are categorically eligible for protection, it is safe to begin from the assumption that the music, movies, software, photos, articles, and whatnot made in the last century that you find online are actively protected by copyright.

This easy assumption becomes less reasonable in the context of 3D printing. Many of the objects coming out of a 3D printer are simply not eligible for copyright protection. As ?functional? objects, they are beyond copyright?s scope. They may be protectable by patent, but because patent protection is not automatic, many of these objects will simply not be protected by intellectual property at all. The idea that something is entirely unprotected by copyright or patent would have felt perfectly natural 30 years ago, but can feel deeply disorienting today.

Furthermore, unlike those music, movies, software, photos, articles, and whatnot, we often have to treat a physical object and the digital file that represents that object differently in the context of 3D printing and intellectual property. Although we do not often draw the distinction between a song and an .mp3 file, there are many situations where we are called on to conceive of an object and its digital file as fundamentally different intellectual property entities.

The importance of this difference manifests itself when people start to talk about licensing 3D printed things. Taking a page from the more traditional digital world, the conversation often starts with the relative strengths and weaknesses of various licenses. However, beginning there skips a fundamental and easy-to-overlook step: before considering which license to use, you need to know what you are actually licensing.

It was easy to skip this step with traditional digital media because the answer to ?what can you license?? was almost always ?everything.? But in the context of 3D printing, the answer is just as likely to be ?nothing? or at least ?only some parts.? Understanding what is and is not available to license is a new skill for our collective intellectual property education, and it is a critical one in the world of 3D printing.

In order to start this process, today we at Public Knowledge are releasing a new whitepaper called Licensing Your 3D Printed Stuff. Instead of focusing on the differences between licenses, this paper walks you through how to figure out what is even available to license in the first place. Because until you understand that, everything else is just a detail.

More posts from Michael Weinberg >>