Protect Your Code Against Licensing Risks

In the race to cut the time and expense involved in software development, developers may be endangering future business deals in favor of speed. That’s because programs using code created with generative artificial intelligence (GenAI) tools may include open source code that can create stumbling blocks during a business deal’s due-diligence process.

Many developers form startups for the express purpose of an eventual buy-out from a larger organization. They develop proprietary products based on programs that their developers write using a specific programming language such as Python and C++ and, increasingly, with GenAI coding tools.

Challenges can arise when snippets or lines of code within proprietary products contain open source code, which developers rely on because it is easy to use and free. In fact, a 2025 study by software security vendor Black Duck identified open source code in 99% of mergers and acquisitions transactions audited.

Open source code is, typically, a publicly available source code of software programs that is free for developers to use, modify, and redistribute. However, within the universe of open source code are subsets of code that carry restrictive licenses.

Such restrictive licenses can potentially obligate startups to release their products under open source licenses, which may hinder their ability to charge fees for those products. That Black Duck study also revealed that 85% of merger and acquisition transactions included open source code components with license conflicts. You don’t want your startup to have open source licensing conflicts that could potential delay—or even derail—a potential deal.

“Issues mostly come up in the course of an acquisition, when an acquirer might want to run an open source scan to determine the source of a code within a proprietary product,” said Steve Argentieri, a partner in the Business Law Department of New York-based legal firm Goodwin Procter LLP. “At this point, open source code is probably presenting more practical deal risk than legal risk. Tools such as Copilot have added additional protections to mitigate legal risks such as copyright risk.”

Fortunately, a startup can take steps to mitigate such licensing risks within GenAI code as its developers create products that may eventually be part of a sale. By crafting and enforcing processes and policies that govern the coding process, a clear line of sight can be created as to the origin of all code, insulating the startup from risk.

Identify Licensing Risk within GenAI-Created Code

Potential acquirers must understand what types of open source code appears within a startup’s product base so they can appropriately manage business risk. While there are other types of business risk involved in open source code, one critical risk is licensing risk, which includes potential legal liabilities, compliance issues, financial cost, and intellectual property risk.

GenAI coding tools, which are trained using large language models (LLMs) and natural language processing, create code in response to prompts from developers. GenAI code can be particularly susceptible to licensing risk because GenAI code tools may strip licensing from code lines and snippets when it is created. That means that developers may not realize the code they are using carries a restrictive—instead of permissive—open source license.

As GenAI tools become more popular with developers, this risk is rising. In fact, GenAI coding tools—including ChatGPT, GitHub Copilot, Cursor, and more—are popular with developers, with more than 76% of developers surveyed by StackOverflow either already using them or planning on using them. In the course of using GenAI coding tools, developers unknowingly may insert snippets or whole pieces of open source code that is under restrictive licenses into their programs.

Copyleft

Copyleft is one type of restrictive open source license. Copyleft is an example of a type of restrictive open source code that requires that any derivative uses of that specific code in other programs or products be made available under the same terms. Those terms require that future users can further copy and change it without charge, which means that proprietary products that charge licensing fees should not include copyleft code.

“Generative AI tools may have been trained on copyleft code or the copyleft code could have been incorrectly copied from somewhere else,” said Argentieri. “That creates a risk that you’re accidentally incorporating copyleft code into your software. When you distribute your product, technically you would have an obligation to make your source code available under that copyleft license.”

The creators of open source code, including types with copyleft, believe that code should be freely available to anyone who wants to use it. That freedom is not just associated with the cost, in that open source is available without charge, but also that users should have freedom to use it without restrictions. In other words, the philosophy of open source and copyleft is that middlemen should not be able to strip away the freedom to use open source code through restrictive licenses that charge fees.

Because of the licensing restrictions that copyleft code carries, organizations need to understand their vulnerabilities in terms of whether copyleft code is somehow included in their proprietary products.

Mitigate Code Licensing Risks

To avoid breaching copyleft licenses—and other types of open source licenses—it is possible to implement policies and procedures governing the use of both GenAI-developed code and open source code in general. Oleh Komenchuk recommends using GenAI coding tools for prototyping, but not for the actual production code in products, said the ML lead and AI Engineer at software development company Uptech.

“AI speeds up our thinking, not our shipping,” Komenchuk said. His organization uses a four-step process to mitigate licensing risk that includes reviewing all AI code, tracking where all code comes from, scanning code with tools designed to detect suspicious code snippets, and limiting use of AI to prototyping. Tools designed to scan AI for licensing violation include Black Duck, FOSSA, and Synk.

Argentieri suggested that startups—and all organizations using AI tools to generate code—develop policies to minimize AI risk. “If you are using third-party AI tools, understand what data is going into those, the rights around that data that is either being input or is being trained on to mitigate AI risk,” he said. “Licensing is just one risk—there are many other risks involved in using AI coding tools and other AI tools including privacy, data security, and potentially leaking personal information into AI models.”

Amy Buttell is a Silver Spring, MD-based technology, legal, and business journalist, content creator, writer, and ghostwriter.