What is Open Software?

Software helps researchers collect data, analyze results, model systems, and share findings efficiently and accurately. Open Software is often articulated as software that can be adapted – that is Open Source software. In the world of programming, this often includes some sense of collaboration where someone else, who is able to view and edit a copy of your code, is then able to make a suggestion on how to improve your program. Imagine you’re the author of a book, and for each new edition, you crowd sourced edits that your readers think will improve the story; new characters, different timelines, you name it!

We’ll get into the nitty gritty of different interpretations or implementations of open in the context of software shortly, but to start, the following CNBC news story, The Rise Of Open-Source Software (13:51), provides a high level overview of the history and future directions of Open Source Software.

The Rise of Open Source Software | CNBC

With this underlying understanding of open software as adaptable, open source code, throughout this module we’re going to get more nuanced in exactly how we might interpret open and the impacts this can have on the research life cycle.

But first, in addition to adaptable code, when we talk about software that is open, we often call it free. Free is a term with many interpretations, so we’d like to clarify the difference between free software, Free Software (with capitals), and Open Source software.

Although perhaps nothing in life is truly free, free software simply implies no financial transactions being needed in the acquisition of the software. Lots of software is ostensibly free; not all of it has its underlying code available to be adapted by anyone wishing to do so. Often free software is provided without any sort of warranty or promise of support. Most social media platforms fit within this definition of free software: there’s no financial transaction needed to access the platform, but users “pay” by viewing advertisements or by having their behaviour tracked on the platform.

When we talk about Open Source software we’re describing the availability of the underlying code for people to edit, modify, and improve. This certainly does not imply that there is no price tag attached to said software. For example, some software is available free of charge for non-commercial use. Alternatively, some companies provide paid support contracts (with the software being free to use for those who do not wish to have access to customer support).

And finally, when we talk about Free Software, we’re describing a specific philosophical approach to the development and distribution of software that goes well beyond simple financial transactions and code adaptability. Free Software, in this latter sense, is defined by the four essential freedoms of the user (you) to:

Run the program as they wish;
View the source code and modify how the program works;
Redistribute the original program;
Distribute modified versions of the program.

Differentiating free software and Open Source software from Free Software is often done, per the GNU philosophy, by suggesting that we’re not talking about free in the sense of “free beer”, but “free speech”.

Dig Deeper

Learn more about the GNU philosophy:

The GNU articulation of Free Software: What is Free Software?
The GNU on how Open Source and Free Software differ: Why Open Source misses the point of Free Software
Founder of the GNU, Richard Stallman discusses the ethos behind and need for Free Software: Free software, free society: Richard Stallman at TEDxGeneva 2014 (13:39)

Where Does My Code Run?

When thinking about various forms of free and open software, we need to consider where the software will run: locally on our own computer or in the cloud? Software as a Service (SaaS) is a business model where software applications are hosted and made available to users over the internet. The primary benefits to software-users are convenience (it’s easier to sign-up for a service with your email address than it is to install a piece of software), low upfront cost (often zero, or a monthly free), and access to massive-scale computing power. You probably use SaaS everyday; Google Search, Zoom, and ChatGPT are all examples of SaaS applications.

In the context of Open, the largest drawback of most SaaS applications is the lack of transparency. The source code for SaaS applications is often proprietary and not shared. So, while users may be able to interact with a system without paying they often won’t have the four fundamental freedoms of Free Software. Furthermore, software updates are frequent in SaaS systems, which leads to challenges in replication: if you have used a cloud based service in your research and the providers materially change the service (or go out of business), then you may not be able to replicate your prior work.

Not all SaaS applications are closed source. Some, such as Mattermost (a Slack alternative), WordPress, H5P, and GitLab (a GitHub alternative) are provided as Open Source software with the option to self-host the application on your own computer, in addition to their SaaS business model.

The Many Facets of Open in Open Software

Just as free can be interpreted differently in different contexts, so can the intent of open.

When we talk about open software supporting the research life cycle, we can think of three tiers of open:

Software whose code is publicly available — that is, anyone can inspect and verify the code;
Software that is open source licensed — that is, not only is the code public, it is available for people to edit, modify, reuse and improve; and
Software that allows for creating human and machine interpretable content.

This last tier can seem like a bit of a leap if you’re used to working in programs like Microsoft Word and Excel, for example, to write and build visualizations like bar charts. We’ll look more in depth – and with examples – at what it means for the underlying structure of a document and its analysis to be both human and machine interpretable shortly. In the meantime, one example that you might already be familiar with, often used in Open Education, are Wikis. It is not uncommon for class research assignments to be built around editing existing pages on Wikipedia or even generating course-specific content on UBC’s Wiki. Wikis employ a flavour of Markdown, a key tool in building human and machine readable content. More on this shortly!

Let’s Briefly Recap

Open software can be interpreted by its price tag, the sharing of its underlying source code, and the philosophy that underpins how it should be used and distributed. In short, open source does not equal free, nor does closed source equal paid. And open source can come with many restrictions that Free Software should not.

In supporting transparency in the research process, a key element of open software includes using tools that enable human and machine interpretable content so that even if a particular piece of software is no longer available, a human should still be able to understand what the intent of the code was.

In the scholarship of teaching and learning, engaging in open platforms is just as important as it is in traditional research.

A Note on Licensing

While the above largely speaks to how software is developed and / or interacted with by the user and developers, creating and sharing open software is an important part of open scholarship. If you are creating software and want to make it available for others to use, there are several standard copyright licenses you can choose from. We’ll explore open licenses in more detail in the open education section of POSE, but understanding software licensing is essential if you plan to modify existing open source code or incorporate another open source project into your own work.

It’s important to note that open source licenses are specifically designed for software, while Creative Commons licenses, which are used in open access publishing and open educational resources are generally intended for “content” based work like writing, art, and media, not software. It’s also important to note that not all open software licenses are compatible with each other, and you need to be mindful of this when building on someone else’s code. While we won’t dive into the complex details of which specific licenses are compatible, it’s still crucial to pay close attention to licensing when using or combining open source software projects.

Dig Deeper

If you wish to take a bit of a diversion and learn about software licensing, check out the following:

Introductory post from Free Code Camp on open source licenses
A bit more information from Opensource.com
And a 200 page book on open source and free software licenses for those wanting the full story
A dive into licensing implications for code created by GenAI tools
Are you interested in applying an open source license to something you created? If so, explore your workplace’s intellectual property policies. At UBC this LR11

Why Does it Matter?

In some aspects of the research process, being able to follow the linkages between outputs and inputs is a common expectation. For example, when we read a paper, we expect it to cite its sources. We also expect to be able to track down those sources and investigate the strength and validity of the claims being made.

Consider the following scenario:

Scenario – Open, Reproducible Research

Abdul is a geneticist working at UBC. They recently came across an article in their area of research. Abdul contacted the authors to inquire about accessing the data and scripts for both cleaning and analysing the data; if the results of the study could be confirmed, there could be a huge impact for Abdul’s area of practice. The authors forwarded Abdul the data, but responded that they hadn’t kept track of everything they did with the data; some of the clean up and organization happened in Excel and subsequent analyses were done in the statistical program SPSS.

Without being able to reproduce the analyses done on the data – changes were not tracked in Excel and Abdul does not have access to SPSS, a proprietary, closed source application – Abdul is unable to verify the findings claimed in the research article. This leaves Abdul unsure how these findings should be evaluated and interpreted.

In the example above, the research that Abdul came across was neither transparent nor reproducible (refer back to the Open Research module on Reproducibility & Replicability). The use of open software in this situation would have been one step the researchers could have taken to remediate this. Such a choice would have also contributed to the posterity and reliability of both their research inputs and outputs.

Increasingly, if data is being summarized, we expect to be able to review the underlying raw data to understand how it’s been transformed.

Likewise, we should also expect to be able to see and understand the software that was used to interpret that data and generate that output. Proprietary software limits our ability to do so. Open software, on the other hand, helps to increase this transparency.

Additionally, we’ve all encountered files in formats not supported by our operating system or any current program available. Ideally, whether it be the ethics application that initiated a research project, the data collection tool employed, the data processing tool used, or the final output – poster, audio, video, traditional manuscript – we expect to be able to review the content 5, 10, 15 years post-production. Open software, using open formats, helps to facilitate this.

Reflecting back to the example above, if the data that Abdul needed was readily available in a standardized, open format (more about this in the module on Open Data) and the cleaning and processing of this data done using a readily available open source software solution with proper documentation, Abdul could have more readily engaged in verifying the results and potentially improving their own research practices. As we’ll see shortly, using scripts to handle the data would improve this transparency and reproducibility still further; making the process an open process using open tools.