By Cabe Attwell, Codio Contributing Editor
Recent studies show computer science students don’t adequately understand the difference between source-code plagiarism, copyright infringement, and true fair use. As such, is it any wonder plagiarism persists in both academic and professional settings?
It seems like a simple concept – collaboration is good; plagiarism is bad. But in the world of computer science, it isn’t that simple. There is disagreement regarding what constitutes plagiarism, copyright infringement, and free use; and there is no wonder plagiarism is a huge concern in academic and professional CS environments alike.
The definition of plagiarism
According to the Merriam-Webster Dictionary, plagiarism is defined as “to use the words or ideas of another person as if they were your own words or ideas.” Yet pertaining to source-code plagiarism, research shows students (who later become CS professionals) plagiarize simply because the understanding of source-code plagiarism is murky.
In the paper, “Source Code Plagiarism – A Student Perspective,” researchers discovered the definition of source-code plagiarism was unclear for both professors and students. The majority of students surveyed did not understand the difference between source-code plagiarism and copyright infringement; and free use is another concept entirely. Without a clear-cut definition, research indicates plagiarism, whether intentional or not, is on the rise.
A recent news article published by Stanford discovered academic dishonesty among computer science students has gradually increased each year. While the university has turned to source-code plagiarism software to uphold academic integrity of its CS students, there isn’t enough research to support whether the programmers are stealing intentionally or not.
In some form, plagiarism in programming isn’t so different from traditional literary plagiarism. If something isn’t your idea, don’t pass it off as your own. That’s easy enough. But plagiarism in programming is different, in that some programmers do not entirely know they are using the ideas of others. Until there is a clear definition for ethnical practices, the situation isn’t likely to change.
In the aforementioned paper, a majority of computer science students stated using an idea from another programmer or textbook “for inspiration” did not constitute plagiarism, and also did not require the need to provide a source. These are the same programmers who work for companies, like Waze, and unknowingly commit acts that are not ethnical.
It was big news when Google acquisition Waze was hit with a $150 million lawsuit for using open-source code in its navigational application. Open-source means free use, right? Not exactly.
Free use and Creative Commons licensing grant free use of copyrighted materials, depending on the purpose of such use. Typically, free use is granted for the expansion of education, or the development of other open-source programs, for the overall benefit of a said community. Not for late homework assignments or commercialized products, even if you get consent.
In the case of Waze, the company did gain consent for use to create an open-source platform based on the contributions of the open-source community, but not to commercialize a product. As such, when the technology was purchase by Google for more than $1B, Waze was hit with a hefty lawsuit, and programmers everywhere heeded the warning.
Tools Against Plagiarism
Several programs exist today to catch source-code plagiarism. Among the most popular are Moss, and iThenticate. These allow companies and educational institutes to check student work for plagiarism to ensure students are honest in their practices, and most importantly – that they actually understand the material. While in coding, most functions are built upon the same “wheel,” there are still many ways to bake a cake. And if you want to earn your keep in the highly competitive CS environment, you’d better be able to forge your own coding path.
Due to its' importance for computer science professors, Codio has integrated a plagiarism detection tool that is twice as effective. It provides results that are always available to the teacher, and the teacher makes the decision on whether plagiarism has occurred.
On average, most programmers don’t plagiarize intentionally. The coding community helps one another through forums like Stack Exchange and GitHub, but that doesn’t mean this help is something that can be commercialized.
The CS community first and foremost needs an agreed upon definition of source-code plagiarism and copyright infringement. If the academic community can drill such concepts into students through mandatory ethics courses, at least university-educated programmers will have a firm grasp of the concept of plagiarism, which should decrease its prevalence. As for self-made programmers, the concept isn’t hard to grasp, and such ethics courses could also be made open-source.
If we all work together, we can support one another’s ingenuity to create new, innovative technology solutions that the world has never seen, and which would be otherwise impossible if we continue to recreate copy-pasted programs that already exist. And where is the fun in that?