Changeset - 9a142ebfd720
[Not reviewed]
0 1 0
Bradley Kuhn (bkuhn) - 8 years ago 2016-08-09 05:05:59
bkuhn@ebb.org
Add link to partial source release.
1 file changed with 1 insertions and 1 deletions:
0 comments (0 inline, 0 general)
www/conservancy/static/copyleft-compliance/vmware-code-similarity.html
Show inline comments
...
 
@@ -12,13 +12,13 @@
 
<h1 id="establishing-a-baseline-of-the-ccfinderx-tool">Establishing A Baseline of the CCFinderX Tool</h1>
 
<p>CCFinderX offers many statistics for clone detection. After expert analysis, we concluded that most relevant to this situation is the &quot;ratio of similarity&quot; between the existing code and the new code. To establish a baseline, we considered two different comparisons of Free and Open Source Software (FOSS). First, we compared the Linux kernel, Version 4.5.2, to the FreeBSD kernel, Version 10.3.0. This comparison was inspired by the similar 2002 study <a href="#fn5" class="footnoteRef" id="fnref5"><sup>5</sup></a> of these two large C programs. The hypothesis remained that CCFinderX would encounter a small percentage of code similarity, since the FreeBSD and Linux projects collaborate on some subprojects and willingly share code under the 3-Clause BSD license for those parts. (These collaborations are public and well-documented.)</p>
 
<p>The experiment confirmed the hypothesis. We found that a 3.68% &quot;ratio of similarity&quot; when comparing code from Linux to the FreeBSD kernel.</p>
 
<p>Next, we compared the source code of the Linux Kernel 4.5.2 to the LLVM+Clang system, version 3.8.0. These two projects are each a large program written in the C programming language, but they are not known to actively share code. We would expect some very minimal similarity simply due to chance, but something much lower than the 3.68% found between Linux and FreeBSD's kernel.</p>
 
<p>Indeed, when the same test is run to compare Linux to the LLVM+Clang system, the &quot;ratio of similarity&quot; was 0.075%.</p>
 
<h1 id="general-comparison-of-linux-kernel-to-vmware-sources">General Comparison of Linux Kernel to VMware sources</h1>
 
<p>With the baseline established, we now begin relevant comparisons. First, we compare the Linux kernel version 2.6.34 to the sources released by VMware in their (partial) source release. the &quot;ratio of similarity&quot; between Linux 2.6.34 and VMware's partial source release is 20.72%. There is little question that much of VMware's kernel has come from Linux.</p>
 
<p>With the baseline established, we now begin relevant comparisons. First, we compare the Linux kernel version 2.6.34 to the sources <a href="http://k.sfconservancy.org/vmkdrivers">released by VMware in their (partial) source release</a>. The &quot;ratio of similarity&quot; between Linux 2.6.34 and VMware's partial source release is 20.72%. There is little question that much of VMware's kernel has come from Linux.</p>
 
<h1 id="methodology-of-showing-hellwigs-contributions-in-vmware-esxi-5.5-sources">Methodology Of Showing Hellwig's Contributions in VMware ESXi 5.5 Sources</h1>
 
<p>The following describes a methodology to show Hellwig's contributions to Linux, and how they compare to code found in VMware ESXi 5.5.</p>
 
<h2 id="extracting-hellwigs-contributions-from-linux-historical-repository">Extracting Hellwig's Contributions From Linux Historical Repository</h2>
 
<p>Excellent records exist of contributions made to Linux from 2002-02-04 through present date. From 2002-02-04 through 2005-04-03, Bitkeeper was used to store revision control history of Linux. Each improvement contributed to Linux has information regarding who placed the contribution in Linux, and a comment field in which the contributor can credit others, such as by noting that the contribution actually came from someone else.</p>
 
<p>I extracted from the historical Linux tree the identifying number of all commits that are either made with Hellwig in the official Author field, or where the person in the Author field left notes clearly indicating that the contribution was done by Hellwig. For the latter, the following regular expression search against the log file was used:</p>
 
<pre><code>(Submitted\s+by|original\s+patch|patch\s+(from|by)|originally\s+(from|by)).*Hellwig</code></pre>
0 comments (0 inline, 0 general)