Changeset - 20d1d0419125
[Not reviewed]
0 1 0
Bradley Kuhn (bkuhn) - 8 years ago 2016-08-09 04:57:27
bkuhn@ebb.org
Fix code command formatting.
1 file changed with 7 insertions and 1 deletions:
0 comments (0 inline, 0 general)
www/conservancy/static/copyleft-compliance/vmware-code-similarity.html
Show inline comments
...
 
@@ -21,13 +21,19 @@
 
<h2 id="extracting-hellwigs-contributions-from-linux-historical-repository">Extracting Hellwig's Contributions From Linux Historical Repository</h2>
 
<p>Excellent records exist of contributions made to Linux from 2002-02-04 through present date. From 2002-02-04 through 2005-04-03, Bitkeeper was used to store revision control history of Linux. Each improvement contributed to Linux has information regarding who placed the contribution in Linux, and a comment field in which the contributor can credit others, such as by noting that the contribution actually came from someone else.</p>
 
<p>I extracted from the historical Linux tree the identifying number of all commits that are either made with Hellwig in the official Author field, or where the person in the Author field left notes clearly indicating that the contribution was done by Hellwig. For the latter, the following regular expression search against the log file was used:</p>
 
<pre><code>(Submitted\s+by|original\s+patch|patch\s+(from|by)|originally\s+(from|by)).*Hellwig</code></pre>
 
<p>Specifically, I used <a href="https://github.com/conservancy/gpl-compliance-tools/blob/master/commit-id-list-matching-regex.plx">a script</a> to extract a list of commit ids from the <a href="git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git">historical Linux repository</a>. This method found 1,012 separate occasions of contribution by Hellwig from 2002-02-04 through 2005-04-03.</p>
 
<p>After finding these separate occasions of contribution, I then extracted the source code lines that Hellwig added or changed in each contribution in this repository. I did so by carefully cross-referencing the commits that Hellwig performed with the output of <code>git blame</code>. I specifically <a href="https://github.com/conservancy/gpl-compliance-tools/blob/master/extract-code-added-in-commits.plx">wrote a script</a> to carefully extracted only lines that Hellwig changed or added in that repository, and placed only those contributions identifiable as Hellwig's into new files whose named matched the original filenames. This created a corpus of code that can be verifiable as added or changed by Hellwig and no one else.</p>
 
<p>Here are the specific commands I ran: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git linux-historical $ ./commit-id-list-matching-regex.plx <code>pwd</code>/linux-historical/.git Hellwig '(Submitted+by|originals+patch|patch+from|originally+by).<em>' &gt; hellwig-historical.ids $ ./extract-code-added-in-commits.plx --repository=<code>pwd</code>/linux-historical --output-dir=<code>pwd</code>/hellwig-historical --central-commit e7e173af42dbf37b1d946f9ee00219cb3b2bea6a --progress --blame-opts=-M --blame-opts=-C &lt; ./hellwig-historical.ids $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-current $ ./commit-id-list-matching-regex.plx <code>pwd</code>/linux-current/.git Hellwig '(Submitted+by|original+patch|patch+(from|by)|originally+(from|by)).</em>' &gt; ./hellwig-current.ids $ ./extract-code-added-in-commits.plx --progress --repository=<code>pwd</code>/linux-current --output-dir=<code>pwd</code>/hellwig-through-2.6.34 --fork-limit=14 --blame-opts=-M --blame-opts=-M --blame-opts=-C --blame-opts=-C --central-commit e40152ee1e1c7a63f4777791863215e3faa37a86 &lt; hellwig-current.ids</p>
 
<p>Here are the specific commands I ran:</p>
 
<pre><code>$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git linux-historical
 
$ ./commit-id-list-matching-regex.plx `pwd`/linux-historical/.git Hellwig &#39;(Submitted\s+by|originals+patch|patch\s+from|originally\s+by).*&#39; &gt; hellwig-historical.ids
 
$ ./extract-code-added-in-commits.plx --repository=`pwd`/linux-historical --output-dir=`pwd`/hellwig-historical --central-commit e7e173af42dbf37b1d946f9ee00219cb3b2bea6a --progress --blame-opts=-M --blame-opts=-C &lt; ./hellwig-historical.ids
 
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-current
 
$ ./commit-id-list-matching-regex.plx `pwd`/linux-current/.git Hellwig &#39;(Submitted\s+by|original\s+patch|patch\s+(from|by)|originally\s+(from|by)).*&#39; &gt; ./hellwig-current.ids
 
$ ./extract-code-added-in-commits.plx --progress --repository=`pwd`/linux-current --output-dir=`pwd`/hellwig-through-2.6.34 --fork-limit=14 --blame-opts=-M  --blame-opts=-M --blame-opts=-C --blame-opts=-C --central-commit e40152ee1e1c7a63f4777791863215e3faa37a86   &lt; hellwig-current.ids </code></pre>
 
<p>Note: e40152ee1e1c7a63f4777791863215e3faa37a86 is the 2.6.34 version created by Linus Torvalds <script type="text/javascript">
 
<!--
 
h='&#108;&#x69;&#110;&#x75;&#120;&#x2d;&#102;&#x6f;&#x75;&#110;&#100;&#x61;&#116;&#x69;&#x6f;&#110;&#46;&#x6f;&#114;&#x67;';a='&#64;';n='&#116;&#x6f;&#114;&#118;&#x61;&#108;&#100;&#x73;';e=n+a+h;
 
document.write('<a h'+'ref'+'="ma'+'ilto'+':'+e+'">'+e+'<\/'+'a'+'>');
 
// -->
 
</script><noscript>&#116;&#x6f;&#114;&#118;&#x61;&#108;&#100;&#x73;&#32;&#x61;&#116;&#32;&#108;&#x69;&#110;&#x75;&#120;&#x2d;&#102;&#x6f;&#x75;&#110;&#100;&#x61;&#116;&#x69;&#x6f;&#110;&#32;&#100;&#x6f;&#116;&#32;&#x6f;&#114;&#x67;</noscript> on 2010-05-16 14:17:36 -0700, with Git commit comment: &quot;Linus 2.6.34&quot;.</p>
0 comments (0 inline, 0 general)