227 lines
10 KiB
HTML
227 lines
10 KiB
HTML
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!DOCTYPE html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
|
|
<meta charset="UTF-8">
|
|
<meta http-equiv="X-UA-Compatible" content="IE=edge">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
<link rel="shortcut icon" type="image/jpg" href="https://branding.ewpratten.com/pfp/2022/460x460.webp" />
|
|
|
|
<link rel="canonical" href="https://ewpratten.com/blog/shift2/" />
|
|
|
|
|
|
<link rel="alternate" type="application/rss+xml" title="RSS" href="https://ewpratten.com/rss.xml">
|
|
|
|
<meta name="twitter:card" content="summary" />
|
|
<meta name="og:site" content="ewpratten.com" />
|
|
<meta name="og:site_name" content="Evan Pratten" />
|
|
|
|
|
|
<meta name="og:image"
|
|
content="https://branding.ewpratten.com/pfp/2022/460x460.webp" />
|
|
|
|
|
|
<meta property="og:description" content="XOR is pretty cool" />
|
|
<meta property="description" content="XOR is pretty cool" />
|
|
<meta name="description" content="XOR is pretty cool">
|
|
|
|
|
|
<meta property="og:title" content="Keyed data encoding with Python - Evan Pratten" />
|
|
|
|
|
|
|
|
<meta property="og:type" content="article" />
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<title>Keyed data encoding with Python | Evan Pratten</title>
|
|
|
|
|
|
<link rel="stylesheet" href="/global.css">
|
|
|
|
|
|
<link rel="stylesheet" href="/dist/github-markdown-css/github-markdown-light.css" lazyload>
|
|
<link rel="stylesheet" href="/styles/bootstrap.css" lazyload>
|
|
<link rel="stylesheet" href="/styles/typography.css">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<div class="page">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="/styles/components/heading-card.css">
|
|
|
|
|
|
<div class="heading-card">
|
|
<div class="profile-photo-container">
|
|
<img src="https://branding.ewpratten.com/pfp/2022/460x460.webp" alt="Profile Photo" loading="lazy">
|
|
</div>
|
|
<div class="text-container">
|
|
<h1>Evan Pratten</h1>
|
|
<p>Software Developer</p>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
<div class="container">
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="/styles/components/navbar.css">
|
|
|
|
|
|
<div class="ewp-navbar">
|
|
<hr>
|
|
<ul class="navbar-items">
|
|
<li><a href="/">Home</a></li>
|
|
<li class="separator">|</li>
|
|
<li><a href="/timeline">Timeline</a></li>
|
|
<li class="separator">|</li>
|
|
<li class="dropdown-center">
|
|
<a href="#" role="button" data-bs-toggle="dropdown" aria-expanded="false">
|
|
More
|
|
</a>
|
|
<ul class="dropdown-menu">
|
|
|
|
|
|
<li><a class="dropdown-item" href="/photography">Photography</a></li>
|
|
<li><a class="dropdown-item" href="/contact">Contact</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
</ul>
|
|
<hr>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
<article id="content" class="container markdown-body">
|
|
|
|
<h1 style="margin-bottom:0;padding-bottom:0;">Keyed data encoding with Python</h1>
|
|
<em>XOR is pretty cool</em>
|
|
<br><br>
|
|
|
|
<p>I have always been interested in text and data encoding, so last year, I made my first encoding tool. <a rel="noopener" target="_blank" href="https://github.com/Ewpratten/shift64">Shift64</a> was designed to take plaintext data with a key, and convert it into a block of base64 that could, in theory, only be decoded with the original key. I had a lot of fun with this tool, and a very stripped down version of it actually ended up as a bonus question on the <a rel="noopener" target="_blank" href="https://github.com/frc5024/Programming-Test/blob/master/test.md">5024 Programming Test</a> for 2018/2019. Yes, the key was in fact <code>5024</code>.</p>
|
|
<p>This tool had some issues. Firstly, the code was a mess and only accepted hard-coded values. This made it very impractical as an every-day tool, and a nightmare to continue developing. Secondly, the encoder made use of entropy bits, and self modifying keys that would end up producing encoded files >1GB from just the word <em>hello</em>.</p>
|
|
<h2 id="shift2">Shift2</h2>
|
|
<p>One of the oldest items on my TODO list has been to rewrite shift64, so I made a brand new tool out of it. <a rel="noopener" target="_blank" href="https://github.com/Ewpratten/shift">Shift2</a> is both a command-line tool, and a Python3 library that can efficiently encode and decode text data with a single key (unlike shift64, which used two keys concatenated into a single string, and separated by a colon).</p>
|
|
<h3 id="how-it-works">How it works</h3>
|
|
<p>Shift2 has two inputs. A <code>file</code>, and a <code>key</code>. These two strings are used to produce a single output, the <code>message</code>.</p>
|
|
<p>When encoding a file, shift2 starts by encoding the raw data with <a rel="noopener" target="_blank" href="https://en.wikipedia.org/wiki/Ascii85">base85</a>, to ensure that all data being passed to the next stage can be represented as a UTF-8 string (even binary data). This base85 data is then XOR encrypted with a rotating key. This operation can be expressed with the following (this example ignores the base85 encoding steps):</p>
|
|
<pre data-lang="python" style="background-color:#2b303b;color:#c0c5ce;" class="language-python "><code class="language-python" data-lang="python"><span>file = "</span><span style="color:#a3be8c;">Hello reader! I am some input that needs to be encoded</span><span>"
|
|
</span><span>key = "</span><span style="color:#a3be8c;">ewpratten</span><span>"
|
|
</span><span>
|
|
</span><span>message = ""
|
|
</span><span>
|
|
</span><span style="color:#b48ead;">for </span><span>i, char </span><span style="color:#b48ead;">in </span><span style="color:#96b5b4;">enumerate</span><span>(file):
|
|
</span><span> message += </span><span style="color:#96b5b4;">chr</span><span>(
|
|
</span><span> </span><span style="color:#96b5b4;">ord</span><span>(char) ^ </span><span style="color:#96b5b4;">ord</span><span>(key[i % </span><span style="color:#96b5b4;">len</span><span>(key) - </span><span style="color:#d08770;">1</span><span>])
|
|
</span><span> )
|
|
</span><span>
|
|
</span></code></pre>
|
|
<p>The output of this contains non-displayable characters. A second base85 encoding is used to fix this. Running the example snippet above, then base85 encoding the <code>message</code> once results in:</p>
|
|
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>CIA~89YF>W1PTBJQBo*W6$nli7#$Zu9U2uI5my8n002}A3jh-XQWYCi2Ma|K9uW=@5di
|
|
</span></code></pre>
|
|
<p>If using the shift2 commandline tool, you would see a different output:</p>
|
|
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>B2-is8Y&4!ED2H~Ix<~LOCfn@P;xLjM_E8(awt`1YC<SaOLbpaL^T!^W_ucF8Er~?NnC$>e0@WAWn2bqc6M1yP+DqF4M_kSCp0uA5h->H
|
|
</span></code></pre>
|
|
<p>This is for a few reasons. Firstly, as mentioned above, shift2 uses base85 <strong>twice</strong>. Once before, and once after XOR encryption. Secondly, a file header is prepended to the output to help the decoder read the file. This header contains version info, the file length, and the encoding type.</p>
|
|
<h3 id="try-it-yourself-with-pip">Try it yourself with PIP</h3>
|
|
<p>I have published shift2 on <a rel="noopener" target="_blank" href="https://pypi.org/project/shift-tool/">pypi.org</a> for use with PIP. To install shift2, ensure both <code>python3</code> and <code>python3-pip</code> are installed on your computer, then run:</p>
|
|
<pre data-lang="sh" style="background-color:#2b303b;color:#c0c5ce;" class="language-sh "><code class="language-sh" data-lang="sh"><span style="color:#65737e;"># Install shift2
|
|
</span><span style="color:#bf616a;">pip3</span><span> install shift-tool
|
|
</span><span>
|
|
</span><span style="color:#65737e;"># View the help for shift2
|
|
</span><span style="color:#bf616a;">shift2 -h
|
|
</span></code></pre>
|
|
<div id="demo" markdown="1">
|
|
<h3 id="try-it-in-the-browser">Try it in the browser</h3>
|
|
<p>I have ported the core code from shift2 to <a rel="noopener" target="_blank" href="http://www.brython.info/index.html">run in the browser</a>. This demo is entirely client-side, and may take a few seconds to load depending on your device.</p>
|
|
<input type="radio" id="encode" name="shift-action" value="encode" checked>
|
|
<label for="encode">Encode</label>
|
|
<input type="radio" id="decode" name="shift-action" value="decode">
|
|
<label for="decode">Decode</label>
|
|
<p><input type="text" id="key" name="key" placeholder="Encoding key" required><br>
|
|
<input type="text" id="msg" name="msg" placeholder="Message" required size="30"></p>
|
|
<p><button type="button" class="btn btn-primary" id="shift-button" disabled>shift2 demo is loading... (this may take a few seconds)</button></p>
|
|
</div>
|
|
<h3 id="future-plans">Future plans</h3>
|
|
<p>Due to the fact that shift2 can also be used as a library (as outlined in the <a rel="noopener" target="_blank" href="https://github.com/Ewpratten/shift/blob/master/README.md">README</a>), I would like to write a program that allows users to talk to eachother IRC style over a TCP port. This program would use either a pre-shared, or generated key to encode / decode messages on the fly.</p>
|
|
<p>If you are interested in helping out, or taking on this idea for yourself, send me an email.</p>
|
|
<!-- Python code -->
|
|
<script type="text/python" src="/assets/python/shift2/shift2demo.py"></script>
|
|
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="/styles/components/footer.css">
|
|
|
|
|
|
<div class="footer">
|
|
<br>
|
|
<span class="gray">-- EOF --</span>
|
|
<p>
|
|
Site design & content by: <a href="/contact">Evan Pratten</a><br>
|
|
Consider <a href="/donate" target="_blank">supporting my work</a> if you like what you see<br>
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.2.2/dist/js/bootstrap.bundle.min.js"
|
|
integrity="sha384-OERcA2EqjJCMA+/3y+gxIOqMEjwtxJY7qPCqsdltbNJuaOe923+mo//f6V8Qbsw3"
|
|
crossorigin="anonymous"></script>
|
|
|
|
<!-- Global site tag (gtag.js) - Google Analytics -->
|
|
<script defer src="https://www.googletagmanager.com/gtag/js?id=G-5912H4H03P"></script>
|
|
<script>
|
|
window.dataLayer = window.dataLayer || [];
|
|
function gtag() { dataLayer.push(arguments); }
|
|
gtag('js', new Date());
|
|
|
|
gtag('config', 'G-5912H4H03P');
|
|
</script>
|
|
</body>
|
|
|
|
</html> |