234 lines
12 KiB
HTML
234 lines
12 KiB
HTML
<head>
|
|
<title>Evan Pratten</title>
|
|
<meta charset="utf-8" />
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no" />
|
|
|
|
<!-- Begin Jekyll SEO tag v2.6.1 -->
|
|
<title>Keyed data encoding with Python | Evan Pratten</title>
|
|
<meta name="generator" content="Jekyll v4.0.0" />
|
|
<meta property="og:title" content="Keyed data encoding with Python" />
|
|
<meta property="og:locale" content="en_US" />
|
|
<meta name="description" content="XOR is pretty cool" />
|
|
<meta property="og:description" content="XOR is pretty cool" />
|
|
<link rel="canonical" href="http://0.0.0.0:4000/blog/2019/08/24/shift2" />
|
|
<meta property="og:url" content="http://0.0.0.0:4000/blog/2019/08/24/shift2" />
|
|
<meta property="og:site_name" content="Evan Pratten" />
|
|
<meta property="og:type" content="article" />
|
|
<meta property="article:published_time" content="2019-08-24T09:13:00-04:00" />
|
|
<script type="application/ld+json">
|
|
{"datePublished":"2019-08-24T09:13:00-04:00","mainEntityOfPage":{"@type":"WebPage","@id":"http://0.0.0.0:4000/blog/2019/08/24/shift2"},"@type":"BlogPosting","url":"http://0.0.0.0:4000/blog/2019/08/24/shift2","headline":"Keyed data encoding with Python","description":"XOR is pretty cool","dateModified":"2019-08-24T09:13:00-04:00","@context":"https://schema.org"}</script>
|
|
<!-- End Jekyll SEO tag -->
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
|
|
integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
|
|
<link rel="stylesheet" href="/assets/css/main.css">
|
|
<link rel="stylesheet" href="/assets/css/github-syntax.css">
|
|
<link href="https://fonts.googleapis.com/css?family=IBM+Plex+Mono:400,400i|IBM+Plex+Sans:100,100i,400,400i,700,700i" rel="stylesheet">
|
|
<link href="https://stackpath.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous">
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<div class="site-ctr">
|
|
<!-- Navbar -->
|
|
<nav class="navbar navbar-dark sticky-top bg-dark navbar-expand-lg">
|
|
<!-- Navbar content -->
|
|
<!-- <div class="container"> -->
|
|
<a class="navbar-brand" href="/">Evan Pratten</a>
|
|
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarNavAltMarkup" aria-controls="navbarNavAltMarkup" aria-expanded="false" aria-label="Toggle navigation">
|
|
<span class="navbar-toggler-icon"></span>
|
|
</button>
|
|
<div class="collapse navbar-collapse" id="navbarNavAltMarkup">
|
|
<div class="navbar-nav ml-auto">
|
|
<a class="nav-item nav-link" href="/blog">Blog</a>
|
|
<a class="nav-item nav-link" href="/projects">Projects</a>
|
|
<!-- <a class="nav-item nav-link" href="/documentation">Documentation</a> -->
|
|
<a class="nav-item nav-link" href="/about">About</a>
|
|
</div>
|
|
<!-- </div> -->
|
|
</div>
|
|
</nav>
|
|
<!-- <div style="height:5vh"></div> -->
|
|
|
|
<!-- Header -->
|
|
<!-- <div class="header">
|
|
<div class="container">
|
|
<div class="content">
|
|
</div>
|
|
</div>
|
|
<div class="header-gap"></div>
|
|
</div> -->
|
|
|
|
<div class="reactive-bg">
|
|
<div class="post container">
|
|
<h1>Keyed data encoding with Python
|
|
|
|
</h1>
|
|
<h4>XOR is pretty cool
|
|
|
|
</h4>
|
|
<hr>
|
|
<p><em>2019-08-24 09:13:00 -0400
|
|
|
|
</em></p>
|
|
|
|
<br>
|
|
|
|
<p>I have always been interested in text and data encoding, so last year, I made my first encoding tool. <a href="https://github.com/Ewpratten/shift64">Shift64</a> was designed to take plaintext data with a key, and convert it into a block of base64 that could, in theory, only be decoded with the original key. I had a lot of fun with this tool, and a very stripped down version of it actually ended up as a bonus question on the <a href="https://github.com/frc5024/Programming-Test/blob/master/test.md">5024 Programming Test</a> for 2018/2019. Yes, the key was in fact <code class="highlighter-rouge">5024</code>.</p>
|
|
|
|
<p>This tool had some issues. Firstly, the code was a mess and only accepted hard-coded values. This made it very impractical as an every-day tool, and a nightmare to continue developing. Secondly, the encoder made use of entropy bits, and self modifying keys that would end up producing encoded files >1GB from just the word <em>hello</em>.</p>
|
|
|
|
<h2 id="shift2">Shift2</h2>
|
|
<p>One of the oldest items on my TODO list has been to rewrite shift64, so I made a brand new tool out of it. <a href="https://github.com/Ewpratten/shift">Shift2</a> is both a command-line tool, and a Python3 library that can efficiently encode and decode text data with a single key (unlike shift64, which used two keys concatenated into a single string, and separated by a colon).</p>
|
|
|
|
<h3 id="how-it-works">How it works</h3>
|
|
<p>Shift2 has two inputs. A <code class="highlighter-rouge">file</code>, and a <code class="highlighter-rouge">key</code>. These two strings are used to produce a single output, the <code class="highlighter-rouge">message</code>.</p>
|
|
|
|
<p>When encoding a file, shift2 starts by encoding the raw data with <a href="https://en.wikipedia.org/wiki/Ascii85">base85</a>, to ensure that all data being passed to the next stage can be represented as a UTF-8 string (even binary data). This base85 data is then XOR encrypted with a rotating key. This operation can be expressed with the following (this example ignores the base85 encoding steps):</p>
|
|
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">file</span> <span class="o">=</span> <span class="s">"Hello reader! I am some input that needs to be encoded"</span>
|
|
<span class="n">key</span> <span class="o">=</span> <span class="s">"ewpratten"</span>
|
|
|
|
<span class="n">message</span> <span class="o">=</span> <span class="s">""</span>
|
|
|
|
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">char</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="nb">file</span><span class="p">):</span>
|
|
<span class="n">message</span> <span class="o">+=</span> <span class="nb">chr</span><span class="p">(</span>
|
|
<span class="nb">ord</span><span class="p">(</span><span class="n">char</span><span class="p">)</span> <span class="o">^</span> <span class="nb">ord</span><span class="p">(</span><span class="n">key</span><span class="p">[</span><span class="n">i</span> <span class="o">%</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">])</span>
|
|
<span class="p">)</span>
|
|
|
|
</code></pre></div></div>
|
|
|
|
<p>The output of this contains non-displayable characters. A second base85 encoding is used to fix this. Running the example snippet above, then base85 encoding the <code class="highlighter-rouge">message</code> once results in:</p>
|
|
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CIA~89YF>W1PTBJQBo*W6$nli7#$Zu9U2uI5my8n002}A3jh-XQWYCi2Ma|K9uW=@5di
|
|
</code></pre></div></div>
|
|
|
|
<p>If using the shift2 commandline tool, you would see a different output:</p>
|
|
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>B2-is8Y&4!ED2H~Ix<~LOCfn@P;xLjM_E8(awt`1YC<SaOLbpaL^T!^W_ucF8Er~?NnC$>e0@WAWn2bqc6M1yP+DqF4M_kSCp0uA5h->H
|
|
</code></pre></div></div>
|
|
|
|
<p>This is for a few reasons. Firstly, as mentioned above, shift2 uses base85 <strong>twice</strong>. Once before, and once after XOR encryption. Secondly, a file header is prepended to the output to help the decoder read the file. This header contains version info, the file length, and the encoding type.</p>
|
|
|
|
<h3 id="try-it-yourself-with-pip">Try it yourself with PIP</h3>
|
|
<p>I have published shift2 on <a href="https://pypi.org/project/shift-tool/">pypi.org</a> for use with PIP. To install shift2, ensure both <code class="highlighter-rouge">python3</code> and <code class="highlighter-rouge">python3-pip</code> are installed on your computer, then run:</p>
|
|
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Install shift2</span>
|
|
pip3 <span class="nb">install </span>shift-tool
|
|
|
|
<span class="c"># View the help for shift2</span>
|
|
shift2 <span class="nt">-h</span>
|
|
</code></pre></div></div>
|
|
|
|
<div id="demo">
|
|
<h3 id="try-it-in-the-browser">Try it in the browser</h3>
|
|
<p>I have ported the core code from shift2 to <a href="http://www.brython.info/index.html">run in the browser</a>. This demo is entirely client-side, and may take a few seconds to load depending on your device.</p>
|
|
|
|
<p><input type="radio" id="encode" name="shift-action" value="encode" checked>
|
|
<label for="encode">Encode</label>
|
|
<input type="radio" id="decode" name="shift-action" value="decode">
|
|
<label for="decode">Decode</label></p>
|
|
|
|
<p><input type="text" id="key" name="key" placeholder="Encoding key" required=""><br>
|
|
<input type="text" id="msg" name="msg" placeholder="Message" required="" size="30"></p>
|
|
|
|
<p><button type="button" class="btn btn-primary" id="shift-button" disabled>shift2 demo is loading… (this may take a few seconds)</button></p>
|
|
|
|
</div>
|
|
|
|
<h3 id="future-plans">Future plans</h3>
|
|
<p>Due to the fact that shift2 can also be used as a library (as outlined in the <a href="https://github.com/Ewpratten/shift/blob/master/README.md">README</a>), I would like to write a program that allows users to talk to eachother IRC style over a TCP port. This program would use either a pre-shared, or generated key to encode / decode messages on the fly.</p>
|
|
|
|
<p>If you are interested in helping out, or taking on this idea for yourself, send me an email.</p>
|
|
|
|
<!-- Python code -->
|
|
<script type="text/python" src="/assets/python/shift2/shift2demo.py"></script>
|
|
|
|
|
|
</div>
|
|
</div>
|
|
|
|
</div>
|
|
<!-- <div id="particles-js"></div> -->
|
|
|
|
<div class="container foot" style="text-align:center;">
|
|
<br>
|
|
<span class="site-info">
|
|
Site design by: <a href="https://retrylife.ca">Evan Pratten</a> |
|
|
|
|
This site was last updated at: 2019-11-30 11:37:59 -0500
|
|
</span>
|
|
</div>
|
|
|
|
<!-- Brython -->
|
|
<script src="/assets/js/brython.js"></script>
|
|
<script src="/assets/js/brython_stdlib.js"></script>
|
|
|
|
<script>
|
|
function startPY(){
|
|
|
|
brython();
|
|
console.log("Started Python")
|
|
}
|
|
|
|
window.onload = startPY;
|
|
</script>
|
|
|
|
|
|
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
|
|
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
|
|
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
|
|
|
|
<!-- Offsets for links -->
|
|
<script>
|
|
(function ($, window) {
|
|
var adjustAnchor = function () {
|
|
|
|
var $anchor = $(':target'),
|
|
fixedElementHeight = 100;
|
|
|
|
if ($anchor.length > 0) {
|
|
|
|
window.scrollTo(0, $anchor.offset().top - fixedElementHeight);
|
|
}
|
|
|
|
};
|
|
|
|
$(window).on('hashchange load', function () {
|
|
adjustAnchor();
|
|
});
|
|
|
|
})(jQuery, window);
|
|
</script>
|
|
|
|
<!-- Global site tag (gtag.js) - Google Analytics -->
|
|
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-74118570-2"></script>
|
|
<script>
|
|
window.dataLayer = window.dataLayer || [];
|
|
function gtag() { dataLayer.push(arguments); }
|
|
gtag('js', new Date());
|
|
|
|
gtag('config', 'UA-74118570-2');
|
|
</script>
|
|
|
|
|
|
<!-- particles -->
|
|
<script>
|
|
var body = document.body
|
|
|
|
var particles = document.getElementById("particles-js")
|
|
|
|
particles.style.height = body.scrollHeight + "px"
|
|
|
|
console.log(body.scrollHeight)
|
|
</script>
|
|
<script src="/assets/js/particles.min.js"></script>
|
|
<script>
|
|
particlesJS.load('particles-js', '/assets/js/particles.json', function () {
|
|
console.log('callback - particles.js config loaded');
|
|
});
|
|
</script>
|
|
|
|
<!-- Twitter embeds -->
|
|
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
|
|
|
|
|
|
</body> |