<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.0">Jekyll</generator><link href="https://excessivelyadequate.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://excessivelyadequate.com/" rel="alternate" type="text/html" /><updated>2026-04-04T20:26:26+02:00</updated><id>https://excessivelyadequate.com/feed.xml</id><title type="html">Excessively Adequate</title><subtitle>This is a blog about programming and adjacent topics that interest me.
</subtitle><author><name>Noah Doersing</name></author><entry><title type="html">Wikipedia-style Redirects in BookStack</title><link href="https://excessivelyadequate.com/posts/redirects.html" rel="alternate" type="text/html" title="Wikipedia-style Redirects in BookStack" /><published>2025-03-30T21:45:00+02:00</published><updated>2025-03-30T21:45:00+02:00</updated><id>https://excessivelyadequate.com/posts/redirects</id><content type="html" xml:base="https://excessivelyadequate.com/posts/redirects.html"><![CDATA[<p>In lieu of writing a novel introduction, allow me to recycle<sup id="fnref:planet"><a href="#fn:planet" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> the one from a previous post…</p>

<blockquote>
  <p>At <a href="https://www.suedweststrom.de">work</a>, we recently moved our internal knowledge base from a relatively creaky <a href="https://www.dokuwiki.org/dokuwiki">DokuWiki</a> instance to a much more modern <a href="https://www.bookstackapp.com">BookStack</a> setup. It’s great and <em>requires</em> very little configuration, which – perhaps counter-intuitively – made me <em>want</em> to inflict some custom CSS (and a bit of JavaScript) upon it.</p>
</blockquote>

<p>…in which I wrote about adding <a href="/posts/booksthack.html">external link icons, a button to copy a page’s permalink, and tag-dependent banners</a>. All of this was possible without touching<sup id="fnref:complicate"><a href="#fn:complicate" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> BookStack’s source code, instead relying on a bit of CSS and JavaScript hacked into the “Custom HTML Head Content” option of BookStack’s settings.</p>

<p>Same goes for <em>redirects</em> as I’ve implemented<sup id="fnref:slow"><a href="#fn:slow" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> them.</p>

<h2 id="background">Background</h2>

<p>On Wikipedia – where I got<sup id="fnref:kiddo"><a href="#fn:kiddo" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> the idea from – redirects <a href="https://en.wikipedia.org/wiki/Wikipedia:Redirect#Purposes_of_redirects">cover all kinds of situations</a> where a page needs to be reachable under multiple names: think synonyms, alternate names, common misspellings, or non-standard romanizations. Increasing discoverability of sections of long articles by redirecting from identically-named pages is another use case.</p>

<p>In BookStack, depending on how large your knowledge base is, few of these reasons may apply. But at work, we’ve found that the structure dictated by BookStack’s <a href="https://www.bookstackapp.com/docs/user/content-overview/">book→chapter→page hierarchy</a> sometimes results in related topics becoming scattered across two or three “subtrees”, making it a bit tricky to find all relevant information from, say, a chapter index.</p>

<p>To make such connections more immediately apparent, we now tend to employ redirects sort of as “see also” pointers<sup id="fnref:concept"><a href="#fn:concept" class="footnote" rel="footnote" role="doc-noteref">5</a></sup> visible at the book/chapter overview level. <em>(That’s in addition to <a href="https://www.bookstackapp.com/docs/user/organising-content/">reorganizing things</a> as required, which is less of a band-aid solution, of course.)</em></p>

<h2 id="syntax">Syntax</h2>

<p>Seeing as you’re now totally convinced that wiki redirects are the best<sup id="fnref:marginally"><a href="#fn:marginally" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> thing since sliced bread, here’s how you’d set one up.</p>

<h3 id="on-wikipedia">On Wikipedia</h3>

<p>Let’s say there’s a page titled “VPN Gateway” but those darn readers<sup id="fnref:real"><a href="#fn:real" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> just keep searching for “VPN Appliance”. To keep the 404s away, one might create another page under “VPN Appliance” with the following content.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#REDIRECT [[VPN Gateway]]
</code></pre></div></div>

<p>If a reader now searched for “VPN Appliance” (or went directly to the URL of that page, <em>e.g.</em>, after being linked to it), they’d land on “VPN Gateway” with a little notice disclosing that they’ve been redirected.</p>

<h3 id="in-bookstack">In BookStack</h3>

<p>My implementation in BookStack is meant to be used identically, the sole difference stemming from that fact that Wikipedia is written using the <a href="https://en.wikipedia.org/wiki/Help:Wikitext">wikitext</a> markup language whereas BookStack’s <a href="https://www.bookstackapp.com/docs/user/wysiwyg-editor/">default editor</a> is of the WYSIWYG variety.</p>

<p>So if you wanted to redirect searches like “Oh no, everything’s down” to your server outage plan, you’d create a page with that name and the following content.</p>

<blockquote>
  <p>#REDIRECT <a href="https://demo.bookstackapp.com/books/it-department/page/server-outage-plan">https://demo.bookstackapp.com/books/it-department/page/server-outage-plan</a></p>
</blockquote>

<p>Note that the reference to the redirect target must be an actual link – just plain text isn’t enough – so type a space after pasting it in to turn it into one. That’s because</p>

<ol>
  <li>it makes parsing out the redirect target basically trivial,</li>
  <li>there might be <a href="https://github.com/BookStackApp/BookStack/issues/5411">changes</a> to how/if renaming pages affects existing inbound links in future, and</li>
  <li>this way, it’ll still work without JavaScript (just, y’know, requiring an additional click) <em>or</em> if a future change to BookStack’s HTML markup were to break my code.</li>
</ol>

<p>It’s also worth pointing out that my implementation really just follows any link you put after “#REDIRECT”, so it also supports redirects to books, chapters, sections of pages, or even external websites.</p>

<h2 id="implementation">Implementation</h2>

<p>At last!</p>

<p>Maintaining the same ridiculous comments-to-code ratio as in my <a href="/posts/booksthack.html">previous post on BookStack hacks</a>, I don’t think the code requires all that much explanatory prose, yet I’ll still write a few paragraphs – with a screenshot or two – after the listing.</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;script&gt;</span>
  <span class="c1">// make wikipedia-style redirects possible, see https://en.wikipedia.org/wiki/Wikipedia:Redirect</span>
  <span class="c1">// to redirect, create a page whose content begins with "#REDIRECT", then a link</span>
  <span class="nf">addEventListener</span><span class="p">(</span><span class="dl">"</span><span class="s2">load</span><span class="dl">"</span><span class="p">,</span> <span class="nx">e</span> <span class="o">=&gt;</span> <span class="p">{</span>

    <span class="c1">// determine base url (baseUrl != location.origin if bookstack is installed in a subdirectory, hence some substring action)</span>
    <span class="kd">const</span> <span class="nx">baseUrl</span> <span class="o">=</span> <span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">.</span><span class="nf">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">.</span><span class="nf">indexOf</span><span class="p">(</span><span class="dl">"</span><span class="s2">/books/</span><span class="dl">"</span><span class="p">));</span>

    <span class="c1">// helper function to show a notice above the page heading</span>
    <span class="c1">// note: "position: absolute" to avoid shifting content around a split-second after page load, which can be jarring</span>
    <span class="kd">const</span> <span class="nx">showRedirectNotice</span> <span class="o">=</span> <span class="nx">message</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="nx">redirectNotice</span> <span class="o">=</span> <span class="s2">`
        &lt;p style="opacity: 0.75; font-style: italic; position: absolute; margin-top: -0.2em; overflow-x: hidden; z-index: 10;"&gt;
          (</span><span class="p">${</span><span class="nx">message</span><span class="p">}</span><span class="s2">)
        &lt;/p&gt;
      `</span><span class="p">;</span>
      <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">.content-wrap</span><span class="dl">"</span><span class="p">).</span><span class="nf">insertAdjacentHTML</span><span class="p">(</span><span class="dl">"</span><span class="s2">afterbegin</span><span class="dl">"</span><span class="p">,</span> <span class="nx">redirectNotice</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="c1">// visual flourish: italicize redirect pages in book/chapter overviews and search results</span>
    <span class="kd">const</span> <span class="nx">listedPages</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelectorAll</span><span class="p">(</span><span class="dl">"</span><span class="s2">.entity-list .entity-list-item.page</span><span class="dl">"</span><span class="p">);</span>
    <span class="nx">listedPages</span><span class="p">.</span><span class="nf">forEach</span><span class="p">(</span><span class="nx">listedPage</span> <span class="o">=&gt;</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="nx">snippet</span> <span class="o">=</span> <span class="nx">listedPage</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">.entity-item-snippet .text-muted</span><span class="dl">"</span><span class="p">);</span>
      <span class="k">if </span><span class="p">(</span><span class="o">!!</span><span class="nx">snippet</span> <span class="o">&amp;&amp;</span> <span class="nx">snippet</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">trim</span><span class="p">().</span><span class="nf">startsWith</span><span class="p">(</span><span class="dl">"</span><span class="s2">#REDIRECT</span><span class="dl">"</span><span class="p">))</span> <span class="p">{</span>
        <span class="nx">listedPage</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">fontStyle</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">italic</span><span class="dl">"</span><span class="p">;</span>
      <span class="p">}</span>
    <span class="p">});</span>

    <span class="c1">// CASE 1: ON REDIRECT PAGE</span>

    <span class="c1">// only do stuff if we're on a page and the first paragraph begins with "#REDIRECT"</span>
    <span class="kd">const</span> <span class="nx">isPage</span> <span class="o">=</span> <span class="o">!!</span><span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">#page-details</span><span class="dl">"</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">firstParagraph</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">.page-content p</span><span class="dl">"</span><span class="p">);</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">isPage</span> <span class="o">&amp;&amp;</span> <span class="nx">firstParagraph</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">trim</span><span class="p">().</span><span class="nf">startsWith</span><span class="p">(</span><span class="dl">"</span><span class="s2">#REDIRECT</span><span class="dl">"</span><span class="p">))</span> <span class="p">{</span>

      <span class="c1">// quit if the url query string contains "no_redirect" (to enable the user to edit the page)</span>
      <span class="k">if </span><span class="p">(</span><span class="nx">location</span><span class="p">.</span><span class="nx">search</span><span class="p">.</span><span class="nf">includes</span><span class="p">(</span><span class="dl">"</span><span class="s2">no_redirect</span><span class="dl">"</span><span class="p">))</span> <span class="p">{</span>
        <span class="nx">showRedirectNotice</span><span class="p">(</span><span class="dl">"</span><span class="s2">Not redirected due to &lt;code&gt;no_redirect&lt;/code&gt; URL parameter</span><span class="dl">"</span><span class="p">)</span>
        <span class="k">return</span><span class="p">;</span>
      <span class="p">}</span>

      <span class="c1">// also quit if it looks like the user has just edited the page</span>
      <span class="k">if </span><span class="p">(</span><span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">.notification.pos span</span><span class="dl">"</span><span class="p">).</span><span class="nx">textContent</span> <span class="o">==</span> <span class="dl">"</span><span class="s2">Page successfully updated</span><span class="dl">"</span><span class="p">)</span> <span class="p">{</span>
        <span class="nf">showRedirectNotice</span><span class="p">(</span><span class="dl">"</span><span class="s2">Not redirected because you've just updated this page – reload to be redirected anyway</span><span class="dl">"</span><span class="p">)</span>
        <span class="k">return</span><span class="p">;</span>
      <span class="p">}</span>

      <span class="c1">// parse out target url</span>
      <span class="kd">const</span> <span class="nx">redirectTargetUrl</span> <span class="o">=</span> <span class="nx">firstParagraph</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">a</span><span class="dl">"</span><span class="p">).</span><span class="nx">href</span><span class="p">;</span>

      <span class="c1">// if it's an external link, just go there</span>
      <span class="kd">const</span> <span class="nx">isExternalLink</span> <span class="o">=</span> <span class="o">!</span><span class="nx">redirectTargetUrl</span><span class="p">.</span><span class="nf">startsWith</span><span class="p">(</span><span class="nx">baseUrl</span><span class="p">);</span>
      <span class="k">if </span><span class="p">(</span><span class="nx">isExternalLink</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">location</span><span class="p">.</span><span class="nx">href</span> <span class="o">=</span> <span class="nx">redirectTargetUrl</span><span class="p">;</span>  <span class="c1">// could disable external redirects by commenting-out this line</span>
        <span class="k">return</span><span class="p">;</span>
      <span class="p">}</span>

      <span class="c1">// if internal, patch the current url (sans base url) and page title into the query string</span>
      <span class="c1">// this allows linking back to the redirect page (enabling edits) on the target page</span>
      <span class="kd">const</span> <span class="nx">patchedRedirectTargetUrl</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">URL</span><span class="p">(</span><span class="nx">redirectTargetUrl</span><span class="p">);</span>
      <span class="nx">patchedRedirectTargetUrl</span><span class="p">.</span><span class="nx">searchParams</span><span class="p">.</span><span class="nf">set</span><span class="p">(</span><span class="dl">"</span><span class="s2">redirected_from</span><span class="dl">"</span><span class="p">,</span> <span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">.</span><span class="nf">replace</span><span class="p">(</span><span class="nx">baseUrl</span><span class="p">,</span> <span class="dl">""</span><span class="p">));</span>
      <span class="nx">patchedRedirectTargetUrl</span><span class="p">.</span><span class="nx">searchParams</span><span class="p">.</span><span class="nf">set</span><span class="p">(</span><span class="dl">"</span><span class="s2">redirected_from_title</span><span class="dl">"</span><span class="p">,</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">.page-content h1</span><span class="dl">"</span><span class="p">).</span><span class="nx">textContent</span><span class="p">);</span>

      <span class="c1">// go there!</span>
      <span class="nx">location</span><span class="p">.</span><span class="nx">href</span> <span class="o">=</span> <span class="nx">patchedRedirectTargetUrl</span><span class="p">.</span><span class="nx">href</span><span class="p">;</span>
      <span class="k">return</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// CASE 2: ON REDIRECT TARGT PAGE</span>

    <span class="c1">// note: shelves/books/chapters can also be redirect targets, so no need to ensure we're on a page here</span>

    <span class="c1">// check if relevant parameters are present in query string</span>
    <span class="kd">const</span> <span class="nx">queryParams</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">URLSearchParams</span><span class="p">(</span><span class="nx">location</span><span class="p">.</span><span class="nx">search</span><span class="p">);</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">queryParams</span><span class="p">.</span><span class="nf">has</span><span class="p">(</span><span class="dl">'</span><span class="s1">redirected_from</span><span class="dl">'</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="nx">queryParams</span><span class="p">.</span><span class="nf">has</span><span class="p">(</span><span class="dl">'</span><span class="s1">redirected_from_title</span><span class="dl">'</span><span class="p">))</span> <span class="p">{</span>

      <span class="c1">// patch "no_redirect" into link back to redirect page (to allow users to go back and edit that one easily)</span>
      <span class="kd">const</span> <span class="nx">patchedRedirectSourceUrl</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">URL</span><span class="p">(</span><span class="nx">baseUrl</span> <span class="o">+</span> <span class="nx">queryParams</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="dl">'</span><span class="s1">redirected_from</span><span class="dl">'</span><span class="p">))</span>
      <span class="nx">patchedRedirectSourceUrl</span><span class="p">.</span><span class="nx">searchParams</span><span class="p">.</span><span class="nf">set</span><span class="p">(</span><span class="dl">"</span><span class="s2">no_redirect</span><span class="dl">"</span><span class="p">,</span> <span class="dl">""</span><span class="p">);</span>

      <span class="c1">// thell the user about the redirect and provide a link back to the redirect page</span>
      <span class="nx">showRedirectNotice</span><span class="p">(</span><span class="s2">`Redirected from &lt;a href="</span><span class="p">${</span><span class="nx">patchedRedirectSourceUrl</span><span class="p">.</span><span class="nx">href</span><span class="p">}</span><span class="s2">" title="Click to modify the redirect"&gt;</span><span class="p">${</span><span class="nx">queryParams</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="dl">'</span><span class="s1">redirected_from_title</span><span class="dl">'</span><span class="p">)}</span><span class="s2">&lt;/a&gt;`</span><span class="p">);</span>

      <span class="c1">// clear url parameters without polluting history</span>
      <span class="kd">const</span> <span class="nx">unpatchedUrl</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">URL</span><span class="p">(</span><span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">);</span>
      <span class="nx">unpatchedUrl</span><span class="p">.</span><span class="nx">searchParams</span><span class="p">.</span><span class="k">delete</span><span class="p">(</span><span class="dl">"</span><span class="s2">redirected_from</span><span class="dl">"</span><span class="p">);</span>
      <span class="nx">unpatchedUrl</span><span class="p">.</span><span class="nx">searchParams</span><span class="p">.</span><span class="k">delete</span><span class="p">(</span><span class="dl">"</span><span class="s2">redirected_from_title</span><span class="dl">"</span><span class="p">);</span>
      <span class="nx">history</span><span class="p">.</span><span class="nf">replaceState</span><span class="p">({},</span> <span class="dl">''</span><span class="p">,</span> <span class="nx">unpatchedUrl</span><span class="p">.</span><span class="nx">href</span><span class="p">);</span>
    <span class="p">}</span>
  <span class="p">});</span>
<span class="nt">&lt;/script&gt;</span>
</code></pre></div></div>

<p>With the way I’ve implemented redirects, there’s two cases to consider:</p>

<ol>
  <li>
    <p>If a reader navigates to a redirect page (<em>i.e.</em>, any page whose text starts with “#REDIRECT”), the code first checks</p>

    <ul>
      <li>if a special URL query parameter <code class="language-plaintext highlighter-rouge">no_redirect</code> is set (either manually or by following a link <em>back</em> from the redirect’s target) or</li>
      <li>whether the page has just been edited.</li>
    </ul>

    <p>If either of these two special cases applies, no redirect occurs; a helpful message is displayed instead. After all: Without this kind of mechanism, it’d be tricky to modify a redirect after setting it up since you’d never be “allowed” to remain on the redirect page.</p>

    <p>But in the common case, the reader needs to be quickly sent on their merry way to the link following the redirect “directive”. If it’s</p>

    <ul>
      <li>an external link: off they go, but</li>
      <li>for internal links, my code first patches two query parameters into the redirect target URL: the path of the redirect page and its name. These will come in handy now:</li>
    </ul>
  </li>
  <li>
    <p>Once a reader has been redirected – which, since it’s a freshly-loaded page and I didn’t want to set a cookie, is determined by the presence of our pair of query parameters – two steps remain:</p>

    <ul>
      <li>A message with a link back to the redirect page is shown (with the query parameter <code class="language-plaintext highlighter-rouge">no_redirect</code> set).</li>
      <li>Having served their purpose, the query parameters are removed from the URL.</li>
    </ul>
  </li>
</ol>

<p>Finally, a brief note on the “longevity” of this hack: As alluded to earlier, since it depends on the structure (class names and nesting) of the HTML markup generated by BookStack, it’s liable to break after some future update. Breakage will, however, not lead to “catastrophic failures” that could meaningfully impact your readers (think infinite redirect loops or similar mayhem) – in the worst case, redirects would just stop working and they’d have to click through manually. Like an animal.</p>

<p class="wide"><img src="/static/redirects.png" alt="" /></p>

<p class="caption">A screenshot<sup id="fnref:bookdemo"><a href="#fn:bookdemo" class="footnote" rel="footnote" role="doc-noteref">8</a></sup> of a redirect target page.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:planet">
      <p>It’s good for the planet. <a href="#fnref:planet" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:complicate">
      <p>Which would unacceptably complicate updates. (Lesson learned more than the nominal “once”: Make updates as simple as possible to ensure they’re <em>actually</em> done.) <a href="#fnref:complicate" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:slow">
      <p>Before dropping you into the “background” section, let me admit that this implementation happened on a whim during a slow Friday afternoon (on which I had a bit of a headache to boot). So the “background” section is a bit of a retcon and really, ‘tis all because I had an idle thought: “Hey, redirects are a thing on Wikipedia, so why not BookStack? Can I do this in JS? I think so, let’s try it!” <a href="#fnref:slow" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:kiddo">
      <p><em>Anecdote:</em> I was a nerdy kid. I remember having a book about Wikipedia (probably got it for a birthday because my nerdiness came with a side of being unable to conceal it) when I was like 11 or 12, which is almost 20 years ago now. Just went looking for it and I’m pretty sure it’s <a href="https://upload.wikimedia.org/wikipedia/commons/8/8b/WikiPress_1_Wikipedia.pdf">this one</a>. (Warning: 270-page PDF with lots of long German words.) Pages 199-203 are about redirects, which, I now seem to recall, really appealed to me back then for some inexplicable reason…? So adding redirects into BookStack turned out to be one of these full-circle moments you tend to encounter with increasing frequency as you age. <a href="#fnref:kiddo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:concept">
      <p>That’s also a concept <a href="https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Layout#%22See_also%22_section"><del>stolen</del>adapted</a> from Wikipedia. <a href="#fnref:concept" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:marginally">
      <p>Alright, I’ll settle for “marginally useful”. <a href="#fnref:marginally" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:real">
      <p>Or, more likely – let’s be real – “AI” crawlers. <em>(The author sighed in mild dismay at the state of the internet while penning this footnote.)</em> <a href="#fnref:real" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bookdemo">
      <p>Note to self: To test my code on <a href="https://demo.bookstackapp.com">BookStack’s demo</a>, where the “Custom HTML Head Content” setting can’t be changed, I need to replace <code class="language-plaintext highlighter-rouge">addEventListener("load", e =&gt; {</code> with <code class="language-plaintext highlighter-rouge">(() =&gt; {</code> and the final line <code class="language-plaintext highlighter-rouge">});</code> with <code class="language-plaintext highlighter-rouge">})();</code>, then paste the resulting variant of the code snippet into the console. That’s required on every page (notably <em>again</em> after being redirected). Clunky, but it works! <a href="#fnref:bookdemo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[In lieu of writing a novel introduction, allow me to recycle1 the one from a previous post… It’s good for the planet. &#8617;]]></summary></entry><entry><title type="html">BookStack Hacks: Adding External Link Icons, Fewer Clicks to Copy a Page’s Permalink, and More</title><link href="https://excessivelyadequate.com/posts/booksthack.html" rel="alternate" type="text/html" title="BookStack Hacks: Adding External Link Icons, Fewer Clicks to Copy a Page’s Permalink, and More" /><published>2024-10-02T12:00:00+02:00</published><updated>2024-10-02T12:00:00+02:00</updated><id>https://excessivelyadequate.com/posts/booksthack</id><content type="html" xml:base="https://excessivelyadequate.com/posts/booksthack.html"><![CDATA[<p>At <a href="https://www.suedweststrom.de">work</a><sup id="fnref:threeoffour"><a href="#fn:threeoffour" class="footnote" rel="footnote" role="doc-noteref">1</a></sup>, we recently moved our internal knowledge base from a relatively creaky <a href="https://www.dokuwiki.org/dokuwiki">DokuWiki</a> instance to a much more modern <a href="https://www.bookstackapp.com">BookStack</a> setup. It’s great and <em>requires</em> very little configuration, which – perhaps counter-intuitively – made me <em>want</em> to inflict some custom CSS (and a bit of JavaScript) upon<sup id="fnref:customization"><a href="#fn:customization" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> it.</p>

<p>This kind of thing is even encouraged by <a href="https://danb.me">Dan Brown</a>, BookStack’s developer, who has built a page collecting some <a href="https://www.bookstackapp.com/hacks/">community-developed hacks</a> in addition to including a number of handy <a href="https://www.bookstackapp.com/docs/admin/hacking-bookstack/">hooks and other features that encourage hackery</a> with BookStack itself.</p>

<h2 id="adding-external-and-attachment-link-icons">Adding external (and attachment) link icons</h2>

<p>To motivate adding visual distinction (<em>aka</em>, perhaps, noise), let me divide the kinds of links commonly found in a wiki into three or four categories, which a vanilla BookStack installation renders identically.</p>

<ul>
  <li>
    <p>Internal links to other pages in that wiki – these are commonly called <em>wikilinks</em>. In a well-connected wiki, most links are going to be wikilinks, so they ought to look like standard links.</p>
  </li>
  <li>
    <p>Internal wikilinks whose target pages don’t exist. <a href="https://www.mediawiki.org/wiki/MediaWiki">MediaWiki</a> (which Wikipedia runs on) and DokuWiki color such <em>dead links</em> red. This alerts readers to that fact that they won’t find further information behind them and nudges editors towards filling in those gaps.</p>

    <p>Sadly, BookStack presently doesn’t provide the means to render dead links differently than links to existing pages. <em>(I’ve recently <a href="https://github.com/BookStackApp/BookStack/issues/5163#issuecomment-2386914611">inquired</a> whether this would be possible to implement.)</em></p>
  </li>
  <li>
    <p>Links to other<sup id="fnref:www"><a href="#fn:www" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> websites. To signal that by following such an <em>external link</em>, you’re leaving the safe and well-maintained confines our your knowledge base, MediaWiki and DokuWiki display an “<svg xmlns="http://www.w3.org/2000/svg" style="height: 0.8em;" viewBox="0 0 12 12"><g transform="translate(-1,1)"><path fill="currentColor" d="M6 1h5v5L8.86 3.85 4.7 8 4 7.3l4.15-4.16zM2 3h2v1H2v6h6V8h1v2a1 1 0 0 1-1 1H2a1 1 0 0 1-1-1V4a1 1 0 0 1 1-1"></path></g></svg>” icon<sup id="fnref:iconlicense"><a href="#fn:iconlicense" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> after the link text. Easily added to BookStack with some moderately-fancy CSS!</p>
  </li>
  <li>
    <p>This one’s specific to BookStack: Links to <em>attachments</em>. Attachments are also listed in a page’s sidebar, but to include some context, it sometimes makes sense to refer to them from within the page text. What’s more, since BookStack configures the underlying web server to prompt the user’s browser to <em>download</em> attachments (instead of <em>displaying</em>, say, PDFs), it’s handy to have attachment links stand apart.</p>
  </li>
</ul>

<p>So! Adding MediaWiki’s external link icon into BookStack is relatively easy with CSS – some explanation after the code:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;style&gt;</span>
  <span class="c">/* mark external links like on wikipedia https://en.wikipedia.org/wiki/Help:External_link_icons */</span>
  <span class="c">/* svg from https://en.wikipedia.org/w/skins/Vector/resources/skins.vector.styles/images/link-external-small-ltr-progressive.svg, licensed under the gnu general public license https://www.mediawiki.org/wiki/Copyright */</span>
  <span class="c">/* converted for use in css with https://www.svgbackgrounds.com/tools/svg-to-css/ */</span>
  <span class="nc">.page-content</span> <span class="nt">a</span><span class="o">[</span><span class="nt">href</span><span class="o">^=</span><span class="s1">"http"</span><span class="o">]</span><span class="nd">:not</span><span class="o">([</span><span class="nt">href</span><span class="o">^=</span><span class="s1">"https://your-bookstack.url"</span><span class="o">])</span> <span class="p">{</span>
    <span class="nl">background-image</span><span class="p">:</span> <span class="nb">url</span><span class="p">(</span><span class="err">'</span><span class="n">data</span><span class="p">:</span><span class="n">image</span><span class="p">/</span><span class="n">svg</span><span class="err">+</span><span class="n">xml</span><span class="p">,</span><span class="err">&lt;</span><span class="n">svg</span> <span class="n">xmlns</span><span class="err">=</span><span class="s1">"http://www.w3.org/2000/svg"</span> <span class="n">width</span><span class="err">=</span><span class="s1">"12"</span> <span class="n">height</span><span class="err">=</span><span class="s1">"12"</span> <span class="n">viewBox</span><span class="err">=</span><span class="s1">"0 0 12 12"</span><span class="err">&gt;&lt;</span><span class="n">title</span><span class="err">&gt;</span><span class="n">external</span> <span class="n">link</span><span class="err">&lt;</span><span class="p">/</span><span class="n">title</span><span class="err">&gt;&lt;</span><span class="n">path</span> <span class="n">fill</span><span class="err">=</span><span class="s1">"%23206ea7"</span> <span class="n">d</span><span class="err">=</span><span class="s1">"M6 1h5v5L8.86 3.85 4.7 8 4 7.3l4.15-4.16zM2 3h2v1H2v6h6V8h1v2a1 1 0 0 1-1 1H2a1 1 0 0 1-1-1V4a1 1 0 0 1 1-1"</span><span class="p">/</span><span class="err">&gt;&lt;</span><span class="p">/</span><span class="n">svg</span><span class="err">&gt;</span><span class="s2">');
    background-position: center right;
    background-repeat: no-repeat;
    background-size: 0.857em; /* matches the 12px icon size given bookstack'</span><span class="n">s</span> <span class="nb">default</span> <span class="m">14px</span> <span class="nb">text</span> <span class="n">size</span> <span class="err">*</span><span class="p">/</span>
    <span class="n">padding-right</span><span class="p">:</span> <span class="m">1em</span><span class="p">;</span>
  <span class="p">}</span>
<span class="nt">&lt;/style&gt;</span>
</code></pre></div></div>

<p>The CSS selector works like this: <code class="language-plaintext highlighter-rouge">.page-content</code> is a container element wrapped around, unsurprisingly, page content (it’s also applied to the editor) but not BookStack’s UI; <code class="language-plaintext highlighter-rouge">a[href^="http"]</code> selects all links within that whose targets start with <code class="language-plaintext highlighter-rouge">http</code> (importantly, this makes the rule <em>not</em> apply to relative links, which are internal by definition); and <code class="language-plaintext highlighter-rouge">:not([href^="https://your-bookstack.url"])</code> – modify this part to match your setup – <em>excludes</em> links beginning with your BookStack instance’s base URL, <em>i.e.</em>, wikilinks.</p>

<p>Any links matched by the selector are padded rightwards to make space for the SVG icon defined in the <code class="language-plaintext highlighter-rouge">background-image</code> property. It’s <em>so</em> neat how you can drop SVG code – sometimes requiring <a href="https://www.svgbackgrounds.com/tools/svg-to-css/">minor modifications</a>, but no <a href="https://www.base64decode.org">Base64</a> obfuscation – straight<sup id="fnref:nocurrentcolor"><a href="#fn:nocurrentcolor" class="footnote" rel="footnote" role="doc-noteref">5</a></sup> into CSS declarations. The <code class="language-plaintext highlighter-rouge">background-size</code> is chosen to yield a crisp 12-pixel icon at BookStack’s default text size.</p>

<p>Similarly, with a different selector and another<sup id="fnref:othericon"><a href="#fn:othericon" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> icon “<svg xmlns="http://www.w3.org/2000/svg" style="height: 0.8em;" viewBox="1 0 10.5 12"><g transform="translate(1,1)" fill="currentColor"><path d="M2 1V10h6v-8h1v8c0 .5523-.4477 1-1 1h-6c-.5523 0-1-.4477-1-1v-8c0-.5523.4477-1 1-1h6c.5523 0 1 .4477 1 1l-7 0"></path><path d="M 7,4 H 3 V 3 h 4"></path><path d="M 7,6 H 3 V 5 h 4"></path><path d="M 7,8 H 3 V 7 h 4"></path></g></svg>”, you can mark links to attachments:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;style&gt;</span>
  <span class="c">/* similarly, mark links to attachments */</span>
  <span class="nc">.page-content</span> <span class="nt">a</span><span class="o">[</span><span class="nt">href</span><span class="o">^=</span><span class="s1">"https://your-bookstack.url/attachments/"</span><span class="o">]</span> <span class="p">{</span>
    <span class="nl">background-image</span><span class="p">:</span> <span class="nb">url</span><span class="p">(</span><span class="err">'</span><span class="n">data</span><span class="p">:</span><span class="n">image</span><span class="p">/</span><span class="n">svg</span><span class="err">+</span><span class="n">xml</span><span class="p">,</span><span class="err">&lt;</span><span class="n">svg</span> <span class="n">xmlns</span><span class="err">=</span><span class="s1">"http://www.w3.org/2000/svg"</span> <span class="n">width</span><span class="err">=</span><span class="s1">"12"</span> <span class="n">height</span><span class="err">=</span><span class="s1">"12"</span> <span class="n">viewBox</span><span class="err">=</span><span class="s1">"0 0 12 12"</span><span class="err">&gt;&lt;</span><span class="n">title</span><span class="err">&gt;</span><span class="n">attachment</span> <span class="n">link</span><span class="err">&lt;</span><span class="p">/</span><span class="n">title</span><span class="err">&gt;&lt;</span><span class="n">g</span> <span class="n">transform</span><span class="err">=</span><span class="s1">"translate(1,1)"</span> <span class="n">fill</span><span class="err">=</span><span class="s1">"%23206ea7"</span><span class="err">&gt;&lt;</span><span class="n">path</span> <span class="n">d</span><span class="err">=</span><span class="s1">"M2 1V10h6v-8h1v8c0 .5523-.4477 1-1 1h-6c-.5523 0-1-.4477-1-1v-8c0-.5523.4477-1 1-1h6c.5523 0 1 .4477 1 1l-7 0"</span> <span class="p">/</span><span class="err">&gt;&lt;</span><span class="n">path</span> <span class="n">d</span><span class="err">=</span><span class="s1">"M 7,4 H 3 V 3 h 4"</span> <span class="p">/</span><span class="err">&gt;&lt;</span><span class="n">path</span> <span class="n">d</span><span class="err">=</span><span class="s1">"M 7,6 H 3 V 5 h 4"</span> <span class="p">/</span><span class="err">&gt;&lt;</span><span class="n">path</span> <span class="n">d</span><span class="err">=</span><span class="s1">"M 7,8 H 3 V 7 h 4"</span> <span class="p">/</span><span class="err">&gt;&lt;</span><span class="p">/</span><span class="n">g</span><span class="err">&gt;&lt;</span><span class="p">/</span><span class="n">svg</span><span class="err">&gt;</span><span class="s2">');
    background-position: center right;
    background-repeat: no-repeat;
    background-size: auto 0.857em; /* matches the 12px icon size given bookstack'</span><span class="n">s</span> <span class="nb">default</span> <span class="m">14px</span> <span class="nb">text</span> <span class="n">size</span> <span class="err">*</span><span class="p">/</span>
    <span class="n">padding-right</span><span class="p">:</span> <span class="m">0.92em</span><span class="p">;</span>
  <span class="p">}</span>
<span class="nt">&lt;/style&gt;</span>
</code></pre></div></div>

<p><em>(Curious what that’ll look like? You’ll find a screenshot, also including what’s covered below, at the end of the post.)</em></p>

<h2 id="fewer-clicks-to-copy-a-pages-permalink">Fewer clicks to copy a page’s permalink</h2>

<p>At the time of writing, BookStack’s page URLs look like <code class="language-plaintext highlighter-rouge">https://your-bookstack.url/books/the-two-towers/page/the-last-march-of-the-ents</code>. Were a <a href="https://en.wikipedia.org/wiki/List_of_Friends_episodes">friendly</a> editor to rename that page to “The One Where the Ents Flood Isengard”, the URL would change accordingly, breaking<sup id="fnref:revisionsystem"><a href="#fn:revisionsystem" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> inbound links. Modifying book titles is even more impactful, affecting the URLs of all pages located in the relevant book.</p>

<p>While BookStack is smart enough to cascade name changes, <em>i.e.</em>, it automatically adjusts internal links as you rename pages (and books, and chapters), external references to BookStack don’t receive this treatment, of course. At work, this matters because we refer to BookStack pages in all kinds of places – internal tools, infrastructure alerts, task descriptions in various automation tools, and more – to provide context and more information.</p>

<p>To avoid links dying as we occasionally rename and move stuff, we<sup id="fnref:trytorememberto"><a href="#fn:trytorememberto" class="footnote" rel="footnote" role="doc-noteref">8</a></sup> refer to <em>permalinks</em> instead of the human-readable URLs: Internally, each page is stored with identifier like <code class="language-plaintext highlighter-rouge">1337</code>, and links of the form <code class="language-plaintext highlighter-rouge">https://your-bookstack.url/link/1337</code> then redirect to the “standard” URL. BookStack <a href="https://www.bookstackapp.com/docs/user/content-permalinks/">provides that permalink</a> in a slightly roundabout way:</p>

<blockquote>
  <p>Simply select any block of text within a page and you’ll see a small popup box. Within this popup box will be an input containing the page permalink. A copy button next to the input allows you to copy the link with a single click.</p>
</blockquote>

<p>During our migration from DokuWiki, where we had to update a whole bunch of links to now point to BookStack, this felt like too many clicks, so I wrote a little bit of JavaScript that adds a “Copy permalink” button to every page’s sidebar:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;script&gt;</span>
  <span class="c1">// add a permalink, uh, link to the details section of the sidebar</span>
  <span class="nf">addEventListener</span><span class="p">(</span><span class="dl">"</span><span class="s2">load</span><span class="dl">"</span><span class="p">,</span> <span class="nx">e</span> <span class="o">=&gt;</span> <span class="p">{</span>
  
    <span class="c1">// check if we're on a page (shelves/books/chapters also have ids but permalinks to these don't work)</span>
    <span class="kd">const</span> <span class="nx">isPage</span> <span class="o">=</span> <span class="o">!!</span><span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">#page-details</span><span class="dl">"</span><span class="p">);</span>
  
    <span class="c1">// determine page id - can be extracted from the form for the "favorite" button in the sidebar of shelf/book/chapter/page pages</span>
    <span class="kd">const</span> <span class="nx">idInput</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">.actions form input[name="id"]</span><span class="dl">'</span><span class="p">);</span>
  
    <span class="k">if </span><span class="p">(</span><span class="nx">isPage</span> <span class="o">&amp;&amp;</span> <span class="nx">idInput</span><span class="p">)</span> <span class="p">{</span>
      <span class="kd">const</span> <span class="nx">id</span> <span class="o">=</span> <span class="nx">idInput</span><span class="p">.</span><span class="nx">value</span><span class="p">;</span>
  
      <span class="c1">// construct permalink url (baseUrl != location.origin if bookstack is installed in a subdirectory, hence some substring shenanigans)</span>
      <span class="kd">const</span> <span class="nx">baseUrl</span> <span class="o">=</span> <span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">.</span><span class="nf">substring</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">location</span><span class="p">.</span><span class="nx">href</span><span class="p">.</span><span class="nf">indexOf</span><span class="p">(</span><span class="dl">"</span><span class="s2">/books/</span><span class="dl">"</span><span class="p">));</span>
      <span class="kd">const</span> <span class="nx">permalinkUrl</span> <span class="o">=</span> <span class="s2">`</span><span class="p">${</span><span class="nx">baseUrl</span><span class="p">}</span><span class="s2">/link/</span><span class="p">${</span><span class="nx">id</span><span class="p">}${</span><span class="nx">location</span><span class="p">.</span><span class="nx">hash</span><span class="p">}</span><span class="s2">`</span><span class="p">;</span>
  
      <span class="c1">// link icon taken from resources/icons/link.svg</span>
      <span class="kd">const</span> <span class="nx">linkIcon</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">&lt;svg class="svg-icon" data-icon="link" role="presentation" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"&gt;&lt;path d="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1M8 13h8v-2H8zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5"&gt;&lt;/path&gt;&lt;/svg&gt;</span><span class="dl">'</span><span class="p">;</span>
      <span class="kd">const</span> <span class="nx">permalinkTitle</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">Won</span><span class="se">\'</span><span class="s1">t change when you rename (or move) pages or books.</span><span class="dl">'</span>
      <span class="kd">const</span> <span class="nx">permalinkLabel</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">Copy permalink</span><span class="dl">'</span><span class="p">;</span>
      
      <span class="kd">const</span> <span class="nx">permalinkHtml</span> <span class="o">=</span> <span class="s2">`&lt;a id="copy-permalink" href="</span><span class="p">${</span><span class="nx">permalinkUrl</span><span class="p">}</span><span class="s2">" class="entity-meta-item" title="</span><span class="p">${</span><span class="nx">permalinkTitle</span><span class="p">}</span><span class="s2">"&gt;</span><span class="p">${</span><span class="nx">linkIcon</span><span class="p">}${</span><span class="nx">permalinkLabel</span><span class="p">}</span><span class="s2">&lt;/a&gt;`</span><span class="p">;</span>
  
      <span class="c1">// append permalink to details section of sidebar</span>
      <span class="kd">const</span> <span class="nx">sidebarDetailsElement</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">.entity-meta</span><span class="dl">'</span><span class="p">);</span>
      <span class="nx">sidebarDetailsElement</span><span class="p">.</span><span class="nf">insertAdjacentHTML</span><span class="p">(</span><span class="dl">'</span><span class="s1">beforeend</span><span class="dl">'</span><span class="p">,</span> <span class="nx">permalinkHtml</span><span class="p">);</span>
  
      <span class="c1">// define click handler to copy permalink to clipboard</span>
      <span class="kd">const</span> <span class="nx">permalinkElement</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">"</span><span class="s2">#copy-permalink</span><span class="dl">"</span><span class="p">);</span>
      <span class="nx">permalinkElement</span><span class="p">.</span><span class="nf">addEventListener</span><span class="p">(</span><span class="dl">"</span><span class="s2">click</span><span class="dl">"</span><span class="p">,</span> <span class="nx">e</span> <span class="o">=&gt;</span> <span class="p">{</span>
          <span class="nx">e</span><span class="p">.</span><span class="nf">preventDefault</span><span class="p">();</span>
          <span class="nb">navigator</span><span class="p">.</span><span class="nx">clipboard</span><span class="p">.</span><span class="nf">writeText</span><span class="p">(</span><span class="nx">permalinkUrl</span><span class="p">);</span>

          <span class="c1">// color link green, then fade back to default color</span>
          <span class="nx">permalinkElement</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">color</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">var(--color-positive)</span><span class="dl">"</span><span class="p">;</span>
          <span class="nx">permalinkElement</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">transition</span> <span class="o">=</span> <span class="dl">""</span><span class="p">;</span>
          <span class="nf">setTimeout</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
            <span class="nx">permalinkElement</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">color</span> <span class="o">=</span> <span class="dl">""</span><span class="p">;</span>
            <span class="nx">permalinkElement</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">transition</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">color 1s</span><span class="dl">"</span><span class="p">;</span>
          <span class="p">},</span> <span class="mi">2000</span><span class="p">);</span>
      <span class="p">});</span>
    <span class="p">}</span>
  <span class="p">});</span>
<span class="nt">&lt;/script&gt;</span>
</code></pre></div></div>

<p>The inline comments explain what’s happening in nigh-excruciating detail, but in short: The script determines the page’s identifier, uses that to assemble a permalink, which it then patches into the sidebar, finally adding a click handler to copy the permalink to the clipboard (while providing visual feedback) instead of navigating to it.</p>

<h2 id="displaying-banners-or-making-other-style-changes-based-on-tags">Displaying banners (or making other style changes) based on tags</h2>

<p>To perform the initial migration of our DokuWiki content into BookStack, we’d built a script that renders DokuWiki’s formatting syntax as HTML, adjusts wikilinks to target BookStack’s URL scheme, collects images and other media, then uploads all that via <a href="https://www.bookstackapp.com/docs/admin/hacking-bookstack/#bookstack-api">BookStack’s API</a>. This process transferred most pages just fine, but some were in need of minor adjustment – which is why we had our script set a tag <code class="language-plaintext highlighter-rouge">check-import</code> on each page, aptly named to indicate the need to manually check whether everything’s still up to snuff.</p>

<p>Because tags are relatively inconspicuous<sup id="fnref:dailyuse"><a href="#fn:dailyuse" class="footnote" rel="footnote" role="doc-noteref">9</a></sup>, we were glad to find out that BookStack registers page tags<sup id="fnref:pagetags"><a href="#fn:pagetags" class="footnote" rel="footnote" role="doc-noteref">10</a></sup> in the form of <a href="https://www.bookstackapp.com/docs/admin/hacking-bookstack/#tag-classes">CSS classes on the <code class="language-plaintext highlighter-rouge">&lt;body&gt;</code></a> element…</p>

<blockquote>
  <p>While primarily for categorization, tags within BookStack can also provide opportunities for customization. […] As an example, a tag name/value pair of <code class="language-plaintext highlighter-rouge">Priority: Critical</code> will apply the following classes to the body: <code class="language-plaintext highlighter-rouge">tag-name-priority</code>, <code class="language-plaintext highlighter-rouge">tag-value-critical</code>, <code class="language-plaintext highlighter-rouge">tag-pair-priority-critical</code>.</p>
</blockquote>

<p>…allowing us to, in combination with a <code class="language-plaintext highlighter-rouge">::before</code> pseudo-element and the CSS <code class="language-plaintext highlighter-rouge">content</code> property, add an prominent explanatory banner to the top of any page tagged <code class="language-plaintext highlighter-rouge">check-import</code> which automatically disappears upon removal of that tag:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;style&gt;</span>
  <span class="c">/* pages not yet checked after our migration from dokuwiki are tagged, which bookstack helpfully registers in the form of classes on the body - since those tags are pretty subtle, make things a bit more noisy */</span>
  <span class="nc">.tag-name-checkimport</span> <span class="nf">#bkmrk-page-title</span><span class="nd">::before</span> <span class="p">{</span>
    <span class="nl">content</span><span class="p">:</span> <span class="s1">"This page imported from DokuWiki still needs checking – in case you've got a minute..."</span><span class="p">;</span>
    <span class="nl">color</span><span class="p">:</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-warning</span><span class="p">);</span>
    <span class="nl">font-size</span><span class="p">:</span> <span class="m">14px</span><span class="p">;</span>
    <span class="nl">font-weight</span><span class="p">:</span> <span class="nb">bold</span><span class="p">;</span>
    <span class="nl">line-height</span><span class="p">:</span> <span class="m">1.5</span><span class="p">;</span>
    <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span>
    <span class="nl">border</span><span class="p">:</span> <span class="m">1px</span> <span class="nb">solid</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-warning</span><span class="p">);</span>
    <span class="nl">border-radius</span><span class="p">:</span> <span class="m">0.2em</span><span class="p">;</span>
    <span class="nl">background-color</span><span class="p">:</span> <span class="n">color</span><span class="p">(</span><span class="n">from</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-warning</span><span class="p">)</span> <span class="n">srgb</span> <span class="n">r</span> <span class="n">g</span> <span class="n">b</span> <span class="p">/</span> <span class="m">0.1</span><span class="p">);</span>  <span class="c">/* brighten color for background */</span>
    <span class="nl">padding</span><span class="p">:</span> <span class="m">0.5em</span> <span class="m">0.75em</span><span class="p">;</span>
    <span class="nl">margin-bottom</span><span class="p">:</span> <span class="m">1em</span><span class="p">;</span>
  <span class="p">}</span>
<span class="nt">&lt;/style&gt;</span>
</code></pre></div></div>

<p>There’s nothing fancy here apart from my use of the <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/color_value/color"><code class="language-plaintext highlighter-rouge">color()</code> function</a> (a relatively new<sup id="fnref:colorfunction"><a href="#fn:colorfunction" class="footnote" rel="footnote" role="doc-noteref">11</a></sup> addition to the CSS specification) to brighten the <code class="language-plaintext highlighter-rouge">var(--color-warning)</code> defined in BookStack’s stylesheet.</p>

<p>We’ve also set up a job that regularly dynamically generates certain pages (mostly infrastructure overviews) based on data collated from various sources. To indicate that such pages shouldn’t be edited manually, they’re equipped with a tag <code class="language-plaintext highlighter-rouge">auto-update</code> that’s similarly associated with a CSS-powered notice:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;style&gt;</span>
  <span class="c">/* similarly, point out that changes on auto-updated pages won't be persisted */</span>
  <span class="nc">.tag-name-autoupdate</span> <span class="nf">#bkmrk-page-title</span><span class="nd">::before</span> <span class="p">{</span>
    <span class="nl">content</span><span class="p">:</span> <span class="s1">"This page is regularly dynamically generated by an external program - any changes you make here will be overwritten during the next update."</span><span class="p">;</span>
    <span class="nl">color</span><span class="p">:</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-info</span><span class="p">);</span>
    <span class="nl">font-size</span><span class="p">:</span> <span class="m">14px</span><span class="p">;</span>
    <span class="nl">line-height</span><span class="p">:</span> <span class="m">1.5</span><span class="p">;</span>
    <span class="nl">display</span><span class="p">:</span> <span class="nb">block</span><span class="p">;</span>
    <span class="nl">border</span><span class="p">:</span> <span class="m">1px</span> <span class="nb">solid</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-info</span><span class="p">);</span>
    <span class="nl">border-radius</span><span class="p">:</span> <span class="m">0.2em</span><span class="p">;</span>
    <span class="nl">background-color</span><span class="p">:</span> <span class="n">color</span><span class="p">(</span><span class="n">from</span> <span class="n">var</span><span class="p">(</span><span class="n">--color-info</span><span class="p">)</span> <span class="n">srgb</span> <span class="n">r</span> <span class="n">g</span> <span class="n">b</span> <span class="p">/</span> <span class="m">0.1</span><span class="p">);</span>  <span class="c">/* brighten color for background */</span>
    <span class="nl">padding</span><span class="p">:</span> <span class="m">0.5em</span> <span class="m">0.75em</span><span class="p">;</span>
    <span class="nl">margin-bottom</span><span class="p">:</span> <span class="m">1em</span><span class="p">;</span>
  <span class="p">}</span>
<span class="nt">&lt;/style&gt;</span>
</code></pre></div></div>

<p>(We could, alternatively, have the job that’s generating these pages include a variant of this notice <em>within</em> the page content – but I prefer this approach.)</p>

<hr />

<p>With these modifications in place, the screenshot below shows how a page<sup id="fnref:bookstackdemo"><a href="#fn:bookstackdemo" class="footnote" rel="footnote" role="doc-noteref">12</a></sup> might now appear: Notice the tag-dependent banner up top, the “Copy permalink” item in the sidebar, and the icons next to some links.</p>

<p class="wide"><img src="/static/booksthack.jpg" alt="" /></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:threeoffour">
      <p>Three of my most recent four posts start with those two words. Boss makes a dollar, I make a dime – and get to (under certain conditions) write about stuff I build on company time. <a href="#fnref:threeoffour" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:customization">
      <p>In the “Customization” category of BookStack’s settings, there’s a “Custom HTML Head Content” option that allows an administrator to conveniently patch a bit of code into each page’s <code class="language-plaintext highlighter-rouge">&lt;head&gt;</code> element without having to futz with template files (and thus complicating upgrades). <a href="#fnref:customization" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:www">
      <p>Like this one! <a href="#fnref:www" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:iconlicense">
      <p>This is, in fact, MediaWiki’s icon. As far as I can tell, being part of MediaWiki, it’s made available under the <a href="https://www.mediawiki.org/wiki/Copyright">GNU General Public License</a>. <a href="#fnref:iconlicense" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:nocurrentcolor">
      <p>Though you <a href="https://stackoverflow.com/a/76006610">can’t</a> refer to CSS variables or special keywords like <code class="language-plaintext highlighter-rouge">currentColor</code> from within SVGs embedded in this manner. <a href="#fnref:nocurrentcolor" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:othericon">
      <p>Hand-coded (with some help from <a href="https://yqnn.github.io/svg-path-editor/">SvgPathEditor</a>) based on the previous icon. <a href="#fnref:othericon" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:revisionsystem">
      <p>Old links may continue to work, but that’s not something you can rely on, according to <a href="https://www.bookstackapp.com/docs/user/content-permalinks/">BookStack’s documentation</a>: “Upon name changes of the book or page, BookStack will use the revision system to attempt resolving when old links are used but it is possible for some actions to cause old page links to no longer lead to the updated content.” <a href="#fnref:revisionsystem" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:trytorememberto">
      <p>…try to remember to… <a href="#fnref:trytorememberto" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:dailyuse">
      <p>As they should be in daily use, of course! <a href="#fnref:dailyuse" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:pagetags">
      <p>I’ve <a href="https://github.com/BookStackApp/BookStack/issues/5217">filed an issue</a> to explore the possibility of adding book and chapter tags into the <code class="language-plaintext highlighter-rouge">&lt;body&gt;</code>’s class list, as well. (My thinking is that setting subtly different background colors for all pages located in certain books would provide useful visual distinction.) <a href="#fnref:pagetags" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:colorfunction">
      <p>This was my first time using it, and I’m head over heels! <a href="#fnref:colorfunction" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bookstackdemo">
      <p>In BookStack’s <a href="https://demo.bookstackapp.com">demo instance</a>, in this case. <a href="#fnref:bookstackdemo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[At work1, we recently moved our internal knowledge base from a relatively creaky DokuWiki instance to a much more modern BookStack setup. It’s great and requires very little configuration, which – perhaps counter-intuitively – made me want to inflict some custom CSS (and a bit of JavaScript) upon2 it. Three of my most recent four posts start with those two words. Boss makes a dollar, I make a dime – and get to (under certain conditions) write about stuff I build on company time. &#8617; In the “Customization” category of BookStack’s settings, there’s a “Custom HTML Head Content” option that allows an administrator to conveniently patch a bit of code into each page’s &lt;head&gt; element without having to futz with template files (and thus complicating upgrades). &#8617;]]></summary></entry><entry><title type="html">Setting Up Amazon WorkSpaces With Simple AD Even if That’s Unavailable in Your AWS Region: Through VPC Peering, AD Connector, and a Route 53 Resolver Outbound Endpoint (Orchestrated With Terraform)</title><link href="https://excessivelyadequate.com/posts/sadwsp.html" rel="alternate" type="text/html" title="Setting Up Amazon WorkSpaces With Simple AD Even if That’s Unavailable in Your AWS Region: Through VPC Peering, AD Connector, and a Route 53 Resolver Outbound Endpoint (Orchestrated With Terraform)" /><published>2024-09-27T21:30:00+02:00</published><updated>2024-09-27T21:30:00+02:00</updated><id>https://excessivelyadequate.com/posts/sadwsp</id><content type="html" xml:base="https://excessivelyadequate.com/posts/sadwsp.html"><![CDATA[<p>Long title, but whatcha gonna do.</p>

<p>At <a href="https://www.suedweststrom.de">work</a>, which is a well-connected service provider in Germany’s energy industry and thus not the least likely target<sup id="fnref:knock"><a href="#fn:knock" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> of state-sponsored hackery, we’re constantly evaluating new solutions to improve our business continuity management (BCM) – without buzzwords, that’s preparations to get back up and running quickly if our main infrastructure were to be compromised.</p>

<p>One component of our strategy involves <strong>replicating certain parts of our infrastructure on the Amazon cloud using Terraform in a sort of colder-than-cold-standby manner where – except when we’re running disaster simulations – this infrastructure replica plain doesn’t exist</strong>; we only keep its Terraform specification<sup id="fnref:backupsdocs"><a href="#fn:backupsdocs" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> around. That’s kind of neat because non-existent infrastructure doesn’t cost all that much.</p>

<p>Recently, I was tasked with bringing <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/amazon-workspaces.html">Amazon WorkSpaces</a> into this mix to provide virtual desktops. In case you’re not familiar with it:</p>

<blockquote>
  <p>Amazon WorkSpaces enables you to provision virtual, cloud-based Microsoft Windows, Amazon Linux 2, Ubuntu Linux, or Red Hat Enterprise Linux desktops for your users, known as WorkSpaces. WorkSpaces eliminates the need to procure and deploy hardware or install complex software. You can quickly add or remove users as your needs change. Users can access their virtual desktops from multiple devices or web browsers.</p>
</blockquote>

<p>As we already use PCoIP-enabled zero/thin clients to access our existing infrastructure, being able to re-point these same (too-dumb-to-be-compromised) clients at WorkSpaces at a moment’s notice sounded ideal – with just one drawback: User management and authentication on WorkSpaces <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/manage-workspaces-directory.html">works via Active Directory</a>, and since that’s not required by the infrastructure subset we’re replicating on AWS, we weren’t planning on rebuilding our existing domain there. <strong>Keeping things simple ought to pay off when you’re scrambling to get up and running again</strong>, and having to configure (and secure) a Microsoft Active Directory in a pinch is, by all accounts, the <em>opposite</em> of simple.</p>

<p>Speaking of “simple” – luckily, Amazon offers <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/directory_simple_ad.html">Simple AD</a>, an inexpensive directory service that’s just “featureful” enough to support WorkSpaces.</p>

<blockquote>
  <p>Simple AD is a standalone managed directory that is powered by a Samba 4 Active Directory Compatible Server. […]</p>

  <p>Simple AD provides a subset of the features offered by AWS Managed Microsoft AD, including the ability to manage user accounts and group memberships, create and apply group policies, securely connect to Amazon EC2 instances, and provide Kerberos-based single sign-on (SSO). However, note that Simple AD does not support features such as multi-factor authentication (MFA), trust relationships with other domains, Active Directory Administrative Center, PowerShell support, Active Directory recycle bin, group managed service accounts, and schema extensions for POSIX and Microsoft applications.</p>
</blockquote>

<p>Given our minimalist use case, none of these limitations bother us.</p>

<p>Unfortunately, Simple AD is only <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/regions.html">available in certain AWS Regions</a>, which <code class="language-plaintext highlighter-rouge">eu-central-1</code> (Frankfurt) is not among<sup id="fnref:wsfra"><a href="#fn:wsfra" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> – yet being located in Germany and physically closest to us, that’s the Region we had selected for this project.</p>

<p>While we could migrate our infrastructure replica to, say, <code class="language-plaintext highlighter-rouge">eu-west-1</code> (Ireland) relatively easily (it being specified with Terraform and all), we decided<sup id="fnref:snipe"><a href="#fn:snipe" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> to first <strong>explore whether it’d be possible to connect a Simple AD in Ireland with WorkSpaces in Frankfurt</strong>. Some cursory googling yieled <a href="https://www.reddit.com/r/aws/comments/gz30tf/can_ad_connectors_be_used_with_simple_ad/">conflicting answers</a>, so we went ahead and just kind of tried – which wasn’t entirely straightforward (and, <em>spoiler alert</em>, doesn’t even end up saving money compared to a <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/directory_microsoft_ad.html">Managed Microsoft AD</a>), hence (and anyway): this blog post documenting what’s required.</p>

<p><em>Note:</em> If you aren’t using Terraform yourself, you’ll still be able to follow along via the AWS Management Console (or your IAC tool of choice). Plus, <strong>even if you’re not planning on implementing the exact same setup using the technologies mentioned in the title, this post will give some pointers on diverse topics</strong> like working with multiple AWS Regions, automatically joining EC2 instances to a directory via <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html">SSM</a> (and debugging seamless join failures), ports to open for directory services, setting up a VPC peering connection, getting started with WorkSpaces, and maybe more, who knows!</p>

<h2 id="overview">Overview</h2>

<p><img src="/static/sadwsp.drawio.svg" alt="" /></p>

<p>This <a href="https://www.drawio.com">draw.io</a> diagram provides a high-level overview of the infrastructure I’ll be setting up in this post, omitting some details like security groups. Don’t be shy about scrolling back up to it as you read on. The <span style="color: #6c8ebf;">blue</span> arrow signifies the <em>logical</em> path taken by AD authentication from WorkSpaces in Frankfurt to the Simple AD in Ireland. The <span style="color: #b85450;">red</span> arrows symbolize how EC2 instances are joined to the directory. <span style="color: #666666;">Grey</span> arrows merely indicate how WorkSpaces set up this way can seamlessly “talk” to other parts of the infrastructure.</p>

<h2 id="preamble-mapping-availability-zone-ids-to-az-names">Preamble: Mapping Availability Zone IDs to AZ names</h2>

<p>We’ve got a fairly standard VPC setup in Frankfurt. There’d be no point writing another word about it (I’ll show the code in a minute) if it weren’t for the fact that <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/azs-workspaces.html">Amazon WorkSpaces is only available in a subset of the Availability Zones for each supported Region</a>: in Frankfurt, for example, only <code class="language-plaintext highlighter-rouge">euc1-az2</code> and <code class="language-plaintext highlighter-rouge">euc1-az3</code> support WorkSpaces – so we wanted to make sure to deploy most of our infrastructure in these AZs and not, say, <code class="language-plaintext highlighter-rouge">euc1-az1</code>.</p>

<p>You might be more familiar with a different naming scheme for AZs: Those two zones would be <code class="language-plaintext highlighter-rouge">eu-central-1b</code> and <code class="language-plaintext highlighter-rouge">eu-central-1c</code>, right? Wrong! (Probably. (Confused?))</p>

<p>Quoting from the AWS documentation linked above:</p>

<blockquote>
  <p>[W]e independently map Availability Zones to names for each AWS account. For example, the Availability Zone <code class="language-plaintext highlighter-rouge">us-east-1a</code> for your AWS account might not be the same location as <code class="language-plaintext highlighter-rouge">us-east-1a</code> for another AWS account.</p>

  <p>To coordinate Availability Zones across accounts, you must use the AZ ID, which is a unique and consistent identifier for an Availability Zone. For example, <code class="language-plaintext highlighter-rouge">use1-az2</code> is an AZ ID for the <code class="language-plaintext highlighter-rouge">us-east-1</code> Region and it has the same location in every AWS account.</p>
</blockquote>

<p>Anther <a href="https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html">documentation page</a> elaborates on the reasoning behind this:</p>

<blockquote>
  <p>This approach helps to distribute resources across the Availability Zones in an AWS Region, instead of resources likely being concentrated in Availability Zone “a” for each Region.</p>
</blockquote>

<p>If you’re logged into your AWS account right now, you can see how this randomized mapping worked out for you <a href="https://eu-central-1.console.aws.amazon.com/ec2/home?region=eu-central-1#Settings:tab=zones">here</a> in the EC2 console.</p>

<p>Since the subnet we’re intending to “house” our WorkSpaces in must be located in <code class="language-plaintext highlighter-rouge">euc1-az2</code> or <code class="language-plaintext highlighter-rouge">euc1-az3</code> (that’s where WorkSpaces is available, after all), yet throughout Terraform, the <code class="language-plaintext highlighter-rouge">eu-central-1x</code> nomenclature is used, we require a <a href="https://stackoverflow.com/questions/77763318/terraform-create-map-of-az-id-to-name">mapping function between the two</a>.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># usual required_providers boilerplate</span>
<span class="k">terraform</span> <span class="p">{</span>
  <span class="nx">required_providers</span> <span class="p">{</span>
    <span class="nx">aws</span> <span class="o">=</span> <span class="p">{</span>
      <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"hashicorp/aws"</span>
      <span class="nx">version</span> <span class="o">=</span> <span class="s2">"~&gt; 5.0"</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># for completeness' sake, first the provider for frankfurt</span>
<span class="c1"># note that we'll add a second aws provider for ireland later (with an alias)</span>
<span class="k">provider</span> <span class="s2">"aws"</span> <span class="p">{</span>
  <span class="nx">region</span> <span class="o">=</span> <span class="s2">"eu-central-1"</span>
  <span class="nx">allowed_account_ids</span> <span class="o">=</span> <span class="p">[...]</span>  <span class="c1"># fill in to prevent accidentally messing with other stuff</span>

  <span class="c1"># set some tags on created objects to track what's been set up via terraform</span>
  <span class="nx">default_tags</span> <span class="p">{</span>
    <span class="nx">tags</span> <span class="o">=</span> <span class="p">{</span>
      <span class="s2">"Terraform"</span>   <span class="p">=</span> <span class="s2">"true"</span>
      <span class="s2">"Environment"</span> <span class="p">=</span> <span class="s2">"..."</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># get available azs</span>
<span class="k">data</span> <span class="s2">"aws_availability_zones"</span> <span class="s2">"available"</span> <span class="p">{</span>
  <span class="nx">state</span> <span class="o">=</span> <span class="s2">"available"</span>
<span class="p">}</span>

<span class="nx">locals</span> <span class="p">{</span>
  <span class="c1"># map between az ids and names (look up by id, returns name)</span>
  <span class="c1"># via https://stackoverflow.com/questions/77763318/terraform-create-map-of-az-id-to-name</span>
  <span class="nx">az_id_to_name_map</span> <span class="o">=</span> <span class="p">{</span> <span class="nx">for</span> <span class="nx">az</span> <span class="nx">in</span> <span class="k">data</span><span class="p">.</span><span class="nx">aws_availability_zones</span><span class="p">.</span><span class="nx">available</span><span class="p">.</span><span class="nx">zone_ids</span> <span class="o">:</span> <span class="nx">az</span> <span class="o">=&gt;</span> <span class="nx">element</span><span class="p">(</span><span class="k">data</span><span class="p">.</span><span class="nx">aws_availability_zones</span><span class="p">.</span><span class="nx">available</span><span class="p">.</span><span class="nx">names</span><span class="p">,</span> <span class="nx">index</span><span class="p">(</span><span class="k">data</span><span class="p">.</span><span class="nx">aws_availability_zones</span><span class="p">.</span><span class="nx">available</span><span class="p">.</span><span class="nx">zone_ids</span><span class="p">,</span> <span class="nx">az</span><span class="p">))</span> <span class="p">}</span>

  <span class="c1"># then set up a list of two or three azs (to set up subnets in later on, depending on your resiliency requirements)</span>
  <span class="c1"># making sure "euc1-az2" (where workspaces is available) is at index 0</span>
  <span class="c1"># see https://docs.aws.amazon.com/workspaces/latest/adminguide/azs-workspaces.html</span>
  <span class="nx">azs</span> <span class="o">=</span> <span class="p">[</span>
    <span class="kd">local</span><span class="p">.</span><span class="nx">az_id_to_name_map</span><span class="p">[</span><span class="s2">"euc1-az2"</span><span class="p">],</span>
    <span class="kd">local</span><span class="p">.</span><span class="nx">az_id_to_name_map</span><span class="p">[</span><span class="s2">"euc1-az3"</span><span class="p">],</span>
    <span class="c1">#local.az_id_to_name_map["euc1-az1"]</span>
  <span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If we now create public and private subnets based on that <code class="language-plaintext highlighter-rouge">azs</code> list, we can be sure that the respective 0-indexed subnets will be suitable for WorkSpaces.</p>

<h2 id="two-vpcs">Two VPCs<sup id="fnref:lotr"><a href="#fn:lotr" class="footnote" rel="footnote" role="doc-noteref">5</a></sup></h2>

<p>We like to use the <a href="https://github.com/terraform-aws-modules/terraform-aws-vpc"><code class="language-plaintext highlighter-rouge">terraform-aws-modules/vpc/aws</code> module</a> to keep VPC boilerplate<sup id="fnref:boilerplate"><a href="#fn:boilerplate" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> to a minimum. Accordingly, our VPC in Frankfurt is set up like this:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">locals</span> <span class="p">{</span>
  <span class="nx">vpc_cidr</span> <span class="o">=</span> <span class="s2">"10.0.0.0/16"</span>
<span class="p">}</span>

<span class="c1"># vpc, documentation see https://github.com/terraform-aws-modules/terraform-aws-vpc</span>
<span class="k">module</span> <span class="s2">"vpc"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/vpc/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.6.0"</span>

  <span class="nx">name</span>                         <span class="o">=</span> <span class="s2">"main-vpc-in-frankfurt"</span>
  <span class="nx">cidr</span>                         <span class="o">=</span> <span class="kd">local</span><span class="p">.</span><span class="nx">vpc_cidr</span>
  <span class="nx">enable_nat_gateway</span>           <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">single_nat_gateway</span>           <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">enable_dns_hostnames</span>         <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">azs</span>                          <span class="o">=</span> <span class="kd">local</span><span class="err">.</span><span class="nx">azs</span>
  <span class="nx">public_subnets</span>               <span class="o">=</span> <span class="p">[</span><span class="nx">for</span> <span class="nx">k</span><span class="p">,</span> <span class="nx">v</span> <span class="nx">in</span> <span class="kd">local</span><span class="p">.</span><span class="nx">azs</span> <span class="o">:</span> <span class="nx">cidrsubnet</span><span class="p">(</span><span class="kd">local</span><span class="p">.</span><span class="nx">vpc_cidr</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="nx">k</span><span class="p">)]</span>
  <span class="nx">private_subnets</span>              <span class="o">=</span> <span class="p">[</span><span class="nx">for</span> <span class="nx">k</span><span class="p">,</span> <span class="nx">v</span> <span class="nx">in</span> <span class="kd">local</span><span class="p">.</span><span class="nx">azs</span> <span class="o">:</span> <span class="nx">cidrsubnet</span><span class="p">(</span><span class="kd">local</span><span class="p">.</span><span class="nx">vpc_cidr</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="nx">k</span> <span class="o">+</span> <span class="mi">3</span><span class="p">)]</span>
  <span class="nx">database_subnets</span>             <span class="o">=</span> <span class="p">[</span><span class="nx">for</span> <span class="nx">k</span><span class="p">,</span> <span class="nx">v</span> <span class="nx">in</span> <span class="kd">local</span><span class="p">.</span><span class="nx">azs</span> <span class="o">:</span> <span class="nx">cidrsubnet</span><span class="p">(</span><span class="kd">local</span><span class="p">.</span><span class="nx">vpc_cidr</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="nx">k</span> <span class="o">+</span> <span class="mi">6</span><span class="p">)]</span>
  <span class="nx">create_database_subnet_group</span> <span class="o">=</span> <span class="kc">true</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Nothing too special (except for the fact that we specially prepared the <code class="language-plaintext highlighter-rouge">az</code>s list as explained above) – it’s just a bunch of subnets housing various servers and a couple of fairly beefy databases that aren’t relevant in the context of this post. Our WorkSpaces machines will be able to access these resources <em>and</em> we’ll be able to control this access by referencing the WorkSpaces security group (more on that later).</p>

<p>To set up a VPC for Simple AD in Ireland, we first need to add another AWS provider into our Terraform project, setting an <code class="language-plaintext highlighter-rouge">alias</code> to assign a unique name to this second provider.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">provider</span> <span class="s2">"aws"</span> <span class="p">{</span>
  <span class="nx">alias</span> <span class="o">=</span> <span class="s2">"ireland"</span>  <span class="c1"># to avoid conflicts with main provider, https://dev.to/devops4mecode/deploy-aws-resources-in-different-aws-account-and-multi-region-with-terraform-multi-provider-and-alias-ie9</span>

  <span class="nx">region</span> <span class="o">=</span> <span class="s2">"eu-west-1"</span>
  <span class="nx">allowed_account_ids</span> <span class="o">=</span> <span class="p">[...]</span>

  <span class="c1"># set some tags on created objects to track what's been set up via terraform</span>
  <span class="nx">default_tags</span> <span class="p">{</span>
    <span class="nx">tags</span> <span class="o">=</span> <span class="p">{</span>
      <span class="s2">"Terraform"</span>   <span class="p">=</span> <span class="s2">"true"</span>
      <span class="s2">"Environment"</span> <span class="p">=</span> <span class="s2">"..."</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="k">data</span> <span class="s2">"aws_availability_zones"</span> <span class="s2">"available_ireland"</span> <span class="p">{</span>
  <span class="nx">state</span> <span class="o">=</span> <span class="s2">"available"</span>

  <span class="c1"># include this line for any resources you wish to create in ireland</span>
  <span class="k">provider</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The VPC itself – only intended to house our Simple AD and a small EC2 instance for AD administration tasks – can be more basic than our “main” VPC in Frankfurt, only requiring public subnets (in different AZs, but here it doesn’t matter <em>which</em> AZs) for the AD’s two domain controllers <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_tutorial_create.html">according to Amazon’s documentation</a>. I opted for public subnets to avoid having to set up (more importantly: pay for) a <a href="https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html">NAT gateway</a> for internet access.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># vpc for simple ad, roughly matching https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_tutorial_create.html</span>
<span class="k">module</span> <span class="s2">"vpc_ireland"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/vpc/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.6.0"</span>

  <span class="nx">name</span>               <span class="o">=</span> <span class="s2">"ireland-vpc-for-simple-ad"</span>
  <span class="nx">cidr</span>               <span class="o">=</span> <span class="s2">"10.40.0.0/16"</span>
  <span class="nx">enable_nat_gateway</span> <span class="o">=</span> <span class="kc">false</span>
  <span class="nx">azs</span>                <span class="o">=</span> <span class="p">[</span><span class="s2">"eu-west-1a"</span><span class="p">,</span> <span class="s2">"eu-west-1b"</span><span class="p">]</span>
  <span class="nx">public_subnets</span>     <span class="o">=</span> <span class="p">[</span><span class="s2">"10.40.0.0/20"</span><span class="p">,</span> <span class="s2">"10.40.16.0/20"</span><span class="p">]</span>
  <span class="nx">private_subnets</span>    <span class="o">=</span> <span class="p">[]</span>

  <span class="c1"># terrraform modules require a different "provider override" notation than resources</span>
  <span class="nx">providers</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">aws</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Since we’ll be establishing a connection between the two VPCs through VPC peering, their CIDR ranges must be disjoint, so I selected <code class="language-plaintext highlighter-rouge">10.40.0.0/16</code> for <code class="language-plaintext highlighter-rouge">vpc_ireland</code>. That’s because “40” looks<sup id="fnref:prof"><a href="#fn:prof" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> like “AD” if you squint a little (or, admittedly, a lot).</p>

<h2 id="simple-ad">Simple AD</h2>

<p><a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_getting_started.html#gsg_create_directory">Setting up Simple AD</a> is, well, simple. (That’s where using an AWS-managed service really shines.)</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">locals</span> <span class="p">{</span>
  <span class="nx">simplead_fqdn</span> <span class="o">=</span> <span class="s2">"ad.ourcooldomain.com"</span> <span class="c1"># we've registered a different domain for this project, but you don't need to know about it</span>
<span class="p">}</span>

<span class="k">resource</span> <span class="s2">"aws_directory_service_directory"</span> <span class="s2">"simplead"</span> <span class="p">{</span>
  <span class="nx">type</span> <span class="o">=</span> <span class="s2">"SimpleAD"</span>
  <span class="nx">size</span> <span class="o">=</span> <span class="s2">"Small"</span>

  <span class="nx">name</span>       <span class="o">=</span> <span class="kd">local</span><span class="p">.</span><span class="nx">simplead_fqdn</span>
  <span class="nx">short_name</span> <span class="o">=</span> <span class="s2">"ad"</span> <span class="c1"># note: admin username then ad\administrator</span>
  <span class="nx">password</span>   <span class="o">=</span> <span class="kd">local</span><span class="p">.</span><span class="nx">envs</span><span class="p">[</span><span class="s2">"SIMPLEAD_ADMIN_PASSWORD"</span><span class="p">]</span>

  <span class="nx">vpc_settings</span> <span class="p">{</span>
    <span class="nx">vpc_id</span>     <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">vpc_id</span>
    <span class="nx">subnet_ids</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">public_subnets</span>
  <span class="p">}</span>

  <span class="k">provider</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span> <span class="c1"># won't work in frankfurt</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Where does <code class="language-plaintext highlighter-rouge">local.envs["SIMPLEAD_ADMIN_PASSWORD"]</code> come from, I hear you ask?</p>

<p>Since it’s bad practice to commit sensitive data like passwords into source control (our Terraform specification for this project lives in a Git repository), we’ve stored the password we’re planning to use for the <code class="language-plaintext highlighter-rouge">ad\administrator</code> user in a <code class="language-plaintext highlighter-rouge">.env</code> file (<a href="https://stackoverflow.com/a/64026661">duly</a> <code class="language-plaintext highlighter-rouge">.gitignore</code>d, of course)…</p>

<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">SIMPLEAD_ADMIN_PASSWORD</span><span class="p">=</span><span class="s">§up0rS3cretAd4dminPässwor&amp;</span>
</code></pre></div></div>

<p>…which is imported into a <code class="language-plaintext highlighter-rouge">local</code> value using this code snippet:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># import environment variables</span>
<span class="c1"># via https://stackoverflow.com/questions/59789730/how-can-i-read-environment-variables-from-a-env-file-into-a-terraform-script</span>
<span class="nx">locals</span> <span class="p">{</span>
  <span class="nx">envs</span> <span class="o">=</span> <span class="p">{</span> <span class="nx">for</span> <span class="nx">tuple</span> <span class="nx">in</span> <span class="nx">regexall</span><span class="p">(</span><span class="s2">"(.*)=(.*)"</span><span class="p">,</span> <span class="nx">file</span><span class="p">(</span><span class="s2">"</span><span class="p">${</span><span class="nx">path</span><span class="p">.</span><span class="k">module</span><span class="p">}</span><span class="s2">/.env"</span><span class="p">))</span> <span class="o">:</span> <span class="nx">tuple</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="p">=</span><span class="o">&gt;</span> <span class="nx">sensitive</span><span class="p">(</span><span class="nx">tuple</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, upon running <code class="language-plaintext highlighter-rouge">terraform apply</code>, AWS will take 5-10 minutes to provision <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_getting_started.html#prereq_simple">two domain controllers and DNS servers</a> (the DNS configuration will become important in a minute) for you. As usual with AWS-managed multi-AZ deployments, updates will be performed automatically in a manner that shouldn’t impact availability.</p>

<p><em>Pricing (as of September 2024)</em>: Assuming you haven’t used AWS directory services before, Simple AD will be free for the first month. After that, it’s <a href="https://aws.amazon.com/directoryservice/other-directories-pricing/">roughly $40/month</a>. It is, in fact, free in perpetuity if directly connected to WorkSpaces with at least one active user per month, but since we’ll use it in conjunction with AD Connector (where the same policy applies – so no costs there), AWS will charge those $40/month – which is still less than half the price of an AWS Managed Microsoft AD.</p>

<h2 id="a-small-windows-ec2-instance-for-ad-administration-tasks">A small Windows EC2 instance for AD administration tasks</h2>

<p>With Simple AD up and running, it’s time to test whether we can have have AWS automatically join an EC2 instance to it.</p>

<p>We’ll be needing such an instance for more than just testing, anyway: While it’s possible to <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/manage-workspaces-users.html">create AD users within the Amazon WorkSpaces console</a> – and for Managed Microsoft ADs <a href="https://aws.amazon.com/about-aws/whats-new/2024/09/aws-managed-microsoft-ad-users-groups-using-apis/">newly</a> in the AWS Management Console – those interfaces lack the functionality required for setting up a user equipped with the rights required to join future EC2 instances and WorkSpaces to the directory as needed within an AD Connector context. (We could use the <code class="language-plaintext highlighter-rouge">ad\administrator</code> user for this, but that’s bad practice.)</p>

<p>The first step is creating a key pair for the instance’s local administrator user (which we shouldn’t ever need to log into if the instance joins the AD successfully). I usually do this manually<sup id="fnref:tfkey"><a href="#fn:tfkey" class="footnote" rel="footnote" role="doc-noteref">8</a></sup> in the EC2 console under “Network &amp; Security” &gt; “Key Pairs”, making sure I’m in the correct Region. Here, in Ireland, I created a key pair called <code class="language-plaintext highlighter-rouge">simplead-admin-server-keypair</code>.</p>

<p>Then there’s the usual security group boilerplate, made a little bit less verbose by the <code class="language-plaintext highlighter-rouge">terraform-aws-modules/security-group/aws</code> module. As we’re planning to connect to this machine only via <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/fleet-rdp.html">Fleet Manager Remote Desktop</a> instead of standard RDP, no ingress rules are necessary. We could limit egress, but since we won’t install third-party software or surf the web on this server, we opted not to do so (for now).</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># security group for ad admin server, allowing egress to anywhere</span>
<span class="k">module</span> <span class="s2">"security_group_simplead_administration_server"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/security-group/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.1.2"</span>

  <span class="nx">name</span>        <span class="o">=</span> <span class="s2">"simplead-admin-server-sg"</span>
  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Simple AD Admin Server Security Group"</span>
  <span class="nx">vpc_id</span>      <span class="o">=</span> <span class="k">module</span><span class="err">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">vpc_id</span>

  <span class="c1"># allow all traffic out (and none in), by default to 0.0.0.0/0 and ::/0</span>
  <span class="nx">egress_rules</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"all-all"</span><span class="p">]</span>

  <span class="nx">providers</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">aws</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>As for the disk image to bootstrap our instance with: the latest Windows Server 2022 <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html">AMI</a> will do the trick.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># dynamically determine latest windows server 2022 base ami</span>
<span class="k">data</span> <span class="s2">"aws_ami"</span> <span class="s2">"latest_windows_server_2022_base_ireland"</span> <span class="p">{</span>
  <span class="nx">most_recent</span> <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">owners</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"amazon"</span><span class="p">]</span>
  <span class="nx">filter</span> <span class="p">{</span>
    <span class="nx">name</span> <span class="o">=</span> <span class="s2">"name"</span>
    <span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"Windows_Server-2022-English-Full-Base-*"</span><span class="p">]</span>
  <span class="p">}</span>

  <span class="k">provider</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
<span class="p">}</span>
</code></pre></div></div>

<p><em>(Reminder in case you’ve become blind to it: Always add <code class="language-plaintext highlighter-rouge">provider = aws.ireland</code> to resources that relate to infrastructure in Ireland, even on a <code class="language-plaintext highlighter-rouge">data</code> source that merely retrieves an AMI ID like this.)</em></p>

<p>The Terraform definition of the EC2 instance itself is a little more involved – you’ll be familiar with most of the following arguments if you’ve set up EC2 instances using Terraform before…</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ad admin server</span>
<span class="k">module</span> <span class="s2">"simplead_administration_server"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/ec2-instance/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.6.1"</span>

  <span class="nx">name</span> <span class="o">=</span> <span class="s2">"simplead-admin-server"</span>

  <span class="nx">ami</span>                         <span class="o">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">latest_windows_server_2022_base_ireland</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">ignore_ami_changes</span>          <span class="o">=</span> <span class="kc">true</span>       <span class="c1"># since we dynamically determine the latest windows ami, its id will change each month or so – so prevent terraform from thinking it needs to replace this instance each time that happens</span>
  <span class="nx">instance_type</span>               <span class="o">=</span> <span class="s2">"t3.small"</span> <span class="c1"># bit sluggish, but enough</span>
  <span class="nx">vpc_security_group_ids</span>      <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">security_group_simplead_administration_server</span><span class="p">.</span><span class="nx">security_group_id</span><span class="p">]</span>
  <span class="nx">subnet_id</span>                   <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">public_subnets</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
  <span class="nx">key_name</span>                    <span class="o">=</span> <span class="s2">"simplead-admin-server-keypair"</span> <span class="c1"># created manually in the ec2 console</span>
  <span class="nx">associate_public_ip_address</span> <span class="o">=</span> <span class="kc">true</span>

  <span class="c1"># assign a 50gb c: drive</span>
  <span class="nx">root_block_device</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span>
      <span class="nx">encrypted</span>   <span class="o">=</span> <span class="kc">true</span>
      <span class="nx">volume_type</span> <span class="o">=</span> <span class="s2">"gp3"</span>
      <span class="nx">volume_size</span> <span class="o">=</span> <span class="mi">50</span>
    <span class="p">},</span>
  <span class="p">]</span>

  <span class="c1"># set up an iam instance profile and grant permissions that</span>
  <span class="c1"># 1. enable the ssm agent to perform its tasks and</span>
  <span class="c1"># 2. allow the ssm agent to interact with directory services (so, join this instance to a directory)</span>
  <span class="nx">create_iam_instance_profile</span> <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">iam_role_description</span>        <span class="o">=</span> <span class="s2">"IAM role for simplead_administration_server EC2 instance"</span>
  <span class="nx">iam_role_policies</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">AmazonSSMManagedInstanceCore</span>    <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"</span>
    <span class="nx">AmazonSSMDirectoryServiceAccess</span> <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonSSMDirectoryServiceAccess"</span>
  <span class="p">}</span>

  <span class="c1"># powershell code for setting up ad management tooling components on server</span>
  <span class="c1"># via https://github.com/neillturner/terraform-aws-adclient/blob/master/main.tf</span>
  <span class="c1"># as user data, see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#ec2-windows-user-data</span>
  <span class="c1"># note that "&lt;&lt;-EOF" instead of "&lt;&lt;EOF" strips out leading tabs (but not spaces)</span>
  <span class="nx">user_data</span> <span class="o">=</span> <span class="o">&lt;&lt;-</span><span class="nx">EOF</span>
    <span class="o">&lt;</span><span class="nx">powershell</span><span class="o">&gt;</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-ADLDS</span>          <span class="c1"># "AD LDS Snap-Ins and Command-Line Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-AD-PowerShell</span>  <span class="c1"># "Active Directory module for Windows PowerShell"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-AD-Tools</span>       <span class="c1"># "AD DS and AD LDS Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-DNS-Server</span>     <span class="c1"># "DNS Server Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">GPMC</span>                <span class="c1"># "Group Policy Management"</span>

      <span class="c1"># fix powershell not accepting keyboard input by installing current PSReadLine version</span>
      <span class="c1"># via https://repost.aws/questions/QUGfM8RX3bSaadv5P_8f6byg/i-am-unable-to-paste-text-or-type-while-using-fleet-manager-in-certain-windows</span>
      <span class="nx">Install-PackageProvider</span> <span class="o">-</span><span class="nx">Name</span> <span class="nx">NuGet</span> <span class="o">-</span><span class="nx">MinimumVersion</span> <span class="mf">2.8</span><span class="err">.</span><span class="mf">5.201</span> <span class="o">-</span><span class="nx">Force</span>
      <span class="nx">Install-Module</span> <span class="o">-</span><span class="nx">Name</span> <span class="nx">PSReadLine</span> <span class="o">-</span><span class="nx">Force</span>
    <span class="o">&lt;/</span><span class="nx">powershell</span><span class="o">&gt;</span>
  <span class="nx">EOF</span>

  <span class="c1"># dependency not necessary in practice during testing</span>
  <span class="c1"># yet: logically, an instance can't join to a directory that doesn't exist yet</span>
  <span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">]</span>

  <span class="nx">providers</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">aws</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
  <span class="p">}</span>
<span class="err">}</span>
</code></pre></div></div>

<p>…but a couple words about the IAM instance profile and, in a minute, <code class="language-plaintext highlighter-rouge">user_data</code> are in order. From Amazon’s <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/setup-instance-permissions.html#instance-profile-add-permissions">documentation</a>:</p>

<blockquote>
  <p>An instance profile is a container that passes IAM role information to an Amazon Elastic Compute Cloud (Amazon EC2) instance at launch. You can create an instance profile for Systems Manager by attaching one or more IAM policies that define the necessary permissions to a new role or to a role you already created.</p>
</blockquote>

<p>So this equips our instance with an IAM role that, given certain policy attachments, allows software running on that instance to access AWS services<sup id="fnref:s3"><a href="#fn:s3" class="footnote" rel="footnote" role="doc-noteref">9</a></sup> with permissions specified in those policies. In this case, as outlined in the inline comment above, attaching the <code class="language-plaintext highlighter-rouge">AmazonSSMManagedInstanceCore</code> policy to our IAM instance profile allows Amazon’s <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html">SSM Agent</a> (which comes preinstalled on Amazon’s Windows AMIs) to perform its various functions, one of which – <a href="https://aws.amazon.com/blogs/mt/applying-managed-instance-policy-best-practices/">requiring</a> the other <code class="language-plaintext highlighter-rouge">AmazonSSMDirectoryServiceAccess</code> policy – is seamlessly joining an instance to a domain.</p>

<p>The <code class="language-plaintext highlighter-rouge">user_data</code> argument contains a sequence of PowerShell commands <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html#ec2-windows-user-data">executed by <code class="language-plaintext highlighter-rouge">EC2Launch</code></a> when the instance is initially launched (you could add <code class="language-plaintext highlighter-rouge">&lt;persist&gt;true&lt;/persist&gt;</code> behind the closing <code class="language-plaintext highlighter-rouge">&lt;/powershell&gt;</code> tag to have <code class="language-plaintext highlighter-rouge">EC2Launch</code> run these commands on each boot, which comes in handy in some contexts). These specific commands here install various Windows components for Active Directory administration. Also, we use <code class="language-plaintext highlighter-rouge">NuGet</code> to install the current <code class="language-plaintext highlighter-rouge">ReadLine</code> version (without which PowerShell <a href="https://repost.aws/questions/QUGfM8RX3bSaadv5P_8f6byg/i-am-unable-to-paste-text-or-type-while-using-fleet-manager-in-certain-windows">won’t</a> accept keyboard input when using Fleet Manager Remote Desktop).</p>

<p>Back to SSM Agent: Having granted it the permission to join this here instance to a directory is <em>required but not sufficient</em>: We need to let it know <em>which</em> directory to join the instance to. That’s done with a <code class="language-plaintext highlighter-rouge">aws_ssm_document</code> resource – basically a configuration file for SSM where we specify the directory ID, name, and IP addresses of the DNS servers, all of which can be pulled from the <code class="language-plaintext highlighter-rouge">aws_directory_service_directory.simplead</code> resource.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ssm configuration to join ec2 instance to ad via ssm, via https://stackoverflow.com/a/63706452</span>
<span class="k">resource</span> <span class="s2">"aws_ssm_document"</span> <span class="s2">"join_simplead"</span> <span class="p">{</span>
  <span class="nx">name</span>          <span class="o">=</span> <span class="s2">"join-simplead"</span>
  <span class="nx">document_type</span> <span class="o">=</span> <span class="s2">"Command"</span>
  <span class="nx">content</span> <span class="o">=</span> <span class="nx">jsonencode</span><span class="p">(</span>
    <span class="p">{</span>
      <span class="s2">"schemaVersion"</span> <span class="p">=</span> <span class="s2">"2.2"</span>
      <span class="s2">"description"</span>   <span class="p">=</span> <span class="s2">"aws:domainJoin"</span>
      <span class="s2">"mainSteps"</span> <span class="p">=</span> <span class="p">[</span>
        <span class="p">{</span>
          <span class="s2">"action"</span> <span class="p">=</span> <span class="s2">"aws:domainJoin"</span><span class="p">,</span>
          <span class="s2">"name"</span>   <span class="p">=</span> <span class="s2">"domainJoin"</span><span class="p">,</span>
          <span class="s2">"inputs"</span> <span class="p">=</span> <span class="p">{</span>
            <span class="s2">"directoryId"</span>    <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span>
            <span class="s2">"directoryName"</span>  <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">name</span><span class="p">,</span>
            <span class="s2">"dnsIpAddresses"</span> <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">dns_ip_addresses</span>
          <span class="p">}</span>
        <span class="p">}</span>
      <span class="p">]</span>
    <span class="p">}</span>
  <span class="p">)</span>

  <span class="k">provider</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
<span class="p">}</span>

<span class="c1"># associate the configuration with the instance</span>
<span class="k">resource</span> <span class="s2">"aws_ssm_association"</span> <span class="s2">"join_simplead_administration_server"</span> <span class="p">{</span>
  <span class="nx">name</span> <span class="o">=</span> <span class="nx">aws_ssm_document</span><span class="p">.</span><span class="nx">join_simplead</span><span class="p">.</span><span class="nx">name</span>
  <span class="nx">targets</span> <span class="p">{</span>
    <span class="nx">key</span>    <span class="o">=</span> <span class="s2">"InstanceIds"</span>
    <span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">simplead_administration_server</span><span class="p">.</span><span class="nx">id</span><span class="p">]</span>

    <span class="c1"># can also join based on a tag instead of a list of instance ids, see https://www.flypenguin.de/2021/10/18/aws---auto-join-windows-clients-to-a-managed-ad/</span>
  <span class="p">}</span>

  <span class="k">provider</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
<span class="p">}</span>
</code></pre></div></div>

<p>That’s it! After <code class="language-plaintext highlighter-rouge">terraform apply</code> and a few minutes (up to a quarter of an hour in my tests) of twiddling your thumbs waiting for <code class="language-plaintext highlighter-rouge">EC2Launch</code> to execute our PowerShell commands and SSM Agent to join the instance to the directory (and rebooting it at least once), you should be able to remote into the instance using Fleet Manager Remote Desktop. Upon logging in with the AD administration credentials – <code class="language-plaintext highlighter-rouge">ad\administrator</code> and the password from the <code class="language-plaintext highlighter-rouge">.env</code> file – you’ll be greeted with an “Other user” login screen. That means things worked! (If not, don’t fret: Down at the bottom of this post, you’ll find <a href="#debugging-seamless-ad-join-failures">some pointers on debugging seamless AD join failures</a>.)</p>

<p><em>Pricing (as of September 2024):</em> The EC2 instance will set you back around $1.25/day, $0.99 of which is for the instance itself (which shrinks to zero when the instance isn’t running), with the 50 GB of block storage assigned to the <code class="language-plaintext highlighter-rouge">C:\</code> drive costing $0.14. The remaining $0.12 can be explained by the public IP address associated with the instance (a little wasteful, but still cheaper than a NAT gateway).</p>

<h2 id="creating-a-directory-user-for-ad-connector-and-while-youre-at-it-your-first-workspaces-user">Creating a directory user for AD Connector (…and, while you’re at it, your first WorkSpaces user)</h2>

<p>With our administration instance up and running, it’s prime time to set up the aforementioned AD user with permissions to join future EC2 instances and WorkSpaces to the directory as required by AD Connector. While we could reuse the <code class="language-plaintext highlighter-rouge">ad\administrator</code> user for this, that’s <a href="https://en.wikipedia.org/wiki/Principle_of_least_privilege">bad practice</a>.</p>

<p>As it just so happens, I won’t need to explain that here because AWS provides<sup id="fnref:archive"><a href="#fn:archive" class="footnote" rel="footnote" role="doc-noteref">10</a></sup> a <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html#connect_delegate_privileges">handy step-by-step guide in the AD Connector docs</a>. What’s not mentioned here is that the user’s password must be compliant with <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_passwords_account-policy.html#default-policy-details">AWS password requirements</a>, so keep that in mind.</p>

<p><em>(Note that this means you can’t just take the Terraform code from this post and <code class="language-plaintext highlighter-rouge">terraform apply</code> it from top to bottom – there’s this manual step in-between.)</em></p>

<p>Assuming you’ve followed these instructions and have created an AD user account (<code class="language-plaintext highlighter-rouge">ad\adconnector</code>, say – and make sure you <em>un</em>check the “User must change password at next logon” checkbox) that’s a member of the <code class="language-plaintext highlighter-rouge">Connectors</code> group, store its password in the <code class="language-plaintext highlighter-rouge">.env</code> file – we’ll need it in a bit.</p>

<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">SIMPLEAD_ADMIN_PASSWORD</span><span class="p">=</span><span class="s">§up0rS3cretAd4dminPässwor&amp;</span>
<span class="py">SIMPLEAD_ADCONNECTOR_SERVICE_ACCOUNT_PASSWORD</span><span class="p">=</span><span class="s">ÆnotherPässwor&amp;That'sEv3nM0arSecre1</span>
</code></pre></div></div>

<p>Before closing the connection to the AD administration server, feel free to already create a standard AD user (no non-default group memberships needed, but once again do make sure to uncheck the “User must change password at next logon” checkbox) for your first WorkSpace.</p>

<p><img src="/static/sadwsp-user.png" alt="" /></p>

<p class="caption">I’ve heard pictures drive engagement, so here, 3300ish words in, is a screenshot of the user creation modal for my <code class="language-plaintext highlighter-rouge">noah</code> user.</p>

<h2 id="letting-the-two-vpcs-talk-to-each-other-with-a-vpc-peering-connection">Letting the two VPCs talk to each other with a VPC peering connection</h2>

<p>Before setting up AD Connector (let alone WorkSpaces), we need to connect the two VPCs in Frankfurt and Ireland. You won’t be surprised to learn that AWS offers <a href="https://medium.com/awesome-cloud/aws-difference-between-vpc-peering-and-transit-gateway-comparison-aws-vpc-peering-vs-aws-transit-gateway-3640a464be2d">a number of technologies</a> for linking multiple VPCs together. If you’re dealing with a dense “network of networks”, a <a href="https://aws.amazon.com/transit-gateway/">Transit Gateway</a> might prove a more maintainable solution than a more basic and inexpensive VPC peering connection – utter overkill here, though. Quoting from <a href="https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html">the docs</a>:</p>

<blockquote>
  <p>A VPC peering connection is a networking connection between two VPCs that enables you to route traffic between them […]. Instances in either VPC can communicate with each other as if they are within the same network. You can create a VPC peering connection between your own VPCs, or with a VPC in another AWS account. The VPCs can be in different Regions (also known as an inter-Region VPC peering connection).</p>
</blockquote>

<p>Inter-Region VPC peering connection! Sounds like just the tool for the job.</p>

<blockquote>
  <p>[R]esources in the VPCs (for example, EC2 instances and Lambda functions) in different AWS Regions can communicate with each other using private IP addresses, without using a gateway, VPN connection, or network appliance. The traffic remains in the private IP address space. All inter-Region traffic is encrypted with no single point of failure, or bandwidth bottleneck. Traffic always stays on the global AWS backbone, and never traverses the public internet, which reduces threats, such as common exploits, and DDoS attacks.</p>
</blockquote>

<p>Brilliant – where can I sign up?</p>

<p>Right within Terraform, of course, and <a href="https://github.com/grem11n/terraform-aws-vpc-peering/blob/master/examples/single-account-multi-region/README.md">the <code class="language-plaintext highlighter-rouge">grem11n/vpc-peering/aws</code> module</a> makes the establishment of the VPC peering connection and the required <a href="https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-routing.html">routing table updates</a> surprisingly painless.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="s2">"vpc_peering_single_account_multi_region_main_ireland"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"grem11n/vpc-peering/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"7.0.0"</span>

  <span class="nx">providers</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">aws</span><span class="p">.</span><span class="nx">this</span> <span class="o">=</span> <span class="nx">aws</span>
    <span class="nx">aws</span><span class="p">.</span><span class="nx">peer</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
  <span class="p">}</span>

  <span class="c1"># important: vpc cidrs must be disjoint!</span>
  <span class="nx">this_vpc_id</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>
  <span class="nx">peer_vpc_id</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">vpc_id</span>

  <span class="nx">auto_accept_peering</span> <span class="o">=</span> <span class="kc">true</span>
<span class="p">}</span>
</code></pre></div></div>

<p>One quick <code class="language-plaintext highlighter-rouge">terraform apply</code> later, any infrastructure in Frankfurt can now talk to the AD domain controllers in Ireland. Well, almost…</p>

<h2 id="extending-the-simple-ad-security-group-to-allow-inbound-traffic-from-frankfurt">Extending the Simple AD security group to allow inbound traffic from Frankfurt</h2>

<p>When setting up a Simple AD directory, AWS automatically creates a security group associated with the directory’s domain controllers and DNS servers. At the time of writing, this security group isn’t referenced<sup id="fnref:secdocs"><a href="#fn:secdocs" class="footnote" rel="footnote" role="doc-noteref">11</a></sup> from the “Directory details” page of your directory, so you’ll need to go hunting for it in the <a href="https://eu-west-1.console.aws.amazon.com/vpcconsole/home?region=eu-west-1#SecurityGroups:">VPC dashboard</a>: it’ll be named <code class="language-plaintext highlighter-rouge">d-9367c582c5_controllers</code> referencing your directory’s ID. During Simple AD creation, Terraform also captures the security group’s ID as <code class="language-plaintext highlighter-rouge">aws_directory_service_directory.simplead.security_group_id</code>, so you can also identify it that way.</p>

<p><img src="/static/sadwsp-securitygroup.png" alt="" /></p>

<p>As you can see in the screenshot above, this security group permits access from the host VPC’s CIDR block to a plethora of ports of the directory servers (that’s because AD comprises a <a href="https://www.encryptionconsulting.com/ports-required-for-active-directory-and-pki/">variety of services</a>). So, with our VPC peering connection established, we’ll now need to extend the security group rules to <em>also</em> allow access from our network in Frankfurt in order for instances located there to communicate with the directory.</p>

<p>As far as I’m aware (…but I’m by no means an expert!), Terraform doesn’t provide a convenient way of retrieving existing security group rules, then duplicating them but with certain fields modified. So I just went through the inbound rules listed above and manually implemented them in Terraform. Again, the <code class="language-plaintext highlighter-rouge">terraform-aws-modules/security-group/aws</code> module makes this relatively painless, enabling me to specify the rules in a relatively dense and almost tabular manner instead of separate <code class="language-plaintext highlighter-rouge">resource</code>s:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># create relevant security group rules to enable access from peered vpc in frankfurt</span>
<span class="c1"># can verify that you haven't missed any by sorting by port range in the aws management console</span>
<span class="k">module</span> <span class="s2">"security_group_extensions_simplead_controllers"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/security-group/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.1.2"</span>

  <span class="nx">create_sg</span>         <span class="o">=</span> <span class="kc">false</span> <span class="c1"># because it already exists and isn't directly managed by terraform</span>
  <span class="nx">security_group_id</span> <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">security_group_id</span>

  <span class="c1"># define source cidr block (will be used in ingress_with_cidr_blocks below if none other given)</span>
  <span class="nx">ingress_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="kd">local</span><span class="p">.</span><span class="nx">vpc_cidr</span><span class="p">]</span>

  <span class="c1"># rules (i added some descriptions)</span>
  <span class="nx">ingress_with_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>   <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>    <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"icmp"</span><span class="p">,</span> <span class="nx">description</span> <span class="o">=</span> <span class="s2">"All ICMP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>   <span class="mi">53</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>    <span class="mi">53</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"DNS"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>   <span class="mi">53</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>    <span class="mi">53</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"DNS"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>   <span class="mi">88</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>    <span class="mi">88</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Kerberos Authentication"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>   <span class="mi">88</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>    <span class="mi">88</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Kerberos Authentication"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">123</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">123</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"NTP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">135</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">135</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"RPC"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">138</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">138</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"NetBIOS Datagram Service"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">389</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">389</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"LDAP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">389</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">389</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"LDAP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">445</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">445</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"SMB"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">445</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">445</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"SMB"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">464</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">464</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Kerberos"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">464</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">464</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Kerberos"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span>  <span class="mi">636</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>   <span class="mi">636</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"LDAPS"</span><span class="p">},</span>
    <span class="c1">#{from_port =  636, to_port =   636, protocol = "udp",  description = "LDAPS"},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">3268</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span>  <span class="mi">3269</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"LDAP (Global Catalog)"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">1024</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">65535</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"Various TCP"</span><span class="p">},</span>  <span class="c1"># contains previous entry, but i'm merely replicating what aws sets</span>
  <span class="p">]</span>

  <span class="nx">providers</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">aws</span> <span class="o">=</span> <span class="nx">aws</span><span class="p">.</span><span class="nx">ireland</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>You’d think it’d also be possible to, instead of granting access to the entire CIDR block of our VPC in Frankfurt, create a security group there, assign it to instances you want to join to the directory, and reference it <em>instead of</em> the CIDR block. While cross-VPC and even cross-account security group references are possible, cross-Region ones sadly <a href="https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-security-groups.html">won’t</a> work:</p>

<blockquote>
  <p>You can’t reference the security group of a peer VPC that’s in a different Region. Instead, use the CIDR block of the peer VPC.</p>
</blockquote>

<p>Can’t be helped.</p>

<p>You can verify that these rules have been successfully applied (and that VPC peering works) by trying to ping the IP address of one of your directory’s DNS servers (determine it by checking that value of <code class="language-plaintext highlighter-rouge">aws_directory_service_directory.simplead.dns_ip_addresses[0]</code> via <code class="language-plaintext highlighter-rouge">terraform console</code>, for example) from an existing instance<sup id="fnref:outicmp"><a href="#fn:outicmp" class="footnote" rel="footnote" role="doc-noteref">12</a></sup> in Frankfurt.</p>

<h2 id="routing-ad-dns-traffic-from-frankfurt-to-ireland-with-a-route-53-resolver-outbound-endpoint--sketches-of-alternate-approaches">Routing AD DNS traffic from Frankfurt to Ireland with a Route 53 Resolver outbound endpoint (&amp; sketches of alternate approaches)</h2>

<p>Pinging the DNS servers is nice and all, but having DNS requests to <a href="https://learn.microsoft.com/en-us/archive/blogs/servergeeks/dns-records-that-are-required-for-proper-functionality-of-active-directory">various subdomains</a> of <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code> (our Simple AD’s fully qualified domain name) delivered to them is required for many AD operations. In our VPC in Ireland, this seems to happen automatically (or at least it’s not an issue during seamless domain joins there), but in Frankfurt, we must first configure Amazon’s <a href="https://docs.aws.amazon.com/vpc/latest/userguide/AmazonDNS-concepts.html#AmazonDNS">internal DNS resolution</a>, called the Route 53 Resolver, to forward requests matching the AD’s FQDN to said DNS servers.</p>

<p>That’s required because sadly, AD Connector – which we’ll set up in the next section – doesn’t magically take care of this. (I’ve performed a number of tests where domain joins would fail due to retroactively-obvious<sup id="fnref:familiar"><a href="#fn:familiar" class="footnote" rel="footnote" role="doc-noteref">13</a></sup> DNS resolution issues.)</p>

<p>One potential approach to sort this out is <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_dhcp_options_set.html">creating a DHCP options set</a>…</p>

<blockquote>
  <p>AWS recommends that you create a DHCP options set for your AWS Directory Service directory and assign the DHCP options set to the VPC that your directory is in. This allows any instances in that VPC to point to the specified domain and DNS servers to resolve their domain names.</p>
</blockquote>

<p>…but that won’t work here: While simple, this method sends <em>all</em> DNS traffic in the relevant VPC to the DNS servers configured in the options set (here: the AD DNS servers). In Frankfurt, with DNS traffic thusly redirected to Ireland, this predictably breaks<sup id="fnref:tried"><a href="#fn:tried" class="footnote" rel="footnote" role="doc-noteref">14</a></sup> all kinds of things. And while the Simple AD DNS servers fall back to the Route 53 Resolver for anything they are not authoritative<sup id="fnref:authoritsource"><a href="#fn:authoritsource" class="footnote" rel="footnote" role="doc-noteref">15</a></sup> for, that’s <em>Ireland’s Route 53 Resolver</em> which of course isn’t aware of infrastructure in Frankfurt.</p>

<p>Instead, it’s possible to configure<sup id="fnref:primer"><a href="#fn:primer" class="footnote" rel="footnote" role="doc-noteref">16</a></sup> the Route 53 Resolver in Frankfurt to proxy <em>only</em> DNS queries for <code class="language-plaintext highlighter-rouge">*ad.ourcooldomain.com</code> over to the Simple AD DNS servers over in Ireland <a href="https://aws.amazon.com/blogs/networking-and-content-delivery/integrating-your-directory-services-dns-resolution-with-amazon-route-53-resolvers/">using an Outbound Endpoint</a> while letting Amazon’s standard DNS resolution handle the rest as before.</p>

<p><em>(Note that Route 53 Resolver endpoints, while being a very “AWS-y” solution to this problem, are surprisingly expensive – so I’ll discuss an alternate solution a bit later.)</em></p>

<p>Any Route 53 Resolver endpoint requires you to set up (or reuse) a security group which, in the case of outbound endpoints, <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-forwarding-outbound-queries.html#resolver-forwarding-outbound-queries-endpoint-values">regulates</a> which targets the endpoint can forward DNS requests to.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># security group for route 53 resolver outbound endpoint</span>
<span class="k">module</span> <span class="s2">"security_group_ad_dns_outbound_endpoint"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/security-group/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.1.2"</span>

  <span class="nx">name</span>        <span class="o">=</span> <span class="s2">"ad-dns-outbound-endpoint-sg"</span>
  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"AD DNS Outbound Endpoint Security Group"</span>
  <span class="nx">vpc_id</span>      <span class="o">=</span> <span class="k">module</span><span class="err">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>

  <span class="c1"># there's probably no reason to restrict outbound dns traffic (since targets are explicitly specified in rules later on), but do it anyway</span>
  <span class="nx">egress_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">vpc_ireland</span><span class="p">.</span><span class="nx">vpc_cidr_block</span><span class="p">]</span>
  <span class="nx">egress_with_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">53</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">53</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">53</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">53</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">},</span>
  <span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The endpoint itself, then, can be defined as follows, specifying whether to allow unencrypted and/or HTTPS-encrypted DNS requests as well as IPv4 and/or IPv6 addressing.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># route 53 resolver outbound endpoint for forwarding dns queries concerning ad fqdn to ad dns servers while keeping other dns queries as is</span>
<span class="k">resource</span> <span class="s2">"aws_route53_resolver_endpoint"</span> <span class="s2">"ad_dns_outbound_endpoint"</span> <span class="p">{</span>
  <span class="nx">name</span>      <span class="o">=</span> <span class="s2">"ad-dns-outbound-endpoint"</span>
  <span class="nx">direction</span> <span class="o">=</span> <span class="s2">"OUTBOUND"</span>

  <span class="nx">security_group_ids</span> <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">security_group_ad_dns_outbound_endpoint</span><span class="p">.</span><span class="nx">security_group_id</span><span class="p">]</span>

  <span class="nx">dynamic</span> <span class="s2">"ip_address"</span> <span class="p">{</span>
    <span class="nx">for_each</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">private_subnets</span>
    <span class="nx">content</span> <span class="p">{</span>
      <span class="nx">subnet_id</span> <span class="o">=</span> <span class="nx">ip_address</span><span class="p">.</span><span class="nx">value</span>
    <span class="p">}</span>
  <span class="p">}</span>

  <span class="c1"># the construct above repeats the following for each private subnet (=index of module.vpc.private_subnets)</span>
  <span class="c1">#ip_address {</span>
  <span class="c1">#  subnet_id = module.vpc.private_subnets[0]</span>
  <span class="c1">#}</span>

  <span class="nx">protocols</span>              <span class="o">=</span> <span class="p">[</span><span class="s2">"Do53"</span><span class="p">]</span> <span class="c1"># unencrypted only (all aws-internal anyway)</span>
  <span class="nx">resolver_endpoint_type</span> <span class="o">=</span> <span class="s2">"IPV4"</span>   <span class="c1"># ipv4 (no need to ipv6 between frankfurt and ireland - again, all aws-internal anyway)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This allocates an IP address for the outbound endpoint in each private subnet (to save costs at the expense of some resiliency, you could – and, in this scenario, we actually did – constrain it to two subnets only).</p>

<p>With the outbound endpoint all set up, the redirection of DNS requests for <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code> (and subdomains thereof) now needs to be configured within a <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-rules-managing.html">resolver rule</a> associated with our endpoint:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"aws_route53_resolver_rule"</span> <span class="s2">"ad_dns_outbound_rule"</span> <span class="p">{</span>
  <span class="nx">name</span>                 <span class="o">=</span> <span class="s2">"simplead-dns-outbound-rule"</span>

  <span class="nx">rule_type</span>            <span class="o">=</span> <span class="s2">"FORWARD"</span>                                     <span class="c1"># forward...</span>
  <span class="nx">domain_name</span>          <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">name</span> <span class="c1"># ...anything suffixed by this domain...</span>

  <span class="nx">resolver_endpoint_id</span> <span class="o">=</span> <span class="nx">aws_route53_resolver_endpoint</span><span class="p">.</span><span class="nx">ad_dns_outbound_endpoint</span><span class="p">.</span><span class="nx">id</span> <span class="c1"># ...via this endpoint...</span>

  <span class="c1"># ...to these target dns servers</span>
  <span class="nx">dynamic</span> <span class="s2">"target_ip"</span> <span class="p">{</span>
    <span class="nx">for_each</span> <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">dns_ip_addresses</span>
    <span class="nx">content</span> <span class="p">{</span>
      <span class="nx">ip</span> <span class="o">=</span> <span class="nx">target_ip</span><span class="p">.</span><span class="nx">value</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="c1"># associate rule with the vpc our outbound endpoint lives in</span>
<span class="k">resource</span> <span class="s2">"aws_route53_resolver_rule_association"</span> <span class="s2">"simplead_dns_outbound_association"</span> <span class="p">{</span>
  <span class="nx">resolver_rule_id</span> <span class="o">=</span> <span class="nx">aws_route53_resolver_rule</span><span class="p">.</span><span class="nx">ad_dns_outbound_rule</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">vpc_id</span>           <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Running <code class="language-plaintext highlighter-rouge">terraform apply</code> should take less than five minutes.</p>

<p>Afterwards, similarly to how you earlier verified that the peering connection worked by pinging the directory servers from Frankfurt, you can now run <code class="language-plaintext highlighter-rouge">nslookup ad.ourcooldomain.com</code> on any preexisting EC2 instance in Frankfurt – if the outbound endpoint works correctly, this should now resolve instead of timing out.</p>

<p><em>Pricing (as of September 2024):</em> Surprisingly<sup id="fnref:whysurpr"><a href="#fn:whysurpr" class="footnote" rel="footnote" role="doc-noteref">17</a></sup> expensive, as indicated previously. A Route 53 Endpoint <a href="https://aws.amazon.com/route53/pricing/">costs</a> $0.125/hour per ENI, <em>i.e.</em>, for each <code class="language-plaintext highlighter-rouge">ip_address { subnet_id = ... }</code> block in the definition of your <code class="language-plaintext highlighter-rouge">aws_route53_resolver_endpoint</code>. Specifying just one <code class="language-plaintext highlighter-rouge">ip_address</code> block won’t work because, as with most full-managed services of this kind, AWS requires you to deploy Route 53 Endpoints in at least two subnets so that maintenance can be performed without downtime. So, times two, that’s $0.25/hour, which <a href="/posts/1m.html">comes out</a> to $180ish/month (plus negligible traffic charges).</p>

<p>For us, these costs aren’t a deal-breaker since this infrastructure isn’t meant to be permanent – but your mileage may vary. Hence:</p>

<h3 id="alternative-roll-your-own-dns-forwarding-setup-with-unbound">Alternative: Roll your own DNS forwarding setup with Unbound</h3>

<p>Whist writing this post, I came across an alternative in an old <a href="https://serverfault.com/questions/853305/forwarding-all-dns-request-in-a-zone-to-another-dns-server-in-route53">Server Fault answer</a>: As <a href="https://aws.amazon.com/blogs/security/how-to-set-up-dns-resolution-between-on-premises-networks-and-aws-by-using-unbound/">outlined in an AWS Security Blog post from 2016</a> (that’s from before Route 53 Endpoints were a thing), it’s possible to set up <a href="https://nlnetlabs.nl/projects/unbound/about/">Unbound</a>, an open-source DNS resolver, on a tiny EC2 instance and configure it to implement essentially the same forwarding rule we defined above while routing any other requests to AWS-provided DNS. Server setup and configuration can be performed wholly with <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html">EC2 user data</a>.</p>

<p>This instance (or, for resiliency, two instances – there’s no tricky synchronization requirements here, they’d be two independent and identical DNS forwarders) can then be referenced in a DHCP options set as mentioned earlier.</p>

<p>While I haven’t tested this solution (and it’s from far-gone 2016), I can’t think of a reason as to why it <em>shouldn’t</em> still work in 2024 and beyond.</p>

<h3 id="not-an-alternative-attaching-ns-and-glue-entries-pointing-at-your-ad-dns-servers-to-your-domain">(Not an) alternative: Attaching NS and Glue entries pointing at your AD DNS servers to your domain</h3>

<p>Not being a DNS expert, I briefly tried to <a href="https://www.youtube.com/watch?v=lIFE7h3m40U">bodge</a> together a DNS-based solution where I’d set up some subdomains, NS records and <a href="https://serverfault.com/questions/764937/why-dont-ns-records-contain-ip-addresses">glue records</a> to <em>publicly</em> delegate authority to <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code> to the AD DNS servers. After <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver.html">all</a>:</p>

<blockquote>
  <p>For public domain names, Route 53 Resolver performs recursive lookups against public name servers on the internet.</p>
</blockquote>

<p>I won’t detail all the variations I tried since this <em>can’t</em> work. That’s because – and I’m only 60% confident in the following explanation – the NS records for <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code> would need to resolve to the private IP addresses of the AD DNS servers, which the Route 53 Resolver, being located outside the VPC, can’t reach and thus also can’t recursively query.</p>

<p><em>(Apart from the fact that this idea went nowhere, it’d only work if you, in fact, set an existing domain whose DNS you control as your AD’s FQDN (which isn’t otherwise required), plus it’d leak the internal IP addresses of your AD DNS servers to the whole wide world, which – despite public access being locked out through security group rules – isn’t ideal to say the least.)</em></p>

<h2 id="enabling-seamless-domain-joins-and-workspaces-in-frankfurt-with-ad-connector">Enabling seamless domain joins (and WorkSpaces) in Frankfurt with AD Connector</h2>

<p>With the Amazon-managed EC2 instances hosting the directory in Ireland now accessible via VPC peering and integrated into DNS resolution, we could now already <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/join_windows_instance.html">manually join EC2 instances</a> in Frankfurt to the AD.</p>

<p>But <em>seamless</em> joins of EC2 instances rely on AWS APIs to initialize the join, and in my testing, there’s no way of having those API calls reference a directory in a different Region. Similarly, WorkSpaces must be associated<sup id="fnref:registered"><a href="#fn:registered" class="footnote" rel="footnote" role="doc-noteref">18</a></sup> with a directory located in the same Region as those WorkSpaces.</p>

<p>While primarily designed to forge a connection to, say, an on-premises Microsoft Active Directory, <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/directory_ad_connector.html">AD Connector</a> can of course <a href="https://repost.aws/knowledge-center/workspaces-ad-different-region">also target directories hosted in other AWS Regions</a>, and luckily for us, it supports Simple AD. This is not advertised or explicitly documented anywhere, as far as I can tell, but it “just works”.</p>

<p>I’m not sure exactly how AD Connector works under the hood, but <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html">from what</a> I can gather, it functionally acts as a proxy:</p>

<blockquote>
  <p>AD Connector is a directory gateway with which you can redirect directory requests to your on-premises Microsoft Active Directory without caching any information in the cloud. […] When connected to your existing directory, all of your directory data remains on your domain controllers. AWS Directory Service does not replicate any of your directory data. […]</p>

  <p>When you sign in to any AWS application or service integrated with an AD Connector (AWS IAM Identity Center included), the app or service forwards your authentication request to AD Connector which then forwards the request to a domain controller in your self-managed Active Directory for authentication. If you are successfully authenticated to your self-managed Active Directory, AD Connector then returns an authentication token to the app or service (similar to a Kerberos token).</p>
</blockquote>

<p>Never mind <em>how</em> AD Connector works – it’s not going to work if we don’t set it up. So, having already created an <code class="language-plaintext highlighter-rouge">ad\adconnector</code> directory user with the permissions required by AD Connector earlier, it’s time to put it to work:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ad connector in frankfurt, making ireland simple ad available there</span>
<span class="k">resource</span> <span class="s2">"aws_directory_service_directory"</span> <span class="s2">"adconnector"</span> <span class="p">{</span>
  <span class="nx">name</span>     <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">name</span>               <span class="c1"># must be identical to ad name in ireland</span>
  <span class="nx">password</span> <span class="o">=</span> <span class="kd">local</span><span class="p">.</span><span class="nx">envs</span><span class="p">[</span><span class="s2">"SIMPLEAD_ADCONNECTOR_SERVICE_ACCOUNT_PASSWORD"</span><span class="p">]</span> <span class="c1"># of service user we set up with domain join permissions</span>
  <span class="nx">size</span>     <span class="o">=</span> <span class="s2">"Small"</span>
  <span class="nx">type</span>     <span class="o">=</span> <span class="s2">"ADConnector"</span>

  <span class="nx">connect_settings</span> <span class="p">{</span>
    <span class="nx">customer_dns_ips</span>  <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">simplead</span><span class="p">.</span><span class="nx">dns_ip_addresses</span>
    <span class="nx">customer_username</span> <span class="o">=</span> <span class="s2">"adconnector"</span>                                                  <span class="c1"># service user we set up with domain join permissions</span>
    <span class="nx">subnet_ids</span>        <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">private_subnets</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">private_subnets</span><span class="p">[</span><span class="mi">1</span><span class="p">]]</span> <span class="c1"># at least two</span>
    <span class="nx">vpc_id</span>            <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>
  <span class="p">}</span>

  <span class="c1"># make sure not to create before dns set up correctly</span>
  <span class="c1"># (not actually required here just yet, but potentially for resources that transitively depend on this)</span>
  <span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_route53_resolver_rule_association</span><span class="p">.</span><span class="nx">simplead_dns_outbound_association</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Once <code class="language-plaintext highlighter-rouge">terraform apply</code> has spent 5-7 minutes waiting for Amazon to set up the AD Connector (which automatically performs connectivity checks, failures of which you can debug as outlined <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html#connect_verification">in the docs</a>), you should be good to go.</p>

<p><em>Pricing (as of September 2024)</em>: Identical to Simple AD – free if connected to WorkSpaces with at least one active user per month. Otherwise, it’s <a href="https://aws.amazon.com/directoryservice/other-directories-pricing/">roughly $40/month</a>.</p>

<h2 id="optionally-another-ec2-instance-for-ad-connector-testing-and-maybe-debugging">Optionally, another EC2 instance for AD Connector testing (and maybe debugging)</h2>

<p>To really make sure AD Connector works <em>and</em> our DNS forwarding shenanigans function correctly, I recommend setting up a small EC2 instance in Frankfurt and attempting to, as with the instance in Ireland, have SSM Agent perform a seamless domain join. <em>(If you don’t run into any issues with that, you can terminate the instance again – but if you</em> do <em>run into issues, it’s nice to have a machine ready for debugging instead of trying to figure out potentially-cryptic WorkSpaces error messages.)</em></p>

<p>First create an EC2 key pair named <code class="language-plaintext highlighter-rouge">adconnector-test-server-keypair</code> in Frankfurt – then, add this code to your Terraform project (which is almost identical to the specification of the EC2 instance in Ireland, so, in a shocking turn of events, I don’t believe there’s any need for more exposition):</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="s2">"security_group_adconnector_test_server"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/security-group/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.1.2"</span>

  <span class="nx">name</span>        <span class="o">=</span> <span class="s2">"adconnector-test-server-sg"</span>
  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"AD Connector Test Server Security Group"</span>
  <span class="nx">vpc_id</span>      <span class="o">=</span> <span class="k">module</span><span class="err">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>

  <span class="nx">egress_rules</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"all-all"</span><span class="p">]</span>
<span class="p">}</span>

<span class="k">data</span> <span class="s2">"aws_ami"</span> <span class="s2">"latest_windows_server_2022_base"</span> <span class="p">{</span>
  <span class="nx">most_recent</span> <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">owners</span>      <span class="o">=</span> <span class="p">[</span><span class="s2">"amazon"</span><span class="p">]</span>
  <span class="nx">filter</span> <span class="p">{</span>
    <span class="nx">name</span>   <span class="o">=</span> <span class="s2">"name"</span>
    <span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"Windows_Server-2022-English-Full-Base-*"</span><span class="p">]</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="k">module</span> <span class="s2">"adconnector_test_server"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/ec2-instance/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.6.1"</span>

  <span class="nx">name</span> <span class="o">=</span> <span class="s2">"adconnector-test-server"</span>

  <span class="nx">ami</span>                    <span class="o">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">latest_windows_server_2022_base</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">ignore_ami_changes</span>     <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">instance_type</span>          <span class="o">=</span> <span class="s2">"t3.small"</span>
  <span class="nx">vpc_security_group_ids</span> <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">security_group_adconnector_test_server</span><span class="p">.</span><span class="nx">security_group_id</span><span class="p">]</span>
  <span class="nx">subnet_id</span>              <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">private_subnets</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="c1"># note: private subnet in frankfurt instead of public subnet in ireland, hence no associate_public_ip_address argument</span>
  <span class="nx">key_name</span>               <span class="o">=</span> <span class="s2">"adconnector-test-server-keypair"</span>

  <span class="nx">root_block_device</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span>
      <span class="nx">encrypted</span>   <span class="o">=</span> <span class="kc">true</span>
      <span class="nx">volume_type</span> <span class="o">=</span> <span class="s2">"gp3"</span>
      <span class="nx">volume_size</span> <span class="o">=</span> <span class="mi">50</span>
    <span class="p">},</span>
  <span class="p">]</span>

  <span class="nx">create_iam_instance_profile</span> <span class="o">=</span> <span class="kc">true</span>
  <span class="nx">iam_role_description</span>        <span class="o">=</span> <span class="s2">"IAM role for adconnector_test_server EC2 instance"</span>
  <span class="nx">iam_role_policies</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">AmazonSSMManagedInstanceCore</span>    <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"</span>
    <span class="nx">AmazonSSMDirectoryServiceAccess</span> <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonSSMDirectoryServiceAccess"</span>
  <span class="p">}</span>

  <span class="nx">user_data</span> <span class="o">=</span> <span class="o">&lt;&lt;-</span><span class="nx">EOF</span>
    <span class="o">&lt;</span><span class="nx">powershell</span><span class="o">&gt;</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-ADLDS</span>          <span class="c1"># "AD LDS Snap-Ins and Command-Line Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-AD-PowerShell</span>  <span class="c1"># "Active Directory module for Windows PowerShell"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-AD-Tools</span>       <span class="c1"># "AD DS and AD LDS Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">RSAT-DNS-Server</span>     <span class="c1"># "DNS Server Tools"</span>
      <span class="nx">Install-WindowsFeature</span> <span class="nx">GPMC</span>                <span class="c1"># "Group Policy Management"</span>

      <span class="c1"># fix powershell not accepting keyboard input by installing current PSReadLine version</span>
      <span class="c1"># via https://repost.aws/questions/QUGfM8RX3bSaadv5P_8f6byg/i-am-unable-to-paste-text-or-type-while-using-fleet-manager-in-certain-windows</span>
      <span class="nx">Install-PackageProvider</span> <span class="o">-</span><span class="nx">Name</span> <span class="nx">NuGet</span> <span class="o">-</span><span class="nx">MinimumVersion</span> <span class="mf">2.8</span><span class="err">.</span><span class="mf">5.201</span> <span class="o">-</span><span class="nx">Force</span>
      <span class="nx">Install-Module</span> <span class="o">-</span><span class="nx">Name</span> <span class="nx">PSReadLine</span> <span class="o">-</span><span class="nx">Force</span>
    <span class="o">&lt;/</span><span class="nx">powershell</span><span class="o">&gt;</span>
  <span class="nx">EOF</span>

  <span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">]</span>
<span class="err">}</span>

<span class="k">resource</span> <span class="s2">"aws_ssm_document"</span> <span class="s2">"join_adconnector"</span> <span class="p">{</span>
  <span class="nx">name</span>          <span class="o">=</span> <span class="s2">"join-adconnector"</span>
  <span class="nx">document_type</span> <span class="o">=</span> <span class="s2">"Command"</span>
  <span class="nx">content</span> <span class="o">=</span> <span class="nx">jsonencode</span><span class="p">(</span>
    <span class="p">{</span>
      <span class="s2">"schemaVersion"</span> <span class="p">=</span> <span class="s2">"2.2"</span>
      <span class="s2">"description"</span>   <span class="p">=</span> <span class="s2">"aws:domainJoin"</span>
      <span class="s2">"mainSteps"</span> <span class="p">=</span> <span class="p">[</span>
        <span class="p">{</span>
          <span class="s2">"action"</span> <span class="p">=</span> <span class="s2">"aws:domainJoin"</span><span class="p">,</span>
          <span class="s2">"name"</span>   <span class="p">=</span> <span class="s2">"domainJoin"</span><span class="p">,</span>
          <span class="s2">"inputs"</span> <span class="p">=</span> <span class="p">{</span>
            <span class="s2">"directoryId"</span>    <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span>
            <span class="s2">"directoryName"</span>  <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">name</span><span class="p">,</span>
            <span class="s2">"dnsIpAddresses"</span> <span class="p">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">dns_ip_addresses</span>
          <span class="p">}</span>
        <span class="p">}</span>
      <span class="p">]</span>
    <span class="p">}</span>
  <span class="p">)</span>
<span class="p">}</span>

<span class="k">resource</span> <span class="s2">"aws_ssm_association"</span> <span class="s2">"join_adconnector_test_server"</span> <span class="p">{</span>
  <span class="nx">name</span> <span class="o">=</span> <span class="nx">aws_ssm_document</span><span class="p">.</span><span class="nx">join_adconnector</span><span class="p">.</span><span class="nx">name</span>
  <span class="nx">targets</span> <span class="p">{</span>
    <span class="nx">key</span>    <span class="o">=</span> <span class="s2">"InstanceIds"</span>
    <span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="k">module</span><span class="p">.</span><span class="nx">adconnector_test_server</span><span class="p">.</span><span class="nx">id</span><span class="p">]</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The first time I <code class="language-plaintext highlighter-rouge">terraform apply</code>ed this instance into existence (which, like the administration instance in Ireland, takes up to a quarter of an hour), the seamless domain join failed (<em>i.e.</em>, I couldn’t log in with AD credentials and running the <code class="language-plaintext highlighter-rouge">systeminfo</code> command-line tool as the local administrator yielded <code class="language-plaintext highlighter-rouge">Domain: WORKGROUP</code>). A manual domain join worked out just fine, though, so I tried setting up the EC2 instance again without any changes<sup id="fnref:insanity"><a href="#fn:insanity" class="footnote" rel="footnote" role="doc-noteref">19</a></sup> and that time (and ever since), the seamless domain join succeeded. I’m not sure what to make of that other than it confirming a certain <a href="https://www.youtube.com/watch?v=nn2FB1P_Mn8">age-old adage of IT professionals</a>.</p>

<p>So, if everything’s worked out, you can terminate this instance again. (If not, even after a retry: Down at the bottom of this post, you’ll find <a href="#debugging-seamless-ad-join-failures">some pointers on debugging seamless AD join failures</a>.)</p>

<p>That said, it’s a good “boilerplate-y” starting point for production EC2 instances you wish to join to your directory.</p>

<h2 id="registring-the-directory-with-workspaces">Registring the directory with WorkSpaces</h2>

<p>We’ve almost reached the goal (which, if you’ve lost track, 6500ish words removed from the title and all, is setting up a WorkSpace in Frankfurt that’s joined to the Simple AD in Ireland). <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/register-deregister-directory.html">But</a> first:</p>

<blockquote>
  <p>To allow WorkSpaces to use an existing AWS Directory Service directory, you must register it with WorkSpaces. After you register a directory, you can launch WorkSpaces in the directory.</p>
</blockquote>

<p>And <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/workspaces-access-control.html#create-default-role">even</a> “firster”:</p>

<blockquote>
  <p>Before you can register a directory using [Terraform], you must verify that a role named <code class="language-plaintext highlighter-rouge">workspaces_DefaultRole</code> exists. This role is created by the Quick Setup or if you launch a WorkSpace using the AWS Management Console, and it grants Amazon WorkSpaces permission to access specific AWS resources on your behalf.</p>
</blockquote>

<p>If you’ve previously experimented with WorkSpaces in the same AWS account, this role may already exist. If not, though, you can easily create it with Terraform, then attach the relevant<sup id="fnref:pools"><a href="#fn:pools" class="footnote" rel="footnote" role="doc-noteref">20</a></sup> AWS managed policies:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># workspaces_DefaultRole role (required before doing workspaces stuff via api/tf)</span>
<span class="c1"># https://docs.aws.amazon.com/workspaces/latest/adminguide/workspaces-access-control.html#create-default-role</span>
<span class="k">data</span> <span class="s2">"aws_iam_policy_document"</span> <span class="s2">"workspaces"</span> <span class="p">{</span>
  <span class="nx">statement</span> <span class="p">{</span>
    <span class="nx">actions</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"sts:AssumeRole"</span><span class="p">]</span>
    <span class="nx">principals</span> <span class="p">{</span>
      <span class="nx">type</span>        <span class="o">=</span> <span class="s2">"Service"</span>
      <span class="nx">identifiers</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"workspaces.amazonaws.com"</span><span class="p">]</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>
<span class="k">resource</span> <span class="s2">"aws_iam_role"</span> <span class="s2">"workspaces_default"</span> <span class="p">{</span>
  <span class="nx">name</span>               <span class="o">=</span> <span class="s2">"workspaces_DefaultRole"</span>
  <span class="nx">assume_role_policy</span> <span class="o">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">aws_iam_policy_document</span><span class="p">.</span><span class="nx">workspaces</span><span class="p">.</span><span class="nx">json</span>
<span class="p">}</span>
<span class="k">resource</span> <span class="s2">"aws_iam_role_policy_attachment"</span> <span class="s2">"workspaces_default_service_access"</span> <span class="p">{</span>
  <span class="nx">role</span>       <span class="o">=</span> <span class="nx">aws_iam_role</span><span class="p">.</span><span class="nx">workspaces_default</span><span class="p">.</span><span class="nx">name</span>
  <span class="nx">policy_arn</span> <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonWorkSpacesServiceAccess"</span>
<span class="p">}</span>
<span class="k">resource</span> <span class="s2">"aws_iam_role_policy_attachment"</span> <span class="s2">"workspaces_default_self_service_access"</span> <span class="p">{</span>
  <span class="nx">role</span>       <span class="o">=</span> <span class="nx">aws_iam_role</span><span class="p">.</span><span class="nx">workspaces_default</span><span class="p">.</span><span class="nx">name</span>
  <span class="nx">policy_arn</span> <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonWorkSpacesSelfServiceAccess"</span>
<span class="p">}</span>
<span class="k">resource</span> <span class="s2">"aws_iam_role_policy_attachment"</span> <span class="s2">"workspaces_default_pool_service_access"</span> <span class="p">{</span>
  <span class="nx">role</span>       <span class="o">=</span> <span class="nx">aws_iam_role</span><span class="p">.</span><span class="nx">workspaces_default</span><span class="p">.</span><span class="nx">name</span>
  <span class="nx">policy_arn</span> <span class="o">=</span> <span class="s2">"arn:aws:iam::aws:policy/AmazonWorkSpacesPoolServiceAccess"</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Another step you should perform before registering your directory for use with WorkSpaces is setting up a security group for your WorkSpaces instances – that’s because such a security group should be specified during the registration stage. It can be switched out for a different one later, but, quoting from Amazon’s <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/amazon-workspaces-security-groups.html">documentation</a>:</p>

<blockquote>
  <p>You can add a default WorkSpaces security group to a directory. After you associate a new security group with a WorkSpaces directory, [only] new WorkSpaces that you launch or existing WorkSpaces that you rebuild will have the new security group.</p>
</blockquote>

<p>If you don’t specify a security group, Amazon will auto-generate one (it’ll, in fact, <em>always</em> auto-generate one but that one won’t be used if you specify your own) whose name consists of the directory identifier followed by <code class="language-plaintext highlighter-rouge">_workspacesMembers</code> and which AWS warns against modifying.</p>

<p>Interestingly enough, to enable clients to initiate a connection to your WorkSpaces, you’ll need to add inbound rules for PCoIP and Amazon’s WSP protocol to your security group. This isn’t documented anywhere (I figured it out by trial and error and later found a relevant <a href="https://repost.aws/questions/QUQ1brLyutQPiTPKZMj0yXSg/can-t-access-my-personal-workspace-from-any-client">re:Post question and answer</a>) and these rules also aren’t automatically attached to the auto-generated <code class="language-plaintext highlighter-rouge">_workspacesMembers</code> security group, which is a bit puzzling.</p>

<p>Anyway, enough <a href="https://www.urbandictionary.com/define.php?term=Yapping">yapping</a> – here’s the code:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="s2">"security_group_workspaces"</span> <span class="p">{</span>
  <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"terraform-aws-modules/security-group/aws"</span>
  <span class="nx">version</span> <span class="o">=</span> <span class="s2">"5.1.2"</span>

  <span class="nx">name</span>   <span class="o">=</span> <span class="s2">"workspaces-sg"</span>
  <span class="nx">vpc_id</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">vpc</span><span class="p">.</span><span class="nx">vpc_id</span>

  <span class="c1"># can constrain this depending on what your workspaces need to access (plus you can reference this security group in other security group rules, of course)</span>
  <span class="nx">egress_rules</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"all-all"</span><span class="p">]</span>

  <span class="c1"># might be able to restrict this https://docs.aws.amazon.com/workspaces/latest/adminguide/workspaces-port-requirements.html#network-interfaces</span>
  <span class="c1"># also compare https://docs.aws.amazon.com/workspaces/latest/adminguide/architecture.html</span>
  <span class="nx">ingress_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"0.0.0.0/0"</span><span class="p">]</span>

  <span class="c1"># allow inbound traffic via the two streaming protocols supported by workspaces: pcoip, wsp</span>
  <span class="c1"># note that this is required (tried without and couldn't connect to workspaces), yet not documented</span>
  <span class="c1"># see also: https://repost.aws/questions/QUQ1brLyutQPiTPKZMj0yXSg/can-t-access-my-personal-workspace-from-any-client</span>
  <span class="nx">ingress_with_cidr_blocks</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">4172</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">4172</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"PCoIP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">4172</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">4172</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"PCoIP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">4195</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">4195</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"tcp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"WSP"</span><span class="p">},</span>
    <span class="p">{</span><span class="nx">from_port</span> <span class="o">=</span> <span class="mi">4195</span><span class="p">,</span> <span class="nx">to_port</span> <span class="o">=</span> <span class="mi">4195</span><span class="p">,</span> <span class="nx">protocol</span> <span class="o">=</span> <span class="s2">"udp"</span><span class="p">,</span>  <span class="nx">description</span> <span class="o">=</span> <span class="s2">"WSP"</span><span class="p">},</span>
  <span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>After this rather lengthy prelude, registering your AD Connector with WorkSpaces works as follows. Note that you can configure certain settings for <em>all</em> WorkSpaces you’ll create in your directory during this step – <em>e.g.</em>, which client applications can connect to your WorkSpaces, whether your users can resize their WorkSpaces by themselves, and others. I recommend taking a look at the <a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/workspaces_directory.html">documentation of the <code class="language-plaintext highlighter-rouge">aws_workspaces_directory</code> Terraform resource</a> and also the <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/update-directory-details.html">relevant Amazon docs</a>.</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"aws_workspaces_directory"</span> <span class="s2">"adconnector"</span> <span class="p">{</span>
  <span class="nx">directory_id</span> <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">subnet_ids</span>   <span class="o">=</span> <span class="nx">aws_directory_service_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">connect_settings</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="nx">subnet_ids</span>

  <span class="nx">self_service_permissions</span> <span class="p">{</span>
    <span class="nx">change_compute_type</span>  <span class="o">=</span> <span class="kc">false</span>
    <span class="nx">increase_volume_size</span> <span class="o">=</span> <span class="kc">false</span>
    <span class="nx">rebuild_workspace</span>    <span class="o">=</span> <span class="kc">false</span>
    <span class="nx">restart_workspace</span>    <span class="o">=</span> <span class="kc">true</span>
    <span class="nx">switch_running_mode</span>  <span class="o">=</span> <span class="kc">false</span>
  <span class="p">}</span>

  <span class="c1"># steer which client software your users can access workspaces through</span>
  <span class="c1"># for this example, just allow macos (i.e. osx)</span>
  <span class="nx">workspace_access_properties</span> <span class="p">{</span>
    <span class="nx">device_type_android</span>    <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_chromeos</span>   <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_ios</span>        <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_linux</span>      <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_osx</span>        <span class="o">=</span> <span class="s2">"ALLOW"</span>
    <span class="nx">device_type_web</span>        <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_windows</span>    <span class="o">=</span> <span class="s2">"DENY"</span>
    <span class="nx">device_type_zeroclient</span> <span class="o">=</span> <span class="s2">"DENY"</span>
  <span class="err">}</span>

  <span class="nx">workspace_creation_properties</span> <span class="p">{</span>
    <span class="nx">custom_security_group_id</span> <span class="o">=</span> <span class="k">module</span><span class="p">.</span><span class="nx">security_group_workspaces</span><span class="p">.</span><span class="nx">security_group_id</span>
    <span class="c1"># can also change the ou workspaces will be created in via default_ou</span>
    <span class="nx">enable_internet_access</span>              <span class="o">=</span> <span class="kc">false</span> <span class="c1"># since we use a nat gateway in our vpc</span>
    <span class="nx">enable_maintenance_mode</span>             <span class="o">=</span> <span class="kc">true</span>  <span class="c1"># true = automatic weekly windows updates, see https://docs.aws.amazon.com/workspaces/latest/adminguide/workspace-maintenance.html</span>
    <span class="nx">user_enabled_as_local_administrator</span> <span class="o">=</span> <span class="kc">false</span> <span class="c1"># no way!</span>
  <span class="p">}</span>

  <span class="c1"># can also restrict access to workspaces based on ip groups, see https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/workspaces_ip_group</span>

  <span class="c1"># make sure directory is not registered before iam role set up</span>
  <span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_iam_role_policy_attachment</span><span class="p">.</span><span class="nx">workspaces_default_service_access</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>With that, we’re ready to set up…</p>

<h2 id="a-first-workspace">A first WorkSpace</h2>

<p>Assuming the directory was successfully registered with WorkSpaces, now it’s almost trivially easy to set up a WorkSpace for your first user – which, if you followed my instructions, you’ve already created earlier after setting up the service user for AD Connector. If you haven’t, now’s the time!</p>

<p>Note that WorkSpaces, somewhat differently from EC2 instances and AMIs, bundle the compute configuration – CPUs and RAM – with disk images and a streaming protocol (industry-quasi-standard <a href="https://en.wikipedia.org/wiki/Teradici#PCoIP_Protocol">PCoIP</a> or Amazon’s own WSP, which <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/amazon-workspaces-protocols.html">may not be a coin toss depending on your needs</a>) and, in the case of Windows, whether to preinstall a license-included Microsoft Office distribution or not. Such a bundle is called a, uh, <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/amazon-workspaces-bundles.html">bundle</a>:</p>

<blockquote>
  <p>A <em>WorkSpace bundle</em> is a combination of an operating system, and storage, compute, and software resources [and streaming protocol]. When you launch a WorkSpace, you select the bundle that meets your needs. The default bundles available for WorkSpaces are called <em>public bundles</em>.</p>
</blockquote>

<p>(Once you’ve got a WorkSpace up and running, you can – after installing software for your users – <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/create-custom-bundle.html">snapshot it to create a <em>custom bundle</em></a> which can then be deployed on further WorkSpaces.)</p>

<p>You can browse available bundles in the <a href="https://eu-central-1.console.aws.amazon.com/workspaces/v2/workspaces/create-desktops">WorkSpace setup flow</a> in the WorkSpaces Console – try it and note down the bundle ID(s) you wish to try out. <em>(Note that if you change your mind later, you can – if <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/migrate-workspaces.html#migration-limits">certain conditions</a> are met – <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/migrate-workspaces.html">migrate to another bundle</a> while retaining user data.)</em></p>

<p><img src="/static/sadwsp-bundles.png" alt="" /></p>

<p>While you can use the <a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/workspaces_bundle"><code class="language-plaintext highlighter-rouge">aws_workspaces_bundle</code> resource</a> to dynamically determine bundle IDs based on names, I advise against doing that: at the time of writing, there’s <a href="https://github.com/hashicorp/terraform-provider-aws/issues/33445">no option to specify the desired streaming protocol</a> this way. However, WorkSpace bundle IDs don’t change frequently (unlike EC2 image IDs), meaning that they’re relatively stable. I like to give them names as follows:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">data</span> <span class="s2">"aws_workspaces_bundle"</span> <span class="s2">"standard_server_2022_wsp"</span> <span class="p">{</span>
    <span class="nx">bundle_id</span> <span class="o">=</span> <span class="s2">"wsb-93xk71ss4"</span> <span class="c1"># "Standard with Windows 10 (Server 2022 based)", WSP</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Then, a WorkSpace using my <code class="language-plaintext highlighter-rouge">noah</code> user created earlier can be set up like this:</p>

<div class="language-terraform highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"aws_workspaces_workspace"</span> <span class="s2">"noah_workspace_image"</span> <span class="p">{</span>
  <span class="nx">directory_id</span> <span class="o">=</span> <span class="nx">aws_workspaces_directory</span><span class="p">.</span><span class="nx">adconnector</span><span class="p">.</span><span class="nx">directory_id</span>
  <span class="c1"># could also reference aws_directory_service_directory.adconnector.id instead but then there's no terraform dependency between workspace and the directory registration, leading to issues on terraform apply/destroy</span>

  <span class="nx">bundle_id</span> <span class="o">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">aws_workspaces_bundle</span><span class="p">.</span><span class="nx">standard_server_2022_wsp</span><span class="p">.</span><span class="nx">bundle_id</span>
  <span class="nx">user_name</span> <span class="o">=</span> <span class="s2">"noah"</span>

  <span class="c1"># since bitlocker encryption is not supported on workspaces, you can encrypt volumes using aws kms keys</span>
  <span class="c1"># this has drawbacks, though, so take a look at the docs to make an informed decision</span>
  <span class="c1"># https://docs.aws.amazon.com/workspaces/latest/adminguide/encrypt-workspaces.html</span>
  <span class="nx">root_volume_encryption_enabled</span> <span class="o">=</span> <span class="kc">false</span>
  <span class="nx">user_volume_encryption_enabled</span> <span class="o">=</span> <span class="kc">false</span>

  <span class="c1"># other options, see https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/workspaces_workspace.html</span>
  <span class="c1"># in my testing, there seems to be a bug (?) where you have to either set values for all these arguments or leave this block out completely</span>
  <span class="nx">workspace_properties</span> <span class="p">{</span>
    <span class="nx">compute_type_name</span>                         <span class="o">=</span> <span class="s2">"STANDARD"</span>
    <span class="nx">user_volume_size_gib</span>                      <span class="o">=</span> <span class="mi">20</span>
    <span class="nx">root_volume_size_gib</span>                      <span class="o">=</span> <span class="mi">80</span>
    <span class="nx">running_mode</span>                              <span class="o">=</span> <span class="s2">"AUTO_STOP"</span> <span class="c1"># or "ALWAYS_ON"</span>
    <span class="nx">running_mode_auto_stop_timeout_in_minutes</span> <span class="o">=</span> <span class="mi">60</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In my experience, it takes about 10-12 minutes for <code class="language-plaintext highlighter-rouge">terraform apply</code> to deploy the WorkSpace and join it to your Simple AD. Afterwards, you can log in using the <a href="https://clients.amazonworkspaces.com">client application of your choice</a> (assuming you’ve enabled it during directory registation).</p>

<p><em>Pricing (as of September 2024)</em>: Depends on the bundle and whether a given WorkSpace was set up <a href="https://www.reddit.com/r/aws/comments/g0no4f/comment/fne7kvx/">with monthly or usage-based billing</a> (via the <code class="language-plaintext highlighter-rouge">running_mode</code> argument). Expect $50-$100 per month for relatively standard configurations. Little bit more with Microsoft Office, little bit less with Linux instead of Windows. A lot more if you go for <a href="https://aws.amazon.com/about-aws/whats-new/2023/10/amazon-workspaces-graphics-g4dn-bundles-ubuntu-desktops/">“GPU-enabled” configurations</a>.</p>

<h2 id="it-works">It works!</h2>

<p>Go download the <a href="https://clients.amazonworkspaces.com">WorkSpaces client for the platform you’re using</a> and log in using your registration code (which is the same for all WorkSpaces authenticated against a given directory, see <code class="language-plaintext highlighter-rouge">aws_workspaces_directory.adconnector.registration_code</code> or in the AWS Management Console) and AD user credentials.</p>

<p><img src="/static/sadwsp-itworks.png" alt="" /></p>

<p class="caption">This website displayed in Firefox running on a WorkSpace accessed through Amazon’s WorkSpaces client for macOS.</p>

<p>So there you go, a WorkSpace in Frankfurt managed<sup id="fnref:gp"><a href="#fn:gp" class="footnote" rel="footnote" role="doc-noteref">21</a></sup> by a Simple AD in Ireland. Not necessarily cost-effective (largely due to the Route 53 Resolver endpoint required for proper DNS routing – though, as mentioned, there’s cheaper alternatives), but it <em>functions</em> just swell.</p>

<p>Did that warrant 8000ish words, though? Who<sup id="fnref:who"><a href="#fn:who" class="footnote" rel="footnote" role="doc-noteref">22</a></sup> knows, but <strong>I hope you’ve learned a thing or three</strong>. I know I have.</p>

<hr />

<h2 id="-appendix-a-non-exhaustive-set-of-pointers-for-debugging-seamless-ad-join-failures"><a name="debugging-seamless-ad-join-failures"></a> Appendix: A non-exhaustive set of pointers for debugging seamless AD join failures</h2>

<p>As promised earlier, here’s a few<sup id="fnref:adcomplex"><a href="#fn:adcomplex" class="footnote" rel="footnote" role="doc-noteref">23</a></sup> seamless AD join failure scenarios I encountered during work on this project, with tips on how to figure out what’s wrong and on approaches for finding your way toward a solution.</p>

<p>Generally, it’s a good idea to familiarize yourself with Microsoft’s <a href="https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/active-directory-domain-join-troubleshooting-guidance">Active Directory domain join troubleshooting guidance</a>. Make sure to also take a look at Amazon’s <a href="https://repost.aws/knowledge-center/ssm-seamless-domain-join-windows">Knowledge Center article on this topic</a> – it lists common reasons why a seamless domain join on AWS might fail and how to resolve the underlying issue in each case.</p>

<h3 id="checking-whether-the-join-failed">Checking <em>whether</em> the join failed</h3>

<p>At the most basic level, knowing what indicates a failed directory join is an important first step.</p>

<ul>
  <li>
    <p>If you can’t RDP into a freshly–created instance using the <code class="language-plaintext highlighter-rouge">ad\administrator</code> credentials, chances are the join didn’t work out. It’s <em>technically</em> also possible that the join initially succeeded but connectivity was lost since then, though.</p>
  </li>
  <li>
    <p>After a successful login with the local administrator account (you’ll need the relevant EC2 keypair for this), open PowerShell and run <code class="language-plaintext highlighter-rouge">systeminfo</code>. This command takes a few seconds to gather information, then prints out a list of information about your<sup id="fnref:remotesysteminfo"><a href="#fn:remotesysteminfo" class="footnote" rel="footnote" role="doc-noteref">24</a></sup> machine. The line starting with “Domain:” contains either your directory’s FQDN (like <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code>) – indicating a successful join – or <code class="language-plaintext highlighter-rouge">WORKGROUP</code>, which is the default on computers not joined to a domain.</p>
  </li>
  <li>
    <p>Alternatively, run <code class="language-plaintext highlighter-rouge">%SystemRoot%\system32\control.exe sysdm.cpl</code> to open the “System Properties” dialog box, select the “Computer Name” tab, then look at the name displayed next to “Workgroup:”.</p>
  </li>
</ul>

<h3 id="where-to-look-for-error-messages">Where to look for error messages</h3>

<p>There’s several places where you can find hints as to what went wrong during the domain join:</p>

<ul>
  <li>
    <p>The first step in debugging the underlying issue is checking the log events<sup id="fnref:ssmcloudwatch"><a href="#fn:ssmcloudwatch" class="footnote" rel="footnote" role="doc-noteref">25</a></sup> created during the join attempt by Amazon’s software. Navigate to the relevant section in <a href="https://learn.microsoft.com/en-us/shows/inside/event-viewer">Event Viewer</a>: “Application and Services Logs” &gt; “EC2ConfigService”.</p>

    <p>There, you should see a list of events in reverse-chronological order, usually culminating in an “Error” event, commonly surrounded by warnings. Take a look at the details of these and immediately preceding “Information” events – this tells you at which stage the failure occurred.</p>

    <p><img src="/static/sadwsp-eventviewer.png" alt="" /></p>

    <p class="caption">A successful seamless AD join’s entries in Event Viewer – what you <em>want</em> to see.</p>
  </li>
  <li>
    <p>For further details on which step of the domain join failed and <em>why</em>, it’s often illuminating to read through <code class="language-plaintext highlighter-rouge">C:\Windows\debug\NetSetup.LOG</code> – Windows <a href="https://mattwv.wordpress.com/2013/08/06/debugging-domain-join-issues-with-netsetup-log/">details the entire process of joining the domain</a> in this file. In the documentation, you’ll find a <a href="https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/active-directory-domain-join-troubleshooting-guidance#common-issues-and-solutions">list of common error codes and what to do about them</a>.</p>
  </li>
  <li>
    <p>Attempting a <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/join_windows_instance.html">manual domain join</a> produces detailed error messages that, in my experience, tend to be more immediately helpful than what’s logged to <code class="language-plaintext highlighter-rouge">NetSetup.LOG</code>, so it’s worth a shot if you’re not getting anywhere. (It’s also possible that a manual domain join succeeds when a seamless one didn’t.)</p>
  </li>
</ul>

<h3 id="scenario-i--ad-connector-cannot-be-created">Scenario I – AD Connector cannot be created</h3>

<p>Amazon provides a <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html#connect_verification">port test application</a> designed to aid in debugging failures when setting up AD Connector:</p>

<blockquote>
  <p>Download and unzip the <code class="language-plaintext highlighter-rouge">DirectoryServicePortTest</code> test application. The source code and Visual Studio project files are included so you can modify [it] if desired. […] This test app determines if the necessary ports are open from the VPC to your domain, and also verifies the minimum forest and domain functional levels.</p>
</blockquote>

<p>Also double-check that the <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html#prereq_connector">AD Connector prerequisites</a> are met in your scenario.</p>

<h3 id="scenario-ii--join-not-initiated-due-to-missing-iam-policy-or-ssm-document">Scenario II – Join not initiated due to missing IAM policy or SSM document</h3>

<p>If you forgot to add the <code class="language-plaintext highlighter-rouge">rn:aws:iam::aws:policy/AmazonSSMDirectoryServiceAccess</code> policy to your instance’s IAM instance profile <em>or</em> haven’t associated the <code class="language-plaintext highlighter-rouge">aws:domainJoin</code> SSM document defined above with your instance, the join may not be attempted to begin with (which results not a whole lot being logged, leaving you scratching your head).</p>

<h3 id="scenario-iii--dns-resolution-failures">Scenario III – DNS resolution failures</h3>

<p>If the Route 53 outbound endpoint (or some custom DNS redirection setup) routes DNS queries incorrectly or you haven’t set up redirection of AD-related DNS queries to your AD DNS servers to begin with, you’re going to be presented with an error right after the “Evaluate that we can resolve the directory name” step:</p>

<p><img src="/static/sadwsp-eventviewer-dns.png" alt="" /></p>

<p>When attempting a manual join in this scenario, the error message should tell you which subdomain couldn’t be resolved and what DNS server was queried for it:</p>

<p><img src="/static/sadwsp-manualjoin-dns.png" alt="" /></p>

<p>To eliminate the possibility of closed ports (or VPC peering problems, though I haven’t had any of those) causing your DNS issues, you can configure Windows to directly query the Simple AD DNS servers by their IP addresses, after which a manual join <em>should</em> succeed:</p>

<ol>
  <li>Open the start menu and type <code class="language-plaintext highlighter-rouge">ncpa.cpl</code> to open the “Network Connection” section of Control Panel.</li>
  <li>Right-click the “Ethernet 3” network adapter (it might be named differently, but there should only be one) and select “Properties”.</li>
  <li>Double-click the “Internet Protocol Version 4” list entry.</li>
  <li>Select “Use the following DNS server addresses” and change the “Preferred DNS server” and “Alternate DNS server” addresses to the IP addresses of your Simple AD DNS servers.</li>
  <li>Choose “OK”.</li>
</ol>

<p>If you’re confident that DNS forwarding works correctly <em>in principle</em>, make sure that the security group associated with your Simple AD allows inbound TCP and UDP traffic on port 53. Conversely, your EC2 instance’s security group must allow outbound TCP and UDP traffic on port 53.</p>

<p><em>Note:</em> As an alternative to venerable old <code class="language-plaintext highlighter-rouge">nslookup</code>, you can use <a href="https://learn.microsoft.com/en-us/powershell/module/dnsclient/resolve-dnsname?view=windowsserver2022-ps">the <code class="language-plaintext highlighter-rouge">Resolve-DnsName</code> PowerShell cmdlet</a> to test DNS resolution. For example, <code class="language-plaintext highlighter-rouge">Resolve-DnsName ad.ourcooldomain.com</code> tries to resolve that subdomain using the default options, while <code class="language-plaintext highlighter-rouge">Resolve-DnsName ad.ourcooldomain.com -Server 10.0.42.42</code> queries that DNS server accordingly. Supply an argument like <code class="language-plaintext highlighter-rouge">-Type A</code> to specifically query for A records (and analogous for CNAME, NS etc.).</p>

<h3 id="scenario-iv--ad-connector-user-doesnt-have-the-correct-permissions">Scenario IV – AD Connector user doesn’t have the correct permissions</h3>

<p>On a test run just prior to publishing this post, I had forgotten to add the <code class="language-plaintext highlighter-rouge">ad\adconnector</code> user to the <code class="language-plaintext highlighter-rouge">Connectors</code> group. This resulted in an error as shown below.</p>

<p><img src="/static/sadwsp-eventviewer-permissions.png" alt="" /></p>

<p>Simply fixing that group membership oversight and recreating the EC2 instance allowed the seamless join to succeed.</p>

<h3 id="scenario-v--join-failure-after-computer-account-creation-dns-issues">Scenario V – Join failure after computer account creation (DNS issues)</h3>

<p>At some point, I had defined an A record on our AD’s FQDN referencing one of the Simple AD DNS servers. This had the effect of letting the seamless join progress a little further than before – a computer account was created in the directory before faulty DNS resolution of subdomains of <code class="language-plaintext highlighter-rouge">ad.ourcooldomain.com</code> brought the process to a standstill <a href="https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/active-directory-domain-join-troubleshooting-guidance#error-code-0x54b">with an error code 0x54b</a>.</p>

<p>So keep in mind that successful creation of a computer account doesn’t preclude the possibility of a DNS misconfiguration.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:knock">
      <p>Knock on wood. <a href="#fnref:knock" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:backupsdocs">
      <p>And plenty of documentation. And, of course, multiple layers of backups – infrastructure’s not all that useful if there’s no data flowing through it. <a href="#fnref:backupsdocs" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:wsfra">
      <p>WorkSpaces itself also <a href="https://docs.aws.amazon.com/workspaces/latest/adminguide/azs-workspaces.html">isn’t universally available</a>, but it is in Frankfurt. <a href="#fnref:wsfra" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:snipe">
      <p>Being <a href="https://xkcd.com/356/">nerd sniped</a> may have played a role in all this. <a href="#fnref:snipe" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:lotr">
      <p>Please appreciate my restraint in not referencing the title of the second <em>Lord of the Rings</em> movie’s. (Welp.) <a href="#fnref:lotr" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:boilerplate">
      <p>A few notes on using third-party Terraform modules: 1. Always read the code, both for security reasons (even a 1000-star GitHub repo can be compromised) and to learn stuff – perhaps what you’re trying to do is so simple that it’d make more sense to just, uh, <em>raw dog</em> the relevant <code class="language-plaintext highlighter-rouge">resource</code>s; 2. pin the module version for stability and, again, security reasons; and 3. set yourself a regular reminder to check for updates (also of Terraform itself and any providers you use). <a href="#fnref:boilerplate" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:prof">
      <p>Now that’s some professional network topologizin’! <a href="#fnref:prof" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:tfkey">
      <p>Instead of by uploading a self-generated public key to AWS <a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/key_pair">using Terraform</a>. <a href="#fnref:tfkey" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:s3">
      <p>You could, say, define and attach a policy that grants read permissions for an S3 bucket, which would then be accessible from that instance (<em>e.g.</em>, <a href="https://cyberduck.io/s3/">with Cyberduck</a> – see the “S3 (Credentials from Instance Metadata) connection profile” configuration) without further authentication. <a href="#fnref:s3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:archive">
      <p><a href="https://web.archive.org/web/20250124120917/https://docs.aws.amazon.com/directoryservice/latest/admin-guide/ad_connector_getting_started.html#connect_delegate_privileges">Mirrored</a> on the <a href="/posts/harlond.html">excellent</a> Internet Archive. <a href="#fnref:archive" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:secdocs">
      <p>It’s <a href="https://docs.aws.amazon.com/directoryservice/latest/admin-guide/simple_ad_best_practices.html">mentioned in the documentation</a>, though, and making changes to it isn’t discouraged all that harshly. <a href="#fnref:secdocs" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:outicmp">
      <p>As long as that instance’s security group allows outbound ICMP. <a href="#fnref:outicmp" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:familiar">
      <p>Not being deeply familiar with how Active Directory <em>works</em> had me scratching me head for a while there! <a href="#fnref:familiar" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:tried">
      <p>I’ve tried and, for starters, couldn’t remote into EC2 instances anymore, leading me to quickly revert this change. <a href="#fnref:tried" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:authoritsource">
      <p>Quoting <a href="https://aws.amazon.com/blogs/networking-and-content-delivery/integrating-your-directory-services-dns-resolution-with-amazon-route-53-resolvers/">from an AWS blog post</a>: “[I]n December 20th 2022 we introduced a change in the default behavior of the DNS resolver in AWS Managed AD. Starting this date, all new directories are created with a forwarder for all non-authoritative queries to the <em>+2 IP address</em> of the VPC”, <em>i.e.</em>, the Amazon-provided Route 53 Resolver. <a href="#fnref:authoritsource" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:primer">
      <p><a href="https://sjramblings.io/route53_resolver_magic">Here’s</a> a good primer on the Route 53 Resolver and what’s possible with endpoints. <a href="#fnref:primer" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:whysurpr">
      <p>A cursory search for implementation details regarding Route 53 Endpoints didn’t get me very far as to why this conceptually-simple feature costs so much. But I assume, considering the alternative outlined further down, that AWS sets up a set of “EC2-instance-shaped” DNS servers that implement your forwarding rules, which are likely dimensioned for far larger workloads than what’s required here. <a href="#fnref:whysurpr" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:registered">
      <p>Before you can create individual WorkSpaces instances, a directory must be registered for use with WorkSpaces. <a href="#fnref:registered" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:insanity">
      <p><a href="https://www.quora.com/Why-do-people-say-The-definition-of-insanity-is-doing-the-same-thing-over-and-over-again-expecting-different-results-when-that-isnt-even-close-to-the-actual-definition">“The definition of insanity is doing the same thing over and over again expecting different results”</a>, but if that ends up <em>working</em>, doesn’t that mean our computers are insane? (No, complex systems just interact in inconsistent ways which (along with the fact that <em>we</em> can make such systems from <a href="https://news.ycombinator.com/item?id=28205823">rocks and lightning</a>) is amazing but occasionally annoying. (I’ll stop short of turning this footnote into a <a href="https://www.youtube.com/watch?v=dBQG-Dcm4VQ">John Green video coercing nigh-randomness into beautiful metaphors for the human condition</a>. Which is a type of John Green video I enjoy immensely. But this is supposed to be a technical blog post, dang it.)) <a href="#fnref:insanity" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:pools">
      <p>We won’t need the last one, but without it, you might run into issues later on if you decide to explore <a href="https://aws.amazon.com/about-aws/whats-new/2024/06/amazon-workspaces-pools-amazon-workspaces/">WorkSpaces Pools</a>. <a href="#fnref:pools" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:gp">
      <p>“Managed” since even (as opposed to Microsoft Managed AD) with Simple AD, group policies function – at least in a <a href="https://www.reddit.com/r/aws/comments/tghhk2/comment/i15divb/">limited way</a>. I haven’t explored what those limitations <em>are</em> in depth, but rolling out environment variables works, for one. <a href="#fnref:gp" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:who">
      <p>In retrospect, I, for one, am not convinced, but having already written those words, I might as well hit publish (actually <code class="language-plaintext highlighter-rouge">git commit &amp;&amp; git push</code> – if you’re still receptive to input after all this waffling, you can read about how I deploy this blog <a href="/posts/deploy.html">here</a>). <a href="#fnref:who" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:adcomplex">
      <p>AD joins are somewhat complex (especially if the directory servers, as is the case here, are located in a different part of the world as the rest of your infrastructure), so I’m sure there’s many kinds of failure scenarios I was lucky enough to avoid here. (And, conversely, many I encountered that you’ll be skilled enough to know how to circumvent in your sleep.) <a href="#fnref:adcomplex" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:remotesysteminfo">
      <p><code class="language-plaintext highlighter-rouge">systeminfo</code> can also <a href="https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/systeminfo#examples">retrieve information about other computers joined to your domain</a>. <a href="#fnref:remotesysteminfo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:ssmcloudwatch">
      <p>Good to know: You can <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/monitoring-ssm-agent.html">configure SSM Agent to log to CloudWatch</a> – very useful for larger deployments. <a href="#fnref:ssmcloudwatch" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[Long title, but whatcha gonna do.]]></summary></entry><entry><title type="html">One Million Hits Per Month</title><link href="https://excessivelyadequate.com/posts/1m.html" rel="alternate" type="text/html" title="One Million Hits Per Month" /><published>2024-09-03T11:40:00+02:00</published><updated>2024-09-03T11:40:00+02:00</updated><id>https://excessivelyadequate.com/posts/1m</id><content type="html" xml:base="https://excessivelyadequate.com/posts/1m.html"><![CDATA[<p>…is surprisingly (to me, anyway) relatively few – about one per 2.5 seconds. Something<sup id="fnref:spikey"><a href="#fn:spikey" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> to keep in mind when speccing out systems.</p>

<p><em>Relatedly:</em> There’s a lot of hours in a year, especially for systems running 24/7. A cloud server<sup id="fnref:m4large"><a href="#fn:m4large" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> that costs $0.12/hour seems pretty inexpensive, yet it’ll rack up a $1000 bill at the end of each year.</p>

<p>Doing the math isn’t hard, of course, but your<sup id="fnref:weird"><a href="#fn:weird" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> brain might not always do it if left unprompted.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:spikey">
      <p>Of course, access patterns won’t ever be distributed uniformly and different workloads are spikey to varying degrees. <a href="#fnref:spikey" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:m4large">
      <p>An example from the AWS universe: <code class="language-plaintext highlighter-rouge">m4.large</code> with 200 GB of EBS storage. (Disregarding savings plans or other cost-cutting measures for always-on instances.) <a href="#fnref:m4large" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:weird">
      <p>Again, <em>maybe I’m wired weird</em> and you’re able to translate from “<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> per second” to “<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi></mrow><annotation encoding="application/x-tex">y</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">y</span></span></span></span> per hour” to “<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi></mrow><annotation encoding="application/x-tex">z</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.04398em;">z</span></span></span></span> per year” easily. But, unless you need to do that a lot, I’d guess it’s not instinctive. <a href="#fnref:weird" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[…is surprisingly (to me, anyway) relatively few – about one per 2.5 seconds. Something1 to keep in mind when speccing out systems. Of course, access patterns won’t ever be distributed uniformly and different workloads are spikey to varying degrees. &#8617;]]></summary></entry><entry><title type="html">Zeroing a Column of a CSV File With PowerShell</title><link href="https://excessivelyadequate.com/posts/zeroing.html" rel="alternate" type="text/html" title="Zeroing a Column of a CSV File With PowerShell" /><published>2024-07-26T20:30:00+02:00</published><updated>2024-07-26T20:30:00+02:00</updated><id>https://excessivelyadequate.com/posts/zeroing</id><content type="html" xml:base="https://excessivelyadequate.com/posts/zeroing.html"><![CDATA[<p>Earlier today at <a href="https://www.suedweststrom.de">work</a>, a coworker called in with a little data munging task for me to solve: a column of a CSV file supplied by a third party on a regular basis needed to be redacted (for technical rather than sneaky reasons) before the file was passed along to another third party.</p>

<p>For more complex operations on tabular data (transposing rows/columns or, say, inflicting math upon them), we usually rely on Python scripts (…anything’s better than Excel macros). But this “problem” seemed simple enough to make the overhead of setting up a virtual environment plus the usual CSV input/output boilerplate seem like <a href="https://en.wiktionary.org/wiki/mit_Kanonen_auf_Spatzen_schießen">shooting at sparrows with cannons</a> – so I ignored my reflexive aversion<sup id="fnref:aversion"><a href="#fn:aversion" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> to Microsoft technology and looked into how to get this done in PowerShell, which is conveniently built into our task automation software<sup id="fnref:wontlink"><a href="#fn:wontlink" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> of choice.</p>

<h2 id="importing-a-csv-file-into-a-powershell-session">Importing a CSV file into a PowerShell session</h2>

<p>The CSV files in question come in a format that has its roots in <a href="https://portal.m7.energy/plpx/documentation/">ComTrader</a>, a popular frontend to the <a href="https://en.wikipedia.org/wiki/European_Energy_Exchange">European Energy Exchange</a>, making it a quasi-standard for exchanging power trading data within parts of Germany’s energy industry.</p>

<pre><code class="language-csv">Area;Type;B/S;Accnt;Product;Ctrct;Qty;Prc;BG;Txt;PQty;ValRes;ValDate;ExeRes
TNG;REG;S;P;XBID_Quarter_Hour_Power;23Q1;0.5;-500;Standard;Comment;;GTD;04.07.2024 21:50:00;NON
AMP;REG;S;P;XBID_Quarter_Hour_Power;23Q1;1.5;-500;Standard;Comment;;GTD;04.07.2024 21:50:00;NON
</code></pre>

<p>Reading a semicolon-separated CSV file into a PowerShell variable can be a accomplished using the <code class="language-plaintext highlighter-rouge">Import-Csv</code> cmdlet:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Import-Csv</span><span class="w"> </span><span class="nt">-Delimiter</span><span class="w"> </span><span class="s1">';'</span><span class="w"> </span><span class="s1">'C:\path\to\file.csv'</span><span class="w">
</span></code></pre></div></div>

<p>This’ll yield, as described in <a href="https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/import-csv?view=powershell-7.4">the documentation</a>, a “table-like custom object from the items in the CSV file. Each column in the CSV file becomes a property of the custom object and the items in rows become the property values.”</p>

<h2 id="zeroing-a-column">Zeroing a column</h2>

<p>The cool thing about these table-like objects appears to be that they’re query-able by, among<sup id="fnref:group"><a href="#fn:group" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> others, the rather powerful <code class="language-plaintext highlighter-rouge">Select-Object</code> cmdlet. You could, for example, extract the columns<sup id="fnref:properties"><a href="#fn:properties" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> <code class="language-plaintext highlighter-rouge">Qty</code> (<em>power quantity</em> if you’re curious) and <code class="language-plaintext highlighter-rouge">Prc</code> (<em>price</em>) like this:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Select-Object</span><span class="w"> </span><span class="s1">'Qty'</span><span class="p">,</span><span class="s1">'Prc'</span><span class="w">
</span></code></pre></div></div>

<p>That’d yield a two-column table, ready for re-serialization or further processing. Relatedly, here’s how to extract <em>all columns except</em> <code class="language-plaintext highlighter-rouge">Prc</code>:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Select-Object</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nt">-ExcludeProperty</span><span class="w"> </span><span class="s1">'Prc'</span><span class="w">
</span></code></pre></div></div>

<p>You can also use <code class="language-plaintext highlighter-rouge">Select-Object</code>, albeit in a slightly-more-syntactically-convoluted manner, to generate <em>new</em> columns <a href="https://community.spiceworks.com/t/replace-blank-values-with-null-in-csv-for-specific-column-all-columns/954877">based on values of existing columns</a>. If you wanted to, say, append a column <code class="language-plaintext highlighter-rouge">HalfPrc</code> containing the values of the <code class="language-plaintext highlighter-rouge">Prc</code> column divided by 2, you could run this command:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Select-Object</span><span class="w"> </span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="p">@{</span><span class="nx">Name</span><span class="o">=</span><span class="s1">'HalfPrc'</span><span class="p">;</span><span class="w"> </span><span class="nx">Expression</span><span class="o">=</span><span class="p">{</span><span class="o">.</span><span class="nf">5</span><span class="o">*</span><span class="p">[</span><span class="n">float</span><span class="p">]</span><span class="bp">$_</span><span class="o">.</span><span class="nf">prc</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Similarly, to empty the <code class="language-plaintext highlighter-rouge">Prc</code> column (which is what my colleague was after), I arrived at the following command which, as above, selects everything except the preexisting <code class="language-plaintext highlighter-rouge">Prc</code> column, then adds a new <code class="language-plaintext highlighter-rouge">Prc</code> column containing only empty strings, storing the result in a variable <code class="language-plaintext highlighter-rouge">$csv_fixed</code>:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv_fixed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">$csv</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Select-Object</span><span class="w"> </span><span class="nt">-ExcludeProperty</span><span class="w"> </span><span class="s1">'Prc'</span><span class="w"> </span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="p">@{</span><span class="nx">Name</span><span class="o">=</span><span class="s1">'Prc'</span><span class="p">;</span><span class="w"> </span><span class="nx">Expression</span><span class="o">=</span><span class="p">{</span><span class="s1">''</span><span class="p">}}</span><span class="w">
</span></code></pre></div></div>

<p>Note that new columns created this way are <em>appended</em> to the table – so the new <code class="language-plaintext highlighter-rouge">Prc</code> column ends up, visually speaking, at the right end of the table, not in the same location as the previous <code class="language-plaintext highlighter-rouge">Prc</code> column.</p>

<h2 id="writing-the-result-out">Writing the result out</h2>

<p>Knowing that there’s an <code class="language-plaintext highlighter-rouge">Import-Csv</code> cmdlet, you won’t have trouble guessing how to export <code class="language-plaintext highlighter-rouge">$csv_fixed</code> back into a CSV file:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$csv_fixed</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Export-Csv</span><span class="w"> </span><span class="nt">-Delimiter</span><span class="w"> </span><span class="s1">';'</span><span class="w"> </span><span class="nt">-NoTypeInformation</span><span class="w"> </span><span class="s1">'C:\path\to\output_file.csv'</span><span class="w">
</span></code></pre></div></div>

<p>By default, PowerShell adds a <em>type information header</em> like <code class="language-plaintext highlighter-rouge">#TYPE Selected.System.Management.Automation.PSCustomObject</code> to the generated CSV file, which can (and probably should) be suppressed via the <code class="language-plaintext highlighter-rouge">-NoTypeInformation</code> switch.</p>

<p>The resulting CSV file, then, looks like this:</p>

<pre><code class="language-csv">"Area";"Type";"B/S";"Accnt";"Product";"Ctrct";"Qty";"Prc";"BG";"Txt";"PQty";"ValRes";"ValDate";"ExeRes"
"TNG";"REG";"S";"P";"XBID_Quarter_Hour_Power";"23Q1";"0.5";"";"Standard";"Comment";"";"GTD";"04.07.2024 21:50:00";"NON"
"AMP";"REG";"S";"P";"XBID_Quarter_Hour_Power";"23Q1";"1.5";"";"Standard";"Comment";"";"GTD";"04.07.2024 21:50:00";"NON"
</code></pre>

<p>Notice that PowerShell wraps each value in quotes, which, while unnecessary in this case, is good practice and won’t confuse <a href="https://datatracker.ietf.org/doc/html/rfc4180">standards</a>-compliant CSV consumers. (If you’re running PowerShell 7 or later, you <a href="https://stackoverflow.com/questions/60678901/how-to-remove-all-quotations-mark-in-the-csv-file-using-powershell-script/60680265#60680265">can</a> add <code class="language-plaintext highlighter-rouge">-UseQuotes AsNeeded</code>, which is delightfully self-explanatory, to the <code class="language-plaintext highlighter-rouge">Export-Csv</code> call.)</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:aversion">
      <p>Although I just learned that PowerShell is also available on macOS and Linux – and there’s <a href="https://community.jumpcloud.com/t5/radical-admin-blog/powershell-for-the-mac-admin-part-5-pivot/ba-p/2755">some nifty stuff</a> you can do with it. Hmm! <a href="#fnref:aversion" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:wontlink">
      <p>Whose name I shan’t utter for opsec reasons. <a href="#fnref:wontlink" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:group">
      <p>Another cmdlet worth <a href="https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/group-object?view=powershell-7.4">looking at</a> is <code class="language-plaintext highlighter-rouge">Group-Object</code>. But wait, <code class="language-plaintext highlighter-rouge">Select-...</code>? <code class="language-plaintext highlighter-rouge">Group-...</code>? That almost sounds <a href="/posts/matrix.html">like SQL</a>! <a href="#fnref:group" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:properties">
      <p>What I call “columns” are really <em>properties</em> in PowerShell, but I’ll be sticking to CSV terminology. <a href="#fnref:properties" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[Earlier today at work, a coworker called in with a little data munging task for me to solve: a column of a CSV file supplied by a third party on a regular basis needed to be redacted (for technical rather than sneaky reasons) before the file was passed along to another third party.]]></summary></entry><entry><title type="html">Determining Which Country a Location Is in Without an API (…But With MariaDB)</title><link href="https://excessivelyadequate.com/posts/country.html" rel="alternate" type="text/html" title="Determining Which Country a Location Is in Without an API (…But With MariaDB)" /><published>2024-07-19T11:25:00+02:00</published><updated>2024-07-19T11:25:00+02:00</updated><id>https://excessivelyadequate.com/posts/country</id><content type="html" xml:base="https://excessivelyadequate.com/posts/country.html"><![CDATA[<p>A few days ago, I wrote about <a href="/posts/mark.html">implementing search term highlighting</a> as part of a tool my partner and I have long been using to track our shared purchases. Another set of improvements deals with location data – towards the tail end of our <a href="https://mastodon.social/@doersino/112780830112857382">recent vacation to South Korea</a>, on a whim, I added a feature where logging a purchase on a device with geolocation support also captures the current location<sup id="fnref:refine"><a href="#fn:refine" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> (as a <a href="https://en.wikipedia.org/wiki/Geographic_coordinate_system">latitude-longitude pair</a>), storing it along with the rest of the purchase data.</p>

<p>But what to do with that data? Displaying it on a map<sup id="fnref:leaflet"><a href="#fn:leaflet" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> is kind of neat, of course.</p>

<p class="wide"><img src="/static/country.jpg" alt="" /></p>

<p class="caption">Places we ate and shopped at in central Seoul. Monami Curry, whose half-half curry is <a href="https://www.instagram.com/explore/locations/1026862483/">instagrammable to the max</a>, seemingly hasn’t survived the pandemic (in that location, anyway – there’s another branch in Suwon, a bit south of Seoul).</p>

<p>But what else might location data be useful for?</p>

<p>Well, while implementing the search feature I wrote about last time, I got the idea of <em>filtering purchases by country</em> – providing a dropdown menu of country names as part of the search form, selecting one of which constrains search results to purchases 1. with a location that’s 2. located in that country.</p>

<p>The challenge, then, is <strong>determining which country a latitude-longitude coordinate pair falls into</strong>. There’s roughly two ways to solve this:</p>

<ol>
  <li>
    <p>Calling out to any of the dozens (hundreds?) of <a href="https://en.wikipedia.org/wiki/Reverse_geocoding">reverse geocoding</a> APIs available. That’s easy to do and, given that we’re unlikely to log more than a couple dozen purchases a month, ought to be well within the free tier of most API providers. Slight privacy concerns, but eh. And if the API were to be <a href="https://www.tumblr.com/ourincrediblejourney">sunset</a> in a few years, just switch to another one. No-brainer, really.</p>
  </li>
  <li>
    <p>Saying “I don’t need no stinkin’ API”, procuring a GeoJSON file containing country outlines, massaging it into a format and detail level appropriate for the task, importing it into a data structure, building a way to query that data for polygon-point intersections, then bulk-processing existing purchases to add country names.</p>
  </li>
</ol>

<p>No points for guessing which option I picked!</p>

<h2 id="disclaimer-countries-are-weird-but-it-doesnt-really-matter">Disclaimer: Countries are weird (but it doesn’t really matter)</h2>

<p>There’s 200<a href="https://www.youtube.com/watch?v=3nB688xBYdY">ish</a> countries. Countries can be divided into multiple parts, <a href="https://en.wikipedia.org/wiki/French_Polynesia">sometimes separated by roughly half the planet</a>.
A few countries have <a href="https://99percentinvisible.org/article/northwest-angle-inside-nesting-geography-exclaves-enclaves/">(potentially nested) exclaves</a> in other countries. There’s places that are sort of <a href="https://www.youtube.com/watch?v=KwHj4lj3F-k">shared between multiple countries</a>. Other places are <a href="https://en.wikipedia.org/wiki/Bir_Tawil">claimed by no one</a>. Many borders are disputed, so which country a location belongs to depends on who you ask. Similarly, some countries wholly <a href="https://en.wikipedia.org/wiki/Taiwan%2C_China">don’t exist</a> according to other countries. And coastlines <a href="https://en.wikipedia.org/wiki/Coastline_paradox">tend to be fractals</a>.</p>

<p>…but my partner and me are unlikely to make purchases in most of those areas, so correct treatment of edge cases like this wasn’t<sup id="fnref:geopolitical"><a href="#fn:geopolitical" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> a priority when selecting a dataset.</p>

<h2 id="geojson-and-lists-of-lists-of-lists-of-lists">GeoJSON and lists of lists of lists (of lists)</h2>

<p>There’s <a href="https://en.wikipedia.org/wiki/GIS_file_format#Example_vector_file_formats">a bunch of formats</a> that geographical data like country outlines commonly come in – for example, <a href="https://github.com/doersino/aerialbot">ærialbot</a>, a Mastodon bot I wrote a few years back, utilizes a <a href="https://en.wikipedia.org/wiki/Shapefile">Shapefile</a> to generate random locations in the non-ocean parts of the world. These days, <a href="https://geojson.org">GeoJSON</a> seems to be more popular (and thus more widely supported), and it’s easier<sup id="fnref:bin"><a href="#fn:bin" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> to inspect and edit in its raw form.</p>

<p>That’s because a GeoJSON file is basically just a list of polygons (or <em>multipolygons</em> – handy for encoding non-contiguous shapes, <em>e.g.</em>, Greece), each of which can be annotated with data like, say, a country name. Other geometry types like points are also supported, but they’re not relevant for storing country borders.</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
    <span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">FeatureCollection</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">features</span><span class="dl">"</span><span class="p">:</span> <span class="p">[</span>                       <span class="c1">// a list...</span>
        <span class="p">{</span>
            <span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Feature</span><span class="dl">"</span><span class="p">,</span>          <span class="c1">// ...of features...</span>
            <span class="dl">"</span><span class="s2">geometry</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>               <span class="c1">// ...each of which has a geometry...</span>
                <span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">MultiPolygon</span><span class="dl">"</span><span class="p">,</span> <span class="c1">// ...here, of type multipolygon...</span>
                <span class="dl">"</span><span class="s2">coordinates</span><span class="dl">"</span><span class="p">:</span> <span class="p">[</span>        <span class="c1">// ...which is just a list of polygons...</span>
                    <span class="p">[</span>                   <span class="c1">// ...where each polygon is a list starting with a main shape...</span>
                        <span class="p">[</span>               <span class="c1">// ...specified as a list of longitude-latitude coordinate pairs...</span>
                            <span class="p">[</span><span class="o">-</span><span class="mf">17.2448353</span><span class="p">,</span> <span class="mf">21.3521298</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mf">17.5584441</span><span class="p">,</span> <span class="mf">21.2683253</span><span class="p">],</span> <span class="p">...</span>
                        <span class="p">],</span>
                        <span class="p">[</span>               <span class="c1">// ...followed by zero or more holes (which are also polygons)...</span>
                            <span class="p">...</span>
                        <span class="p">],</span>
                        <span class="p">...</span>             <span class="c1">// ...more holes go here...</span>
                    <span class="p">],</span>
                    <span class="p">[</span>                   <span class="c1">// ...another polygon...</span>
                        <span class="p">[</span>               <span class="c1">// ...with a main shape (no holes this time)...</span>
                            <span class="p">[...,</span> <span class="p">...],</span> <span class="p">...</span>
                        <span class="p">]</span>
                    <span class="p">],</span>
                    <span class="p">...</span>                 <span class="c1">// ...even more polygons...</span>
                <span class="p">]</span>
            <span class="p">},</span>
            <span class="dl">"</span><span class="s2">properties</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>             <span class="c1">// ...and some data</span>
                <span class="dl">"</span><span class="s2">osm_id</span><span class="dl">"</span><span class="p">:</span> <span class="o">-</span><span class="mi">5441968</span><span class="p">,</span>
                <span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">República Árabe Saharaui Democrática الجمهورية العربية الصحراوية الديمقراطية</span><span class="dl">"</span><span class="p">,</span>
                <span class="dl">"</span><span class="s2">name_en</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Sahrawi Arab Democratic Republic</span><span class="dl">"</span><span class="p">,</span>
                <span class="p">...</span>
            <span class="p">}</span>
        <span class="p">},</span>
        <span class="p">...</span>                             <span class="c1">// more features, each like the one above!</span>
    <span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If you’re confused about the difference between polygons, multipolygons, and holes (…I was!), <a href="https://gis.stackexchange.com/questions/225368/understanding-difference-between-polygon-and-multipolygon-for-shapefiles-in-qgis">here’s a Stack Exchange post</a> explaining and illustrating these concepts really well.</p>

<h2 id="foraging-for-or-depending-on-your-disposition-hunting-down-country-outline-data">Foraging for (or, depending on your disposition, hunting down) country outline data</h2>

<p>Using the search engine of your choice, you can find a bunch of sites providing freely-downloadable GeoJSON files of the world’s borders at varying levels of detail – <a href="https://github.com/datasets/geo-countries/blob/master/data/countries.geojson">here’s one</a> I initially considered. (You don’t want too much detail because that makes for larger files and slower computation, and you don’t want too little either because that’ll lead to inaccuracies.)</p>

<p>Most datasets I found delimit countries by their coastlines, which makes a lot of sense for your typical world map! But that’s not ideal for a reverse geocoding use case since 1. when you’re near the coast, GPS inaccuracies can place your location just barely in the sea (and thus beyond the country outline), 2. depending on the resolution of a given GeoJSON file, small islands and peninsulas might not be included, 3. land reclamation is a thing, so coastlines change relatively rapidly in certain areas, and 4. accurately tracing the coastlines significantly increases the volume of data for some countries (<em>e.g.</em>, again, Greece) when simple shapes around small islands would suffice for this use case.</p>

<p class="double"><img src="/static/country-coastlines.jpg" alt="" /><img src="/static/country-territorialwaters.jpg" alt="" /></p>

<p class="caption">The difference between a coastline-delimited dataset (fairly low-res, mind you) and one based on territorial waters. (Rendered using <a href="https://geojson.io">geojson.io</a>, background map courtesy of Mapbox/OpenStreetMap.)</p>

<p>So I looked for a dataset that includes a country’s territorial waters. It’s possible to generate such a dataset using <a href="https://www.openstreetmap.org">OpenStreetMap</a>’s <a href="https://overpass-turbo.eu">Overpass Turbo</a> API, but following a link from <a href="https://gis.stackexchange.com/questions/379757/how-to-generate-smallest-possible-country-boundaries-file">a Stack Exchange post</a> that explains how to do that, I instead came across <a href="https://osm-boundaries.com">osm-boundaries.com</a>, a service offering ready-made country-plus-territorial-waters outlines for download. Selecting all countries yielded a 125 MB GeoJSON file, which seemed<sup id="fnref:intuit"><a href="#fn:intuit" class="footnote" rel="footnote" role="doc-noteref">5</a></sup> a bit larger and more detailed than what I actually needed.</p>

<p>Luckily, there’s <a href="https://mapshaper.org">Mapshaper</a>, an excellent web-based software for editing geospatial data (and converting it between various formats). Among other features, it comes with a simplification tool which allowed me to remove excessive detail while keeping land borders within about 10 meters of their actual locations, with the resulting file<sup id="fnref:dl"><a href="#fn:dl" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> weighing in at a more-manageable 23 MB.</p>

<h2 id="checking-if-a-polygon-contains-a-point-and-why-i-didnt-need-to-implement-that">Checking if a polygon contains a point (and why I didn’t need to implement that)</h2>

<p>Given the GeoJSON file I prepped above and a location from my tool’s database, how can I determine which country contains that location? Easy: For all polygons in the GeoJSON file, do a point-in-polygon check (while essentially inverting that check for holes) and return the (hopefully single) polygon that matches.</p>

<p>…so how to do a point-in-polygon check, then?</p>

<p>Since that’s a common task in computer graphics (among other fields), there’s a <a href="https://en.wikipedia.org/wiki/Point_in_polygon">whole bunch of algorithms</a> tackling it.</p>

<ul>
  <li><a href="https://en.wikipedia.org/wiki/Point_in_polygon#Ray_casting_algorithm">Ray casting algorithm</a>: Casting a ray from outside the polygon towards the point and <a href="https://observablehq.com/@tmcw/understanding-point-in-polygon">counting how many times</a> it intersects the edge of the polygon – if odd, the point’s inside the polygon.</li>
  <li><a href="https://wrfranklin.org/Research/Short_Notes/pnpoly.html">PNPoly</a>: As far as I can tell, this is just a battle-tested implementation of the ray casting algorithm.</li>
  <li><a href="https://en.wikipedia.org/wiki/Point_in_polygon#Winding_number_algorithm">Winding number algorithm</a>: Computing the point’s winding number with respect to the polygon – <em>i.e.</em>, by how many degrees the edge of the polygon, considered segment by segment, travels around the point. If non-zero, the polygon contains the point.</li>
</ul>

<p>When checking multiple polygons (a list of countries, say), I assume (but haven’t read up on it) that you could do some preprocessing before dropping into one of the algorithms listed above, <em>e.g.</em>, using some kind of spatial tree structure to narrow down the number of polygons to test, then first testing whether the point falls into a given polygon’s <a href="https://en.wikipedia.org/wiki/Minimum_bounding_box">axis-aligned bounding box</a>, which is computationally inexpensive.</p>

<p>Since my tool is built with PHP (and, as custom dictates, MariaDB<sup id="fnref:mysql"><a href="#fn:mysql" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> – this’ll be relevant in the next paragraph), I started looking for PHP libraries providing point-in-polygon algorithms. <a href="https://phpgeo.marcusjaschen.de/Calculations/Geofence.html">There’s</a> <a href="https://gist.github.com/paulofreitas/a6f742b63decf5874c53074865eb6dbf">a</a> <a href="https://assemblysys.com/php-point-in-polygon-algorithm/">few</a> implementations, but the ones I found don’t natively support GeoJSON input, requiring at least a modicum of data munging.</p>

<h2 id="who-needs-a-geospatial-library-if-youve-got-a-database">Who needs a geospatial library if you’ve got a database?</h2>

<p>Remembering that PostgreSQL – my usual database of choice – has excellent geospatial capabilities thanks to <a href="https://postgis.net">PostGIS</a>, I looked into whether there’s a similar extension for MariaDB. There isn’t – because <a href="https://mariadb.com/kb/en/geographic-geometric-features/">all kinds of geospatial functions</a> are just built in, including one for point-in-polygon checking! And what’s more, MariaDB provides a <a href="https://dev.mysql.com/doc/refman/8.4/en/spatial-geojson-functions.html">function that converts GeoJSON data</a><sup id="fnref:mysqldocs"><a href="#fn:mysqldocs" class="footnote" rel="footnote" role="doc-noteref">8</a></sup> into its internal representation. How handy is that?!</p>

<p>So I wrote a quick little PHP command-line script that…</p>

<ol>
  <li>…creates a database table <code class="language-plaintext highlighter-rouge">countries</code> with two columns <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">outline</code>, the latter of which will hold the corresponding (multi)polygon encoding the country-plus-territorial-waters border.</li>
  <li>…imports a GeoJSON file formatted as shown above into that table (due to limited familiarity with GeoJSON files, I assume other files might not work – luckily, as discussed, you can inspect them easily and make the necessary adjustments to the code below).</li>
  <li>…goes through my preexisting <code class="language-plaintext highlighter-rouge">purchases</code> table, annotating any rows that have location data with the matching country name – here, you’ll see how to query the <code class="language-plaintext highlighter-rouge">countries</code> table.</li>
</ol>

<h2 id="talk-is-cheap-show-me-the-code">“Talk is cheap. Show me the code.”</h2>

<p>Okay, okay, Linus Torvalds, here it is, step by step, with quite-possibly-redundant explanations below each code block. First, a few lines of setup.</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="no">GEOJSON_FILE</span> <span class="o">=</span> <span class="s2">"OSMB-ab12e27d758279c6311e8bd945a358d9a594bc44.json"</span><span class="p">;</span>

<span class="kc">PHP_SAPI</span> <span class="o">===</span> <span class="s1">'cli'</span> <span class="k">or</span> <span class="k">die</span><span class="p">(</span><span class="s1">'run via cli only'</span><span class="p">);</span>  <span class="c1">// allow execution via cli only</span>

<span class="k">require_once</span> <span class="s2">"db.class.php"</span><span class="p">;</span>  <span class="c1">// import meekrodb</span>
<span class="no">DB</span><span class="o">::</span><span class="nv">$user</span> <span class="o">=</span> <span class="s2">"..."</span><span class="p">;</span>
<span class="no">DB</span><span class="o">::</span><span class="nv">$password</span> <span class="o">=</span> <span class="s2">"..."</span><span class="p">;</span>
<span class="no">DB</span><span class="o">::</span><span class="nv">$dbName</span> <span class="o">=</span> <span class="s2">"..."</span><span class="p">;</span>
</code></pre></div></div>

<p>A little less than ten years ago, when I built the initial version of our purchase tracker, it was common practice to access MariaDB databases using a library like <a href="https://meekro.com">MeekroDB</a> instead of directly utilizing the functions built into PHP – and since my PHP knowledge has atrophied in the intervening years (and MeekroDB was already in my project directory, anyway), I’m just doing the same here.</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">DB</span><span class="o">::</span><span class="nf">query</span><span class="p">(</span><span class="s2">"DROP TABLE IF EXISTS `countries`;"</span><span class="p">);</span>
<span class="no">DB</span><span class="o">::</span><span class="nf">query</span><span class="p">(</span><span class="s2">"CREATE TABLE `countries` (`name` text, `outline` geometry);"</span><span class="p">);</span>
</code></pre></div></div>

<p>Creating the table is fairly straightforward. What’s neat is that MariaDB’s <code class="language-plaintext highlighter-rouge">geometry</code> type encompasses all kinds of geospatial data – so there’s no need to insert polygons differently into the table than multipolygons, say.</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$geojson</span> <span class="o">=</span> <span class="nb">json_decode</span><span class="p">(</span><span class="nb">file_get_contents</span><span class="p">(</span><span class="no">GEOJSON_FILE</span><span class="p">));</span>
<span class="k">foreach</span> <span class="p">(</span><span class="nv">$geojson</span><span class="o">-&gt;</span><span class="n">features</span> <span class="k">as</span> <span class="nv">$feature</span><span class="p">)</span> <span class="p">{</span>
    <span class="nv">$name</span> <span class="o">=</span> <span class="nv">$feature</span><span class="o">-&gt;</span><span class="n">properties</span><span class="o">-&gt;</span><span class="n">name_en</span><span class="p">;</span>
    <span class="nv">$outline</span> <span class="o">=</span> <span class="nb">json_encode</span><span class="p">(</span><span class="nv">$feature</span><span class="o">-&gt;</span><span class="n">geometry</span><span class="p">);</span>
    <span class="no">DB</span><span class="o">::</span><span class="nf">query</span><span class="p">(</span><span class="s2">"INSERT INTO `countries` (`name`, `outline`) VALUES (%s, ST_GeomFromGeoJSON(%s));"</span><span class="p">,</span> <span class="nv">$name</span><span class="p">,</span> <span class="nv">$outline</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>These six lines import the data contained in the <code class="language-plaintext highlighter-rouge">GEOJSON_FILE</code> into the <code class="language-plaintext highlighter-rouge">countries</code> table. First, the GeoJSON data is parsed into a PHP object representation, each feature (<em>i.e.</em>, country) of which is then processed in succession: its name is extracted, its geometry is “re-JSON-encoded” for transfer to the database, then these two values are <code class="language-plaintext highlighter-rouge">INSERT</code>ed into the <code class="language-plaintext highlighter-rouge">countries</code> table, taking advantage of <a href="https://mariadb.com/kb/en/st_geomfromgeojson/">the <code class="language-plaintext highlighter-rouge">ST_GeomFromGeoJSON()</code> function</a> to convert the GeoJSON geometry into MariaDB’s internal representation.</p>

<p>With country outlines now persisted in the database, all that was left to do was updating existing purchases (and, but there’s no point in showing this here, updating my tool to determine the country of newly-logged purchases):</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$purchasesWithLocation</span> <span class="o">=</span> <span class="no">DB</span><span class="o">::</span><span class="nf">query</span><span class="p">(</span><span class="s2">"SELECT * FROM `purchases` WHERE `latitude` IS NOT NULL AND `longitude` IS NOT NULL"</span><span class="p">);</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">purchases</code> table has <code class="language-plaintext highlighter-rouge">latitude</code> and <code class="language-plaintext highlighter-rouge">longitude</code> columns which are <code class="language-plaintext highlighter-rouge">NULL</code> on purchases with<em>out</em> location data.</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">foreach</span> <span class="p">(</span><span class="nv">$purchasesWithLocation</span> <span class="k">as</span> <span class="nv">$p</span><span class="p">)</span> <span class="p">{</span>
    <span class="nv">$point</span> <span class="o">=</span> <span class="s1">'{"type": "Point", "coordinates": ['</span> <span class="mf">.</span> <span class="nv">$p</span><span class="p">[</span><span class="s2">"longitude"</span><span class="p">]</span> <span class="mf">.</span> <span class="s1">', '</span> <span class="mf">.</span> <span class="nv">$p</span><span class="p">[</span><span class="s2">"latitude"</span><span class="p">]</span> <span class="mf">.</span> <span class="s1">']}'</span><span class="p">;</span>  <span class="c1">// careful: lon, lat!</span>
    <span class="no">DB</span><span class="o">::</span><span class="nf">query</span><span class="p">(</span><span class="s2">"UPDATE `purchases`
               SET `country` = (SELECT `name`
                                FROM `countries`
                                WHERE ST_Contains(`outline`, ST_GeomFromGeoJSON(%s))
                                ORDER BY `name`
                                LIMIT 1)
               WHERE `id` = %i"</span><span class="p">,</span> <span class="nv">$point</span><span class="p">,</span> <span class="nv">$p</span><span class="p">[</span><span class="s2">"id"</span><span class="p">]);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This code snippet assembles a GeoJSON feature of type <code class="language-plaintext highlighter-rouge">Point</code> representing each purchase’s location. The <code class="language-plaintext highlighter-rouge">UPDATE</code> statement’s subquery over the fancy new <code class="language-plaintext highlighter-rouge">countries</code> table utilizes <a href="https://mariadb.com/kb/en/st-contains/">the <code class="language-plaintext highlighter-rouge">ST_Contains()</code> function</a> to check if any given country’s <code class="language-plaintext highlighter-rouge">outline</code> contains the specified point, yielding either a single value (the country’s <code class="language-plaintext highlighter-rouge">name</code>) or <code class="language-plaintext highlighter-rouge">NULL</code> if no match was found. The result of the subquery is then patched into the relevant row of the <code class="language-plaintext highlighter-rouge">purchases</code> table.</p>

<p>How’s performance looking? On my <a href="https://uberspace.de">excellent shared hosting</a> plan and given the GeoJSON data prepared above, the <code class="language-plaintext highlighter-rouge">ST_Contains()</code> function takes at most a second per check – usually, it’s significantly quicker.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:refine">
      <p>After returning from vacation, I implemented location refinement by moving a marker on a map. This interface has also allowed me to add locations to some previous purchases. <a href="#fnref:refine" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:leaflet">
      <p>Using <a href="https://leafletjs.com">Leaflet</a> with <a href="https://www.openstreetmap.org/#map=7/51.330/10.453">OSM</a> tiles. <a href="#fnref:leaflet" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:geopolitical">
      <p>Which may change in future depending on geopolitical developments. <a href="#fnref:geopolitical" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:bin">
      <p>Shapefile’s a binary format, GeoJSON is just JSON (with a schema). <a href="#fnref:bin" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:intuit">
      <p>Based on nothing but intuition – at this point, I had yet to think about how to actually query this data. <a href="#fnref:intuit" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:dl">
      <p>I’d’ve been happy to provide a download link, but I’m not sure about licensing – so <a href="https://noahdoersing.com/#contact">drop me an email</a> if you’re interested. <a href="#fnref:dl" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:mysql">
      <p>Formerly known as MySQL. (Which still exists, but there’s little reason to use it over MariaDB these days.) <a href="#fnref:mysql" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:mysqldocs">
      <p>I’m linking to MySQL’s documentation here because it’s more detailed – I suppose (yet have some difficulty typing this) there’s <em>some</em> benefits to using Oracle products. <a href="#fnref:mysqldocs" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[A few days ago, I wrote about implementing search term highlighting as part of a tool my partner and I have long been using to track our shared purchases. Another set of improvements deals with location data – towards the tail end of our recent vacation to South Korea, on a whim, I added a feature where logging a purchase on a device with geolocation support also captures the current location1 (as a latitude-longitude pair), storing it along with the rest of the purchase data. After returning from vacation, I implemented location refinement by moving a marker on a map. This interface has also allowed me to add locations to some previous purchases. &#8617;]]></summary></entry><entry><title type="html">Case-insensitive Search Term Highlighting With JavaScript</title><link href="https://excessivelyadequate.com/posts/mark.html" rel="alternate" type="text/html" title="Case-insensitive Search Term Highlighting With JavaScript" /><published>2024-07-15T20:00:00+02:00</published><updated>2024-07-15T20:00:00+02:00</updated><id>https://excessivelyadequate.com/posts/mark</id><content type="html" xml:base="https://excessivelyadequate.com/posts/mark.html"><![CDATA[<p>My partner and I track our shared expenses using a tiny little web-based tool I wrote back around the time we started dating – so now, nine-ish years later, it was about time to add search functionality.</p>

<p>Not wanting to dive too deeply<sup id="fnref:deeply"><a href="#fn:deeply" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> into the terrible mess of long-untouched and possibly-gone-feral PHP code that makes up the tool’s<sup id="fnref:notpub"><a href="#fn:notpub" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> backend, I decided to do this purely client-side: shipping a list of our roughly 2500 shared purchases to the browser (with the assumption that the development of compute and networking speeds will outpace our accumulation of restaurant visits and things), then hiding those not matching the search term as it’s typed.</p>

<p class="wide"><img src="/static/mark.jpg" alt="" /></p>

<p class="caption">A screenshot of a search that just happens to show off case-insensitive search term highlighting. (Also, some semi-subliminal vacation humblebragging.)</p>

<p><em>(I’ve since published <a href="/posts/country.html">another blog post</a> about that mysterious “namely located anywhere” dropdown menu – it enables filtering by country and is powered by MariaDB’s geospatial functions.)</em></p>

<p>Anyway, one small challenge that’s “encapsulate-able” enough to write a blog post about was <strong>highlighting occurrences of the search term in a case-insensitive manner</strong> – so that searching for “jeon” would highlight both “Shin<mark>jeon</mark>” and “<mark>Jeon</mark>ju”.</p>

<h2 id="markup">Markup</h2>

<p>To set the scene, here’s some HTML closely matching how purchases are marked up (markupped?) in my tool.</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;input</span> <span class="na">id=</span><span class="s">"searchbar"</span> <span class="na">value=</span><span class="s">""</span><span class="nt">&gt;</span>
<span class="nt">&lt;ol</span> <span class="na">id=</span><span class="s">"purchases"</span><span class="nt">&gt;</span>
    <span class="nt">&lt;li</span> <span class="na">id=</span><span class="s">"p1337"</span><span class="nt">&gt;</span>
        <span class="nt">&lt;span</span> <span class="na">class=</span><span class="s">"date"</span><span class="nt">&gt;</span>2024-06-13 -<span class="nt">&lt;/span&gt;</span>
        <span class="nt">&lt;span</span> <span class="na">class=</span><span class="s">"who"</span><span class="nt">&gt;</span>Noah<span class="nt">&lt;/span&gt;</span> paid ⋯ for
        <span class="nt">&lt;span</span> <span class="na">class=</span><span class="s">"description"</span><span class="nt">&gt;</span>things and stuff<span class="nt">&lt;/span&gt;</span>.
    <span class="nt">&lt;/li&gt;</span>
    <span class="nt">&lt;li&gt;</span>
        ⋮
    <span class="nt">&lt;/li&gt;</span>
    ⋮
<span class="nt">&lt;/ol&gt;</span>
</code></pre></div></div>

<p>So there’s an input for the search term, then a list<sup id="fnref:ol"><a href="#fn:ol" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> of purchases, each with an <code class="language-plaintext highlighter-rouge">id</code> like <code class="language-plaintext highlighter-rouge">p1337</code> and a <code class="language-plaintext highlighter-rouge">span.description</code> that’ll be the only thing considered for search matches. <em>(In my case, the description won’t ever contain any HTML. Keep that in mind if your use case comes without this handy dandy simplifying precondition.)</em></p>

<p>Highlighting can be accomplished by wrapping the matching parts of a purchase’s description in <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> tags.</p>

<h2 id="searching">Searching</h2>

<p>Given the markup above, we can implement a basic<sup id="fnref:performance"><a href="#fn:performance" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> search function as follows:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">document</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">#searchbar</span><span class="dl">'</span><span class="p">).</span><span class="nf">addEventListener</span><span class="p">(</span><span class="dl">'</span><span class="s1">input</span><span class="dl">'</span><span class="p">,</span> <span class="nx">event</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">searchTerm</span> <span class="o">=</span> <span class="nx">event</span><span class="p">.</span><span class="nx">target</span><span class="p">.</span><span class="nx">value</span><span class="p">;</span>
    <span class="nb">document</span><span class="p">.</span><span class="nf">querySelectorAll</span><span class="p">(</span><span class="dl">'</span><span class="s1">#purchases li</span><span class="dl">'</span><span class="p">).</span><span class="nf">forEach</span><span class="p">(</span><span class="nx">purchaseLi</span> <span class="o">=&gt;</span> <span class="p">{</span>
        <span class="kd">const</span> <span class="nx">descriptionSpan</span> <span class="o">=</span> <span class="nx">purchaseLi</span><span class="p">.</span><span class="nf">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">span.description</span><span class="dl">'</span><span class="p">);</span>
        <span class="nf">clearHighlight</span><span class="p">(</span><span class="nx">descriptionSpan</span><span class="p">);</span>  <span class="c1">// remove highlights from previous searches</span>

        <span class="kd">const</span> <span class="nx">match</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">toLowerCase</span><span class="p">().</span><span class="nf">includes</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">.</span><span class="nf">toLowerCase</span><span class="p">());</span>
        <span class="k">if </span><span class="p">(</span><span class="nx">match</span><span class="p">)</span> <span class="p">{</span>
            <span class="nx">purchaseLi</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">display</span> <span class="o">=</span> <span class="dl">''</span><span class="p">;</span>  <span class="c1">// if previously hidden, show</span>
            <span class="nf">highlight</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">,</span> <span class="nx">descriptionSpan</span><span class="p">);</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="nx">purchaseLi</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">display</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">none</span><span class="dl">'</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">});</span>
<span class="p">});</span>
</code></pre></div></div>

<p>There’s nothing too fancy going on yet: On each keystroke (or other manner of <code class="language-plaintext highlighter-rouge">input</code>) in the <code class="language-plaintext highlighter-rouge">#searchbar</code>, the code runs through the list of purchases, checking if each one’s description contains the search term (case-insensitively by means of converting both to lowercase first). If so, a yet-unimplemented <code class="language-plaintext highlighter-rouge">highlight(searchTerm, descriptionSpan)</code> function is called. Non-matching purchases are hidden and highlights from previous<sup id="fnref:clear"><a href="#fn:clear" class="footnote" rel="footnote" role="doc-noteref">5</a></sup> searches cleared.</p>

<h2 id="highlighting">Highlighting</h2>

<p>Without even considering case (in)sensitivity, my first shot at the <code class="language-plaintext highlighter-rouge">highlight</code> function was a two-liner:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nf">highlight</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">,</span> <span class="nx">descriptionSpan</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">nonHighlightedBits</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">);</span>
    <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">=</span> <span class="nx">nonHighlightedBits</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">`&lt;mark&gt;</span><span class="p">${</span><span class="nx">searchTerm</span><span class="p">}</span><span class="s2">&lt;/mark&gt;`</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Two lines, three issues. (And I’ve written worse code than this.)</p>

<ol>
  <li>Of course this implementation won’t highlight matches that differ from the search term casing-wise – <code class="language-plaintext highlighter-rouge">String.split</code> is case-sensitive; there’s no case-insensitive equivalent available in any browsers today.</li>
  <li>Even if there was, filling in the correctly-cased variant in each <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> tag in line 2 would require somehow capturing them in line 1. Imagine searching for “a” in “Aardvark” – the first match needs to remain a capital A and the two other ones need to remain lowercase. Can’t just use the search term.</li>
  <li>Unrelated to functionally-correct highlighting but important nonetheless, adding content to a page by setting an element’s <code class="language-plaintext highlighter-rouge">.innerHTML</code> property to a value containing bits of straight user input opens the door to code injection. Despite this not<sup id="fnref:not"><a href="#fn:not" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> being exploitable here, I’ll get back to it after addressing 1 &amp; 2.</li>
</ol>

<h3 id="case-insensitive-highlighting">Case-insensitive highlighting</h3>

<p>JavaScript’s <code class="language-plaintext highlighter-rouge">String.split()</code> function can – instead of a string to split on – also accept a regular expression where each match then yields a split. And, conveniently, JavaScript’s regular expression objects can be created with a case insensitivity flag. So instead of <code class="language-plaintext highlighter-rouge">⋯.split(searchTerm)</code>, we can write <code class="language-plaintext highlighter-rouge">⋯.split(new RegExp(searchTerm, 'ig'))</code>, quickly and easily resolving issue 1.</p>

<p>…but with a bit of an asterisk – literally: Imagine searching for “A*” in the string “A* Algorithm” both with and without regular expression support. A basic search would of course only match “A*”, but a regex search<sup id="fnref:astar"><a href="#fn:astar" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> would instead match the two capital “A”s (plus a whole bunch of empty strings).</p>

<p>Considering this accidental regular expression support a bug rather than a feature (your opinion may well differ), I <a href="https://stackoverflow.com/a/67227435">figured out</a> that backslash-escaping all characters carrying a special meaning in regular expressions will again make the highlighter as dumb as it ought to be:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">regEscape</span> <span class="o">=</span> <span class="nx">v</span> <span class="o">=&gt;</span> <span class="nx">v</span><span class="p">.</span><span class="nf">replace</span><span class="p">(</span><span class="sr">/</span><span class="se">[</span><span class="sr">-[</span><span class="se">\]</span><span class="sr">{}()*+?.,</span><span class="se">\\</span><span class="sr">^$|#</span><span class="se">\s]</span><span class="sr">/g</span><span class="p">,</span> <span class="dl">'</span><span class="se">\\</span><span class="s1">$&amp;</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">nonHighlightedBits</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="k">new</span> <span class="nc">RegExp</span><span class="p">(</span><span class="nf">regEscape</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">),</span> <span class="dl">'</span><span class="s1">ig</span><span class="dl">'</span><span class="p">));</span>
</code></pre></div></div>

<h3 id="correctly-cased-highlighting">Correctly-cased highlighting</h3>

<p>Now onto issue 2 (<em>i.e.</em>, filling in the correctly-cased variant of the match in each <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> tag). Having switched to splitting by a regular expression instead of a plain string just so happens to help resolve this one, too, since <code class="language-plaintext highlighter-rouge">String.split(RegExp)</code> will <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split#splitting_with_a_regexp_to_include_parts_of_the_separator_in_the_result">include any capturing groups</a> at odd indices of the resulting array. So all that’s needed is wrapping the <code class="language-plaintext highlighter-rouge">regEscape(searchTerm)</code> bit in a capturing group…</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">bits</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="k">new</span> <span class="nc">RegExp</span><span class="p">(</span><span class="dl">'</span><span class="s1">(</span><span class="dl">'</span> <span class="o">+</span> <span class="nf">regEscape</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">)</span> <span class="o">+</span> <span class="dl">'</span><span class="s1">)</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">ig</span><span class="dl">'</span><span class="p">));</span>
</code></pre></div></div>

<p>…and then, when putting things back together, placing a <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> tag around <a href="https://stackoverflow.com/a/22312556">each odd-indexed element</a> of that array. With all that, the <code class="language-plaintext highlighter-rouge">highlight</code> function gains a line, but loses two issues (the code injection one remains – read on):</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nf">highlight</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">,</span> <span class="nx">descriptionSpan</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">regEscape</span> <span class="o">=</span> <span class="nx">v</span> <span class="o">=&gt;</span> <span class="nx">v</span><span class="p">.</span><span class="nf">replace</span><span class="p">(</span><span class="sr">/</span><span class="se">[</span><span class="sr">-[</span><span class="se">\]</span><span class="sr">{}()*+?.,</span><span class="se">\\</span><span class="sr">^$|#</span><span class="se">\s]</span><span class="sr">/g</span><span class="p">,</span> <span class="dl">'</span><span class="se">\\</span><span class="s1">$&amp;</span><span class="dl">'</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">bits</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="k">new</span> <span class="nc">RegExp</span><span class="p">(</span><span class="dl">'</span><span class="s1">(</span><span class="dl">'</span> <span class="o">+</span> <span class="nf">regEscape</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">)</span> <span class="o">+</span> <span class="dl">'</span><span class="s1">)</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">ig</span><span class="dl">'</span><span class="p">));</span>
    <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">=</span> <span class="nx">bits</span><span class="p">.</span><span class="nf">map</span><span class="p">((</span><span class="nx">s</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="nx">i</span> <span class="o">&amp;</span> <span class="mi">1</span> <span class="p">?</span> <span class="s2">`&lt;mark&gt;s&lt;/mark&gt;`</span> <span class="p">:</span> <span class="nx">s</span><span class="p">).</span><span class="nf">join</span><span class="p">(</span><span class="dl">''</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="code-injection-prevention">Code injection prevention</h3>

<p>To guard against code injection, we need to construct a collection of DOM nodes ourselves instead of, as above, letting the browser do that by assigning HTML code to the <code class="language-plaintext highlighter-rouge">.innerHTML</code> property of the <code class="language-plaintext highlighter-rouge">descriptionSpan</code> element. “Packaging” anything depending on user input (<em>i.e.</em>, the matches) within text nodes will prevent parsing and execution of <code class="language-plaintext highlighter-rouge">&lt;script&gt;</code> tags someone might’ve sneaked in there.</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">highlighted</span> <span class="o">=</span> <span class="nx">bits</span><span class="p">.</span><span class="nf">map</span><span class="p">((</span><span class="nx">s</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">i</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="kd">const</span> <span class="nx">e</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">createElement</span><span class="p">(</span><span class="dl">'</span><span class="s1">mark</span><span class="dl">'</span><span class="p">);</span>
        <span class="nx">e</span><span class="p">.</span><span class="nf">appendChild</span><span class="p">(</span><span class="nb">document</span><span class="p">.</span><span class="nf">createTextNode</span><span class="p">(</span><span class="nx">s</span><span class="p">));</span>
        <span class="k">return</span> <span class="nx">e</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="nb">document</span><span class="p">.</span><span class="nf">createTextNode</span><span class="p">(</span><span class="nx">s</span><span class="p">);</span>
<span class="p">});</span>
<span class="nx">descriptionSpan</span><span class="p">.</span><span class="nf">replaceChildren</span><span class="p">(...</span><span class="nx">highlighted</span><span class="p">);</span>
</code></pre></div></div>

<p>This code “manually” assembles a subtree of text and <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> nodes and hooks it in below the description element, replacing<sup id="fnref:cuckoo"><a href="#fn:cuckoo" class="footnote" rel="footnote" role="doc-noteref">8</a></sup> its previous contents.</p>

<h2 id="putting-it-all-together">Putting it all together</h2>

<p>Here’s the full <code class="language-plaintext highlighter-rouge">highlight</code> function for your copy-pasting pleasure. There’s also a little <a href="/static/mark.html">demo</a> if you’d like to try it out first!</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nf">highlight</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">,</span> <span class="nx">descriptionSpan</span><span class="p">)</span> <span class="p">{</span>

    <span class="c1">// nothing to do if the search field was empty</span>
    <span class="k">if </span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">.</span><span class="nx">length</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">return</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// case-insensitive split while capturing split string</span>
    <span class="kd">const</span> <span class="nx">regEscape</span> <span class="o">=</span> <span class="nx">v</span> <span class="o">=&gt;</span> <span class="nx">v</span><span class="p">.</span><span class="nf">replace</span><span class="p">(</span><span class="sr">/</span><span class="se">[</span><span class="sr">-[</span><span class="se">\]</span><span class="sr">{}()*+?.,</span><span class="se">\\</span><span class="sr">^$|#</span><span class="se">\s]</span><span class="sr">/g</span><span class="p">,</span> <span class="dl">'</span><span class="se">\\</span><span class="s1">$&amp;</span><span class="dl">'</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">bits</span> <span class="o">=</span> <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nx">textContent</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="k">new</span> <span class="nc">RegExp</span><span class="p">(</span><span class="dl">'</span><span class="s1">(</span><span class="dl">'</span> <span class="o">+</span> <span class="nf">regEscape</span><span class="p">(</span><span class="nx">searchTerm</span><span class="p">)</span> <span class="o">+</span> <span class="dl">'</span><span class="s1">)</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">ig</span><span class="dl">'</span><span class="p">));</span>

    <span class="c1">// put back together while surrounding split strings (always at odd indices) with &lt;mark&gt;</span>
    <span class="kd">const</span> <span class="nx">highlighted</span> <span class="o">=</span> <span class="nx">bits</span><span class="p">.</span><span class="nf">map</span><span class="p">((</span><span class="nx">s</span><span class="p">,</span> <span class="nx">i</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
        <span class="k">if </span><span class="p">(</span><span class="nx">i</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
            <span class="kd">const</span> <span class="nx">e</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nf">createElement</span><span class="p">(</span><span class="dl">'</span><span class="s1">mark</span><span class="dl">'</span><span class="p">);</span>
            <span class="nx">e</span><span class="p">.</span><span class="nf">appendChild</span><span class="p">(</span><span class="nb">document</span><span class="p">.</span><span class="nf">createTextNode</span><span class="p">(</span><span class="nx">s</span><span class="p">));</span>
            <span class="k">return</span> <span class="nx">e</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="k">return</span> <span class="nb">document</span><span class="p">.</span><span class="nf">createTextNode</span><span class="p">(</span><span class="nx">s</span><span class="p">);</span>
    <span class="p">});</span>

    <span class="c1">// finally, write back onto page</span>
    <span class="nx">descriptionSpan</span><span class="p">.</span><span class="nf">replaceChildren</span><span class="p">(...</span><span class="nx">highlighted</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:deeply">
      <p>Which I did anyway once I got a bit of a flow going, adding a map view and – also overdue – the ability to edit purchases. <a href="#fnref:deeply" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:notpub">
      <p>Which is also why I haven’t open-sourced it. <a href="#fnref:notpub" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:ol">
      <p><code class="language-plaintext highlighter-rouge">&lt;ol&gt;</code> instead of <code class="language-plaintext highlighter-rouge">&lt;ul&gt;</code> (despite, in the screenshot, <code class="language-plaintext highlighter-rouge">list-style-type: none</code>) because the list’s ordered by date. Semantic markup, y’all! <a href="#fnref:ol" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:performance">
      <p>Basic as in “I didn’t consider performance for a second”, but anecdotally, it’s quick enough to support thousands of entries without noticeable delay (<em>i.e.</em>, faster than a round trip to the backend). <a href="#fnref:performance" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:clear">
      <p>The <code class="language-plaintext highlighter-rouge">clearHighlight(descriptionSpan)</code> function is just <code class="language-plaintext highlighter-rouge">descriptionSpan.textContent = descriptionSpan.textContent</code>, which effectively replaces any <code class="language-plaintext highlighter-rouge">&lt;mark&gt;</code> tags with the text they contain. <a href="#fnref:clear" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:not">
      <p>For two reasons: 1. The only extant instance of this tool is behind a login that only the two of us have access to, and (more importantly) 2. since purchase descriptions don’t contain HTML, searching with a string containing a <code class="language-plaintext highlighter-rouge">&lt;script&gt;</code> tag wouldn’t match any purchases, so the <code class="language-plaintext highlighter-rouge">highlight</code> function wouldn’t insert it into the page (nor run at all). <a href="#fnref:not" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:astar">
      <p>Where “A*” means “any number (including zero) of successive ‘A’s”. <a href="#fnref:astar" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:cuckoo">
      <p><a href="https://www.youtube.com/watch?v=ZXdZZAf0AU0">Cuckoo, cuckoo!</a> <a href="#fnref:cuckoo" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[My partner and I track our shared expenses using a tiny little web-based tool I wrote back around the time we started dating – so now, nine-ish years later, it was about time to add search functionality.]]></summary></entry><entry><title type="html">20 Blogs Worth Binging</title><link href="https://excessivelyadequate.com/posts/blogs.html" rel="alternate" type="text/html" title="20 Blogs Worth Binging" /><published>2024-04-24T21:00:00+02:00</published><updated>2024-04-24T21:00:00+02:00</updated><id>https://excessivelyadequate.com/posts/blogs</id><content type="html" xml:base="https://excessivelyadequate.com/posts/blogs.html"><![CDATA[<p>…20 times this one, of course! Obvious falsehoods aside, here’s a listicle (is that still what people call them?) of blogs I can wholly recommend<sup id="fnref:blogroll"><a href="#fn:blogroll" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> reading, if you’re so inclined, from beginning to end. My deeply immoral goal here is to gently knock you into blog-shaped rabbit holes you’ll spend hours reading your way through until you emerge, with a bunch of knowledge (or just having had a good time), at the other end. <em>I’m not strict about what constitutes a “blog” here – not all of these sites have RSS feeds, the entries of some aren’t dated, and a couple aren’t even updated<sup id="fnref:timeless"><a href="#fn:timeless" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> anymore.</em></p>

<ul>
  <li>
    <p>There’s only a small chance you don’t know about Randall Munroe’s <a href="https://what-if.xkcd.com/archive/">What if?</a> series, but if you’ve been living under a rock (not throwing any shade – geology’s fascinating!), it’s where he’s been answering questions like “What would happen if you tried to hit a baseball pitched at 90% the speed of light?” with equal (large!) amounts of humor and scientific accuracy.</p>
  </li>
  <li>
    <p>If you’re into the whole <em>Lord of the Rings</em> thing, know that Ian McKellen wrote a series <a href="https://mckellen.com/cinema/lotr/journal.htm">of</a> <a href="https://mckellen.com/cinema/lotr/wb/index.htm">blog</a> <a href="https://mckellen.com/cinema/hobbit-movie/blog.htm">posts</a> (that’s three separate links!) on his experience portraying Gandalf during the <em>LotR</em> and <em>Hobbit</em> movie trilogies.</p>
  </li>
  <li>
    <p>Another series of blog posts discussing <em>LotR</em>, more specifically taking a historian’s look at the Battle of Helm’s Deep, has been <a href="https://acoup.blog/2020/05/01/collections-the-battle-of-helms-deep-part-i-bargaining-for-goods-at-helms-gate/">published by Bret C. Devereaux</a>, the rest of whose blog is also worthy of your consideration. Quoting from the last entry:</p>

    <blockquote>
      <p>Tolkien presents a world where it is often necessary to employ violence, but just as often necessary to restrain it.  Jackson may miss some of the details and opportunities, but he captures this spirit – where most modern ‘war’ movies and certainly most adaptations (looking at you, Game of Thrones) miss it entirely.  And that’s worth taking a deeper look at.</p>
    </blockquote>
  </li>
  <li>
    <p>It’s less of a blog and more of a digital garden, but on the <a href="https://100r.co/site/home.html">website of Hundred Rabbits</a> (<em>i.e.</em>, Rek Bell and Devine Lu Linvega) you’ll find accounts of their travels on a sailboat (from Canada to Mexico to New Zealand to Japan <a href="https://100r.co/site/busy_doing_nothing.html">and back again</a>), permacomputing explorations, boat maintenance tips, solar cooking recipes, and a whole lot more. There’s an RSS feed with monthly updates.</p>
  </li>
  <li>
    <p><a href="https://craigmod.com/ridgeline/">Craig Mod</a> walks, walks, walks – mostly in Japan. And on these walks, he writes about octogenarian-run coffee shops and roads meandering past Pachinko parlors and stiflingly-cedar-covered mountain passes and what he calls his pizza toast obsession, occasionally weaving connections between Japan’s rapidly-depopulating countryside and his own past. Long pull quote from the introduction of his almost-book-y mini site <a href="https://walkkumano.com/iseji/">“Ise-ji: Walk With Me”</a>.</p>

    <blockquote>
      <p>Allow me to share what I love about a good walk in Japan: I love the small villages, coming across a rusted and worn down <em>kissa</em>, sipping a ¥200 cup of the dankest coffee around, listening to an 80 year old <em>mama</em> relay debauched stories of love lost over a slice of pizza toast. I love the Japan-walking clichés, those moments in the forest, alone, <em>uguisu</em> birdsong above, winds shifting the bamboo treetops like a Ghibli film loop, stopping to catch my breath next to a grave marking the spot where a loyal horse conked out two hundred years ago. I love coming across the remnants of teahouses at mountain passes, foundation stones blocking off volume for the mind to fill in. I love cutting through rice fields, greeting suspicious farmers in all their various stages of planting or prepping or harvesting depending on season, photographing their weather-gouged faces, dubious dentistry, the clockwork movements of steam-punk planters dropping seedlings into the shallow ponds of their fields. I love walking past an abandoned and roofless forest shrine in May, returning in December only to find it glowing with fresh <em>hinoki</em> wood – <em>Whoa, someone still cares</em>. And I love the plainness of life on display: The bedsheets and museum-grade underwear drying in the sun, cars washed before jagged mountain backdrops, the maintenance on homes, plaster walls, <em>kayabuki</em> thatched roofs, the squat pulling of weeds from moss gardens. I love all these seemingly insignificant details, but details that, en masse, form the fullness of a time and place, both in the historical aggregate and of that very moment in which you’re stepping. It’s a helluva thing, the gift of walking the world.</p>
    </blockquote>
  </li>
  <li>
    <p>On <a href="https://heredragonsabound.blogspot.com">Here Dragons Abound</a>, Scott Turner details his journey building a procedural fantasy map generator – he doesn’t show much code (most posts contain precisely none), instead exploring and illustrating the concepts involved. I especially recommend the <a href="https://heredragonsabound.blogspot.com/2017/12/city-symbols-part-1-what-im-trying-to-do.html">series on city symbols</a>.</p>
  </li>
  <li>
    <p>Bird photographs from a city here in southwestern Germany, not much more (and why would you want more), on: <a href="https://longingforrotkehlchen.tumblr.com/">Longing for Rotkehlchen</a>.</p>
  </li>
  <li>
    <p><a href="https://vienna-pyongyang.blogspot.com">The forbidden railway</a> details (with lots of pictures) a 2008 rail trip from Vienna to Pyongyang, including an only-sorta-legal border crossing. Unthinkable now, but apparently possible back then!</p>
  </li>
  <li>
    <p>Until 2020 (unrelated to the pandemic), pseudonymous trauma and general surgeon <a href="http://www.docbastard.net">DocBastard</a> maintained a blog where he shared stories from his workplace and regularly weighed in on healthcare-related happenings on the internet.</p>
  </li>
  <li>
    <p>Jimmy Maher’s writing not one, but two blogs worth reading from top to bottom, publishing a long-form article on one of them each week – if you’re into 80’s and 90’s video games (say, <a href="https://www.filfre.net/2020/02/myst-or-the-drawbacks-to-success/">Myst</a>), operating systems, or consumer technology in general, you’ll read your fill on <a href="https://www.filfre.net/">The Digital Antiquarian</a>. And if that’s all a bit too recent for your tastes, over on <a href="https://analog-antiquarian.net/2024/04/12/chapter-4-the-expedition/">The Analog Antiquarian</a>, he starts with <a href="https://analog-antiquarian.net/2019/01/11/chapter-1-the-charlatan-and-the-gossip/">the pyramids</a>.</p>
  </li>
  <li>
    <p>There’s good architecture, there’s bad architecture, and then there’s McMansions. “By alternating comedy-oriented takedowns of individual houses with weekly informative essays about architecture, urbanism, sociology, and design, <a href="https://mcmansionhell.com">McMansionHell</a> hopes to open readers’ eyes to the world around them, and inspire them to make it a better one. “</p>
  </li>
  <li>
    <p>Read a few articles by <a href="https://admiralcloudberg.medium.com">Admiral Cloudberg</a> and you’ll feel safer flying – she writes 30-to-50-minute pieces on historic airplane crashes, describing what happened engagingly but without undue dramatization, outlining the engineering and piloting decisions that led to things unraveling (at <em>just</em> the right level of technical detail), and usually closing with what’s been done to make sure any particular crash won’t reoccur.</p>
  </li>
  <li>
    <p>On <a href="https://techreflect.org/">Tech Reflect</a>, a former Apple employee shares <a href="https://techreflect.org/2020/10/steve-jobs-vs-me-on-my-bicycle/">stories</a> from his time there alongside <a href="https://techreflect.org/2019/11/macos-tips-old-dock-tips-that-are-still-useful/">macOS tips and tricks</a>.</p>
  </li>
  <li>
    <p>If you speak German and are interested in learning what it’s like to live on a sailboat slowly making your way around the world, read about the travels of <a href="https://www.muktuk.de/">Muktuk</a> and her crew. There’s the time <a href="https://www.muktuk.de/2023/02/10/fuenf-wochen-auf-see/">a racing pigeon became a stowaway</a>, having turned up 250 nautical miles from the nearest island. At the time of writing, they’re traveling (sans pigeon) up and down the coast of Japan.</p>
  </li>
  <li>
    <p>Stretching the definition of “blog” to include “multiple novels available chapter-by-chapter plus a bunch of short stories and also an actual blog”, I’d be remiss not to recommend <a href="https://qntm.org/">qntm</a>’s site hosting works such as <a href="https://qntm.org/ra">Ra</a>…</p>

    <blockquote>
      <p>Discovered in the 1970s, magic is now a bona fide field of engineering. There’s magic in heavy industry and magic in your home. It’s what’s next after electricity.</p>
    </blockquote>

    <p>…and his series of <a href="https://qntm.org/nanowrimo">short</a> <a href="https://qntm.org/more">stories</a> originally written during NaNoWriMo. Don’t miss <a href="https://qntm.org/responsibilit">“I Don’t Know, Timmy, Being God Is a Big Responsibility”</a>.</p>
  </li>
  <li>
    <p>If you’re into that particular flavor of semi-paranormal sci-fi, you might already know about the <a href="https://scp-wiki.wikidot.com">SCP Wiki</a> – but if you don’t, you’re in for a deep dive into 7000+ articles cataloging strange <a href="https://scp-wiki.wikidot.com/scp-134">creatures</a>, <a href="https://scp-wiki.wikidot.com/scp-093">places</a>, and <a href="https://scp-wiki.wikidot.com/scp-2816">phenomena</a>.</p>

    <blockquote>
      <p>Staircases that go on forever, mechanical gods from the beginning of time, otherwise regular humans who reshape reality with their mind: these are the kinds of things that, if known to the public, could cause mass hysteria and start wars on scales unprecedented. Due to that, there exists an organization called the SCP Foundation, whose job is to research paranormal activity, keep these creatures and objects concealed from the public, and protect humanity from the horrors of the dark.</p>
    </blockquote>

    <p>And that’s not to mention thousands of <em>stories</em> taking place in that same universe, for example the breathtakingly-high-quality <a href="https://scp-wiki.wikidot.com/antimemetics-division-hub">“There Is No Antimemetics Division”</a> by previous-entry-in-this-list qntm.</p>
  </li>
  <li>
    <p>As far as regular here’s-a-bunch-of-interesting-links newsletters go, <a href="https://www.tomscott.com/newsletter/">Tom Scott’s</a> is my favorite. Given my particular interests, there’s an unusually high signal-to-noise ratio here – then again, I don’t think I missed any of the weekly videos Tom published from 2014 until he was <a href="https://www.youtube.com/watch?v=7DKv5H5Frt0">carried into the sunset by a helicopter</a> earlier this year.</p>
  </li>
  <li>
    <p>Fancy reading about cool places from the comfort of your home? Like, <em>cold</em> cool places? The aptly-named <a href="https://brr.fyi/">brr.fyi</a> is a recently-concluded blog written by an anonymous IT worker initially deployed to Antarctica’s McMurdo Station, where it was apparently too warm and cozy, so they switched to the Amundsen-Scott South Pole Station halfway through.</p>
  </li>
  <li>
    <p>Last but <em>certainly</em> not least, there’s the writings of <a href="http://woodblock.com/front.html">Dave Bull</a>, a woodblock printmaker who’s been living and working in Japan since the 1980s.</p>

    <p>His website is labyrinthine in the best possible way – apart from heaps and bounds of <a href="http://woodblock.com/encyclopedia/outline.html">woodblock printmaking knowledge</a>, you’ll find <a href="http://www.asahi-net.or.jp/~xs3d-bull/essays/essays.html">multiple</a> <a href="http://astoryaweek.com/en/contents.php">collections</a> <a href="http://www.asahi-net.or.jp/~xs3d-bull/hyaku-nin-issho/index.html">of essays</a> (that’s three links to three different pages) on various subjects: a <a href="http://www.asahi-net.or.jp/~xs3d-bull/essays/other/poetry.html">visit to the Imperial Palace</a> to meet the emperor, <a href="http://www.asahi-net.or.jp/~xs3d-bull/hyaku-nin-issho/1992/autumn/9.html#anchor_okunono">raising his daughters</a> back in the 90s, his <a href="http://www.asahi-net.or.jp/~xs3d-bull/hyaku-nin-issho/1994/winter/18.html">own upbringing as a British-born Canadian</a>, <a href="http://woodblock.com/contact/seseragi_frame_index.php?file=intro">building a woodblock printmaking studio</a> in the sub-basement of his house (which, unlike most sub-basements, has a river view), <a href="http://woodblock.com/roundtable/archives/1997/01/sing_for_your_s.html">busking in front of London’s Royal Festival Hall to make rent</a> in his early 20s, <a href="http://woodblock.com/scroll/frame_index.php?file=intro">reproducing a large scroll-mounted woodblock print</a>, <a href="http://www.asahi-net.or.jp/~xs3d-bull/hyaku-nin-issho/2001/autumn/45.html#anchor_bio">programming early computers</a>, general <a href="http://www.asahi-net.or.jp/~xs3d-bull/essays/1995/off_to_market.html">observations on Japanese culture as an outsider</a>, and <em>so</em> much more. Five quotes picked almost at random:</p>

    <blockquote>
      <p>As with most of us, my acquaintance with <a href="http://www.asahi-net.or.jp/~xs3d-bull/essays/1994/unusual_guest.html">bats</a> has been necessarily a rather distant one. That is, up until one day last spring, when I finally had a chance to meet one at close range. I nearly stepped on her (I’m quite sure it was a ‘her’) while coming in to our apartment. A small little brown ball, about two centimeters across, just at the edge of the concrete sidewalk. I thought it was a dead mouse at first, but when I looked closer, I saw that it was a tiny bat, and then when it moved slightly, realized that it was alive.</p>
    </blockquote>

    <blockquote>
      <p>Quite a number of people I meet seem to be interested in my eldest daughter’s <a href="http://www.asahi-net.or.jp/~xs3d-bull/essays/1996/whats_in_name.html">name</a> … at least I do get plenty of comments and questions about it. It is, as far as I can tell, unique, but unlike other unique ‘made-up’ names that I have heard, it has no ‘strange’ feeling. Her name is ‘Himi’, and how she got it is an interesting story … my computer did it!</p>
    </blockquote>

    <blockquote>
      <p>I suppose the students are happy in their clean, bright, not to say warm classrooms. I suppose the insurance company is happy, secure in the knowledge that this building will not catch fire one cold night. I suppose the village parents are happy, knowing that their children have a facility the equal of those in the big cities. But if everybody is so happy, then <a href="http://astoryaweek.com/en/display_story.php?story_file=200707000340196">why are my eyes full of water</a>?</p>
    </blockquote>

    <blockquote>
      <p>Case open on the ground in front of me, I started. I can still remember the first piece I played; a showy little etude full of cascading arpeggios and runs. A million notes packed into just a few bars. I was astonished at the sound that came out. Each note seemed to hang in the air, and travel for miles. Perhaps the water was acting as a sounding board, or perhaps there was some kind of echo from the buildings across the river … It was a magnificent location. No matter how opulent the concert hall at my back may have been inside, it couldn’t have <a href="http://woodblock.com/roundtable/archives/1997/01/sing_for_your_s.html">sounded</a> as good as this!</p>
    </blockquote>

    <blockquote>
      <p>Every summer I leave my apartment in the city and come and stand <a href="http://www.asahi-net.or.jp/~xs3d-bull/hyaku-nin-issho/1992/autumn/9.html#anchor_okunono">here</a> in these fields. I look about me and note the changes; another tree blocking the path, a new hole in the roof of the farmhouse, another stone fallen from a wall …. And I ask myself, “Why am I allowing this to happen?”</p>
    </blockquote>

    <p>Now in his seventies, Dave doesn’t have much time to write lately, instead growing his printmaking company <a href="https://mokuhankan.com">Mokuhankan</a>, making <a href="https://www.youtube.com/watch?v=ij9KXgiyDAc">YouTube videos</a> and <a href="http://twitch.tv/japaneseprintmaking">streaming thrice-weekly on Twitch</a>.</p>
  </li>
</ul>

<hr />

<p>The compilation of this post has benefited greatly from being able to crawl through the archives of <a href="https://github.com/doersino/read">ReAD</a>, my homegrown<sup id="fnref:read"><a href="#fn:read" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> read-it-later tool where I’ve logged over 44,000 articles read across the last decade or so.  Somewhere in my drafts and raring to be finished, there’s an embryo of an article exploring what I’ve learned about my reading habits from this admittedly-obsessive level of record-keeping.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:blogroll">
      <p>This post is kind of an expanded <a href="https://en.wikipedia.org/wiki/Blogroll">blogroll</a>, I suppose. <a href="#fnref:blogroll" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:timeless">
      <p>…they’re all sufficiently “timeless” that a lack of updates won’t matter until you’ve finished binging the existing posts (and are thinking: “now what am I going to do with my life?”). <em>Fun fact: This post has been in my drafts since at least 2020, so some of the “dead” blogs were still being updated when I wrote about them!</em> <a href="#fnref:timeless" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:read">
      <p>Owing to the fact that I started building ReAD in mid-2013, just a few months into properly learning to program, it’s a terrible mess of PHP spaghetti code. I’ve extended it a few times since then with statistics pages, powerful search functionality, and the ability to add quotes – this work has been just about bearable enough to never warrant a full rewrite. <a href="#fnref:read" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[…20 times this one, of course! Obvious falsehoods aside, here’s a listicle (is that still what people call them?) of blogs I can wholly recommend1 reading, if you’re so inclined, from beginning to end. My deeply immoral goal here is to gently knock you into blog-shaped rabbit holes you’ll spend hours reading your way through until you emerge, with a bunch of knowledge (or just having had a good time), at the other end. I’m not strict about what constitutes a “blog” here – not all of these sites have RSS feeds, the entries of some aren’t dated, and a couple aren’t even updated2 anymore. This post is kind of an expanded blogroll, I suppose. &#8617; …they’re all sufficiently “timeless” that a lack of updates won’t matter until you’ve finished binging the existing posts (and are thinking: “now what am I going to do with my life?”). Fun fact: This post has been in my drafts since at least 2020, so some of the “dead” blogs were still being updated when I wrote about them! &#8617;]]></summary></entry><entry><title type="html">Secure Backups of Important Files to Insecure Locations</title><link href="https://excessivelyadequate.com/posts/documents.html" rel="alternate" type="text/html" title="Secure Backups of Important Files to Insecure Locations" /><published>2024-04-21T16:00:00+02:00</published><updated>2024-04-21T16:00:00+02:00</updated><id>https://excessivelyadequate.com/posts/documents</id><content type="html" xml:base="https://excessivelyadequate.com/posts/documents.html"><![CDATA[<p>I’ve mostly<sup id="fnref:loseblattsammlung"><a href="#fn:loseblattsammlung" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> done away with keeping binders of important documents, instead storing everything in a directory on my computer. In addition to my usual backup routine, which comprises</p>

<ul>
  <li>hourly <a href="https://support.apple.com/en-us/104984">Time Machine</a> backups to an external SSD taped to the back of my display,</li>
  <li>monthly <a href="https://shirt-pocket.com/SuperDuper/SuperDuperDescription.html">SuperDuper!</a> clones to a disk in a desk drawer, and (…now transitioning from “backup” to “medium-term archival”)</li>
  <li>quarterly-ish exports to a fairly inscrutable storage scheme distributed across a frankly deranged amount<sup id="fnref:nas"><a href="#fn:nas" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> of unlabeled external drives,</li>
</ul>

<p>these documents seem to warrant another “layer” or two of disaster proofing, my implementation of which I’ll describe in this post. Notably, those previous layers of my proverbial backup onion all live in my apartment<sup id="fnref:backblaze"><a href="#fn:backblaze" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> – and since it’s <a href="https://www.backblaze.com/blog/whats-the-diff-3-2-1-vs-3-2-1-1-0-vs-4-3-2/">best practice</a> to keep at least one backup off-site, enabling just that in a secure (<em>i.e.</em>, encrypted) manner was a primary goal here.</p>

<p>Here’s a heavily-redacted screenshot of the files<sup id="fnref:german"><a href="#fn:german" class="footnote" rel="footnote" role="doc-noteref">4</a></sup> in question (because images break up the monotony of mediocre prose):</p>

<p class="wide"><img src="/static/documents.png" alt="" /></p>

<h2 id="compression">Compression</h2>

<p>The first step of the backup solution I came up with zips up the directory using <code class="language-plaintext highlighter-rouge">tar -cz ".../path/to/the/documents/"</code>, yielding a <code class="language-plaintext highlighter-rouge">.tar.gz</code> bitstream. That’s less to save on storage space (with the files being mostly scans, screenshots, and PDFs, trusty old <a href="https://jvns.ca/blog/2013/10/23/day-15-how-gzip-works/">Gzip</a> can only squeeze out about 20% of redundancy) and more because a singular file is easier to work with than the original directory.</p>

<h2 id="encryption">Encryption</h2>

<p>There’s confoundingly many ways to encrypt a file – pick your poison. I find that in this kind of context, OpenSSL tends to come in handy – it’s an industry standard, can be found preinstalled on many systems, and is reasonably easy to use, even if you want to use a password instead of a <a href="https://opensource.com/article/21/4/encryption-decryption-openssl">keypair</a>.</p>

<p>After a bit of searching around, together with the <code class="language-plaintext highlighter-rouge">tar</code> command from above, the following invocation…</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">tar</span> <span class="nt">-cz</span> <span class="s2">".../path/to/the/documents/"</span> | openssl enc <span class="nt">-aes-256-cbc</span> <span class="nt">-md</span> sha512 <span class="nt">-pbkdf2</span> <span class="nt">-iter</span> 1000000 <span class="nt">-pass</span> pass:<span class="s2">"correct-horse-battery-staple"</span> <span class="nt">-out</span> <span class="s2">".../temporary/documents.tar.gz.bin"</span>
</code></pre></div></div>

<p>…seems to do the trick (that’s not my actual <a href="https://xkcd.com/936/">password</a>, of course). I found it in a <a href="https://unix.stackexchange.com/a/507132">Stack Overflow answer</a> which succinctly explains the meaning of the <code class="language-plaintext highlighter-rouge">openssl</code> arguments:</p>

<blockquote>
  <ul>
    <li>
      <p><code class="language-plaintext highlighter-rouge">-aes-256-cbc</code> is what you <em>should</em> use for maximum protection […],</p>
    </li>
    <li>
      <p><code class="language-plaintext highlighter-rouge">-md sha512</code> is a bit the faster variant of SHA-2 functions family compared to SHA-256 while it might be a bit more secure […],</p>
    </li>
    <li>
      <p><code class="language-plaintext highlighter-rouge">-pbkdf2</code>: use PBKDF2 (Password-Based Key Derivation Function 2) algorithm,</p>
    </li>
    <li>
      <p><code class="language-plaintext highlighter-rouge">-iter 1000000</code> is overriding the default count of iterations (which is 10000) for the password […].</p>
    </li>
  </ul>
</blockquote>

<p>There’s no standard file extension for thusly encrypted data, so I chose <code class="language-plaintext highlighter-rouge">.tar.gz.bin</code> to signify that it’s some sort of binary blob containing <code class="language-plaintext highlighter-rouge">.tar.gz</code> data.</p>

<p><em>Note on sensitive data and command-line applications:</em> Because I only ever run this command on my local machine, I don’t worry greatly about passing the encryption password within a command-line argument – on a shared server, where other users might see it briefly pop up in the process list, <a href="https://superuser.com/a/724987">another</a> solution (<em>e.g.</em>, environment variables or a file<sup id="fnref:xargs"><a href="#fn:xargs" class="footnote" rel="footnote" role="doc-noteref">5</a></sup>) would be advisable.</p>

<h2 id="distribution">Distribution</h2>

<p>I just copy the resulting <code class="language-plaintext highlighter-rouge">.tar.gz.bin</code> file to a couple of locations – presently, that’s my iCloud Drive<sup id="fnref:paranoia"><a href="#fn:paranoia" class="footnote" rel="footnote" role="doc-noteref">6</a></sup> and the server this website is running on (transferred via <code class="language-plaintext highlighter-rouge">scp</code>). Since the file is encrypted, there’s technically no reason (apart from common sense) not to distribute it widely, assuming that the password’s going to remain<sup id="fnref:passman"><a href="#fn:passman" class="footnote" rel="footnote" role="doc-noteref">7</a></sup> private. <del>Ask me for a copy and I might well send you one!</del></p>

<hr />

<h2 id="and-back-again">…and back again</h2>

<p>In the event of a disaster where my computer (and plethora of hard drives (plus a few other copies I didn’t mention (yes, I’m a digital prepper))) were magically deleted from reality <em>but</em> I’d still be around to retrieve the <code class="language-plaintext highlighter-rouge">.tar.gz.bin</code> file and recall the associated password, the following pipeline would yield the original directory.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>openssl enc <span class="nt">-d</span> <span class="nt">-aes-256-cbc</span> <span class="nt">-md</span> sha512 <span class="nt">-pbkdf2</span> <span class="nt">-iter</span> 1000000 <span class="nt">-pass</span> pass:<span class="s2">"correct-horse-battery-staple"</span> <span class="nt">-in</span> <span class="s2">".../temporary/documents.tar.gz.bin"</span> | <span class="nb">tar</span> <span class="nt">-xzf</span> -
</code></pre></div></div>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:loseblattsammlung">
      <p>With the exception of a few paper originals I’m legally required to keep (or not totally confident I’m <em>not</em>), which I store in order of receipt in a file folder. (Of course, there’s a German word for keeping loose collections of papers: <em>Loseblattsammlung</em>.) <a href="#fnref:loseblattsammlung" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:nas">
      <p>Every once in a while, I entertain the idea of getting a NAS, but noise and power consumption (and feature creep – with an always-on computer, I really ought to set up <a href="https://www.plex.tv">Plex</a> for my definitely-not-<a href="https://en.wiktionary.org/wiki/fall_off_the_back_of_a_truck">fallen-off-the-back-of-a-truck</a> video collection, maybe get into <a href="https://homebridge.io">Homebridge</a>, automate a few random things etc.) tend to shut any fantasies down relatively quickly – so far… <a href="#fnref:nas" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:backblaze">
      <p>In addition to my recurring NAS thought experiments described in the previous footnote, I keep thinking about just setting up <a href="https://www.backblaze.com/cloud-backup">Backblaze</a> (or similar), but apart from the not-insignificant price tag, the cloud backup solutions I’ve looked at don’t implement encryption satisfactorily: I’d only use one that encrypts the data on my machine (most do!) but <em>doesn’t</em> <a href="https://www.backblaze.com/computer-backup/docs/encryption">send the decryption keys to the backup service</a> for convenience’s sake (because then what’s the point of encryption beyond in-transit security?). <a href="#fnref:backblaze" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:german">
      <p>99% of what I do on computers is in English, so in retrospect, I’m not sure why I stuck to German for the naming of subdirectories here. Then again, translating terms like <em>Lohnsteuerbescheinigungen</em>, <em>Sozialversicherungsausweis</em> and <em>Steuerliche Identifikationsnummer</em> into another language might be akin to denying my cultural heritage. <a href="#fnref:german" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:xargs">
      <p>If you’re going to store the password (and nothing else) in a file, be careful not to include any whitespace at the beginning or end of the file (<em>e.g.</em>, a newline character) – or, more resiliently to future inattention when, say, changing the password, just strip off any whitespace. In the command above, <code class="language-plaintext highlighter-rouge">... -pass pass:"$(cat ".../path/to/your/password.file" | xargs)" ...</code> would do the trick. <a href="#fnref:xargs" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:paranoia">
      <p>Where, as far as I can tell, Apple (and whatever American goverment agencies might be interested) can access it, hence the encryption. <em>I’m not paranoid, you’re paranoid!</em> <a href="#fnref:paranoia" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:passman">
      <p>I keep the encryption password in my password manager (whose <a href="https://keepass.info">KeePass</a>-compatible database is also backed up in various locations). <a href="#fnref:passman" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[I’ve mostly1 done away with keeping binders of important documents, instead storing everything in a directory on my computer. In addition to my usual backup routine, which comprises With the exception of a few paper originals I’m legally required to keep (or not totally confident I’m not), which I store in order of receipt in a file folder. (Of course, there’s a German word for keeping loose collections of papers: Loseblattsammlung.) &#8617;]]></summary></entry><entry><title type="html">Something Wrong With Your `venv`? Just Reset It!</title><link href="https://excessivelyadequate.com/posts/revenv.html" rel="alternate" type="text/html" title="Something Wrong With Your `venv`? Just Reset It!" /><published>2024-04-06T20:00:00+02:00</published><updated>2024-04-06T20:00:00+02:00</updated><id>https://excessivelyadequate.com/posts/revenv</id><content type="html" xml:base="https://excessivelyadequate.com/posts/revenv.html"><![CDATA[<p>Most folks developing Python applications will have experienced instances of a virtual environment <em>(for non-snake-charmers: a directory containing a specific version of the Python interpreter and relevant libraries bound to a project, avoiding dependecy hell when working on multiple projects in parallel)</em> that’s been chugging along nicely for months on their local development machine just kind of breaking – be it due to a botched dependency downgrade, a change in the Python setup<sup id="fnref:brew"><a href="#fn:brew" class="footnote" rel="footnote" role="doc-noteref">1</a></sup>, or an even more arcane reason.</p>

<p>It’s not a common occurrence, but it happens.</p>

<p>The easiest way of fixing things, if you’ve been keeping track of your project’s dependencies, tends to be simply deleting the old virtual environment, creating a fresh one, and reinstalling the dependencies. <strong>That’s a sequence of three steps you need to perform in the correct order – so it’s ripe for automation!</strong></p>

<p>Depending on where you keep your virtual environments (in my case: alongside each project, <em>i.e.</em>, created via <code class="language-plaintext highlighter-rouge">python3 -m venv .</code>), how you install dependencies (commonly <code class="language-plaintext highlighter-rouge">pip3 install -r requirements.txt</code>) and any other specifics, a Bash script similar to the following should do<sup id="fnref:alsosetup"><a href="#fn:alsosetup" class="footnote" rel="footnote" role="doc-noteref">2</a></sup> the trick:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="k">if</span> <span class="o">[</span> <span class="o">!</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$VIRTUAL_ENV</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"Deactivating..."</span>
    deactivate
<span class="k">else
    </span><span class="nb">echo</span> <span class="s2">"No venv active, skipped 'deactivate' step."</span>
<span class="k">fi
if</span> <span class="o">[</span> <span class="nt">-f</span> <span class="s2">"pyvenv.cfg"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"Nuking old virtual environment..."</span>
    <span class="nb">rm </span>pyvenv.cfg
    <span class="nb">rm</span> <span class="nt">-r</span> bin
    <span class="nb">rm</span> <span class="nt">-r</span> include
    <span class="nb">rm</span> <span class="nt">-r</span> lib
<span class="k">else
    </span><span class="nb">echo</span> <span class="s2">"No 'pyvenv.cfg' file present, skipped nuking step."</span>
<span class="k">fi
</span><span class="nb">echo</span> <span class="s2">"Setting up a fresh virtual environment..."</span>
python3 <span class="nt">-m</span> venv <span class="nb">.</span>
<span class="nb">echo</span> <span class="s2">"Activating..."</span>
<span class="nb">source </span>bin/activate
<span class="k">if</span> <span class="o">[</span> <span class="nt">-f</span> <span class="s2">"requirements.txt"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"Reinstalling from requirements.txt..."</span>
    pip3 <span class="nb">install</span> <span class="nt">-r</span> requirements.txt
<span class="k">else
    </span><span class="nb">echo</span> <span class="s2">"No 'requirements.txt' found, skipped reinstall step."</span>
<span class="k">fi</span>
</code></pre></div></div>

<p>Paste these lines of code (<a href="https://en.wiktionary.org/wiki/modulo">modulo</a> any modifications needed to match your workflow) into a file named <code class="language-plaintext highlighter-rouge">revenv</code>, place it in a directory that’s on<sup id="fnref:path"><a href="#fn:path" class="footnote" rel="footnote" role="doc-noteref">3</a></sup> your <code class="language-plaintext highlighter-rouge">$PATH</code> and make sure the file’s executable: run <code class="language-plaintext highlighter-rouge">chmod u+x revenv</code>, for instance. Then, when you’re in need of resetting a virtual environment, simply <code class="language-plaintext highlighter-rouge">cd</code> to your project’s directory and run <code class="language-plaintext highlighter-rouge">revenv</code>.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:brew">
      <p>In my case, this tends to happen when I upgrade some <a href="https://brew.sh">Homebrew</a> package (say, <code class="language-plaintext highlighter-rouge">yt-dlp</code>) whose new version depends (whether technically required or not) on a newer-than-installed Python version. During this transitive upgrade process, previous Python versions sometimes end up being uninstalled (or, at the very least, relevant symlinks get borked). <a href="#fnref:brew" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:alsosetup">
      <p>Since it’s designed to skip the deletion of the old virtual environment if none is present, I’ve also found it handy for bootstrapping a new project. <a href="#fnref:alsosetup" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:path">
      <p>A common choice of directory is <code class="language-plaintext highlighter-rouge">~/local/bin</code> – which you can put on your <code class="language-plaintext highlighter-rouge">$PATH</code> by adding <code class="language-plaintext highlighter-rouge">export PATH="$HOME/local/bin:$PATH</code> to your <a href="https://linuxize.com/post/bashrc-vs-bash-profile/"><code class="language-plaintext highlighter-rouge">.bashrc</code> and/or <code class="language-plaintext highlighter-rouge">.bash_profile</code></a> – but I tend to instead maintain small utilities like this as functions <a href="https://github.com/doersino/dotfiles/blob/master/.bashrc">in my <code class="language-plaintext highlighter-rouge">.bashrc</code></a>. To each their own. <a href="#fnref:path" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Noah Doersing</name></author><summary type="html"><![CDATA[Most folks developing Python applications will have experienced instances of a virtual environment (for non-snake-charmers: a directory containing a specific version of the Python interpreter and relevant libraries bound to a project, avoiding dependecy hell when working on multiple projects in parallel) that’s been chugging along nicely for months on their local development machine just kind of breaking – be it due to a botched dependency downgrade, a change in the Python setup1, or an even more arcane reason. In my case, this tends to happen when I upgrade some Homebrew package (say, yt-dlp) whose new version depends (whether technically required or not) on a newer-than-installed Python version. During this transitive upgrade process, previous Python versions sometimes end up being uninstalled (or, at the very least, relevant symlinks get borked). &#8617;]]></summary></entry></feed>