Website Design Choices

While I was building my website, I made some design choices that may affect your experience. For example, I chose to use the Volkorn type face, partially because I enjoy it and partially because it has some cool features.

Other choices will have a stronger effect, and will mostly affect AI fans. Yet others will have a minor effect on everyone.

An Explanation of the Creative Commons #

Before I explain what I did, I’ll start with why I did it, and before I can do that I need to explain Creative Commons licenses.

The Creative Commons is a system of licenses that enable creators to waive some (or, in the case of CC0, nearly all) of their rights to their work, allowing others to use what they made as a foundation for other ideas without fear of lawsuits or the like. It is, as they say, “a ‘some rights reserved’ option for content that would otherwise be closed off.”

Creatives building off of the ideas of others is an ancient practice – that’s why what Tolkein called Mythral was frequently renamed Mythril, Richard Sharpe Shaver’s Dero were retitled Derro – but it’s only recently attained a method of preemptive legality. It’s a tale as old as copyright.

Putting your works in the Creative Commons allows this copyright dance to be skipped, and allows people to properly cite what gave them their ideas. It also requires them to properly cite what gave them their ideas.

As people know, AI data scraping crawlers don’t always (or even usually) cite who made whatever they’re throwing into the black box. With the Creative Commons’ legal requirement to cite the creator of the work, even in the most lenient license variant, this creates an outright violation of the law that opens AI companies up to a lawsuit1. Quote the legal code:

If You Share the Licensed Material (including in modified form), You must:

  • retain the following if it is supplied by the Licensor with the Licensed Material:
    • identification of the creator(s) of the Licensed Material and any others designated to receive attribution, in any reasonable manner requested by the Licensor (including by pseudonym if designated);
    • a copyright notice;
    • a notice that refers to this Public License;
    • a notice that refers to the disclaimer of warranties;
    • a URI or hyperlink to the Licensed Material to the extent reasonably practicable;
  • indicate if You modified the Licensed Material and retain an indication of any previous modifications; and
  • indicate the Licensed Material is licensed under this Public License, and include the text of, or the URI or hyperlink to, this Public License.

Additionally, the legal code for the ND (no derivative) licenses (along with removing the “including in modified form” statement) explicitly states “For the avoidance of doubt, You do not have permission under this Public License to Share Adapted Material” immediately after the above section. Logically, an AI regurgitating parts of these texts would qualify, under the CC definition:

Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.

Given the habitual lack of attribution for ANY work used to train any AI that I know of, this is an inevitable legal problem that hangs over AI production, held at by only by the single fraying thread of nobody bothering to actually sue them.

What I am doing to help (and why you might not be linked straight to my newest stories) #

Given how I am a Kind And Generous Person, I have elected to help these companies to not break the law. If you look at my robots.txt file, you will see that it is full of AI scraping tools and AI browser crawlers.

This allows me to:

  1. Warn AI scraper tools that Here There Be (legal) Dragons, and
  2. Ask the (presumably) web crawling bots to wait a few months (or forever) before looking at certain high-risk scraping pages.

It’s like clearly labeling my garbage and recycling bins to make sure woodland critters don’t shove their necks into those six-pack rings and start strangling – a great service to society.

And, like making sure your things go in the right bins, sometimes we need to watch out for bears.

How I’m saving the scrapers at risk #

Some scraper creators don’t realize the importance of robots.txt, or chose to intentionally disregard its warnings in the short-sighted hunt for content. Like bears, they will violate seemingly reasonable barriers, and risk health problems from the consumption of junk food/illegally harvested data.

Bears that become aclimatized to human food lose their healthy fear of humans, and will likely be shot. Scraper bots that become aclimatized to illegally harvested data will be shut down when they inevitably get their owners sued.

So, what can be done to prevent this? Simple: we store the dangerous material in a container these creatures cannot breach, so that these creatures can live long, healthy lives. Yet in the case of some scrapers (and some bears), they have found some safe targets – thus, they must be retrained to fear what once fed them.

And that’s all I’m going to say.


  1. Note: I am not actually a lawyer. I just know how to read, as well as that fair dealing policies are far less lenient than the US fair use policies. For example, India’s fair dealing policy (according to wikipedia) only applies in cases of “private or personal use, including research[;] criticism or review[; or] reporting of current events and current affairs, including the reporting of a lecture delivered in public.” Austrailia adds Legal Advice and Parody and Satire to this list. None of these apply to AI content generatiors like chatGPT. ↩︎