Data Hoarding with Friends

I have a collection of images without any copyright. I suspect a lot of people who write books which should have pictures do the same thing.

Searching various images for the right types of images is boring. Collecting these images is boring (right click, save, copy author name, paste author name into directory, move image to directory, copy image name...). Sifting through these images is also extremely boring.

It's time to get organized.

This Much, I Know...

  • Images should be searchable by tags.
  • All tags should be embedded into the image1 so that the tags stay with the image, even if the image ends up getting lost somewhere on the internet.
  • The top-level structure must be the licence.
 1├── CC-BY-4.0
 2│   ├── Artist_1
 3│   │   └── image_1.jpg
 4│   ├── Artist_2
 5│   │   └── image_2.jpg
 6│   └── Artist_3
 7│       └── image_3.jpg
 8├── CC-BY-SA-4.0
 9│   ├── Artist_4
10│   │   └── image_1.jpg
11│   ├── Artist_5
12│   │   └── image_2.jpg
13│   └── Artist_6
14│       └── image_3.jpg
15├── CC0
16│   ├── Artist_7
17│   │   └── image_1.jpg
18│   ├── Artist_8
19│   │   └── image_2.jpg
20│   └── Artist_9
21│       └── image_3.jpg
22└── GPLv3
23    ├── Artist_10
24    │   └── image_1.jpg
25    ├── Artist_11
26    │   └── image_2.jpg
27    └── Artist_12
28        └── image_3.jpg

Making the licence the first 'door' people walk through makes possible uses clear from the start. So when someone wants old images, no strings attached, they can limit their search easily.

Dividing Art Projects

I'd like to see many people look after different batches of art. Perhaps one collection for 'high fantasy', and another for images which might do well in Call of Cthulhu games.

If only two people work together, we can have a collection twice as large, or doubly-well organized.

Separating the work also allows it to become more modular, so that anyone who wants to take part of the collection can just download that part.

Dividing by Genre

So the images might divide like this:

 1├── CC-BY-SA-4.0
 2│   ├── CoC
 3│   │   ├── image_1.jpg
 4│   │   └── image_2.jpg
 5│   ├── DnD
 6│   │   ├── image_1.jpg
 7│   │   └── image_2.jpg
 8│   └── Vampire_Dark_Ages
 9│       ├── image_1.jpg
10│       └── image_2.jpg
11└── CC0
12    ├── CoC
13    │   ├── image_1.jpg
14    │   └── image_2.jpg
15    ├── DnD
16    │   ├── image_1.jpg
17    │   └── image_2.jpg
18    └── Vampire
19        ├── image_1.jpg
20        └── image_2.jpg

But this structure has problems:

  • Call of Cthulhu and Vampire: Dark Ages overlap. Many images will fit both, so we could get repetition.
  • The structure isn't terribly inviting for non-RPG people, and it seems better to leave this project open to anyone who wants to collect art.

Having an artist in the basic directory tree means 3 levels of subfolders. That's far too much - better to leave the artist's name to the embedded exif data in the image.

Divide by Century of the Art

This has its problems.

  1. We'll have to split artists along multiple centuries.
  2. Images of the Middle Ages come from every century - people have never stopped thinking about knights, princesses and dragons.

Divide by Century Depicted

This sounds about the best so far, but with one major issue: Aesop. A great many old images have 'whimsical' depictions of educated mouse-meals and well-dressed crabs dancing. No matter what century these belong to, most people don't want these images for their serious work, but those who do want them will want nothing else.

I'm trying to write a serious war-game for serious adults, with elves casting fireballs at dragons. I can't have these silly animals in here without a wizard clearly casting his 'Talk with Animals' spell with a magic wand.

  • Me (very serious game designer)

Answers on a Postcard

In summary, I'm still thinking of a way to divide art which allows:

  • Multiple collections, which
  • minimize intersections, and
  • let people focus on some particular work, such as an RPG book, but
  • should not actually be limited to RPG books.

Also, what should the filename of each image be?

  • Name_of_the_Image.jpg?
  • Artist_-_Name_of_the_Image.jpg?
  • Long_description_of_image.jpg?

Technical Details

  • git-lfs
  • Font-ends are a separate matter.
  • If people sign commits with GPG keys, then project technically does everything NFTs said they were going to do, but didn't. More details here .
  • Gitlab lets you upload 1 Gig a month, so this is totally sustainable.
  • exiftool to put data in, and I'll probably make bash scripts to help label things.
  • Databases should never touch the git, because we'll get two sources of truth. Better to generate any external records from the files' exif data.

  1. This can be accomplished through exif data. ↩︎