maintenance for reference management

I am using Zotero as reference manager. I use my Zettelkasten for maintaining references that are actually in use but Zotero holds lots of unused and unmaintained references (i.e. not in my ZK), especially webpages that i store offline just in case the source is changed or unavailable.

In Zotero it is easy to find duplicates which is the only maintenance i do. Zotero allows creating folders, tags to help with search but i find maintaining two separate systems is too much effort. I also don't worry much about unused references because i don't see them as backlog, they're just collecting dust. But i know there is good stuff in it and occasionally i'd simply like to dig into it and add something interesting to my Zettelkasten.

I think what would be most helpful to me is a list of unused references sorted by creation date or in general allow communication between both systems for maintenance purpose only.

In previous discussions others mentioned that they use "The Internet Archive" to store webpages or don't store webpages in their RMS at all and instead paste the URL to the note1. Some are maintaining a reading list.

Any thoughts on maintenance?

my first Zettel uid: 202008120915

• @zk_1000 Not something about which I've thought or had cause to use much. However, I wanted to "second" the idea of using the Internet Archive and their "Wayback machine" - they are very useful. I do have zettels with external references to web articles; if longevity is important, I should really be referencing a page stored in the Internet Archive rather than referencing a "live" page.

• @zk_1000

I too use Zotero in a similar way, but I barely need to do any maintenance. Maybe my system could serve as inspiration.

I keep two collections: Deviation and Project Support Material.

The latter contains subcollections. Each is named after a project. I also include a note with the project's name inside. That way, I can find the collection easily. I just look for the note, click on it, hold Alt, and the collection gets highlighted. If I have few projects, I skim instead.

By name, I actually mean an ID. I use an ID to name and locate projects. For example, 10WorkflowPlayingSkullgirls.

The former is for anything else. For example, stuff I picked up online, material I'd like to read, RSS feeds, etc.

I always make a formal Zotero item when I add sources. The exception is if I think that I won't be reading the source in a while. In that case, I create a webpage item, and add the URL and title. This usually happens with Deviation.

Depending on how things are in my life, I work on procesing Deviation. For example, right now, I'm taking a year off to solve several life problems. So I spend most of my time doing projects. I still work a little evey so often on Deviation so it doesn't pile up too much.

I use two tags: read and unread. Everything I will read gets unread. I swap the tag for read once I've processed the source. More on how this is useful later.

Also, I almost always include a PDF to my items. I use wkhtmltopdf or Firefox's print option to make PDFs out of webpages. PDFs are useful to make highlights. I keep the PDFs in one folder and link to them from Zotero. The only exception is with dictionary entries. I don't need to make any highlights in those cases.

And then, I do some weekly cleaning. I do four things:

First, remove PDFs for read material. I do a search for anything tagged as read and remove their PDFs. I do this manually and is usually fast, but I'll see if I can automate this in the future.

Second, deal with duplicated items. Psych! I don't know how to do that yet. I will eventually.

Third, remove completed projects. I remove the note and the collection. Unread material will either get removed or sent to Deviation. When I remove the collection, I make sure to not remove the remaining items too.

And finally, I clean the trash. Zotero is set to do this automatically, but it doesn't. I'll look into this when I can.

My system looks like a pain in the ass to maintain, but it's not. It takes nearly no effort or time for me. So, I hope that this is useful to you in some way.

Ah, before winding up. Sascha Fast told me about their research.org file a while ago in a recent post of mine. It reminded me of Tiago Forte's archive. Both seemed useful to improve Deviation. Maybe you would like to look into this too?

• edited November 28

I agree that files without item are definitely unread. It is interesting to see how others deal with this.

Thank you for the feedback.

Post edited by zk_1000 on

my first Zettel uid: 202008120915

• @zk_1000 said:

You're welcome.

• I recommend saving a local copy of an online source AND saving a copy in the Internet Archive Wayback Machine. That is what I do, and I consider it a best practice, since if the original source goes offline, then I (and everyone else) will have public evidence in the Internet Archive that the source existed in a certain state at the time of access.

Every year or two I export a list of URLs from my reference manager and run them through a link checker, and if any URLs have permanently disappeared, then I replace them with an Internet Archive link.

There are a few websites that I cite that I have learned are notorious for changing their URL structure without using redirects, resulting in frequent broken URLs, and for those websites I now cite an Internet Archive link directly, since I don't trust the original URL to last long.

I don't worry about "unused references" in my reference manager either. I have no idea, nor do I care, which references are "unused" in my general personal hypertext system. However, I do keep a folder in my reference manager for each writing project, so I know which items have been cited in which projects.

• @Andy said:

I don't worry about "unused references" in my reference manager either. I have no idea, nor do I care, which references are "unused" in my general personal hypertext system. However, I do keep a folder in my reference manager for each writing project, so I know which items have been cited in which projects.

It may be helpful if I describe further how I create a folder (collection) in my reference manager for the items that I have cited in each writing project.

The reference manager that I use is BibDesk, but you can also use this technique with Better BibTeX for Zotero. In both BibDesk and Better BibTeX for Zotero, there is a command that shows you all the items that are cited in an .aux file, which is a file that is generated when you compile a LaTeX document. In BibDesk, the relevant command is "Database → Select Publications from .aux File" (or drag the .aux file onto the main list of publications). In Better BibTeX for Zotero, the relevant command is "Tools → Scan BibTeX AUX file for references".

If you write in LaTeX, you already have an .aux file when you compile your document. If you are like me, you write in Markdown and cite sources using BibTeX cite keys, so to generate the .aux file you need to use Pandoc to convert the Markdown document to a LaTeX document, and then compile the LaTeX document to get the .aux file.

If you wish (I don't), you could use this technique to find out which items in your reference manager are unused in your zettelkasten. Assuming that you use BibTeX cite keys to cite sources in your zettelkasten, you would need to concatenate all your zettels into one file (for example with Unix cat command), or else write a script to extract all the cite keys into one file, and then use the technique described above. In BibDesk, after the items from the .aux file are selected, you can use the command "Edit → Invert Selection" to select all the items that are not cited in the .aux file, which would be all the items that are unused in your zettelkasten.