Algorithmic link review
I've only begun incorporating zettelkasten into my daily life recently, but I'm a long-time user of spaced repetition and credit it with much of the success I had reaching fluency in Japanese a few years back. At least in my experience thus far, zettelkasten works well alongside the practice of spaced repetition, but I was recently struck by a possibly fruitful way to integrate the core spaced repetition concept into zettelkasten itself.
In the healthcare space, there is a type of software called an MPI (Master Patient Index) which serves to collect information about that healthcare organization's patients and make it available to other physicians/nurses/technicians/administrators who may have specific need for it. Historically, you'd need a staff of people for grooming the MPI, as a sufficiently large organization spread across various locations and practitioners will intake data on the same patients in many different forms. The MPI steward's role would be to figure out which records referred to the same person and ensure that the disparate details get compiled and reconciled, and those that may appear to be similar at first glance but really aren't are clearly demarcated as such.
Modern MPIs do as much of this clean-up algorithmically as possible, but you usually still end up with non-trivial numbers of record pairs that the algorithm is not confident it can evaluate, and these it presents to the steward for a final review and decision. Now, obviously, since the steward would like to spend as little time getting deeply familiar with the particulars of one patient's records as possible--that is the doctor's job once the steward has sorted things--there is no spaced repetition component here.
But consider how this might apply to zettelkasten. Review is already a critical part of the refinement of zettels, and it has already led me to see links that were not immediately obvious when I made the notes in the first place. What about a mechanism whereby every day, you are given a certain number of note pairs to evaluate as potential links (or potential unlinks)? You would then assign some confidence value--broadly 'Strong Non-Link', 'Weak Non-Link', 'Weak Link', or 'Strong Link' (and presumably, take action appropriate to that judgment)--which would then apply a modifier to the time interval after which you would be shown it again.
e.g. today I am prompted to evaluate the link potential of two pairs: A and B, and A and C. I decide that A and B look like a strong link. I apply the link where and how I believe is appropriate, and then mark the pair as a strong link. Because my confidence in this assessment is high, the algorithm will significantly postpone ever showing me this pairing again. On the other hand, let's say I think A and C are a weak non-link. I'm not seeing them as linked now, but I don't see any compelling reason to be sure I would never link them. The algorithm doesn't postpone showing me that pair again quite as long as it postpones A and B.
Assuming a well-implemented algorithm, I could see this system potentially saving quite a bit of time over a more 'brute force' review process where one just goes over a set number of their notes every single day in some arbitrary order. Those associations or lacks thereof about which you are most sure, you need not lose much time considering, while those that give you pause appear more regularly. It would also be much more pointed than just a straight read-through of the notes: supposing I went through 5,000 zettels at a rate of 10 per day, even if there were a really striking connection lurking between zettels 9 and 4,328, what's the likelihood that the substance of 9 will be fresh enough in my head to notice it? On the other hand, if I'm explicitly shown 9 and 4,328 side-by-side, it might jump out at me.
For cons, with a sufficiently large zettelkasten, the number of pairings would vastly outstrip the review burden of just going through all the cards on their own. You would also see each card n-1 times (where n is the number of zettels in your collection) in the course of one 'cycle' of the review, so especially up front when the algorithm doesn't have a lot of feedback to work with, you may be getting beaten over the head with the same piece of information in a way that's frustrating and tedious. There is also the possibility, though I'm not sure how to weigh this, that there is an advantage in 'fuzzy comparison.' e.g. comparatively faint memory traces of zettel 9 might be fruitful in forming a different sort of association with zettel 4,328 that this approach would inhibit. Finally, the algorithm's efficacy would depend heavily on the user honestly and accurately assigning confidence intervals, and even then they might be overly self-reinforcing.
Anyway, this isn't anything I've been doing any work on, but just an idea I had that I would be interested to get feedback on. For my part, I'd probably look to implement it in emacs if I wanted to do it.
It looks like you're new here. If you want to get involved, click one of these buttons!