Zettelkasten Forum


Algorithmic link review

I've only begun incorporating zettelkasten into my daily life recently, but I'm a long-time user of spaced repetition and credit it with much of the success I had reaching fluency in Japanese a few years back. At least in my experience thus far, zettelkasten works well alongside the practice of spaced repetition, but I was recently struck by a possibly fruitful way to integrate the core spaced repetition concept into zettelkasten itself.

In the healthcare space, there is a type of software called an MPI (Master Patient Index) which serves to collect information about that healthcare organization's patients and make it available to other physicians/nurses/technicians/administrators who may have specific need for it. Historically, you'd need a staff of people for grooming the MPI, as a sufficiently large organization spread across various locations and practitioners will intake data on the same patients in many different forms. The MPI steward's role would be to figure out which records referred to the same person and ensure that the disparate details get compiled and reconciled, and those that may appear to be similar at first glance but really aren't are clearly demarcated as such.

Modern MPIs do as much of this clean-up algorithmically as possible, but you usually still end up with non-trivial numbers of record pairs that the algorithm is not confident it can evaluate, and these it presents to the steward for a final review and decision. Now, obviously, since the steward would like to spend as little time getting deeply familiar with the particulars of one patient's records as possible--that is the doctor's job once the steward has sorted things--there is no spaced repetition component here.

But consider how this might apply to zettelkasten. Review is already a critical part of the refinement of zettels, and it has already led me to see links that were not immediately obvious when I made the notes in the first place. What about a mechanism whereby every day, you are given a certain number of note pairs to evaluate as potential links (or potential unlinks)? You would then assign some confidence value--broadly 'Strong Non-Link', 'Weak Non-Link', 'Weak Link', or 'Strong Link' (and presumably, take action appropriate to that judgment)--which would then apply a modifier to the time interval after which you would be shown it again.

e.g. today I am prompted to evaluate the link potential of two pairs: A and B, and A and C. I decide that A and B look like a strong link. I apply the link where and how I believe is appropriate, and then mark the pair as a strong link. Because my confidence in this assessment is high, the algorithm will significantly postpone ever showing me this pairing again. On the other hand, let's say I think A and C are a weak non-link. I'm not seeing them as linked now, but I don't see any compelling reason to be sure I would never link them. The algorithm doesn't postpone showing me that pair again quite as long as it postpones A and B.

Assuming a well-implemented algorithm, I could see this system potentially saving quite a bit of time over a more 'brute force' review process where one just goes over a set number of their notes every single day in some arbitrary order. Those associations or lacks thereof about which you are most sure, you need not lose much time considering, while those that give you pause appear more regularly. It would also be much more pointed than just a straight read-through of the notes: supposing I went through 5,000 zettels at a rate of 10 per day, even if there were a really striking connection lurking between zettels 9 and 4,328, what's the likelihood that the substance of 9 will be fresh enough in my head to notice it? On the other hand, if I'm explicitly shown 9 and 4,328 side-by-side, it might jump out at me.

For cons, with a sufficiently large zettelkasten, the number of pairings would vastly outstrip the review burden of just going through all the cards on their own. You would also see each card n-1 times (where n is the number of zettels in your collection) in the course of one 'cycle' of the review, so especially up front when the algorithm doesn't have a lot of feedback to work with, you may be getting beaten over the head with the same piece of information in a way that's frustrating and tedious. There is also the possibility, though I'm not sure how to weigh this, that there is an advantage in 'fuzzy comparison.' e.g. comparatively faint memory traces of zettel 9 might be fruitful in forming a different sort of association with zettel 4,328 that this approach would inhibit. Finally, the algorithm's efficacy would depend heavily on the user honestly and accurately assigning confidence intervals, and even then they might be overly self-reinforcing.

Anyway, this isn't anything I've been doing any work on, but just an idea I had that I would be interested to get feedback on. For my part, I'd probably look to implement it in emacs if I wanted to do it.

Comments

  • edited May 29

    From my understanding, OP is looking for a method to add deeper
    connections within OP's Zettelkästen using less time. Currently, OP
    makes new connections wherever he sees a strong connection between
    ideas. I'd like to share my thoughts from the three-level human memory
    point of view.

    The first one has few seconds and only four to seven chunks one
    person could have. The second one can last about a couple of hours.
    The third one usually refers to long-term memory.

    There's a lot of evidence that some part of the human brain is working
    as a daemon. Our conscious mind may not notice their existence.
    Everyone can connect his idea using the first level memory. Most of
    the time, the connection is simple, but some people do report that
    they experience "aha" time when their mind is wondering when doing
    less intense mental works, e.g. walking, showering, napping, etc. In
    my view, the "aha" time is the time instance where you see a strong
    connection between two ideas.

    To cultivate those highlights moments, the second level of memory must
    be engaged, which requires much more concentration on a narrow topic.
    This type of memory is usually hard to develop especially in this
    information overflow era. For me, I do not have a standard procedure
    to link my Zettelkästen, but I try to train myself to get used to
    performing such that the second level of memory is engaged and the key
    ideas are properly documented.

    If the second level of memory is engaged over a couple of weeks, our
    brain will change significantly, such that the third level, i.e. the
    long-term memory, forms. I would say that the writing process
    especially with Zettelkästen is a hack to form long-term memories. By
    operating at the second level memory and properly document down the
    structure of the ideas, one can gain a much deeper understanding of
    the selected topic.

    The diffuse mode, i.e. a working mode that requires breaking away from
    the studying material, is critical for creating novel ideas. Only
    after engaging with the level two memory, the diffuse mode can help to
    bridge connections in between the information that one interacted
    with.

    To sum up, my algorithm to link ideas is to train myself to work
    with the second level memory, documenting the ideas with structural
    writing, blending diffuse mode in between those hard sessions.

  • @elem said:
    Assuming a well-implemented algorithm...

    I've been stewing a bit over this for last couple of days. The main issue I have is figuring out parameters to determine which notes might be interesting to look at. Also, having an algorithm figuring out what to show me goes against how I use my Zettelkasten: to help me think. Interacting with any related note is a feature, not something I want to get rid of or hand over to an algorithm. I already have my lattices of thought to work with my Zettelkasten, so finding related notes is really simple.

    That being said, there might be something complementary in being fed -say 3- notes at the start of the day to look at. Bringing us back to the problem, how to identify notes to review?

    What I'm really interested in is "cross-pollination" of ideas. I've found in the past that linking a note somewhere in the middle of a lattice will create a new single or multiple lattices leading to notes that I never would have considered. What's important here is that a lattice is a line of thought, composed of related notes. This also means that a lattice has a start and an end, at least in my Zettelkasten. Loops do occur, but can be broken in a deterministic manner.

    Combining the ideas above, I can do the following:

    • sample three random notes from my Zettelkasten;
    • find all lattices a note is a part of by checking if the start of a lattice has a path to the random note;
    • create a list of all notes that are the end for a lattice. Using only a list of "end notes" is a deliberate choice, figuring out all possible lattices and then eliminating all associated notes is too computationally intensive (takes too long). Using only the end notes gives enough context for me to review the notes;
    • remove all end notes for lattices that the random note is a part of, leaving only "end notes" for lattices that the note is not a part of;
    • sample three random notes from the remaining "end notes".

    This way I end up with three notes for each random note that have no relationship to that note. This allows for review and checking if there might be possible relationships to add. Generation of this list is something that can be added to a daily generated list, like the one created by @Will.

    Unfortunately, this is a very limited use case, but it's already pretty complex. I like the idea, so I did add it to my own Zettelkasten-script, just to give it a try. I'll try it for a bit and see if it helps me in creating better/more links (haven't yet figured out how to measure that though ...).

    Anything more complex as described by @elem will probably mean more programming and keeping of state, just to keep track of what the steward has been provided for review. Any hints on how such an algorithm should look like would be appreciated, seems like fun to try and add that to the script as well.

  • And below a quick example what my "today" output looks like (thank you @Will for the awesome idea!), including the notes to review ("Recensie met context"). Let me know if I should translate, my Zettelkasten is in Dutch, with the exception of the English terminology contaminating everything :-).

    ---
    date: '2021-06-01T21:58:58.654585'
    title: Vandaag
    ---
    
    712 dagen totdat ik 40 ben.
    13898 dagen sinds dat ik geboren ben.
    460 dagen sinds de eerste COVID-besmetting in Nederland.
    
    ---
    
    505 *notities*
    1190 *relaties* (gemiddeld 4 relaties per notitie)
    68619 *woorden*
    
    ---
    
    **5 notities in de INBOX**
    
    ## Recensie met context
    
    1. |01| [Aandacht ombuigen met desinformatie](399)
        * |00| [Qubes OS: a reasonably secure operating system](331)
        * |00| [Risico, defense in depth en kaas](147)
        * |00| [Metacommunicatie](51)
    2. |00| [Uitdagingen van CRM-programma's](305)
        * |00| [Plaatjes voor praatjes](304)
        * |00| [Zelfvoedend uitstelgedrag](129)
        * |00| [Lockpicking](85)
    3. |00| [Verbeteren van de nauwe blik op waarde](335)
        * |00| [Shakshuka](402)
        * |00| [Uitschudden van de kaartenbak](124)
        * |00| [Datamodel voor omnichannel transactieverwerking](155)
    
    ## 6 nieuwe/aangepaste notitie(s) in de afgelopen zeven dagen.
    
    1. |15| [Stakeholder](209)
    2. |05| [Governance](213)
    3. |04| [Streaming Architecture](496)
    4. |03| [Strengholt, P. (2020): Data Management at Scale](461)
    5. |02| [Verantwoord digitaliseren via digitale architectuur](378)
    6. |02| [Instemming en consensus](521)
    
  • edited June 2

    @r1tger said:
    That being said, there might be something complementary in being fed -say 3- notes at the start of the day to look at. Bringing us back to the problem, how to identify notes to review?

    @r1tger, does your zettelkasten have a single focus, or does it reflect varied interests?

    If I had only time to review three notes at the start of the day, I’d choose three recently created notes. Three from last week. I reason that energy and momentum are reinforced if you continue working within your current ideas and memes. And expanding on the work, you are already immersed in feels like it has a greater payoff than occasional random serendipity. Choosing to review recent notes aligns with the adage that reading, studying, and note-take should follow interests. But interests do migrate, so you might not want to review notes from 28 weeks ago in your daily review. Your area of study might have moved.

    Random notes lead to threads which no longer align with current areas of ideation (interests). While it may be interesting and fruitful to review an old random note or two once in a while, more value is gained by looking at and reviewing what was created recently.

    For example, because this dovetails with my current interests, reviewing the note from last week, Phrasing Ideas Clearly Is Work And Has Learning Value, considering its related to the ideation that is a current study area. Rather than a random note like The Methuselah Forest, as interesting as Bristle Cone Pines are.

    Picking three notes from last week will show, over time, how interest migrates. I see this on a longer time scale. Reviewing last year's and two years ago, on this day, notes I get a nostalgic view on the past, and sometimes I'm almost embarrassed by what I thought noteworthy back then.

    I review yesterday’s new notes each morning, and sometimes there might be 10 notes if I had a productive day yesterday, onboarding an interesting book or paper. Some mornings I am greeted with zero new notes from yesterday. It's very random, but I've averaged 2.2451820128 new notes/day.

    Post edited by Will on

    Will Simpson
    I'm a zettelnant.
    Research areas: Attention Horizon, Productive Procrastination, Dzogchen, Non-fiction Creative Writing
    kestrelcreek.com

  • @Will said:
    @r1tger, does your zettelkasten have a single focus, or does it reflect varied interests?

    I have multiple interests, ranging from work-related (mostly IT and change management stuff), personal (favorite restaurants and recipes) and some beginner-level philosophy. This creates insight that is sometimes farfetched, and sometimes valuable. I'm still very happy it produces anything useful at all :-).

    If I had only time to review three notes at the start of the day, I’d choose three recently created notes.

    This one really resonates for me! You are absolutely right, re-enforcing what I'm already working with does make much more sense.

    I review yesterday’s new notes each morning, and sometimes there might be 10 notes if I had a productive day yesterday, onboarding an interesting book or paper. Some mornings I am greeted with zero new notes from yesterday. It's very random, but I've averaged 2.2451820128 new notes/day.

    I already have a list of new/modified notes for the last 7 days, so adding three notes for review to each new/modified note should not produce an enormous list. I'm still building the habit of working with my Zettelkasten on a regular basis, I tend to have peaks of productivity. I'll give this a try and see where I end up.

    Thank you so much for your input!

  • @r1tger said:
    I already have a list of new/modified notes for the last 7 days, so adding three notes for review to each new/modified note should not produce an enormous list.

    I'm not sure I understand.

    I'm still building the habit of working with my Zettelkasten on a regular basis. I tend to have peaks of productivity. I'll give this a try and see where I end up.

    This is huge! Try different things. Seeing where they lead, jettisoning the crap, and stealing the valuable nuggets is the second key to habit development. The first is to start small and let the habit grow at its own rate. 80% of habit formation is showing up. The other 20% is the routine of showing up regularly.

    Will Simpson
    I'm a zettelnant.
    Research areas: Attention Horizon, Productive Procrastination, Dzogchen, Non-fiction Creative Writing
    kestrelcreek.com

  • @Will said:
    @r1tger, does your zettelkasten have a single focus, or does it reflect varied interests?

    I have multiple interests, ranging from work-related (mostly IT and change management stuff), personal (favorite restaurants and recipes) and some beginner-level philosophy. This creates insight that is sometimes farfetched, and sometimes valuable. I'm still very happy it produces anything useful at all :-).

    If I had only time to review three notes at the start of the day, I’d choose three recently created notes.

    This one really resonates for me! You are absolutely right, re-enforcing what I'm already working with does make much more sense.

    I review yesterday’s new notes each morning, and sometimes there might be 10 notes if I had a productive day yesterday, onboarding an interesting book or paper. Some mornings I am greeted with zero new notes from yesterday. It's very random, but I've averaged 2.2451820128 new notes/day.

    I already have a list of new/modified notes for the last 7 days, so adding three notes for review to each new/modified note should not produce an enormous list. I'm still building the habit of working with my Zettelkasten on a regular basis, I tend to have peaks of productivity. I'll give this a try and see where I end up.

    Thank you so much for your input!

Sign In or Register to comment.