Measuring the ZK

System · October 2022

This discussion was created from comments split from: A milestone of sorts, what a ride it's been.

Will · October 2022

@Sascha said:
If you want to go down the rabbit hole, you might measure the "median depth of connection" by measuring the median length of trails of connections.

"The rabbit hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well."
~ Alice's Adventures in Wonderland

I'm a little unclear, as usual. This time about what is meant by "median depth of connection."
Let me explain my understanding, and maybe you can set me straight.

I have a zettel titled Workload Management 202210060815, and it has six links in it.

Link	Number of connections
1	2
2	15
3	8
4	10
5	10
6	8

So in my thinking, the zettel Workload Management 202210060815 has a median depth connection of NINE.

I randomly picked another zettel, Crisis In Environments Of Enclosure 202106180709, and it has six links.

Link	Number of connections
1	4
2	6
3	9
4	9
5	6
6	10

So in my thinking, the zettel Crisis In Environments Of Enclosure 202106180709 has a median depth connection of SEVEN point FIVE.

Is this what you are getting at? This process accounts for but one layer deep. What does the "median depth of connection" reveal about the zettel? How does knowing the "median depth of connection" of these two zettel help in understanding?

Sascha · October 2022

Oh, I forgot my explanation.

If you use this image for reference:

"Ring" has one trail with an average length of 6
"Baum" has three trails with an average length of 2 (each trail is two edges long)

If you collect all possible trails in your ZK and then take the median you'd could measure the median length of thought trails.

So, I don't mean it as a trait of an individual note but as a trait of your entire ZK.

ZettelDistraction · October 2022

By a trail do you mean a directed path $(a_0,\ldots,a_n)$ where $(a_i)$ links to $(a_{i+1})$?

Perhaps the average shortest path-length would be useful--there are several implementations in R, python, SageMath etc. It should be routine to assemble a data structure from a ZK to input to one of these algorithms. Suppose while gaining experience with the algorithmic libraries available, you began a project to collect, compute and display network statistics of ZKs. A project to collect and present ZK network statistics would be a magnet for network researchers if it were hosted here. A future rollout of The Archive could facilitate this. I'm not aware of any other ZK or ZK-related software development effort to compute these statistics.^a

It's hard to say without trying whether these statistics will help someone decide what to add and what to connect in their ZK, given what they want to get out of it. Often what they want is writing.

Why create a ZK? My answer now is that I want the ZK to support certain research projects. How so? It should help me to answer the 29 sets of questions from Carl Wieman's How to become a successful physicist. What kind of help? Help with the development of a predictive framework for deciding the answers to the 29 sets of questions that must be addressed for such projects to be successful.^‡

We found that all the experts organized their disciplinary knowledge in a way that was optimized for making decisions. We describe that knowledge-organization structure as a “predictive framework.”

For a ZK to be useful, it ought to facilitate the development of such predictive frameworks for writing, problem solving, etc. I'm assuming the obvious: that these activities require decision making, and that "... knowledge-free problem-solving is a meaningless concept."^† At least, I can't think of a better process for my own purposes than that given in How to become a successful physicist. A ZK had better add something to this effort.

Perhaps a project to gather network statistics of ZKs would offer some evidence. Such a project should address the 29 sets of questions of How to become a successful physicist. A better test would be whether following the guidelines with and without a ZK makes a difference.

^aIn addition to the average shortest path-length, there are other measures: László Gulyás, Gábor Horváth, Tamás Cséri, George Kampis An Estimation of the Shortest and Largest Average Path Length in Graphs of Given Density. But I would start by collecting graphs and computing network statistics from them with the available libraries.

^†You don't need a Nobel Laureate to state the obvious, but it can help to have their endorsement. In this case, the process is not obvious, and you need the Nobel Laureate to state it.

^‡For some projects, a subset of the 29 will do.

GeoEng51 · October 2022

I'm not sure what "median depth" would actually mean, although someone who is mathematically talented (like @ZettelDistraction ) could likely come up with a mathematical definition. It seems there are a couple of qualities, though:

The number of independent "chains" of zettels connected to a particular zettel.
The greatest length of one of the chains.
The average length of all the chains (maybe this is what @Will meant by median depth?)
The complexity of the network of all chains (not even sure how you would determine that, but maybe using patterns such as those shown by @Sascha )

There could certainly be other characteristics of a network.

Is this an algebraic topology problem? Not that I know anything about that; I just happened across the term in this article a while back:

https://www.technologyreview.com/2016/08/24/107808/how-the-mathematics-of-algebraic-topology-is-revolutionizing-brain-science/

Mike_Sanders · October 2022

Wasn't it Drucker who wrote: 'You can't manage what you can't measure'? I don't remember. Another method that (speaking only for myself) makes sense to the end user:

/*
   pseudo code (this is language dependent)

   increment node count within a given hub...
   for each node in hub y++;

   hmm... should that be recursive?

*/

then use formula below for tracking change...

   x = previous node count
   y = current node count
   z = percent of change

   z = x - y / y * 100

example 1 (decrease)...

   4 node count last month
   3 node count this month
   -25% = 3 - 4 / 4 * 100

example 2 (increase)...

   3 node count last month
   5 node count this month
   +66% = 5 - 3 / 3 * 100

example 3 (equilibrium)...

   5 node count last month
   5 node count this month
   0% = 5 - 5 / 5 * 100

commandline example...

   echo 5 3 | awk '{printf "%.2f%%\n", (($1 - $2) / $2) * 100}'

javascript example...

   function metrics(x, y) {

   /*
     formula for tracking change...
     x = node count last month
     y = node count this month
     z = percent of change (2 decimal places)
   */

     var z = (((x - y) / y) * 100).toFixed(2) + "%";

     return z;

   }

   console.log(metrics(5, 3));

Mike_Sanders · October 2022

Typo in my post above (x & y transposed in description ) ought to read...

/*
   pseudo code (this is language dependent)

   increment node count within a given hub...
   for each node in hub x++;

   hmm... should that be recursive?

*/

then use formula below for tracking change...

   x = current node count
   y = previous node count
   z = percent of change

   z = x - y / y * 100

example 1 (decrease)...

   4 node count last month
   3 node count this month
   -25% = 3 - 4 / 4 * 100

example 2 (increase)...

   3 node count last month
   5 node count this month
   +66% = 5 - 3 / 3 * 100

example 3 (equilibrium)...

   5 node count last month
   5 node count this month
   0% = 5 - 5 / 5 * 100

commandline example...

   echo 5 3 | awk '{printf "%.2f%%\n", (($1 - $2) / $2) * 100}'

javascript example...

   function metrics(x, y) {

   /*
     formula for tracking change...
     x = node count this month
     y = node count last month
     z = percent of change (2 decimal places)
   */

     var z = (((x - y) / y) * 100).toFixed(2) + "%";

     return z;

   }

   console.log(metrics(5, 3));

Will · October 2022

@iamaustinha said:
@Will, looks like I jumped back into the Forums at just the right time to congratulate you! Looking forward to when my Zettelkasten is as mature as yours!

@iamaustinha, thank you for your kind comments. Welcome back. A zettelkasten does mature as it grows from infancy to old age. I'm currently parenting a three-year-old with all the classical pleasures, surprises, and sorrows.

Sascha · October 2022

For further explanation @GeoEng51 @ZettelDistraction:

I am not sure what the metric will actually tell. Right now, I feel, the community is left with the amount of notes with the single metric for the Zettelkasten.

But there are quite some other possible metrics that could help to make some justified judgements about the nature of ones Zettelkasten.

I called the median length "depth of connection" just based on my intuition. I have the suspicion that there could be something usable by thinking in that direction of metrics.

I wonder what can be said about ones Zettelkasten when you can access when you have a number of those metrics and collect them automatically. Perhaps, there is something more sophisticated than my clunky way of finding that structure notes improved my note production.

GeoEng51 · October 2022

@Sascha I think what really takes me the most time and most improves my ZK is finding all the "good" links between zettels. That requires constant work and review but pays the most dividends. The more time I spend on that, I believe the more complex my ZK web becomes, which one can view from a graphical map of all connections. Perhaps there could be a metric that is based simply on the apparent complexity of a connection map (e.g., an automated visual assessment of the map)? I'm thinking a computer program that "looks" at the map and then assesses its complexity.

ZettelDistraction · October 2022

@GeoEng51 said:
I'm not sure what "median depth" would actually mean, although someone who is mathematically talented (like @ZettelDistraction ) could likely come up with a mathematical definition. It seems there are a couple of qualities, though:

The average length of a path is well known enough to have a definition on Wikipedia.
https://en.wikipedia.org/wiki/Average_path_length

There could certainly be other characteristics of a network.

Is this an algebraic topology problem? Not that I know anything about that; I just happened across the term in this article a while back:

https://www.technologyreview.com/2016/08/24/107808/how-the-mathematics-of-algebraic-topology-is-revolutionizing-brain-science/

The networks in the brain are orders of magnitude more complicated than every other Zettelkasten network except for @Sascha's, which exceeds that of the most interconnected human brain by the same ratios.

The algebraic topology in the paper is nice--the novelty is in the application more than in the mathematics.

Sascha · October 2022

@GeoEng51 I am still sceptical about the graph view since I never wittnessed any convincing example of its use. However, I am too ignorant to the possibilities of what can be achieved by computers.

I am still tinkering and collecting with all the measuring because I think it is way to early to come out with definitive claims. (I don't know how to judge the median length of thought trails. It could be "more is better" or a domain specific optimal length or even "shorter is better")

I am not even sure what complexity means regarding the ZK if one leaves the thankful realm of normal language.

I think what really takes me the most time and most improves my ZK is finding all the "good" links between zettels.

Perhaps, I backtrack a little bit from my position. Perhaps, there is a use case that connections between huge note clusters are exeptionally promising to review? A similar use case might be to look at the graph view and spot clusters that are not interconnected and review if one missed something.

But I am very biased to think that the on-the-ground-view is paramount. I am focussing on the individual connection since I cannot build a mental bridge from the individual connection between two ideas and some general trait of connections that could be used to access the content of the ZK in a meaningful (knowledge creation) way. To me, the graphical view is one step to far into the realm of abstraction.

I accumulate tids and bits from the most extrem end of the spectrum (like general traits of networks) in the hopes that something emerges when I don't have so much initial biases available to me.

emps · December 2022

Does anyone know what's the difference between Bus and Vollvermascht?
To me they look like the same topology - "everything is connected to everything, 1 link apart".

Sascha · December 2022

The difference seems that the all are connected by one edge. So, the Bus seems more similar to Stern (to me): The difference is only that there is no nod in the middle. (I focus more on the gestalt instead of particular traits)

Please, correct me dear mathematicians!

emps · December 2022

This's correct if graph edges are objects themselves, not just a way to map connections.
I hadn't considered a possibility for edges to be objects.

ZettelDistraction · December 2022

Bus is a 6-regular hypergraph with one "hyperedge," whereas Vollvermascht, the complete graph on 6 nodes, has 15 edges. All but Bus are ordinary graphs. All of them are hypergraphs. Bus is a hypergraph but not a graph. With the exception of Bus, all are 2-regular (their edges have 2 nodes) since they (except for Bus) are ordinary graphs, whereas Bus is a 6-regular hypergraph: each of its edges has 6 nodes--in this case there is one edge with six nodes, so all of the edges of Bus have six nodes.

r1tger · December 2022

@ZettelDistraction said:
The average length of a path is well known enough to have a definition on Wikipedia.
https://en.wikipedia.org/wiki/Average_path_length

Huh, turns out NetworkX has an algorithm built-in already for this.

Taking all the weakly connected components of the directed graph produced by my Zettelkasten and calculating the average (weighted) shortest path length I've now learned that for my Zettelkasten, it is 26. So, hurray? :-).

My Zettelkasten is completely interconnected (~700 notes now) with 4 unconnected notes, which are not taken into account.

Sascha · December 2022

Taking all the weakly connected components of the directed graph produced by my Zettelkasten and calculating the average (weighted) shortest path length I've now learned that for my Zettelkasten, it is 26. So, hurray? :-).

I think in some midterm future we should actually perform a study on various Zettelkastens.

GeoEng51 · December 2022

@Sascha said:
I think in some midterm future we should actually perform a study on various Zettelkastens.

I support that idea!

Mike_Sanders · December 2022

@Sascha said: I think in some midterm future we should actually perform a study on various Zettelkastens.

Using GNUPlot[1] to visualize my indices. Very handy tool if one's software stack can emit an index. Available for Nix/Mac/Win. Example scripts[2] & older but handy cheatsheet[3].

Here's a quick screenshot, have fun!

Zettelkasten Forum

Measuring the ZK

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion