# Tinybase

edited October 2022

Just gauging interest...

In the coming weeks I'm considering releasing my hyper-minimal take on Zettelkasten to a wider audience. My rationale has always been to squash complexity in favor of a quick & lightweight method of retrieving notes (nothing at all wrong with other apps that have all the bells & whistles btw). The same codebase runs on any of Windows, Linux, BSD (Apple likely to be included soon as well) with zero issues.

Some screen shots...

The interface as you noticed is commandline driven & features a REPL (Read Evaluate Print & Loop) to process commands. As it stands now, the source code is merely 27 Kilobytes in size & would be great to toss on a USB drive so your notes are easy to access. No smartphone interface as I'm not much interested in that domain. At any rate, if this sounds like something you might be interested in, check in every now & again & I'll hopefully post a download link eventually.

«1

• Okay, some great progress the last couple of days (screen shot at end of post - click for full image). Thus far...

• compound queries with a single invocation when comma-delimited: note x, note y, note z

• executive summary (I simply call it a header) at the top of each result

• line numbers (these can be toggled on/off within the REPL)

coming up...

• file metrics

• load & parse multiple files

• drag & drop (windows only?)

• I quench my thirst for plain text interfaces with Emacs nowadays, but this looks lovely and brings back DOS memories I haven't even had

Author at Zettelkasten.de • https://christiantietze.de/

• Looks very nice. Is this .txt only? No .md?

.Erdős #2. GitHub

• @ctietze said:
I quench my thirst for plain text interfaces with Emacs nowadays, but this looks lovely and brings back DOS memories I haven't even had

Many thanks & good ol' emacs, probably the original digital assistant. Went through my LISP phase at least a decade ago. Recalls a quote I once read...

I'm always delighted by the light touch and stillness of early programming languages. Not much text; a lot gets done. Old programs read like quiet conversations between a well-spoken research worker and a well-studied mechanical colleague, not as a debate with a compiler. - Jocelyn Ireson-Paine

You know, I visited your personal website awhile back. Very interesting reading.

• @ZettelDistraction said:
Looks very nice. Is this .txt only? No .md?

Hi ZD, nice to speak with you again.

No markdown support (but doesn't choke on it either). Best to think of a tinybase data file as a notebook or container of sorts that wraps your data. Here's an example...

tag: 001, project x, notes

line 1
line 2
line 3

tag: 002, project y, to do

line 1
line 2
line 3


Screen shot below shows current iteration...

I'm using bit fields to control the REPL. For instance, to toggle profile settings (included manual explains all), simply flip a bit on/1 or off/0:

invoking .p110 translates as:

1st bitfield 1 = headers on
2nd bitfield 1 = line numbers on
3rd bitfield 0 = tag index unsorted


File metrics are complete now as well (even a nifty md5sum to verify shared files).

Hope to be posting a detailed explanation of Tinybase soon. 🙂

• File metrics complete...

(1st screen shot shows metrics, 2nd screen shot for fun). md5sum is handy when you need to confirm with another that a shared file is byte-for-byte identical.

Coming up...

. Wildcard expansion when loading multiple files (unix like OSs only): tinybase *.txt

. Drag-n-drop, File associations, 'Send To' functionally (windows only)

. Paginated output with inbuilt default or set it yourself (windows: set PAGER=less.exe) (unix: export PAGER=less)

And introducing scripting support...

echo query1, query2 | tinybase file1 file2 file-nth

or...

cat script | tinybase file1 file2 file-nth

or...

type script | tinybase file1 file2 file-nth

or...

tinybase file1 file2 file-nth < script

or...

tinybase <<EOF file1, file2, file-nth

query1, query2 # text after a hash char is ignored
.t             # display tag index
.e             # display tinybase environment

EOF


Release date coming soon.

• Re: Wildcard expansion...

Done.

Re: Drag-n-drop...

Yup, all of it, complete as well. Here, I've plainly & simply described in the manual how to do these tasks for the end user (right click file & select 'open with', etc.) as it takes less verbiage to describe it than coding it!

And finally, the hexdump routine is included. I'd at first thought I'd wait till the next version, but since I finished it a while back, why not?

The hexdump function will likely be seldom used, but there are times when its quite handy, for instance...

Imagine you have some embedded control codes in your data such as a formfeed (control code ^L). A formfeed as we all know, instructs the printer to stop printing, eject the current page, & recommence printing on the next page. But some terminals, even lots of editors (ahem ~ Windows apps I'm looking at you) substitute control codes with a generic box character, which does nothing to describe the character's actual value. But we're going to solve that problem with Tinybase's inbuilt hexdump feature.

Three steps required...

1st consult the included manual for the hex-table as shown below...

2nd find the formfeed's hexadecimal value where the row & column intersect for ^L, transcribe its hex value from the left most column & top most row - 0C.

3rd confirm the existence of 0C in the middle pane within the hexdump (highlighted in blue in the image above).

/*

7 bit ascii/hex conversion table

printable characters: 20 to 2F through 70 to 7E

0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F

0  ^@  ^A  ^B  ^C  ^D  ^E  ^F  ^G  ^H  ^I  ^J  ^K  ^L  ^M  ^N  ^O
1  ^P  ^Q  ^R  ^S  ^T  ^U  ^V  ^W  ^X  ^Y  ^Z  ^[  ^\  ^]  ^^  ^_
2  SP   !   "   #   $% & ' ( ) * + , - . / 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 4 @ A B C D E F G H I J K L M N O 5 P Q R S T U V W X Y Z [ \ ] ^ _ 6  a b c d e f g h i j k l m n o 7 p q r s t u v w x y z { | } ~ DEL ^@ = NUL ^L = FF ^H = BS SP = SPACE ^Z = EOF ^[ = ESC ^G = BEL ^M = CR ^I = TAB ^J = LF Example: n = 6E */  Darn, I've been hard at this a good number of days now, off to take break & then, knock on wood, finish up the manual & this project is up & running. Wanted to offer up an earnest thanks for everyone's patience with the me as I'm acutely aware I just sort of tumbled into the forum of the blue, & even barged into a thread or two (sorry about that - still learning the netiquette in this domain). At any rate, about to be around the corner. • tag index updates... unsorted (as written by user): amemba dog, mutt, mammal cat, persian, mammal dog, beagle, mammal lizard, reptile sorted A-Z (duplicates merged): amemba beagle cat dog lizard mammal mutt persian reptile sorted by number of occurrences (1st field number of tag entries, 2nd field percentage relative to total tag count, 3rd field tag name): 3 025.00% mammal 2 016.67% dog 1 008.33% mutt 1 008.33% amemba 1 008.33% cat 1 008.33% persian 1 008.33% beagle 1 008.33% lizard 1 008.33% reptile  • Okay, I'm freezing the addition of any more 'features'. About 40% of the manual left (well honestly, I'd most of that written already). You know, seems to me in some respects documentation is the hardest part, nevertheless, I press onwards. Coffee... Latest screenshots: • Ah yes, I've grown to appreciate the form feed character thanks to the Emacs community, too, and configured the editor to display a horizontal rule To me, this shows how gracefully Markdown solves this stuff with "ASCII art"-drawing of an actual line with e.g. dashes --------------  That's arguable less problematic than the form feed character and other control codes. Author at Zettelkasten.de • https://christiantietze.de/ • Windows editors are the main culprit IMO. Rather than showing the control char, they simply supply a generic (often incorrect character). • This post describes some unique things you can do with tinybase. The 1st example shows you how to execute content (build your own code library!) with a tag query. And the 2nd example illustrates dynamic content generation. Ready? Lets do this.. Given a tinybase file named code-library.txt, comprising these tagged blocks... tag: 001, unix example ls *.txt tag: 002, win example @echo off & dir *.txt  In a unix-like OS we would invoke this expression to list all .txt files.. echo unix example | tinybase code-library.txt | /bin/sh  Under Win we would invoke this expression to list all .txt files.. echo win example | tinybase code-library.txt > tmp.cmd & tmp.cmd  The 2nd example shows you how to generate a html file. Assuming a tinybase file containing these tagged blocks: tag: 001, html-start <html> <body> tag: 002, x <p>Hello world from block x! tags: 03, y, ignore <p>Hello world from block y! <p>Eve though this tagged block matched the query, the astute reader will have noticed the keyword 'ignore' in the tagline, thus this block will not be streamed... tag: 004, html-stop </body> </html>  Now then we'll invoke tinybase as shown next to create an html file... echo html-start, x, y, html-stop | tinybase my-notes.txt > example.html  • Really wish I could edit out my typos, but that ability seems to not be granted to me... • Here's the results of my markdown testing here. formfeed (key ^L / hex x0C): bel (key ^G / hex x07): tab (key ^I / hex x09): eof (key ^Z / hex x1A):  formfeed: only emits a new line bel: nothing (but displays a black square with a white X within the forum editor) tab: works or is it simply passed along? I cant tell. eof: nothing But in markdown's defense: neither are these printable characters either so who knows. Me? I don't think markdown should have to worry about these items since html doesn't. • Slowly (still) rewriting the manual & also considering the addition of IPC. Example below shows the output of an unrelated command (just a random blurb) piped into the titlebar of Tinybase, perhaps for status messages? Must think about more... One of theses days, soon, this project will be complete. My take on what I'll dub the 'minimum viable product'. • Fuzzy matching (or I can't remember the confounded card's name) using the Levenshtein distance formula: . query (case insensitive) is: pack . tags are: shack, knack, pak, zap, city . let integer 'd' be the maximum distance between two words pack/shack (d == 2, or two chars max difference till common root 'ack') pack/knack (d == 2, ibid) pack/pak (d == 1, one char difference 'c') pack/zap (d == 3, or two chars max difference till common root 'a') pack/city (d == 4 & since the length of both query & tag also equals 4, no match in the least)  so... fuzzy/approximate matching (poor mans regex!) looks doable on this end. Now about the Haversine formula... • After a bit of wrangling, fuzzy search is going to make the cut... And a screen grab the of environment panel... Really happy with the way this is coming together. As my grandfather often said: 'That dog will hunt!' • Out of curiosity, have you benchmarked your algorithm with a large data set? I dabbled with this some years back but it was unusable for my actual notes. There's always 10k Markdown files if you want to check it out https://github.com/Zettelkasten-Method/10000-markdown-files Author at Zettelkasten.de • https://christiantietze.de/ • Yes, against hurdreds of thousands of tagged blocks megabytes in size. Checkout the metrics in 1st screen shot below where each file is doubled in size per iteration. The 'trick' for speed with Levenshtein (or at least it works for me) is to only compare a user's query against the taglist not the entire body of the file. 2nd screen shot below depicts my thinking, just the tagline (highlighted in blue). Hey nice! Thank for the test data Christian. • Gosh, I'm getting so darn close... & yet wouldn't be wise to rush things. • edited November 2022 Update, Tinybase site is here: https://xyz-software.github.io/index.html Earnest thanks to everyone. Post edited by Mike_Sanders on • Metrics update... Starting to get exited on my end. • Kicking around a subset of regular expressions... /* ^ Match beginning of a buffer$       Match end of a buffer
()      Grouping and substring capturing
\s      Match whitespace
\S      Match non-whitespace
\d      Match decimal digit
\n      Match new line character
\r      Match line feed character
\f      Match form feed character
\v      Match vertical tab character
\t      Match horizontal tab character
\b      Match backspace character
+       Match one or more times (greedy)
+?      Match one or more times (non-greedy)
*       Match zero or more times (greedy)
*?      Match zero or more times (non-greedy)
?       Match zero or once (non-greedy)
x|y     Match x or y (alternation operator)
\meta   Match one of the meta character: ^\$().[]*+?|\
\xHH    Match byte with hex value 0xHH, e.g. \x4a
[...]   Match any character from set. Ranges like [a-z] are supported
[^...]  Match any character but ones from set

*/


• edited November 2022

Okay, more progress on the documentation. Hoping for a completion of the 1st draft by late next week.

Next screen shot shows what a user sees when launching tinybase...

And here's a screen shot of the profile panel...

Tinybase now features 3 modes of parsing tags:

mode exact: 'abc' matches abc

mode phonetic: 'luze' sounds like lucy

mode regex (a subset): '(c|b|h)at' matches any of cat, bat, hat


Optional shell access & calculator (precision to 8 decimals points) ready too.

Here's a scripting example with tinybase invoked as:

tinybase < input.txt puzzles.txt INVALID-FILE-EXAMPLE > output.txt


Here's the contents of input.txt:

.s notepad         # shell test
.p                 # display profile (colors should be off while scripting)
fastfude, colers   # compound query with misspelled words...
.e (3.08^2+4.19^2) # pythagora's theorem
.e cat + dog       # test calc for invalid input


And finally here's the contents of output.txt

> .s notepad

shell access unavailable while scripting is in effect

-- end output ----------------------------------------------------------------

> .p

tinybase profile...

host          DESKTOP-P9O93R7
build         tinybase-5.00p
ansi support  not detected
block  color  default
number color  default
precision     3 decimal places
scan engine   phonetic
index         sorted A-Z
line numbers  on
shell         C:\WINDOWS\system32\cmd.exe
pager         @more

puzzles.txt
INVALID-FILE-EXAMPLE

-- end output ----------------------------------------------------------------

> fastfude, colers

file:  puzzles.txt
query: fastfude, colers
scan:  phonetic match found
block: 5

76  TAGS: 005, PUZZLE, FASTFOOD
77
78  FASTFOOD
79
80  Z V R S E I R F K E T T U T A  BURITTO
81  S T U N O D N P I E A F N K R  CHEESEBURGER
82  X O K R E M G O C Q C U B I Q  DONUTS
83  M T X B R E X N R G O D T O H  FRIES
84  C H E E S E B U R G E R C Y J  HOTDOG
85  N N R B Z O N I O N R I N G S  ICECREAM
86  F A U X U N R O T T I R U B M  NUGGETS
87  L Z V G Z Q K N I R D T F O S  ONION RINGS
88  P Z T T G F I C E C R E A M P  PIZZA
89  G I A U C E H K E E K A H S W  SHAKE
90  T P S M S U T F M A X I C Q S  SOFTDRINK
91  J Z D H D P V S L H N V S H S  TACO
92

-- end output ----------------------------------------------------------------

file:  puzzles.txt
query: fastfude, colers
scan:  phonetic match found
block: 13

212  TAGS: 013, PUZZLE, COLORS
213
214  COLORS
215
216  K F B U Y S R G D J P Y G V K  AZURE
217  N O D E E A Q P W L C A R M F  BLACK
218  A X W T Q U K O B W I P J L J  BLUE
219  Y E H V Y B L A C K E B I C H  CYAN
220  C H I W Q L K B S E H R K A I  GREEN
221  H V T G E I O C L O A Y U N D  INDIGO
222  L L E Y C P X R T C E W D Z F  MAGENTA
223  O B G A T N E G A M X I A M A  ORANGE
224  V I O L E T A R B N G P F O Z  RED
225  E D H X C O D E X O G V F X X  VIOLET
226  V F J B C L J E M A E E D X V  WHITE
227  G Z Z K J F G N R C F N Y H L  YELLOW
228

-- end output ----------------------------------------------------------------

file:  INVALID-FILE-EXAMPLE
query: fastfude, colers
scan:  can't open file

-- end output ----------------------------------------------------------------

> .e (3.08^2+4.19^2)

27.043

-- end output ----------------------------------------------------------------

> .e cat + dog

invalid expression...


Off to enjoy the weekend, grandkids are in town for a day or two

• edited November 2022

In 218 BC during the 2nd Punic wars, Hannibal, when questioned by his lieutenants on the rationality of crossing the Alps in the dead of winter with a herd of elephants, silenced naysayers with this reply: 'We will either find a way, or make a way'.

Just think of it, a 'packed' variable... A single digit can yield 10 differing 'settings' (0-9). That's a whole lot of value for very nearly no memory consumption. Example, profile = 75032211

/*

field     description     range  default

0.......  block color     0-8    0
..0.....  numbers color   0-8    0
...0....  calc precision  0-8    0
....0...  scan engine     0-2    0
.....0..  index engine    0-2    0
......0.  line numbers    0-1    0
.......0  shell           0-1    0

fields 1-3: colors

default  0
black    1
red      2
green    3
yellow   4
blue     5
magenta  6
cyan     7
white    8

field 4: calc precision

0  0
1  0.0
2  0.00
3  0.000
4  0.0000
5  0.00000
6  0.000000
7  0.0000000
8  0.00000000

field 5: scan engine

exact     0
phonetic  1
regex     2

field 6: index engine

unsorted              0
sorted A-Z            1
sorted by occurrence  2

field 7: line numbers

on   1
off  0

field 8: shell

on   1
off  0

*/
`
Post edited by Mike_Sanders on
• edited November 2022

What constitutes a word?

'a' is a word, 'i' as well...

And consider non-alphabetic or even mixed groupings: '876-fubar@example.com'

Me thinks that's a valid word too. So for myself, one or more printable characters surrounded by whitespace (space, tab, newline, etc.) is a word. Nice to have an easy solution when you're banging out code.

Also rediscovered a great quote I'd forgotten I had, good stuff...

• @Mike_Sanders Haha! I like that quote - fits with many similar quotes I have seen and my own (engineering) experience. Thanks for sharing.

• The bolt example reminds me of a sociological study that stuck with me: "Crime and Punishment in the Factory: The Function of Deviancy in Maintaining the Social System"

https://onwork.edu.au/bibitem/1963-Bensman,Joseph-Gerver,Israel-Crime+and+Punishment+in+the+Factory+The+Function+of+Deviancy+in+Maintaining+the+Social+System/

I bet you find copies of that to read. The crux is that even though drilling new threading into bolt holes is officially forbidden and punished, there's a rather complex informal agreement on when and how to re-thread holes, if needed.

Author at Zettelkasten.de • https://christiantietze.de/

• @GeoEng51 said:
@Mike_Sanders Haha! I like that quote - fits with many similar quotes I have seen and my own (engineering) experience. Thanks for sharing.

Excuse my delayed reply Geo, I've been out of town...

Chuckle, & glad you enjoyed it. Always feel welcome to post some of your own if you wish. Here's another nice one...

The chief cause of problems is solutions. - Eric Sevarei