An easy to use Markdown bulk editing tool

Der_Alte_Fritz · September 2024

I currently use Zkn3 as my Zettelkasten but as part of my disaster planning I like to be able to export the ZK to Markdown and then get it to work on other programmes. Just in case the day comes.

I really need some help, being a programming newbie, to find an easy to use bulk editor for Markdown files.

For instance my current task is to find this part of the .md file

Cross references

123, 567, 1240, 221

and convert it to Wikilinks ie:

Cross references

[[123]] [[567]] [[1240]] [[221]]

I have tried Visual Studio with soe help from Copilot AI and can identify the Heading , blank line and numbers but cannot get it to make the changes that I want.

GeoEng51 · September 2024

@Der_Alte_Fritz I'd suggest writing a Python script, but that would only help if you know some Python. That's how I converted one set of Zettelkasten files to work in Logseq.

If you are on Windows, Noteplan+ has some interesting capabilities, but not to the extent that you'll need.

ctietze · September 2024

@Der_Alte_Fritz How far did you get? Can you share what you have?

A task like this consists of at least these sequential sub-tasks that you can focus on and then combine:

Open a file in a scripting language of your choice (to do it with every file in your folder)
Scan ahead/look for the "Cross references" heading. 'readline' or some such are your friends to check line by line. Collect all lines up to that point in the process (to reassemble the file contents again later!). If the heading is not found, abort/skip the next steps (and 'continue' with the next file)
Take the line after the heading to split it by ", " (literal comma and space) into a list of numbers
For each split-apart number, wrap the number in "[[" and "]]". This transforms the list of numbers into a list of wikilinks. Pro tip: trim the pieces to remove extra whitespace just in case.
Combine ('join') the list of wikilinks into one line of text again, separated with a space.

Amontillado · September 2024

BBEdit has a feature called text factories that might be a good fit. I made one that ran a "replace all".

This will look bizarre if you aren't familiar with regular expressions, but bear with me. It's not bad.

I searched for:

([0-9]+),?\s

That's a string of numbers followed optionally with a comma, followed by a space. I replaced with:

[[\1]]

I put a space after "]]" to make things a little tidier.

That turned a line containing:

123, 567, 1240, 221

into:

[[123]] [[567]] [[1240]] [[221]]

Unfortunately, my simple pattern didn't preserve the line ending, so the contents of the following line appeared after the final hyperlink.

That would be an easy fix, but I'm out of time. Probably search for ([0-9]+),?([\s\n]+), replace with [[\1]]\2

BBEdit text factories can be run on file folders and their contents and/or descendants, or on open files. It's a very handy utility.

Hope that helps.

Andy · September 2024

@Amontillado said:

BBEdit has a feature called text factories that might be a good fit. I made one that ran a "replace all".

This will look bizarre if you aren't familiar with regular expressions, but bear with me. It's not bad.

I agree with @Amontillado that you should be able to do what you want with a single regular expression (regex) pattern. I use regex every day to transform text quickly and couldn't live without it.

I use and love BBEdit as well, although you can use any editor that does multi-file find-and-replace with regex. BBEdit's text factories are an easy-to-use graphical interface for creating, saving, and running scripts for text munging. But if you are simply doing a one-time find-and-replace with regex, you can also use BBEdit's multi-file search command if you don't need to retain the regex. Run tests on a copy of your data, of course.

msteffens · September 2024

If your Markdown files are consistently formatted, i.e. if your numbers to be replaced are:

always delimited by ", "
always prefixed with the fixed string: "Cross references" followed by two return characters

then you could simply use a tool like BBEdit to perform multiple multi-file search & replace actions (with "Grep" enabled) using these patterns:

Find: (?<=Cross references\r{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace: \01[[\02]]

Note that you have to repeatedly run these as a multi-file search & replace action until there's nothing to replace anymore.

If you want something that's more elegant and/or flexible, then you'll likely need to write a script instead.

Andy · September 2024

@msteffens said:

Find: (?<=Cross references\r{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace: \01[[\02]]

Note that you have to repeatedly run these as a multi-file search & replace action until there's nothing to replace anymore.

Look at those lookbehind/lookahead assertions—a sign of an advanced regex user!

Der_Alte_Fritz · September 2024

Step 4 Rename Note Sequence and Cross Reference data within the Note.md

OK my attempts so far have not succeeded in making the changes in the way that I need them to work. To change the .md file from

Cross references

(blank line)
123, 456, 1296 (last number lacks a comma)

to this changed file:

Cross references

(blank line)
[[123]] [[456]] [[1296]]

The expression needs to find the heading ## Cross references then a blank line and then change all the numbers on the third line and then stop and move onto the next file. In some cases this heading may not be present in which case it needs to move onto the next file.

Copilot:
Let’s refine the regular expression to ensure it only targets the specific line with numbers after the ## Cross references heading and handles cases where the heading might not be present. Here’s a more precise approach:
2. Open the Find and Replace Dialog: Press Ctrl + H to open the Find and Replace dialog in Visual Studio.
3. Enable Regular Expressions: Click on the “Use Regular Expressions” option (usually a small .* button).
4. Find Pattern: Use the following pattern to match the numbers on the third line after the ## Cross references heading, ensuring it stops after the line and handles cases with no numbers:
(## Cross references\s\n\s\n)(\d+(?:, \d+)*)

This part works and successfully identifies the Heading line, blank line and numbers line.

Attempt 1
The replacement part does not work
5. Set Up the Replace:
○ In the “Replace” field, enter the following to format the numbers as [[number]]:
$1[[\2]]

Run the Replace:
○ Click on the “Replace All” button to apply the changes to all matching files.
This will find all instances of the heading “## Cross references” followed by a blank line and a line of numbers, and replace each number with the format [[number]].
Instead all it does is replace all the numbers with a single instance of [[\2]]

msteffens · September 2024

I don't have Visual Studio, but considering your additional info, I think that these patterns should also work in Visual Studio (*):

Find: (?<=## Cross references[\n\r]{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace: $1[[$2]]

As mentioned above, with these patterns, you'd need to run multiple multi-file search & replace actions – until there's nothing to replace anymore.

(*) maybe you need to enable the "Use PCRE2" option in User Settings -> Features -> Search settings.

Der_Alte_Fritz · September 2024

Second try:

1. Open the Find and Replace dialog: Press Ctrl + H.
2. Enable Regular Expressions: Click on the .* button to enable regex mode.
3. Find what: Use the following regex pattern to match the heading, blank line, and the line of numbers:

(## Cross references\s\n\s\n)(\d+(?:, \d+))
4. Replace with: Use the following replacement pattern to remove the commas:
$1\2
Since Visual Studio’s regex engine might not handle this directly, let’s break it down into a more controlled approach:
Step 1: Match the Heading and Blank Lines
1. Find what:
(## Cross references\s\n\s\n)
2. Replace with:
$1
Step 2: Match and Replace Commas in the Third Line
1. Find what:
(## Cross references\s\n\s\n)(\d+(?:, \d+))
2. Replace with:
$1\2
Step 3: Remove Commas from the Numbers Line
1. Find what:
(\d+),\s*(\d+)
2. Replace with:
$1 $2
Repeat Step 3 until all commas are removed from the numbers line.
This should ensure that only the commas in the third line are removed.

Der_Alte_Fritz · September 2024

This second approach does not work either as it replaces just the first number with \2.

Der_Alte_Fritz · September 2024

@msteffens said:
I don't have Visual Studio, but considering your additional info, I think that these patterns should also work in Visual Studio (*):

Find: (?<=## Cross references[\n\r]{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace: $1[[$2]]

As mentioned above, with these patterns, you'd need to run multiple multi-file search & replace actions – until there's nothing to replace anymore.

(*) maybe you need to enable the "Use PCRE2" option in User Settings -> Features -> Search settings.

I am afraid that the search does not find anything in Visual Studios

msteffens · September 2024

I am afraid that the search does not find anything in Visual Studios

I’ve tested them locally, so these patterns should work in general for any modern PCRE compatible regex engine. But, of course, there could be some special behavior with Visual Studio that I‘m not aware of. Did you search for a "PCRE2" option or similar in the app's search settings?

msteffens · September 2024

If that doesn’t work, it would be helpful if you could provide a concrete sample file, or wrap an exact sample text as "Code" so that we could double check this sample.

Otherwise, the free version of BBEdit includes the multi-file search & replace features, and it’s a general-purpose tool that's always useful to have on your machine. So why not try that?

Also, its text factories feature (mentioned by others above) allows you to gather & execute all your search & replace actions with a single click. This will greatly help you in the future to also convert newly created .md files in the same fashion.

Der_Alte_Fritz · September 2024

I think that PCRE2 is working but really this is beyond my competence.

Der_Alte_Fritz · September 2024

Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.

Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.

In the meantime, I have attached an example Zettel. This one contains all the data fields that I am trying to convert to standard .md from the original Zkn3 export.

Der_Alte_Fritz · September 2024

Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.

Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.

This forum category will not allow me to upload a sample file.

msteffens · September 2024

Ah, I missed that you're using Windows. So maybe, it were just the end-of-line characters in my patterns causing the failure. Could you try these patterns instead:

Find: (?<=## Cross references\s\n\s\n)(.*?)(?<=\s)(\d+)(?:,|$)
Replace: $1[[$2]]

I did also overlook that you want to get rid of the commas, thus the additional change at the end of the Find pattern.

As before, you need to run them multiple times, until there's nothing left to replace.

Der_Alte_Fritz · September 2024

Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.

Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.

This forum category will not allow me to upload a sample file. But here is a screenshot of the exported Markdown file and these are the things that I am trying to change to be working Markdown links, etc.

Der_Alte_Fritz · September 2024

msteffens · September 2024

Did you try the revised patterns proposed above?

As an alternative, here are patterns that don't use any advanced syntax:

Find: (## Cross references\s\n\s\n.*?)( )?(\d+)(?:,|$)
Replace: $1$2[[$3]]

These should work as well.

As before, you need to run them multiple times, until there's nothing left to replace.

Der_Alte_Fritz · September 2024

I am trying an alternative programme Sublime Text as a means of running the expressions as Visual Basic does not seem to be runing them

Der_Alte_Fritz · September 2024

@msteffens said:
Ah, I missed that you're using Windows. So maybe, it were just the end-of-line characters in my patterns causing the failure. Could you try these patterns instead:

Find: (?<=## Cross references\s\n\s\n)(.*?)(?<=\s)(\d+)(?:,|$)
Replace: $1[[$2]]

I did also overlook that you want to get rid of the commas, thus the additional change at the end of the Find pattern.

As before, you need to run them multiple times, until there's nothing left to replace.

Well done!
This pattern replaces all the numbers, with the correct [[number]] except for the last number in the series which has no ','

msteffens · September 2024

That's good to hear, though actually the pattern is designed to also replace the last number which has no trailing comma.

With an app that uses a PCRE compatible regex engine, both of my last given patterns should allow to replace all numbers.

Der_Alte_Fritz · September 2024

@msteffens said:
Did you try the revised patterns proposed above?

As an alternative, here are patterns that don't use any advanced syntax:

Find: (## Cross references\s\n\s\n.*?)( )?(\d+)(?:,|$)
Replace: $1$2[[$3]]

These should work as well.

As before, you need to run them multiple times, until there's nothing left to replace.

Another success! This version gives the same outcome, replaces all the number, with [[number]] after multiple runs except for the last one which has no comma.

msteffens · September 2024

Good, though I'm at a loss to explain why the last number doesn't get replaced for you as well. The patterns reliably do this for me locally at least.

Note that, as mentioned before, you need to run your search & replace actions multiple times, until there's nothing left to replace.

msteffens · September 2024

Ah, I think I understand your issue now: I forgot that my editor (BBEdit) has the regex "multi-line" mode (which allows ^ and $to match on every line) enabled by default, and yours probably has not. So please try this pattern instead:

Find: (## Cross references\s\n\s\n.*?)( )?(\d+)(?m:,|$)
Replace: $1$2[[$3]]

Der_Alte_Fritz · September 2024

The bonus is that this expression works for the next step ## Note Sequences as well.

Der_Alte_Fritz · September 2024

@msteffens said:
Ah, I think I understand your issue now: I forgot that my editor (BBEdit) has the regex "multi-line" mode (which allows ^ and $to match on every line) enabled by default, and yours probably has not. So please try this pattern instead:

Find: (## Cross references\s\n\s\n.*?)( )?(\d+)(?m:,|$)
Replace: $1$2[[$3]]

No, I am afraid that it does not find anything even on an initial unmodified data set.

Der_Alte_Fritz · September 2024

@msteffens said:
Good, though I'm at a loss to explain why the last number doesn't get replaced for you as well. The patterns reliably do this for me locally at least.

Note that, as mentioned before, you need to run your search & replace actions multiple times, until there's nothing left to replace.

I ran the expression multiple times until it returned zero results. I then ran it on a full dataset of 2,500 .md files and it did the same thing, replacing the numbers all but the last one.

Der_Alte_Fritz · September 2024

On the plus side, I imported the 2,500 markdown zettels into Obsidian as a test and the links are all there and working well.

Zettelkasten Forum

An easy to use Markdown bulk editing tool

Cross references

Cross references

Comments

Cross references

Cross references

Howdy, Stranger!

Quick Links

Categories

In this Discussion