An easy to use Markdown bulk editing tool
I currently use Zkn3 as my Zettelkasten but as part of my disaster planning I like to be able to export the ZK to Markdown and then get it to work on other programmes. Just in case the day comes.
I really need some help, being a programming newbie, to find an easy to use bulk editor for Markdown files.
For instance my current task is to find this part of the .md file
Cross references
123, 567, 1240, 221
and convert it to Wikilinks ie:
Cross references
[[123]] [[567]] [[1240]] [[221]]
I have tried Visual Studio with soe help from Copilot AI and can identify the Heading , blank line and numbers but cannot get it to make the changes that I want.
Post edited by ctietze on
Howdy, Stranger!
Comments
@Der_Alte_Fritz I'd suggest writing a Python script, but that would only help if you know some Python. That's how I converted one set of Zettelkasten files to work in Logseq.
If you are on Windows, Noteplan+ has some interesting capabilities, but not to the extent that you'll need.
@Der_Alte_Fritz How far did you get? Can you share what you have?
A task like this consists of at least these sequential sub-tasks that you can focus on and then combine:
", "
(literal comma and space) into a list of numbers"[["
and"]]"
. This transforms the list of numbers into a list of wikilinks. Pro tip: trim the pieces to remove extra whitespace just in case.Author at Zettelkasten.de • https://christiantietze.de/
BBEdit has a feature called text factories that might be a good fit. I made one that ran a "replace all".
This will look bizarre if you aren't familiar with regular expressions, but bear with me. It's not bad.
I searched for:
([0-9]+),?\s
That's a string of numbers followed optionally with a comma, followed by a space. I replaced with:
[[\1]]
I put a space after "]]" to make things a little tidier.
That turned a line containing:
123, 567, 1240, 221
into:
[[123]] [[567]] [[1240]] [[221]]
Unfortunately, my simple pattern didn't preserve the line ending, so the contents of the following line appeared after the final hyperlink.
That would be an easy fix, but I'm out of time. Probably search for ([0-9]+),?([\s\n]+), replace with [[\1]]\2
BBEdit text factories can be run on file folders and their contents and/or descendants, or on open files. It's a very handy utility.
Hope that helps.
@Amontillado said:
I agree with @Amontillado that you should be able to do what you want with a single regular expression (regex) pattern. I use regex every day to transform text quickly and couldn't live without it.
I use and love BBEdit as well, although you can use any editor that does multi-file find-and-replace with regex. BBEdit's text factories are an easy-to-use graphical interface for creating, saving, and running scripts for text munging. But if you are simply doing a one-time find-and-replace with regex, you can also use BBEdit's multi-file search command if you don't need to retain the regex. Run tests on a copy of your data, of course.
If your Markdown files are consistently formatted, i.e. if your numbers to be replaced are:
then you could simply use a tool like BBEdit to perform multiple multi-file search & replace actions (with "Grep" enabled) using these patterns:
Find:
(?<=Cross references\r{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace:
\01[[\02]]
Note that you have to repeatedly run these as a multi-file search & replace action until there's nothing to replace anymore.
If you want something that's more elegant and/or flexible, then you'll likely need to write a script instead.
@msteffens said:
Look at those lookbehind/lookahead assertions—a sign of an advanced regex user!
Step 4 Rename Note Sequence and Cross Reference data within the Note.md
OK my attempts so far have not succeeded in making the changes in the way that I need them to work. To change the .md file from
Cross references
(blank line)
123, 456, 1296 (last number lacks a comma)
to this changed file:
Cross references
(blank line)
[[123]] [[456]] [[1296]]
The expression needs to find the heading ## Cross references then a blank line and then change all the numbers on the third line and then stop and move onto the next file. In some cases this heading may not be present in which case it needs to move onto the next file.
Copilot:
Let’s refine the regular expression to ensure it only targets the specific line with numbers after the ## Cross references heading and handles cases where the heading might not be present. Here’s a more precise approach:
2. Open the Find and Replace Dialog: Press Ctrl + H to open the Find and Replace dialog in Visual Studio.
3. Enable Regular Expressions: Click on the “Use Regular Expressions” option (usually a small .* button).
4. Find Pattern: Use the following pattern to match the numbers on the third line after the ## Cross references heading, ensuring it stops after the line and handles cases with no numbers:
(## Cross references\s\n\s\n)(\d+(?:, \d+)*)
This part works and successfully identifies the Heading line, blank line and numbers line.
Attempt 1
The replacement part does not work
5. Set Up the Replace:
○ In the “Replace” field, enter the following to format the numbers as [[number]]:
$1[[\2]]
○ Click on the “Replace All” button to apply the changes to all matching files.
This will find all instances of the heading “## Cross references” followed by a blank line and a line of numbers, and replace each number with the format [[number]].
Instead all it does is replace all the numbers with a single instance of [[\2]]
I don't have Visual Studio, but considering your additional info, I think that these patterns should also work in Visual Studio (*):
Find:
(?<=## Cross references[\n\r]{2})(.*?)(?<=\s)(\d+)(?=,|$)
Replace:
$1[[$2]]
As mentioned above, with these patterns, you'd need to run multiple multi-file search & replace actions – until there's nothing to replace anymore.
(*) maybe you need to enable the "Use PCRE2" option in User Settings -> Features -> Search settings.
Second try:
(## Cross references\s\n\s\n)(\d+(?:, \d+))
4. Replace with: Use the following replacement pattern to remove the commas:
$1\2
Since Visual Studio’s regex engine might not handle this directly, let’s break it down into a more controlled approach:
Step 1: Match the Heading and Blank Lines
1. Find what:
(## Cross references\s\n\s\n)
2. Replace with:
$1
Step 2: Match and Replace Commas in the Third Line
1. Find what:
(## Cross references\s\n\s\n)(\d+(?:, \d+))
2. Replace with:
$1\2
Step 3: Remove Commas from the Numbers Line
1. Find what:
(\d+),\s*(\d+)
2. Replace with:
$1 $2
Repeat Step 3 until all commas are removed from the numbers line.
This should ensure that only the commas in the third line are removed.
This second approach does not work either as it replaces just the first number with \2.
I am afraid that the search does not find anything in Visual Studios
I’ve tested them locally, so these patterns should work in general for any modern PCRE compatible regex engine. But, of course, there could be some special behavior with Visual Studio that I‘m not aware of. Did you search for a "PCRE2" option or similar in the app's search settings?
If that doesn’t work, it would be helpful if you could provide a concrete sample file, or wrap an exact sample text as "Code" so that we could double check this sample.
Otherwise, the free version of BBEdit includes the multi-file search & replace features, and it’s a general-purpose tool that's always useful to have on your machine. So why not try that?
Also, its text factories feature (mentioned by others above) allows you to gather & execute all your search & replace actions with a single click. This will greatly help you in the future to also convert newly created .md files in the same fashion.
I think that PCRE2 is working but really this is beyond my competence.
Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.
Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.
In the meantime, I have attached an example Zettel. This one contains all the data fields that I am trying to convert to standard .md from the original Zkn3 export.
Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.
Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.
This forum category will not allow me to upload a sample file.
Ah, I missed that you're using Windows. So maybe, it were just the end-of-line characters in my patterns causing the failure. Could you try these patterns instead:
Find:
(?<=## Cross references\s\n\s\n)(.*?)(?<=\s)(\d+)(?:,|$)
Replace:
$1[[$2]]
I did also overlook that you want to get rid of the commas, thus the additional change at the end of the Find pattern.
As before, you need to run them multiple times, until there's nothing left to replace.
Looked at BBEdit but it appears to be only available for Macs while I need a Windows 11 programme.
Sublime and Notepad++ seem to be alternative programmes for BBEdit using Windows.
This forum category will not allow me to upload a sample file. But here is a screenshot of the exported Markdown file and these are the things that I am trying to change to be working Markdown links, etc.
Did you try the revised patterns proposed above?
As an alternative, here are patterns that don't use any advanced syntax:
Find:
(## Cross references\s\n\s\n.*?)( )?(\d+)(?:,|$)
Replace:
$1$2[[$3]]
These should work as well.
As before, you need to run them multiple times, until there's nothing left to replace.
I am trying an alternative programme Sublime Text as a means of running the expressions as Visual Basic does not seem to be runing them
Well done!
This pattern replaces all the numbers, with the correct [[number]] except for the last number in the series which has no ','
That's good to hear, though actually the pattern is designed to also replace the last number which has no trailing comma.
With an app that uses a PCRE compatible regex engine, both of my last given patterns should allow to replace all numbers.
Another success! This version gives the same outcome, replaces all the number, with [[number]] after multiple runs except for the last one which has no comma.
Good, though I'm at a loss to explain why the last number doesn't get replaced for you as well. The patterns reliably do this for me locally at least.
Note that, as mentioned before, you need to run your search & replace actions multiple times, until there's nothing left to replace.
Ah, I think I understand your issue now: I forgot that my editor (BBEdit) has the regex "multi-line" mode (which allows
^
and$
to match on every line) enabled by default, and yours probably has not. So please try this pattern instead:Find:
(## Cross references\s\n\s\n.*?)( )?(\d+)(?m:,|$)
Replace:
$1$2[[$3]]
The bonus is that this expression works for the next step ## Note Sequences as well.
No, I am afraid that it does not find anything even on an initial unmodified data set.
I ran the expression multiple times until it returned zero results. I then ran it on a full dataset of 2,500 .md files and it did the same thing, replacing the numbers all but the last one.
On the plus side, I imported the 2,500 markdown zettels into Obsidian as a test and the links are all there and working well.