My very noobish attempt at making atomic extracts from a PDF
I work with PDFs a lot in my academic work. Mostly reading journal articles. Commonly, I make highlights of important parts of the text. To support my writing, I want to extract each highlight as an individual, atomic text file. Doing this by hand would be tedious. I have very novice code skills, but I know enough Python and CL to be dangerous. In any case, I set a minimal goal of getting highlights out of a Skim PDF and making individual files. Happily, I did that. The code is crude and there is a lot more to do, but I want to share this to invite anyone else who might be interested to contribute or provide feedback.
Here it is: skim_to_md
With this basic functionality, I will keep layering more on. My ultimate goal is to format the text with additional fields and metadata using Markdown. I could probably hard code this into the script, but I'm thinking that using a templating engine like Django or Jinja2 will allow this workflow to be more flexible and extensible in the future. So that's where I'll go next. I'm totally open to changing the plan.
I realize that these extract notes are information, not knowledge @ctietze. But they are the beginning of knowledge. I have different kinds of notes in my workflow, and this is the raw material. My knowledge zettel are where I string pieces together in an overview/outline note, linking to the raw extracts I create here.
Here is an example of what I want a full-fledged note to look like:
[The highlighted text]
Reference: Smith et al. 2017 Journal of Neuroscience
Stuff I need to add:
- Work on naming the files more informatively
- Make different Markdown fields using templates
- Add reference information/citekey
- Clean up the quoted text more
Interacting with these notes
And of course, once you make these, you need to work with them. At a very basic level, I could just use the Finder. But having a very fast and user-friendly interface could help me make the most of these notes. nvALT.app is a great example. In the end, it's just a nice wrapper for text files in a folder. But that wrapper makes working with the files that much better. I'm trying to make SublimeText3 to work for me. @rene 's add-on gives a lot of the feel of nvALT, which is great. So I may do that. Or if Bitwriter.app ever comes out, that might be the best way to interact with the notes, since I hear it will support folders.
It looks like you're new here. If you want to get involved, click one of these buttons!