Zettelkasten Forum


Techniques: PDF Annotation?

I know there are a bunch of folks here who do more academic work than I, who have probably developed workflows around making annotations in PDF files. I'd like to know what y'all are doing in this space.

My Wants:

  • Notes must be in the Zettel, and thus in markdown.
  • I'd like to have references to pages
  • I'd really like to have something like "highlight some text and write a note referring to it", and this is the big challenge I see. How do I mark where in a pdf I'm referencing while using plain text stored outside the pdf?

My current method:
I just quote the line I want, write my comments, and put a [@bibtex] reference. Mostly it's one note per book, but if something inspires a break out note that's a thing I do too. Also I barely understand bibtex, so seeing how other people use it might be helpful.

Thanks!

[EDIT: After writing this I scrolled down and saw this forum post, which addresses some of the questions I have. But I'm still interested in talking about general techniques here.]

Comments

  • References to pages can be easy. You just have to decide:

    • do you want one cite-key per reference (PDF + page)
    • can you live with just one cite-key per PDF

    in your bib file?

    The latter is more or less the standard way if I am correct. With pandoc, you can add page numbers, chapter numbers, etc, just after a comma (see http://pandoc.org/demo/example19/Extension-citations.html)

    Blah blah [see @doe99, pp. 33-35; also @smith04, chap. 1].
    Blah blah [@doe99, pp. 33-35, 38-39 and passim].
    Blah blah [@smith04; @doe99].

    pandoc-citeproc detects locator terms in the CSL locale files. Either abbreviated or unabbreviated forms are accepted. In the en-US locale, locator terms can be written in either singular or plural forms, as book, bk./bks.; chapter, chap./chaps.; column, col./cols.; figure, fig./figs.; folio, fol./fols.; number, no./nos.; line, l./ll.; note, n./nn.; opus, op./opp.; page, p./pp.; paragraph, para./paras.; part, pt./pts.; section, sec./secs.; sub verbo, s.v./s.vv.; verse, v./vv.; volume, vol./vols.; ¶/¶¶; §/§§. If no locator term is used, “page” is assumed.

    When working with Zettel notes you might want to conserve the exact location that you refer to directly in your bib file (you would end up with more than 1 citekey per PDF/book/... ). I can see why you'd want to do that. The sources just happen to be from the same PDF but are separate sources for separate thoughts - and instead of just doing it in the Zettel note(s) you may want this clear separation to manifest directly in your bib file. In that case you'd have to use a work-around. I read somewhere (https://tex.stackexchange.com/questions/302098/citation-with-page-number-does-not-work) that adding a note = "p. 14" field would do the trick but that only seems to apply to certain LaTeX \bibliographystyle{unsrt} styles.

    If your .bib entries are of type @article, for instance, you can add a pages = {pp. 1-27} field. It won't show up in the reference but in the bibliography at the end. I don't recommend putting page numbers in titles.

    Just my 2 cents.

    If you find a cunning way, please let me know :smile:

    • I'd really like to have something like "highlight some text and write a note referring to it", and this is the big challenge I see. How do I mark where in a pdf I'm referencing while using plain text stored outside the pdf?

    And if you find a cunning way for that, then also let us know here :smile:

  • Can you give us an exact example as a use case?

    I am a Zettler

  • @Sascha said:
    Can you give us an exact example as a use case?

    Well, I can tell you what I'm doing currently. NB, again, I am doing fiction work and not academic, so this is a lot about idea development, and I am not sure what my personal best practice is going to be. In other words: I have no idea what I need here, I'm just trying to put a technical system in place for consistency as I develop my conceptual system. I know that's probably backwards :)

    So, right now, as background for a project I'm working on, I'm reading The Headmap Manifesto(pdf, background page is here). My project involves developing ideas spinning off the text, so I have, right now, a Zettel document that looks like

    # 201711282014 Headmap Manifesto
    ## Links
    
    [Headmap Manifesto and Redux](http://technoccult.net/technoccult-library/headmap/)
    
    [manifesto-5.indd - headmap-manifesto.pdf](http://technoccult.net/wp-content/uploads/library/headmap-manifesto.pdf)
    
    All quotes below are [@russel_headmap_1999] 
    
    ## See Also
    [Augmented Reality]([[201711282151]])
    
    ## Notes
    
    
      everything in the world, animate and inanimate, abstract and concrete, has thoughts attached 

    etc etc etc.

    The [@russel_headmap_1999] comes from a .bib file that was generated by Zotero, which lives in my Zettel directory. Also I know for "pure Zettel" I should probably break each of those ideas into individual files. I wasn't worried about that at first, but maybe that it the correct solution? It seems overkill though, as, e.g., one of my notes reads, in its entirety, "Daaaammn."

    Thoughts?

  • Sooo.... Another way of referencing, one that keeps the source and the page reference together, would be to use pandoc references that include the page numbers. Not sure how much that helps you:

    ## Notes
    
    
      everything in the world, animate and inanimate, abstract and concrete, has thoughts attached 
  • @rene said:
    Sooo.... Another way of referencing, one that keeps the source and the page reference together, would be to use pandoc references that include the page numbers. Not sure how much that helps you:

    Great! If that's a standard that other software is going to recognize, that sounds like the right thing here (failing some kind of per-line reference). Thanks!

  • edited November 2017

    Since Rene advocates the Pandoc way here, let me chime in with MultiMarkdown:

    [3][#russel_headmap_1999]
    

    I picked (and still prefer) MMD over Pandoc for Zettel notes because it's easy to make the notes self-contained by defining the citekey and its contents at the bottom of the note.

    See Manage Citations for a Zettelkasten
    from 2013 for some screenshots.

    Pandoc can store the reference info, too, though you'd have to put it into the header (which supports YAML). MMD is closer to Markdown, Pandoc is more versatile and a really great format to manage publication data.

    Author at Zettelkasten.de • https://christiantietze.de/

  • @ctietze Very smooth, the MMD way.
    I hate to put references into the YAML header.

    Currently I just put [@pandoc] references where I need them and for self-containment (though not able to link, as in hyper-link, references to the self-contained zettel-bibliography) I put in a comment block:

    <!-- references (auto)
    
    [@Projekt2017]: “Projekt.” 2017. _Wikipedia_, October. https://de.wikipedia.org/w/index.php?title=Projekt\&oldid=170400845.
    
    -->
    
  • So it is a question on how to reference to a text? Then I would prefer the MMD-Approach. I still use the same thing Christian posted.

    I thought that your question is about the arrangement of the notes. In my opinion, you should never annotate a text but rather create a self-sufficient note that is understand even isolated from the text.

    Anyhow, I made two screenshots (while missunderstanding your question):

    A structure note on a book

    This is a novel from one of my favourite authors William Quindt

    But I can collect all the notes via the search for "#quindt1997" which is the citekey.

    This is handy if the text itself is important.

    A note on novel

    This is a note that just contains the question of the possibility of evil divinity. A concept that I find very interesting for later use for writing and even ethics.

    I made this screenshot if the question was how to make notes out of literature.

    I am a Zettler

  • @ctietze said:
    Since Rene advocates the Pandoc way here, let me chime in with MultiMarkdown:

    [3][#russel_headmap_1999]
    

    I picked (and still prefer) MMD over Pandoc for Zettel notes because it's easy to make the notes self-contained by defining the citekey and its contents at the bottom of the note.

    Oho! I had not realized there was an MMD technique for this. I already use MMD by default so maybe I'll try to make this work instead.

    See Manage Citations for a Zettelkasten
    from 2013 for some screenshots.

    Of course there's already a post on this. Thanks!

    Pandoc can store the reference info, too, though you'd have to put it into the header (which supports YAML). MMD is closer to Markdown, Pandoc is more versatile and a really great format to manage publication data.

    I've been using @rene's auto-bib function in Sublime Text to insert refs in code comments as he mentions below. Is there any programmatic reason they need to go in the header instead?

  • @mediapathic
    I've been using @rene's auto-bib function in Sublime Text to insert refs in code comments as he mentions below. Is there any programmatic reason they need to go in the header instead?

    They would need to go into the header only if you didn't use a .bib file. See https://pandoc.org/MANUAL.html#citations :

    As an alternative to specifying a bibliography file using --bibliography or the YAML metadata field bibliography, you can include the citation data directly in the references field of the document’s YAML metadata. The field should contain an array of YAML-encoded references, for example:

    ---
    references:
    - type: article-journal
      id: WatsonCrick1953
      author:
      - family: Watson
        given: J. D.
      - family: Crick
        given: F. H. C.
      issued:
        date-parts:
        - - 1953
          - 4
          - 25
      title: 'Molecular structure of nucleic acids: a structure for deoxyribose
        nucleic acid'
      title-short: Molecular structure of nucleic acids
      container-title: Nature
      volume: 171
      issue: 4356
      page: 737-738
      DOI: 10.1038/171737a0
      URL: http://www.nature.com/nature/journal/v171/n4356/abs/171737a0.html
      language: en-GB
    ...
    

    I think it's immediately obvious why you wouldn't want to put this into a Zettel note.

    Now, when using a .bib file, you don't need to put the bibliography into the markdown document for citations and bibliography to work. That's why the plugin puts it into a comment block. It's just there so you have all the references inside the note for you yourself. Good if you ever loose that .bib file.

    I don't have multimarkdown on my system(s) but I can try to play with it.

    I think it would be a neat extension to the plugin to make the citation method configurable. Currently it's 'pandoc'. But 'multimarkdown' style with [#citekey] completion is easily doable, and maybe there is a way to do the autobib with [#citekey]: full-blown refrence without pandoc.

  • @mediapathic
    I haven't updated the README yet. But: if you put a line

        "citations-mmd-style": true,
    

    into the plugin's settings, then you can refresh all auto-bibs by re-running the auto-bib command. They will be converted to multimarkdown style.

    In addition, citekey-completion will mark citekeys with a # (in the autocompletion suggestion list) and insert [][#citekey] into your documents. You just have to fill in the page numbers if you like. Multimarkdown makes no difference if you use page-numbers or line-numbers or paragraph-numbers, etc. You can put anything you want into the first pair of brackets, as far as I understand (little of which I do, mmd is rather new to me).

    For this to work, pandoc is still needed. It is used to create the (auto-) bibliographies. So don't delete it yet :smile:

  • I kinda lost track of what the topic is about:

    My Wants:

    • Notes must be in the Zettel, and thus in markdown.

    -- this is about how to take notes, I guess? Does Sascha's stuff tackle this?

    • I'd like to have references to pages

    -- initially, this confused me, but it seems you didn't know about the page parameters you can pass to pandoc citations and used to create 1 entry per citation, so ✓

    • I'd really like to have something like "highlight some text and write a note referring to it", and this is the big challenge I see. How do I mark where in a pdf I'm referencing while using plain text stored outside the pdf?

    -- yeah, that'd be nice, and the best thing I know is the PDF reader Skim so far. Tighter Zettelkasten integration would be helpful, but then again, writing Zettel notes while you try to read and understand a text is not a good idea anyway, so you need 2 passes, one for marking text and one for extracting Zettel notes anyway.

    Author at Zettelkasten.de • https://christiantietze.de/

Sign In or Register to comment.