mdlbear: (technonerdmonster)
[personal profile] mdlbear

I suppose I could have titled this one "Data-mining the to-DO loG", but since the log is kept in a directory called Dog...

My to.do file is somewhere in between a bullet journal and a logbook. Since its start as a pure to-do list in 2006, it has come to take on an increasingly important role in my life. (Some people might say that that's because my memory is deteriorating; they might be right.)

If you haven't already seen my How to.do it post, you might want to read that first. Or look under the cut in any of my "done since" posts. I mentioned a new tagging convention in this post. I have since extended it to make it easier to extract information, and also written a more general-purpose search tool. Because tool-using bear.

The net effect is that I can now easily answer questions like "when was the last time Colleen was discharged from a hospital" (answer: September 10), "what else did I do that day?" (answer: fix a messed-up fstab on Nova, and start to make a list of things I avoid doing, among other things), and so on.

As long as I can search and reliably find the search term and the date on the same line, grep works pretty well, and the convention of putting the mmdd date in parentheses right in front of one of the words "Admitted", "Discharged", or "Transferred" (or just the letter), and I can get "the last time Colleen was in the hospital" from:

grep '[0-9])D' 2*/*.done | tail -1

and the number of hospital stays in 2018 with

grep '[0-9])A' 2018/*.done | wc -l

Requiring a digit before the right parenthesis keeps me from getting false positives on things like "(gastroenterologist)Dr.". Other queries are equally simple. With a date somewhere on the line, I can find things like "CPAP" and "litter".

Of course, I had to go back searching for things like "admit" and "hospital" and put them into the correct format. But none of that helps much with queries like "what else was I doing?", because grep just returns a filename and a line number along with the lines that it finds. Then I have to go to emacs or less and navigate down to the line. It's possible to do better.

The solution was a script called dgrep, where the "d" stands for "done" or something like that. It does a couple of things differently:

  • Mainly, it knows that dates are four digits starting in column 1, so it can print them with the hit.
  • It knows where my to-do archive is, so I don't need to tell it what directories to search if I just want to search all of them.

so I can do the following:

 dgrep '[0-9]\)D' | tail -1
2019/09.done:247: 0910:   / (0910)Discharge instructions:

but there's one more trick. The '--less' option prints, not a filename and line number (which emacs and other editors can parse), but a command that you can use to search for that date:

 dgrep --less '[0-9]\)D' | tail -1
less -p ^0910 2019/09.done 247:   / (0910)Discharge instructions:

I just select the command, and click the middle mouse button to paste it into the command line. The help message also tells you the command line you need to look at each of the hits in succession.

The dgrep script is written in Perl and necessarily uses regular expressions, both of which are well into "now you have two problems" territory if you're not careful. But it works.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

NaBloPoMo stats:
   7582 words in 11 posts this month (average 689/post)
    618 words in 1 post today

Date: 2019-11-12 08:33 pm (UTC)
amaebi: black fox (Default)
From: [personal profile] amaebi
Offhand, I'm specially grateful that I have no need to keep track of hospital admissions and discharges, and sorry that it's so relevant in your lives.

Most Popular Tags

Style Credit

Page generated 2025-06-22 03:51 pm
Powered by Dreamwidth Studios
OSZAR »