Probability #2

The Monty Hall Problem

You’ve surely heard of this one. It’s a really tough one to wrap your head around. I’ll describe the problem in order to solve it, but it’s worth looking into the background of the “Ask Marilyn” incident if you’re not familiar with it.

Behind Monty Hall’s Doors

The story shows that it’s not just hacks like myself that find this stuff non-intuitive. Lots of highly educated, very smart people got tricked by this one.

The problem is based on an old game show, “Let’s Make a Deal”. The host of the show was named Monty Hall. The contestant is shown three doors. Behind two doors are goats. Behind one is A NEW CAR!!!

The contestant chooses a door. If he chooses the one with THE NEW CAR, he gets to keep the car. If he chooses a goat door, he gets a goat. The unspoken assumption being that he’d rather have a car.

But… before the chosen prize is revealed, Monty Hall opens one of the two remaining doors, revealing a goat!

There are now two doors remaining. The one that the contestant chose, and one other one. Monty asks the contestant if he wants to stay with his choice, or if he’d like to switch to the other door.

What should he do?

To clarify, what should he do in order to optimize his chances of winning THE NEW CAR? Are the odds of winning the car better if he sticks with his original choice? Are they better if he switches? Does it just not matter?

Intuition tells us that it doesn’t matter. There are two doors. One has a goat, one has a car. The odds are 50/50. Doesn’t really matter if he sticks with the first choice or switches. The odds are the same.

The truth though? He should absolutely switch. It will significantly improve his odds of winning. In fact, he’s twice as likely to win if he switches than if he sticks with his original choice.

But why???

If you want a deep explanation… I mean really deep, just head over to the Monty Hall Problem page on wikipedia. But I’ll give my take on it.

When the contestant first chose a door, he had a 1 in 3 chance of winning the car. That’s pretty straightforward.

When Monty opens a door, he is not randomly opening a door. He’s opening a door with a goat behind it, always. So he is using his knowledge of what’s behind each door to change the nature of the game.

Let’s say we have doors A, B, and C. A and B have goats, C has a car.

There are three possible scenarios:

  1. Contestant chooses door A. Monty will open door B. If contestant switches doors, he wins.
  2. Contestant chooses door B. Monty opens A. If contestant switches, he wins.
  3. Contestant chooses door C. Monty opens either A or B. It doesn’t matter. If contestant switches, he loses.

So in two of the three scenarios, switching is the winning move.

Even though I’ve worked out the logic of this myself many times and done it in code more than once, it still feels vaguely magical.

Bring on the Code

Let’s start by playing the game 1000 times in a for loop. And we’ll start by setting up the three doors with goats behind them and swapping out a car on one of them.

// we'll play the game 1000 times.
for (let i = 0; i < 1000; i++) {
  // create 3 doors with goats behind them.
  let doors = ["goat", "goat", "goat"];

  // randomly choose one door and put a car behind it.
  let winner = Math.floor(Math.random() * 3);
  doors[winner] = "car";
  console.log(doors);
}

This outputs the 1000 random door configurations. Something like this:

[ 'car', 'goat', 'goat' ]
[ 'car', 'goat', 'goat' ]
[ 'goat', 'car', 'goat' ]
[ 'car', 'goat', 'goat' ]
[ 'goat', 'car', 'goat' ]
[ 'car', 'goat', 'goat' ]
[ 'goat', 'car', 'goat' ]
[ 'car', 'goat', 'goat' ]
[ 'car', 'goat', 'goat' ]
...

Now let’s have the contestant make a choice – 0, 1 or 2. Just to check ourselves, let’s see whether he wins a car and add up how many times he wins. Out of 1000 games, he should win somewhere close to 333.

let wins = 0;

// we'll play the game 1000 times.
for (let i = 0; i < 1000; i++) {
  // create 3 doors with goats behind them.
  let doors = ["goat", "goat", "goat"];
 
  // randomly choose one door and put a car behind it.
  let winner = Math.floor(Math.random() * 3);
  doors[winner] = "car";
  // console.log(doors);
 
  // choose a door and count how many wins we get
  let choice = Math.floor(Math.random() * 3);
  if (doors[choice] === "car") {
    wins++;
  }
}
// should be around 333
console.log(wins);

That checks out for me.

Next, we let Monty open a door to reveal a goat. There are probably plenty of clever ways to code this, but I’ll just loop through the three doors and choose one that is not the same as what the contestant chose, and make sure that it’s a goat door.

  ...
  // now monty chooses another door (must be a goat!)
  let montyChoice;
  for (let j = 0; j < 3; j++) {
    if (j != choice &amp;&amp; doors[j] === "goat") {
      montyChoice = j;
      break;
    }
  }
  console.log(doors[montyChoice]);
  ...

Just to check myself, I logged what’s behind Monty’s door. Sure enough, nothing but goats.

Now we have the contestant’s choice and Monty’s choice. We can add the code back in that calculates how many times the contestant wins the car…

let wins = 0;
// we'll play the game 1000 times.
for (let i = 0; i < 1000; i++) {
  // create 3 doors with goats behind them.
  let doors = ["goat", "goat", "goat"];

  // randomly choose one door and put a car behind it.
  let winner = Math.floor(Math.random() * 3);
  doors[winner] = "car";
  // console.log(doors);

  // choose a door and count how many wins we get
  let choice = Math.floor(Math.random() * 3);

  // now monty chooses another door (must be a goat!)
  let montyChoice;
  for (let j = 0; j < 3; j++) {
    if (j != choice &amp;&amp; doors[j] === "goat") {
      montyChoice = j;
      break;
    }
  }
  // console.log(doors[montyChoice]);
 
  // contestant does not switch
  if (doors[choice] === "car") {
    wins++;
  }
}
console.log(wins);

This isn’t any different than the first time we counted the wins. I consistently get numbers in the low 300s. Of course, because the fact that Monty opens a door doesn’t change the fact that there was a 1 in 3 chance of the contestant winning.

But let’s see what happens if the contestant switches doors. Again, you can get fancy here, but I’ll go brute force, looping through the doors till I find the one that is neither the contestant’s choice nor Monty’s choice. And I’ll count how many times he wins with that choice.

let wins = 0;
// we'll play the game 1000 times.
for (let i = 0; i < 1000; i++) {
  // create 3 doors with goats behind them.
  let doors = ["goat", "goat", "goat"];

  // randomly choose one door and put a car behind it.
  let winner = Math.floor(Math.random() * 3);
  doors[winner] = "car";
  // console.log(doors);

  // choose a door and count how many wins we get
  let choice = Math.floor(Math.random() * 3);

  // now monty chooses another door (must be a goat!)
  let montyChoice;
  for (let j = 0; j < 3; j++) {
    if (j != choice &amp;&amp; doors[j] === "goat") {
      montyChoice = j;
      break;
    }
  }
  // console.log(doors[montyChoice]);
 
  // contestant does not switch
  // if (doors[choice] === "car") {
  //   wins++;
  // }
 
  // contestant switches
  let newChoice;
  for (let j = 0; j < 3; j++) {
    if (j != choice &amp;&amp; j != montyChoice) {
      newChoice = j;
      break;
    }
  }
  if (doors[newChoice] === "car") {
    wins++;
  }
}
console.log(wins);

Running this I get numbers in the upper 600s! Winning roughly 2 out of 3 times, exactly as predicted.

Summary

So there you go. The code proves things out. But it still feels a bit magical.

I did think of another though experiment that helps to make it make sense. Let’s say there were four doors – three goats, one NEW CAR. Contestant chooses one, and Monty Hall opens two of them to reveal two goats. Or better yet, there are ten doors. Contestant chooses one and Monty opens eight. The only way he’d lose by switching is if he had chosen the car to begin with. And there was only a 1 in 10 chance he did that. So if he switches now, he’s got a 9 out of 10 chance of winning! That helps my brain a bit. But still seems a bit magical.

Probability #1

I started re-reading The Drunkard’s Walk again today.

I read it a few years ago and remember really liking it. There are lots of examples in there of situations that seem to defy logic, or at least defy our sense of what is logical. But these are provable mathematically using the basic probability.

Whenever I come across a problem like this, even once I get my head around it as much as I can, I like to write some code to prove it out.

One of my favorite such problems is the two children problem. I haven’t gotten to it in the book yet, but I know it’s coming up. It’s a classic. Here it is, my paraphrase:

A woman says she has two children. She says one of them is a boy. What are the odds that the other one is a boy?

The obvious answer is 50%. There’s a child you don’t know about. It’s either a girl or a boy. Everything else is irrelevant, right?

Nope. Actually the odds are 1 in 3 that the other child a boy, 2 in 3 that it’s a girl.

And the really odd part of it is you can change the wording in a way that seems to make no difference, but totally changes it:

A woman says she has two children. She says the first born one is a boy. What are the odds that the other one is a boy?

In this case, yes, the odds are 50% that the other child is a boy.

To understand the odds of a particular situation occurring, such as the genders of two children, you have to consider all possible arrangements and then how many of those arrangements satisfy the criteria. Then divide.

In this case, we have a family who had one child, then another child. They may have had a boy first, then a girl. Or maybe a boy, then another boy. Or a girl and then a boy. Or a girl and another girl. That’s four possibilities:

  • boy boy
  • boy girl
  • girl boy
  • girl girl

So in the first problem, the mother says that one of her children is a boy. This narrows us down to just three possibilities.

  • boy boy
  • boy girl
  • girl boy

In all three of those, one of the children is a boy. In two of them the other child is a girl and in only one, the other child is also a boy. So, 1 in 3 for boy, 2 in 3 for girl.

But let’s look at the second problem. Mom says that her first child is a boy. That gives us only two possibilities:

  • boy boy
  • boy girl

In one of those, the other child is a girl and in one, it’s a boy. 50/50.

Prove it with Code

I’m doing this with JavaScript using node.js. But use whatever you want.

For probability situations, it helps to have a large number of samples. So let’s make 1000 families. Each family will have two children. We can represent these by strings: “bb” means they had a boy, then another boy. “gb” means they had a girl, then a boy, in that order. Likewise, “bg” means the opposite order and “gg” means they had two girls. We’ll store all the families in an array, and we’ll actually go through one by one, randomly choosing the gender of the first child, and then the second child.

let families = [];

// create 1000 families with two randomly gendered children.
for (let i = 0; i < 1000; i++) {
  // no kids yet.
  let family = "";

  // first child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  // second child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  families.push(family);
}
console.log(families);

This should give you something like the following output, with 1000 total strings.

['gb', 'gb', 'gg', 'gb', 'gb', 'gg', 'bb', 'bg', 'bb', 'gb',
'bg', 'gg', 'gg', 'bg', 'bg', 'gg', 'gg', 'bg', 'gb', 'gg',
'gg', 'gb', 'bb', 'bg', 'gg', 'gb', 'gg', 'gb', 'bb', 'gb',
'bg', 'bb', 'gg', 'gb', 'gb', 'bb', 'bg', 'bg', 'gb', 'bb',
'gb', 'gb', 'gg', 'gg', 'bb', 'bb', 'bb', 'gb', 'gg', 'gb',
'gb', 'bg', 'gg', 'bg', 'bg', 'gb', 'bg', 'gg', 'gg', 'bg',
'bb', 'gb', 'bg', 'gb', 'gg', 'gg', 'bg', 'bg', 'gg', 'bb',
'gb', 'bg', 'gg', 'gb', 'bg', 'bg', 'bg', 'gg', 'bb', 'gb',
'gg', 'bb', 'bb', 'gg', 'gg', 'bb', 'gg', 'gg', 'bg', 'bb',
'bb', 'gg', 'gg', 'gg', 'gg', 'bg', 'gg', 'gg', 'bg', 'gg',
...
]

We have multiples of every type of family in there: “bb”, “bg”, “gg”, “gb”. Now, the mother said that one of the children was a boy. So let’s filter this down to only the families that have a “b” in them.

// now lets get all the families who have at least one boy
let oneBoyFamilies = families.filter(family => family.indexOf("b") > -1);
console.log(oneBoyFamilies);

My var name isn’t the greatest. Consider oneBoyFamilies to mean “at least one boy”. This should give you something like the following output.

['gb', 'gb', 'gb', 'gb', 'bb', 'bg', 'bb', 'gb', 'bg', 'bg',
'bg', 'bg', 'gb', 'gb', 'bb', 'bg', 'gb', 'gb', 'bb', 'gb',
'bg', 'bb', 'gb', 'gb', 'bb', 'bg', 'bg', 'gb', 'bb', 'gb',
'gb', 'bb', 'bb', 'bb', 'gb', 'gb', 'gb', 'bg', 'bg', 'bg',
'gb', 'bg', 'bg', 'bb', 'gb', 'bg', 'gb', 'bg', 'bg', 'bb',
'gb', 'bg', 'gb', 'bg', 'bg', 'bg', 'bb', 'gb', 'bb', 'bb',
'bb', 'bg', 'bb', 'bb', 'bg', 'bg', 'bg', 'bb', 'bb', 'bb',
'gb', 'bb', 'bb', 'bb', 'bg', 'bg', 'bg', 'gb', 'bb', 'bb',
'gb', 'bb', 'bg', 'bg', 'bb',
...
]

I don’t see any “gg”s in there, but just to be sure, we can say:

// validate that there are no families with two girls here
console.log(oneBoyFamilies.indexOf("gg"));

If we’ve done that right, we should get a -1 here. I do, so I’m satisfied.

Now, to find families where the other child is a boy, we need to look for families consisting of “bb”. But for families where the other child is a girl, we need to look for either “gb” or “bg”. Already you can see why it’s 2 to 1 in favor of the other child being a girl. But here’s the code:

let otherChildIsABoy = oneBoyFamilies.filter(family => family === "bb");
let otherChildIsAGirl = oneBoyFamilies.filter(family => family === "bg" || family == "gb");
console.log("boy: " + otherChildIsABoy.length);
console.log("girl: " + otherChildIsAGirl.length);

Alternately, for the girls, you could do something like:

let otherChildIsAGirl = oneBoyFamilies.filter(family => family.indexOf("g") > -1);

Shouldn’t make any difference. With either method, I consistently get in the mid-200s for boys and around 500 for girls. Total would be around 750-ish, which makes sense since we filtered out one of the four arrangements (“gg”). So, 1 in 3 for boys, 2 in 3 for girls. Spot on.

Finally, let’s do the second version. Here, the mom says her first child is a boy. So for that we have to search for families where the first character is a “b”. Then we just get the count of the resulting families where the second character is a “b” and the count where it’s “g”.

// this time, lets get all the families where the FIRST child is a boy
let firstBoyFamilies = families.filter(family => family.charAt(0) === "b");
console.log(firstBoyFamilies);

let secondChildIsABoy = firstBoyFamilies.filter(family => family.charAt(1) === "b");
let secondChildIsAGirl = firstBoyFamilies.filter(family => family.charAt(1) === "g");

console.log("boy: " + secondChildIsABoy.length);
console.log("girl: " + secondChildIsAGirl.length);

The results you see here will vary. Sometimes you’ll get more boys than girls as the second child. Sometimes the opposite. Occasionally they’ll be dead even. We’ll call that 50%.

Why am I doing this?

This is firmly in the realm of recreational mathematics. I just find it really interesting to prove these things out even when my brain doesn’t agree 100% all the time. I enjoy doing it and I’ll probably do some more.

PS:

Here’s all the code in one spot:

let families = [];
// create 1000 families with two randomly gendered children.
for (let i = 0; i < 1000; i++) {
  // no kids yet.
  let family = "";

  // first child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  // second child
  if (Math.random() < 0.5) {
    family += "b";
  } else {
    family += "g";
  }

  families.push(family);
}
console.log(families);

// now lets get all the families who have at least one boy
let oneBoyFamilies = families.filter(family => family.indexOf("b") > -1);
console.log(oneBoyFamilies);

// validate that there are no families with two girls here
console.log(oneBoyFamilies.indexOf("gg"));

let otherChildIsABoy = oneBoyFamilies.filter(family => family === "bb");
let otherChildIsAGirl = oneBoyFamilies.filter(family => family === "bg" || family == "gb");
// alternately…
// let otherChildIsAGirl = oneBoyFamilies.filter(family => family.indexOf("g") > -1);

console.log("boy: " + otherChildIsABoy.length);
console.log("girl: " + otherChildIsAGirl.length);

// this time, lets get all the families where the FIRST child is a boy
let firstBoyFamilies = families.filter(family => family.charAt(0) === "b");
console.log(firstBoyFamilies);

let secondChildIsABoy = firstBoyFamilies.filter(family => family.charAt(1) === "b");
let secondChildIsAGirl = firstBoyFamilies.filter(family => family.charAt(1) === "g");

console.log("boy: " + secondChildIsABoy.length);
console.log("girl: " + secondChildIsAGirl.length);

Learning CNC and Making a MediaBox

Earlier this year I talked about my “Bit-Box” – a custom keyboard, program launcher, Stream Deck clone, device. https://www.bit-101.com/blog/2020/07/bit-box/

The box was handmade, but I had purchased a 3d printed plate to hold the switches. A little later I had the idea of making my own plate with wood. Initial tests, chiseling out a square hole for a single switch worked pretty well, but as soon as I tried to cut out several adjacent holes, the wood between the holes kept chipping out.

I started thinking about using a CNC to do this, and eventually picked up a Sainsmart 3018 Prover.

It took a couple of hours to assemble. Pretty easy actually. And it only came with some relatively useless v-shaped engraving bits, so I ordered a set of flat endmills in different sizes. Since then I’ve picked up a bunch of different bits.

In terms of software, I’ve tried a few different options.

One is Inventables Easel. This is a web app made for the Inventables X-Carve cnc machine. But it can export gcode that can be used with the 3018. Easel has some decent features for free, but you have to pay for full functionality.

The other one I’ve used is Carbide Create. This is made for Carbide3D’s Shapeoko machines. It’s desktop software and is totally free. It also exports gcode. I like Carbide Create a lot better.

The basic flow is to create a set of simple 2d vector shapes – rectangles, circles, paths – then apply tool paths to each shape. For example, you’d specify that you want to use this rectangle as an outline shape that is cut 1/4″ deep. Or you want to use this circle as a pocket, 1/8″ deep. A pocket cut cuts the entire inner area of a shape to a certain depth. You can also do boolean operations to combine or subtract different shapes. It’s super basic, but really does most of what you’d need.

If you want to really go crazy, you can get into 3d modeling with something like FreeCAD or Fusion360, and then create tool paths from those models. A much bigger learning curve and probably overkill until you get into some really complex stuff.

I use Candle to send the gcode to the machine itself.

MediaBox

My goal was to create a “MediaBox”. This is just what I call a custom mini keyboard with media keys – play/pause, next, previous tracks, volume up/down/mute. Six keys in all. Here’s an overview of all my attempts from one of the original hand-cut versions, some test cuts, a couple of failed attempts, through the final working build:

The initial holes for the keys worked perfectly. A 0.555 inch square hole is all you need. Spacing is something like 0.205 inches between keys.

The main design issue beyond that was where to fit the Arduino board and how to route the usb cable. I was initially using 1/2″ black walnut. On the top were the holes for the keys. I then flipped it over and created a recess on the bottom. But the half inch depth was really too shallow. And my original design was just too small once I attached the cable.

So I switched over to 3/4″ walnut and made the whole thing just a bit larger.

Wired it up much the same as I did for the BitBox. Did some finish sanding and applied some tung oil, glued on a leather bottom.

The software presented a bit of a problem. The Arduino keyboard library does not provide a way to send media key codes. Luckily there is another 3rd party library, HID-Project.

You can add this library to your project by going to sketch / manage libaries and searching for “hid project”.

Here’s the code I came up with:


#include <hid-settings.h>
#include <hid-project.h>

// Define Arduino pin numbers for buttons and LEDs
#define VOL_DOWN 2
#define VOL_MUTE 4
#define VOL_UP 3
#define PLAY_PREV 5
#define PLAY_PAUSE 6
#define PLAY_NEXT 7

const long debounceTime = 30;
unsigned long lastPressed = 0;
boolean A, a, B, b, C, c, D, d, E, e, F, f;

void setup() {
  pinMode(VOL_DOWN, INPUT_PULLUP);
  pinMode(VOL_MUTE, INPUT_PULLUP);
  pinMode(VOL_UP, INPUT_PULLUP);
  pinMode(PLAY_PREV, INPUT_PULLUP);
  pinMode(PLAY_PAUSE, INPUT_PULLUP);
  pinMode(PLAY_NEXT, INPUT_PULLUP);

  a = b = c = d = e = f = false;
  Consumer.begin();
  BootKeyboard.begin();
}

void loop() {
  if (millis() - lastPressed  <= debounceTime) {
    return;
  }

  lastPressed = millis();

  A = digitalRead(VOL_DOWN) == LOW;
  B = digitalRead(VOL_MUTE) == LOW;
  C = digitalRead(VOL_UP) == LOW;
  D = digitalRead(PLAY_PREV) == LOW;
  E = digitalRead(PLAY_PAUSE) == LOW;
  F = digitalRead(PLAY_NEXT) == LOW;
  if (A && !a) {
    Consumer.write(MEDIA_VOL_DOWN);
  }
  if (B && !b) {
    Consumer.write(MEDIA_VOL_UP);
  }
  if (C && !c) {
    Consumer.write(MEDIA_VOL_MUTE);
  }
  if (D && !d) {
    Consumer.write(MEDIA_PREV); // alternately MEDIA_REWIND
  }
  if (E && !e) {
    Consumer.write(MEDIA_PLAY_PAUSE);
  }
  if (F && !f) {
    Consumer.write(MEDIA_NEXT); // alternately MEDIA_FAST_FORWARD
  }
  a = A;
  b = B;
  c = C;
  d = D;
  e = E;
  f = F;
}

This was adapted from a few other sample projects I found, as well as the code I had for the BitBox. It works great.

Want one?

I made this for myself, but I’d love to make some more. The materials aren’t cheap though. Well over $30 for the wood, leather, Arduino, keys and key caps. Then the time for cutting, finishing, soldering. I’ve got to work out pricing and different options, and the best way to sell them, but contact me if you’re interested.

I’d also be open to selling just the wooden box, either finished or straight off the mill and you can buy the other parts and put it together yourself. It’s a fun project.

Or… if you have a cnc already, I’m going to post the Carbide Create file I used, with instructions, for free. Check back soon for that.

I Think Bluetooth is Finally OK

Bluetooth was introduced on May 7, 1989. I think I first heard of it in the mid-2000’s. People would use it to try to send contact info or other files between feature phones. As I recall, it had about a 50% chance of actually working. All of my attempts fell squarely in the failing 50%. So I ignored it for a few more years.

Then there were smart phones with Bluetooth and laptops had Bluetooth. There were Bluetooth mice and eventually Bluetooth fitness devices and smart(ish) watches. And they all SUCKED.

Bluetooth and Me: A History

Mice

Every Bluetooth mouse I had was slammed down on the desk in frustration at least once. And only very narrowly avoided being hurled across the room. When you’re using something all day every day, 99% uptime is unacceptable. I’d be in the middle of something and the mouse would just stop responding and I’d have to spend a minute or so reconnecting it. Then it might be fine for several more hours. I tried several and finally quit. I’m firmly in the wireless USB dongle camp now as far as mice go. Logitech’s MX Master 3 is glorious. It actually supports Bluetooth AND wireless. I think I tried an earlier version of the MX Master on Bluetooth and quit the first time it disconnected. The wireless dongle has never once failed me, and I’ve used many.

Headphones

Specifically, I’m talking about “earbuds” or what the kids call “IEMs” (in-ear monitors) these days. I’ve had multiple sets of these. Historically, they suffer from four issues:

  1. Poor audio quality.
  2. Discomfort due to weight.
  3. Poor battery life.
  4. Connectivity issues.
  5. Cost.

You could probably come up with something where you could say you get to choose three out of those 5 points. Maybe. The point is, they play off each other. Better battery life means more weight and cost. Anyway, I never had a pair that I was happy with. In the end, the hassle of a cord (and these days a USB-C adapter) has always been less than the hassle of battery, discomfort, poor sound, and connection problems.

Speakers

I’ve also had multiple Bluetooth speakers. And I’ll even throw my car stereo system into this category. These have been so-so. Connectivity has often been an issue. Some good, some not so good. My car in particular is really bad. It always takes a minute or so and at least two tries to actually connect my phone.

The other thing that has killed me with Bluetooth speakers is that they’ve always had horrible performance on listening to voice audio sources. Music is ok, but just about every one I’ve had cuts out in the silence between words. It will pick up again when it hears the next set of words, but routinely a few words will be lost on almost every sentence. I listen to a lot of podcasts and audiobooks, and this was always impossible with every Bluetooth speaker I had.

Fitness Devices / Smartwatches

I’ve had multiple running watches that had Bluetooth, as well as several Fitbits and an Android Wear watch. Generally, the Bluetooth has worked great. Until it stopped working great. When they decided to stop connecting via Bluetooth, it seemed like there was nothing I could do to get them to reconnect. Even rebooting the device and whatever device it was trying to connect to. But then at some point it would just start working again for however many days.

All this is to say that I’m not just someone who hates something they’ve never tried. I’ve had dozens of Bluetooth devices and every single one of them has caused me some level of frustration. And yet, I keep buying them, holding out hope. (Except mice. I’ve eternally given up on Bluetooth mice.)

But wait!

In the last couple of months, I’ve purchased three Bluetooth devices that I’m actually quite happy with!

Galaxy Buds Plus

For some reason, I decided to take another leap of faith and got another set of Bluetooth ear buds. I checked out a ton of reviews on these things and these seemed like a solid buy. The cost was $139 on Amazon, which isn’t cheap, but not exorbitant. I’ve been amazed at how happy I am with these things. There’s nothing I can say about these that is negative.

Battery life is great. They have the charging case, which itself has wireless charging. I already have wireless charges scattered around the house, so it’s super easy to just toss it on one of them.

Connectivity has been flawless. They connect instantly, never lose the connection.

They are comfortable. I use them with foam tips, which I always get for any earbuds. Never get uncomfortable. I’ve used them while running and they stay put and feel fine.

Sound is quite good. Most of the time I’m listening to podcasts and audiobooks on my phone. They sound great for that. To be honest, for music, I stick with my Sony Walkman NW-A55 and wired Ikko OH-1 IEMs. That’s been a life changing combination. But if I’m running with my phone and want to listen to music, I’ll use the Buds for that, as the music is just background at that point.

I’ve had these for two and a half months now and I can’t say enough good about them. These are the items that have finally sold me on the idea that Bluetooth has made it.

JBL Flip 5

Speaking of sound, I recently picked up a Bluetooth speaker. To be completely transparent, I got this for free. A while back I switched to Verizon Fios and out of the blue they sent me this $100 coupon for the Verizon store as thanks for switching. Lots of phones and phone cases, chargers and headphones, none of which I really needed. I didn’t really need a Bluetooth speaker, but this had pretty good reviews and came to $95 with tax, so why not?

It sounds good, connectivity even on multiple devices has been great, and it works flawlessly with audiobooks and podcasts. Huge battery with lots of listening time. Also, you can turn off the power on/off and Bluetooth connect/disconnect sounds, which has been a big annoyance on every other speaker I’ve had.

Garmin Forerunner 235

In the last month I started running again. I pulled out my old Garmin running watch, which I hadn’t used in … sadly, years. After a full day of charging and trying to get it running, with no success, I ordered a new Garmin watch, the Forerunner 235.

It’s very nice. It’s a full on smartwatch (not Android), which you can add apps and watch faces to. I did set up a better watch face, but not really interested in other apps. It does all day heart rate and sleep tracking. Battery lasts a week if you’re not running. GPS while running will suck it down faster, but will still let you run for many hours without a problem.

It connects to the Garmin Express phone app via Bluetooth and that’s been nearly perfect. When I finish a run, if I have my phone on me, it nearly instantly syncs to the cloud via Bluetooth and phone. If I don’t have my phone on me, it often syncs as soon as I walk into my driveway, with my phone inside the house. Downright impressive.

Summary

Bluetooth may have won me over. I look forward to seeing other quality implementations, though I’m not holding my breath on the mouse situation.

My Wireguard Setup

Disclaimer

Someone has been submitting my recent posts to online tech news aggregators, where they are criticized for not being cutting edge or paradigm shifting enough. If you’ve been led to believe that this post awe and amaze you, complain to the person who submitted it, not me. This is just my personal blog where I write about stuff that I’m doing, mostly technology based. It will not change your life. That said…

Background

I’ve had a “home server” for close to ten years now. It’s a Linux-based desktop pc. It acts as a file server, media server, backup server and a place to try out different things. I guess it’s what is now popularly called a “home lab”. All that’s great when I’m at home on my home network. I can stream movies and music, get files, ssh into the server and do whatever I need to do.

But when I’m out and about, traveling, working (when we used to go out and do stuff like that), I’d also like to have that same access. That’s all simple enough. You go into your router settings, do some port forwarding to that box and then you can stream, ssh, ftp, vnc, whatever. I’ve certainly done just that often enough. But as I became more security conscious, this started to worry me more and more. Having all those ports open into my main machine made me nervous. Yeah, they are behind passwords, or hopefully keys. I locked down ssh pretty tightly, but still worried about it, and all those other services. When I was on Xfinity for home internet, their management app provided a security section which listed all the various attempts to access different ports on the network with their IPs and locations. It was shocking. It became something that was not just theoretical. People were (and are) actually trying to hack into my network. That’s when I shut everything down.

Enter Wireguard

I’d heard quite a bit about Wireguard and it sounded like what I needed. I came upon this tutorial which described exactly what I wanted to do and in pretty clear terms:

https://zach.bloomqu.ist/blog/2019/11/site-to-site-wireguard-vpn.html

This all went together really well. It took a bit of learning and messing things up and fixing them, but I eventually got it all working really nicely and doing exactly what I need. Here’s my current setup:

  • Main wireguard server hosted on an inexpensive VPS in the cloud.
    • ufw set up to block all traffic other than specific ports from specific wireguard clients.
    • rinetd to forward any needed ports to my home server. Currently, that’s just the port that my airsonic server is running on.
  • Main wireguard server hosted on an inexpensive VPS in the cloud.
    • ufw set up to block all traffic other than specific ports from specific wireguard clients.
    • rinetd to forward any needed ports to my home server. Currently, that’s just the port that my airsonic server is running on.
  • wireguard client running on my home server.
    • airsonic music streaming server running there.
  • wireguard clients running on a couple of laptops, my Android phone and tablet. Each client has it’s own private key and the public key of the server. The server has its own private key and the public keys of each client.

With this setup I can ssh into the VPS from anywhere in the world, provided I’m doing it from one of the configured clients. Once I’m into the VPS, I can then ssh into any one of the other clients that has an ssh server running. I could use rinetd to forward ssh on specific ports to specific clients. But for now, that use case is not that common. When the world gets back to normal and I’m out of the house more, that will be useful.

I’ve got my airsonic server running on a specific port of my home server, let’s say it’s 1234. rinetd is set up to forward port 1234 on the VPS to port 1234 on the home server. So I can access my music in the browser from any wireguard client, or I can use any one of many subsonic-compatible Android apps and have my music streaming to my phone or tablet no matter where I am.

This setup is pretty flexible, and I will be able to add other services to it just by opening up a port in ufw and forwarding it as needed using rinetd. Important thing to remember is that when I say “opening up a port in ufw” I mean a wireguard client accessible port. Nothing is open on the VPS except via wireguard. Nothing is open on my home server except via the VPS or local LAN.

Monitoring and Recovery

One downside to this setup is that to access my music for example, I’m relying on a chain of multiple links: wireguard on VPS, ufw, rinetd, wireguard on home server, airsonic. If any one of those doesn’t function just right, I’m listening to silence. This has happened a couple of times, especially when I first set things up and had some things not quite right. Actually, if ufw goes down, I’ll still be able to listen to my music, but my VPS will be open. So I wanted to get some monitoring in place. When things were down early on, I’d be making assumptions on which piece was broke and spending time trying to fix it, only to find out it was one of the other links. With correct monitoring, I can now tell exactly what is up and down.

Monitoring with Healthchecks

I’ve been a big fan of Healthchecks.io. You set up “checks” which provide you with a url to ping. If a check doesn’t get a ping within a specified time period, it notifies you via email, sms, or through more than twenty other integrated services. I’ve been using it to monitor my daily backups. If a backup doesn’t happen at a specified time, I know about it.

So I set up a cron job that runs a script every 10 minutes on my VPS, and a similar one on my home server. This script first checks the status of wireguard. If it’s up, it pings Healthchecks. It does the same for rinetd and ufw. My home server checks wireguard and airsonic. Each of these five services is set up as a separate check in Healthchecks so I can see the status of each of them separately. The cron job runs every 10 minutes, so I give it one extra minute leeway – if Healthchecks doesn’t get a new ping after 11 minutes, that service is marked as down.

Recovery

Eventually I realized that if a particular service was down, once I became aware of it, I’d just go to whatever machine and restart it, so why not just do that automatically. So I built that into each of my checks.

If, say, wireguard is down on the VPS, it will NOT send the ping to Healthchecks. So a minute or so later it will be flagged as being down. But in this case, the script will also automatically try to restart wireguard. The next time it runs (10 minutes later), hopefully it sees that wireguard is up and sends the ping.

Healthchecks also has a “grace period” configuration. Once it notices something is down, it will not alert you until that grace period is done. I set this to 10 minutes. This results in the following sequence if something goes down:

  1. Service X is up and Healthchecks gets pinged at 10:00 pm.
  2. Service X goes down at 10:05 pm.
  3. At 10:10 pm, the script sees that Service X is down and fails to ping Healthchecks.
  4. The script also attempts to restart Service X.
  5. At 10:11 pm Healthchecks has not had a ping in 11 minutes and marks Service X as down.
  6. At 10:20 pm, the script runs again. Service X is up so it pings Healthchecks, which marks Service X as up again.
  7. Alternately, the restart didn’t work and at 10:20 pm no ping is sent.
  8. In this alternate case, at 10:21 pm, Healthchecks emails and texts me about the fact that Service X is down.

A potential improvement to this is that after step 4, when Service X is restarted, I could verify that it’s now working and ping Healthchecks. immediately. This way, if the restart works, nothing is marked as down. But I’m going to run it as is for a while and see how this works out. So far, so good.

I’ve gone through and tested each on of these checks, turning the service off and leaving it off. Within 11 minutes it was marked as down and restarted. And shortly thereafter marked as back up. All automatically.

If this were some kind of public service or mission critical workflow, I could easily set up the pings for every minute or so. But the 10 minutes seems perfectly adequate for my purposes.

More Details?

This post is pretty high level. Most of what went into the wireguard setup is covered in the above link. If you want to set up something similar, I’d be happy to go into more detail on any specific points. Just let me know.

version 1.3

Me again, talking about this silly version program still.

Actually, there are some pretty cool updates over the past few point releases. They came fast and on the heels of each other. The idea was posed to use the Linux package manager – apt or pacman or whatever – to get data on a program instead of relying on a hard-coded list.

Background

After some back and forth I warmed up to the idea, but as a backup to the known program list, not as a replacement. My reasoning is that you might have multiple versions of foo installed. Maybe one was through the default package manager, one through some download-and-run-an-install-script method. They might get installed to different locations in your PATH. But when you call foo on the command line, you’ll only get one of them.

If you query the package manager, it’s going to tell you about the one that it knows, which may or may not be the default. But when you run foo -v on the command line, you will get the one that’s going to be actually run in most cases. So that should be the first place we look. If version doesn’t know about foo then it can turn to the package manager.

Details

I decided to tackle two of the major Linux package managers first – apt (used on Ubuntu and most other Debian derivatives) and pacman (used on Manjaro and other Arch derivatives).

On apt, you can find info about a package, say neovim, you’d type:

apt list neovim --installed

This will give you something like:

neovim/focal,now 0.4.3-3 amd64 [installed]

That 0.4.3-3 is the version number that we’re looking for. It took a bit of regex trickery, but I was able to parse that bit out of it.

On pacman you’d type pacman -Qi neovim and the result would look something like:

Name : neovim
Version : 0.4.4-1
Description : Fork of Vim aiming to improve user experience, plugins, and GUIs
Architecture : x86_64
URL : https://neovim.io
Licenses : custom:neovim
Groups : None
Provides : vim-plugin-runtime
Depends On : libtermkey libuv msgpack-c unibilium libvterm luajit libluv
Optional Deps : python-neovim: for Python 3 plugin support (see :help python)
xclip: for clipboard support on X11 (or xsel) (see :help clipboard) [installed]
xsel: for clipboard support on X11 (or xclip) (see :help clipboard) [installed]
wl-clipboard: for clipboard support on wayland (see :help clipboard)
Required By : None
Optional For : None
Conflicts With : None
Replaces : None
Installed Size : 20.45 MiB
Packager : Sven-Hendrik Haase svenstaro@gmail.com
Build Date : Wed 05 Aug 2020 04:16:43 AM EDT
Install Date : Fri 21 Aug 2020 07:37:52 AM EDT
Install Reason : Explicitly installed
Install Script : No
Validated By : Signature

So we can use grep and/or sed to find the one line of that which starts with Version: and grab the 0.4.4-1 part of it.

I then did basically the same thing for dnf which is the package manager on Redhat, Fedora, and derivatives.

So the process is:

  1. Check to see if version already knows about the program. If so, just do what it already does.
  2. If now, check apt, pacman and dnf. First we can just check to see if each one of those exist and only run the one that does exist. It’s unlikely that many people will have more than one of those. If we find one of those, we do the parsing and spit out the version it tells us about.
  3. If those all fail, then we can just tell the user we couldn’t find any information on that command.

Can we do more?

There are all kinds of other package managers on both Linux and Mac. I started making a list of the different ways you can find and install software and came up with

  • snaps
  • flatpaks
  • pip
  • npm
  • homebrew / linuxbrew

There are others, but those all cover a huge amount of ground. And it turns out that most of them were able to be solved with the same general strategy:

  • Does this package manager exist?
  • Does it know about this program and what info does it have?
  • Parse out the version number from the info it returns.

So, now version supports all of those. It just looks at each one of them in turn until it finds one that give an answer.

This also has the added functionality of being able to return the version of more than just executable programs. Package managers know about various libraries and other assets that aren’t directly executable or don’t have any way of querying them directly for their version. But version can tell you about them. Want to know what version of libusb you have installed? Typing version libusb will tell you.

A personal perk of doing this project is that I was forced to really learn grep and sed. Two programs that ranged from confusing to very mysterious in my mind. Now I get them and really like them. I wrote something up about them too: https://www.bit-101.com/blog/2020/09/grep-and-sed-demystified/

Catalina VM

I’m not a big fan of Apple. Their products themselves are fine, for the most part. Not to my preference in a lot of ways, but that’s fine. I know plenty of people who love their iPhones and Macbooks and watches. I’m not going to argue. I definitely don’t like the company though. I perceive them as being overly controlling, developer hostile, and incredibly narcissistic. All this is just a pretext for saying that I don’t want to spend any money on Apple hardware.

But now and then I do want to be able to test something on Macos. Some program or script or utility I’m working on, like version or my C or Go based graphics and animation libraries. It’s not my main target, but since command line tools developed on Linux are usually trivial to get working on Mac, I’m happy to test them out and make some minor tweaks so they work there.

I’ve been considering building a “Hackintosh” system. But with Apple’s plans of going to ARM possibly by the end of this year, I don’t want to invest a lot in hardware that’s going to be obsolete soon.

This led me to see if it was possible to get Macos running in a VM. And I found this project:

https://github.com/foxlet/macOS-Simple-KVM

This is a git repo with a couple of scripts that create a qemu VM, download the official Macos installer image and runs that installer in that VM. It runs on Linux that I know of.

  1. Install the dependencies.
  2. Check out the repo.
  3. Run the jumpstart.sh script. This even allows you to choose which version you want. Defaults to Catalina.
  4. Create a virtual disk image and add that to the basic.sh script.
  5. Run the basic.sh script and choose install.

The UI that comes up is called “Clover”

The ui actually confused me at first. I was trying to click things, but you can navigate around with arrow keys and use enter to choose things. Choose the default option shown here to install.

This boots up a Macos system right off the install media. The first thing you’ll need to do is format that virtual disk you created. Then run the installer and tell it to install Macos to that disk you jut formatted.

The install takes a long time. Like close to an hour. At points it says it has one minute remaining and hangs there forever, before saying it’s calculating the remaining time, and hanging there forever. But be patient, it will finish after rebooting a couple times.

When it’s done, it will boot right into the OS, asking you to set up a username and password. And then you’re in a full Macos install. Use Control-Alt-F to toggle full screen and Control-Alt-G to toggle capture of the keyboard and mouse. You can shut down as you’d usually shut down a Mac or just close the VM window.

When you boot back in, you’ll get the Clover screen again. This time choose the last option in the top row to boot.

This is the one that messed me up the first time. I just hit enter and wound up in the install flow again.

Performance

By default, the basic.sh script allocates 2 GB RAM and a minimal amount of CPU resources. It’s also hard-coded to 1280×720 resolution I think. Read through the documentation to find out how to beef up the VM. I gave mine 8GB and a lot more CPU. I also got it running at full resolution on my 2560×1440 (or something like that) monitor. With the extra resources, it’s surprisingly performant. I mean, I’m not going to be doing gaming or video editing or trying to run XCode on it, but for browsing the web, regular apps, anything console-based, it’s perfectly adequate.

Once you go full screen with it, it’s honestly hard to tell it’s not the real thing. It is a bit laggy on my Thinkpad with an i5 CPU, but pretty zippy on my Ryzen 5 3600 desktop. I’ve installed various tools, utilities and other programs, as well as home brew and a bunch of packages form there. I haven’t had any problems with it so far at all.

Although the install was slow as hell, it now boots up fully in under a minute for me. That’s good because it’s not something I want running all the time. Giving it all those resources means the underlying Linux OS does not have access to them. You can pause the VM though, which stops it from using any CPU power. I haven’t checked if that affects the memory use though. I’d guess not so much.

Here is Catalina at 1920×1080 on my Thinkpad

Legality

Who knows. Use at your own risk. Just to avoid any problems, I did not sign in with my Apple ID. I’ve never heard of anyone getting sued for running an OS in a VM.

Summary

I’m pretty excited about this. This is perfect for my use cases of testing things out here and there. I’m not expecting it to be a full replacement for actual Apple hardware running Macos. And I don’t need that.

I imagine once Apple switches over to ARM, this project will be obsolete. Hopefully someone figures out a way to continue it with their new architecture.

A Cooler Cooler from Cooler Master

Some weeks ago I shared my new PC build. It’s been wonderful. Working perfectly. One thing I did recently was put in another 500GB drive to hold my VM images. While I was in there, I moved the front mounted drives around to the back and did some better cable arranging.

One thing I had planned to do for a while was add a new CPU cooler. After reading some reviews and watching some Youtube videos, I settled on the Cooler Master Hyper 212 Black Edition.

I probably didn’t really need this. But I think it’s a good investment. I’d been using the stock cooler that came with my Ryzen 5 3600. It did the job adequately. I’m not into any kind of crazy overclocking or anything, and wasn’t having any real heat problems. In general use, with a browser and a few tabs open, terminal and some music playing, maybe Slack open, the CPU would be in the high 30s to mid 40s C. Unless it was completely idle it wouldn’t stay in the 30s much. More in the low 40s. With Slack and some more active tabs open, it’d get into the 50s, maybe some short peaks into the 60s. Completely idle with nothing running, it would be mid-to-high 30s.

With the new cooler, I can see an immediate difference. Idle with just a browser and several tabs, it will settle down to 31-32. I never saw it go that low with the stock cooler. So it’s promising. I’ve only put it in this morning, so I’ll give it a few days to see how it performs under daily loads, but seems like a good improvement so far.

Installation wasn’t too bad. Took off the old cooler and cleaned up the thermal paste. You have to use their custom back plate, so I removed the old one and set up the new one. This was probably the most complex part. The cooler is designed to fit on a number of different socket types. You have to install various posts or screws and clips and move them to various positions depending on which socket you have, as well as a couple different types of brackets that go on the cooler itself. There’s a decent manual with Ikea-like diagrams for each configuration. I managed to get it right the first time without too much difficulty.

I came very close to forgetting to put on fresh thermal paste, but caught myself before tightening any screws. Getting the four screws on the brackets started was a bit tricky. They’re all spring mounted and wobble around. You have apply a little pressure and get them at the right angle to get them going. But once they were started, easy to tighten up by alternate corners.

The cooler comes with a single 120mm Silencio fan. The instructions have you mount it in a configuration that pushes air across the cooler and towards the back fan, which makes sense. But you can also purchase and add a second fan to mount on the other side of the cooler, for a push-pull configuration. There’s also an RGB version, but I think I have enough RGB going on in there as it is.

The cooler itself is pretty tall and the reviews all said to make sure you have enough room in your case. My case is ludicrously cavernous, so there’s plenty of room to spare, but it’s good advice to check.

version 1.0

A while back I posted about a script I wrote called version.

https://github.com/bit101/version

You pass it the name of a program and it tells you what version of that program you have installed. Example:

version java

This saves you from having to remember if it’s java -v, java --version, java -V or something else (no spoilers).

version now knows how to get the version of 156 different programs (including itself). It has 9 contributors and 15 stars. Not exactly React, but it’s cool to have people contributing.

In the original proof of concept, I was using bash case statements. In fact, this was the entire first iteration:

! /bin/bash

case $1 in

java)
$1 -version
;;

gcc | rustc)
$1 --version
;;

node | perl | lua)
$1 -v
;;

python)
$1 -V
;;

go)
$1 version
;;

esac

Once I started adding more programs though, it became obvious that this wasn’t going to work. I discovered that bash and zsh support a form of associative arrays. I thought that would be the perfect thing. It would look something like:

declare -A tools
tools[gcc]=--version
tools=-version
tools[node]=-v

Sadly, these are not supported in bash 3, which is still in use. In fact, my MacBook Pro has bash 3 on it.

Plan C was to fake associative arrays. Essentially, you just make a bunch of variables, one for each tool, with a common prefix:

tools_gcc=--version
tools_java=-version
tools_node=-v

I just had to do a bit of fancy regex with grep and sed to get the argument from the name of the tool that was passed as an argument. The initial pass with this method was pretty ugly, and I didn’t really understand what I had done. One of the contributors made some nice changes, and this led me to learning a lot more about grep and sed and I was finally able to get rid of grep altogether and do it all in sed. I was pretty happy with that. sed has always seemed like one of those arcane tools that only wizards knew how to use.

I also learned how to make man pages. And I made an install and uninstall script. One of the other contributors has been working on making a snap package, but from what I can tell that’s probably not going to work too sell due to the strict confinement of snaps.

Anyway, I don’t think there’s a whole lot more to be done with the simple tool. Hopefully people will still find new programs to add to it. But I figured it was done enough to slap a 1.0.0 sticker on it.

Grep and Sed, Demystified

I’ve kind of half understood grep for a while, but assumed that I didn’t really get it at all. I thought I knew nothing at all about sed. I took some time this weekend to sit down and actually learn about these two commands and discovered I already knew a good deal about both of them and filled in some of what I didn’t know pretty easily. Both are a lot more simple and straightforward than I thought they were.

Grep

grep comes from “global regular expression print”. This is not really an acronym, but comes from the old time ed line editor tool. In that tool, if you wanted to globally search a file you were editing and print the lines that matched, you’d type g/re/p (where re is the regular expression you are using to search. The functionality got pulled out of ed and made into a standalone tool, grep.

Basics

grep, in its simplest use, searches all the lines of a given file or files and prints all the lines that match a regular expression. Syntax:

grep <options> <expression> <file(s)>

So if you want to search the file animals.txt for all the lines that contain the word dog, you just type:

grep dog animals.txt

It’s usually suggested that you include the expression in single quotes. This prevents a lot of potential problems, such as misinterpretations of spaces or unintentional expansion:

grep 'flying squirrel' animals.txt

It’s also a good idea to explicitly use the -e flag before the expression. This explicitly tells grep that the thing coming next is the expression. Say you had a file that was list of items each preceded with a dash, and you wanted to search for -dog

grep '-dog' animals.txt

Even with the quotes, grep will try to parse -dog as a command line flag. This handles it:

grep -e '-dog' animals.txt

You can search multiple files at the same time with wildcards:

grep -e 'dog' *

This will find all the lines that contain dog in any file in the current directory.

You can also recurse directories using the -r (or -R) flag:

grep -r -e 'dog' *

You can combine flags, but make sure that e is the last one before the expression:

grep -re 'dog' *

A very common use of grep is to pipe the output of one command into grep.

cat animals.txt | grep -e 'dog'

This simple example is exactly the same as just using grep with the file name, so is an unnecessary use of cat, but if you have some other command that generates a bunch of text, this is very useful.

grep simply outputs its results to stdout – the terminal. You could pipe that into another command or save it to a new file…

grep -e 'dog' animals.txt > dogs.txt

Extended

When you get into more complex regular expressions, you’ll need to start escaping the special characters you use to construct them, like parentheses and brackets:

grep -e '\(flying \)\?squirrel' animals.txt

This can quickly become a pain. Time for extended regular expressions, using the -E flag:

grep -Ee '(flying )?squirrel' animals.txt

Much easier. Note that -E had nothing at all to do with -e. That confused me earlier. You should use both in this case. You may have heard of the tool egrep. This is simply grep -E. In some systems egrep is literally a shell script that calls grep -E. In others it’s a separate executable, but it’s just grep -E under the hood.

egrep -e '(flying )?squirrel' animals.txt

Other Stuff

The above covers most of what you need to know to use basic grep. There are some other useful flags you can check into as well:

-o prints only the text that matches the expression, instead of the whole line

-h suppresses the file name from printing

-n prints the line number of each printed line.

Simple Text Search

If the thing you are searching for is simple text, you can use grep -F or fgrep. All the same, but you can’t use regular expressions, just search for a simple string.

grep -Fe 'dog' animals.text
fgrep -e 'dog' animals.text

Perl

There’s also grep with Perl syntax for regular expressions. This is a lot more powerful than normal grep regex syntax, but a lot more complex, so only use it if you really need it. It’s also not supported on every system. To use it, use the -P flag.

grep -Pe 'dog' animals.text

In this simple case, using Perl syntax gives us nothing beyond the usual syntax. Also note that pgrep is NOT an alternate form of grep -P. So much for consistency.

Sed

I thought that I knew next to nothing about sed, but it turns out that I’ve been using it for a few years for text replacement in vim! sed stands for “stream editor” and also hails from the ed line editor program. The syntax is:

sed <options> command <file(s)>

The two main options you’ll use most of the time are -e and -E which work the same way they do in grep.

The most common use of sed is to replace text in files. There are other uses which can edit the files in other ways, but I’ll stick to the basic replacement use case.

Like grep, sed reads each line of text in a file or files and looks for a match. It then performs a replacement on the matched text and prints out the resulting lines. The expression to use for replacement is

s/x/y/

where x is the text you are looking for, and y is what to replace it with. So to replace all instances of cat with the word feline in animals.txt

sed -e 's/cat/feline/' animals.txt

Note that sed will print every line of text in the file, whether or not it found a match or not. But the lines that it matched will be changed the way you specified.

After the final slash in the expression, you can add other regex flags like g for global or i for case insensitivity.

Like grep, sed just outputs to stdout. You can redirect that to another file using > or pipe it to another process using |. But do NOT save the output back to the original file. Try it out some time on a test file that you don’t care about and see what happens.

There are lots of other options you can use with sed but the above will probably get you by for a while to come. As you need more, just read up.

Summary

The biggest thing I took away from the couple of hours I spent on this was actually how easy these two commands were to learn. I’d been avoiding doing so for a long time and now wish that I had spend the effort on this much earlier.