Category Archives: sysadmin

Let’s Encrypt – Entering Public Beta

Let’s Encrypt goes public beta. No more paying of ridiculous amounts for a simple SSL certificate. Yearly.

The process is still somewhat rough on the edges now. I expect it to get better when it goes 1.0. There’s another important thing to note when you’re using using certificates from Let’s Encrypt. In the interest of transparency, they publish the list of certificates issued by them. So if you’re uncomfortable about your domain appearing in a public website, you may want to reconsider.

Let’s Encrypt is a free, automated, and open certificate authority brought to you by the Internet Security Research Group (ISRG). ISRG is a California public benefit corporation, and is recognized by the IRS as a tax-exempt organization under Section 501(c)(3) of the Internal Revenue Code.

Source: Entering Public Beta

Setting up DD-WRT on D-Link DIR-868L

Just got the great looking D-Link DIR-868L free recently from a broadband package that I signed up.
DIR-868L-A1-Image-L-Side-Left-

It’s an amazing router that has great features and performance. It also has great hardware specs, which makes it a perfect candidate for trying custom firmware like dd-wrt or OpenWrt. My preference would be to go for OpenWrt, unfortunately at this point of writing it is not supported. So it’s on to dd-wrt.

Installation of dd-wrt firmware can be done by following this wiki. Try it at your own risk, and always have the stock firmware on hand in case it doesn’t work.

Assuming you got this far, what’s next? Packages, naturally! To do that you have to first enable JFFS at the dd-wrt Administration tab. Next, let’s install something.

root@xxxxxxxx:/jffs/tmp# ipkg update
mkdir: can't create directory '//usr/local/lib/': Read-only file system

root@xxxxxxxx:~# ipkg install nano
root@xxxxxxxx:~# nano
-sh: nano: not found

Uh oh. Turns out ipkg is broken on this firmware and a search turns up other users facing the same issue. Someone on the forums suggested opkg instead and that’s where I went. There are many forum posts, blog posts and wikis on this topic. The one that I’m using is this. However, it doesn’t work out of the box else there won’t be this blog post :-).

Following the instructions, you should reach a step that tells you to download a script and execute it. Going for the “not so brave people” approach,

root@xxxxxxxx:/jffs/tmp# wget -q -O- http://debian.keithdunnett.net/ddwrt/optware_setup > optware_setup
root@xxxxxxxx:/jffs/tmp# chmod 700 optware_setup
root@xxxxxxxx:/jffs/tmp# ./optware_setup
Checking we can reach the repository...
./optware_setup: line 15: can't create /opt/usr/bin/optware_boottime: nonexistent directory
chmod: /opt/usr/bin/optware_boottime: No such file or directory
Making sure we have an initial opkg
Connecting to downloads.openwrt.org (78.24.191.177:443)
wget: server returned error: HTTP/1.1 404 Not Found
Connecting to dev.openwrt.org (217.115.15.26:443)
wget: can't open '/opt/lib/functions.sh': Read-only file system
tar: can't open 'opkg.ipk': No such file or directory
tar: can't open 'data.tar.gz': No such file or directory

Delving into the script, there are 2 problems. First, bind /opt to /jffs/opt. Then change line 32 of the script to the updated link (look up the latest link here).

root@xxxxxxxx:/jffs/tmp# mount -o bind /jffs/opt /opt
root@xxxxxxxx:/jffs/tmp# vi optware_setup
change to line 32:
`/usr/bin/wget https://downloads.openwrt.org/snapshots/trunk/bcm53xx/generic/packages/base/opkg_9c97d5ecd795709c8584e972bfdf3aee3a5b846d-10_bcm53xx.ipk -O opkg.ipk` \

Let’s try again.

root@xxxxxxxx:/jffs/tmp# ./optware_setup
Checking we can reach the repository...
Making sure we have an initial opkg
Connecting to downloads.openwrt.org (78.24.191.177:443)
opkg.ipk 100% |***********************************************************************************************************************| 59159 0:00:00 ETA
Connecting to dev.openwrt.org (217.115.15.26:443)
functions.sh 100% |***********************************************************************************************************************| 7274 0:00:00 ETA
Creating the opkg config file in /opt/etc/opkg
You are now ready to install packages using opkg (this session only).
I've installed a script, optware_boottime, to run on boot and make the opkg settings persistent.
I'll add this to the end of rc_startup in nvram for you.
Downloading http://downloads.openwrt.org/snapshots/trunk/bcm53xx/generic/packages/base/Packages.gz.
Updated list of available packages in var/opkg-lists/chaos_calmer_base.
Downloading http://downloads.openwrt.org/snapshots/trunk/bcm53xx/generic/packages/packages/Packages.gz.
Updated list of available packages in var/opkg-lists/chaos_calmer_packages.
Downloading http://downloads.openwrt.org/snapshots/trunk/bcm53xx/generic/packages/routing/Packages.gz.
Updated list of available packages in var/opkg-lists/chaos_calmer_routing.
Downloading http://downloads.openwrt.org/snapshots/trunk/bcm53xx/generic/packages/telephony/Packages.gz.
Updated list of available packages in var/opkg-lists/chaos_calmer_telephony.
Minimal setup is complete. You should now have a working opkg.
We have created some aliases in your ~/.profile to make everything work.
Please either 'source .profile' or LOG OUT and LOG IN AGAIN before proceeding.

Success!

PS: Note that you’ll need to add /jffs/opt to your fstab or something in order to mount /opt on startup.
Disclaimer: I’m a vim user. nano is just an example 🙂

How and Why Swiftype Moved from EC2 to Real Hardware – High Scalability –

The hard truths – cloud is not always the answer.

Great comment from HN:

The reason why it is extremely hard to engineer robust large scale AWS cloud apps can be summarized under the umbrella of performance variance:

– machine latency varies more, you can’t control it
– network latency varies more
– storage latency varies more (S3, Redshift, etc.)
– machine outages are more frequent

How and Why Swiftype Moved from EC2 to Real Hardware – High Scalability –.

How PAPER Magazine’s web engineers scaled Kim Kardashian’s back-end (SFW) — The Message — Medium

I knew about Gluster File system, but it’s the first time I heard of Bees with Machine Guns! This article provides an insider’s view on how an online magazine company scale up their back-end to prepare for Kim Kardashian’s backend ;-). If you are a sysadmin or web engineer I bet some parts of the article will make you smile.
1 NRRjxiTzjIFBK4UlJ3m2ww
How PAPER Magazine’s web engineers scaled Kim Kardashian’s back-end (SFW) — The Message — Medium.

M1 routers misbehaving

Was doing a routine scan when I spotted an unfamiliar address on the network: 192.168.200.1. Strangely arp doesn’t reveal its MAC address, which seems odd given that this is a private IP address used internally.

Traceroute reveals the truth:

> tracert 192.168.200.1

Tracing route to 192.168.200.1 over a maximum of 30 hops

1 3 ms 4 ms 3 ms 10.0.0.2
2 * * * Request timed out.
3 213 ms 5 ms 5 ms 158.210-193-4.unknown.qala.com.sg [210.193.4.158]
4 3 ms 3 ms 3 ms 157.210-193-4.unknown.qala.com.sg [210.193.4.157]
5 104 ms 4 ms 5 ms 217.203-211-158.unknown.qala.com.sg [203.211.158.217]
6 88 ms 5 ms 22 ms 214.203-211-158.unknown.qala.com.sg [203.211.158.214]
7 25 ms 5 ms 14 ms 192.168.200.1

Trace complete.

It seems someone has a misconfigured or misbehaving router that’s exposing private IP addresses. Let’s hope it is not storing something incredibly important.

Migrating a failing hard disk

hard-disk-failure

It happened. Or should I say, almost happened.

As we all know, the hard disk (mechanical ones, that is) is the component that has the highest chance of failure in any computer system. One day I was doing a routine backup of my notebook. My backup solution is rather simple, consisting of no more than rsync. I had left it running a full backup in the background before I went out, expecting it to complete before I return, since only differences are copied. To my surprise when I returned, it was still running and my notebook felt very hot. Much hotter than usual, and that says something, as my notebook reaches uncomfortably heaty temperature after long usage. I blame it on the GPU/hard disk. The copying appears to be stuck at 76% on a particular large file. After terminating it and manually copying the file to my backup hard disk, it remained stuck at 76%. First sign that something is wrong. To be sure that it wasn’t my backup hard disk that’s having problem, I made a copy of the file on the same drive. Yup same thing happened. I immediately stopped any attempts to access the file to avoid aggravating the problem. Conventional wisdom in hard disk recovery says that when a hard disk is showing signs of failure, do not access the “bad” parts ‘cos it could cause the problem to get worse.

A hard disk replacement is imminent, which is not a big deal. Except that it could mean reinstalling everything from scratch. Or not. I’m really not looking forward to spending days fighting with a new OS. So cloning the existing hard disk is my plan.

Step 1: research

Before doing anything that could lead to further data loss, it is always good to read up. My concerns were 1) data integrity 2) preservation of Windows license. As the Windows license that came with the notebook is an OEM license, I wasn’t sure if it could survive the cloning process – with a retail Windows license you could activate on up to X times I think. The recommended way to backup a Windows machine is to use Windows System Image Backup. Unfortunately it can’t be used in my case. My second idea was to use dd. However I’m aware that dd could run into trouble with reading bad parts. Finally I decided on ddrescue, as it appears to be addressing what I need from dd, but with more features targetted towards hard disk recovery.

Step 2: execute

I got a larger hard disk as recommended by most articles. I also needed a way to attach the new hard disk to my notebook. Here’s where my trusty SATA to USB adapter comes in handy. For the benefit of others who may want to do the same, the steps are:

IMG_0207

  • download Knoppix Linux ISO
  • burn to CD, or if you’re lazy like me, create a bootable USB thumbdrive with it using Rufus
  • boot up to Knoppix
  • select shell
  • lsusb to see what USB devices are attached
  • insert SATA to USB adapter
  • lsusb to see what’s added
  • dmesg to see the newly added device. note the new device name
  • (assuming old hard disk is /dev/sda and new hard disk is /dev/sdc) take a deep breath and type:
    ddrescue -f -n /dev/sda /dev/sdc /root/rescue.log
  • if there are no errors, hurray! you can stop here. Otherwise, type:
    ddrescue -d -f -r3 /dev/sda /dev/sdc /root/rescue.log

In my case there was 1 x 8192 bytes of error after the first command. After running the second command, it was reduced to 1024 bytes. Ok, it wasn’t as bad as I thought 🙂

Step 3: verify

  • Unscrew the hard disk compartment and replace the old hard disk with the new one. Replace cover.
  • Boot up.

At this point, if it works it should be pretty obvious. I’m glad to report that everything works as planned. wmic diskdrive shows the new hard disk details. Oh, and Windows didn’t complain. An unexpected good news is after the upgrade, things are speedier and my notebook doesn’t feel as hot as before. Hurray! 😀