Posted in 2016
Install Python and SQLite from Source
I was writing some Python to pull text from pdf files and put them into a sqlite database so that I could perform full text searches for various keywords and phrases. I was able to extract the text and put it in the database. I was using Anaconda on windows to do this fully expecting to be able to do the same on Ubuntu (14.04). I had to replace the sqlite3.dll with the latest one from here because the sqlite3.dll included with Anaconda didn’t have FTS4 or FTS5 enabled. This was as simple as copying the new dll over top of the old one and running this script to verify the changes:
Use Winscp to sync files from Windows to Linux
Recently I upgraded my work laptop to Windows 7. At that time I didn’t want to use the previous sync methods that I have blogged about. I wanted to use something simpler (read easier to install and maintain between different machines). After doing some research I settled on using winscp. Winscp supports folder sync operations through a command line. Winscp takes a simple text file listing the commands that it is to execute. This process can be automated on Windows using batches, one to pull changes and the other to push changes.
Updated Mercurial Batch Pull/Update Python Script
It has been awhile since I last posted. Here is an update to the mercurial push, pull & update scripts I had posted earlier. The code is much better then the original scripts. All of the functionality is wrapped into one script instead of across a few. It should run on Windows without alteration (I’ll get a chance to test it out on Tuesday).
Tolerance Testing - Determining an Appropriate Tolerance
Programming with floating point values leads to numerical round off errors due to the nature of binary numbers. For a more detailed discussion see this article or this one. Basically it boils down to the fact that not all real numbers can be represented by a finite binary sequence. Due to this phenomena comparing floating point values directly is strongly discouraged as the results can be unexpected. Normally, the absolute value of the difference is taken and if it is less than some tolerance value it is accepted as a match.
Team Combinations from a Limited Pool of Players
I had to determine an arrangement of teams from a player pool. Specifically there were 9 players that needed to be organized into fair teams. It seemed straight forward to arrange them into 3 teams of 3 players. The other caveat was that the teams needed to be as fair as possible. Some players were highly skilled while others were not. It wouldn’t be fair to stack the best players on a single team. In order to determine a fair team I had to figure out how many combinations of teams were possible. This would allow me to iterate through all of the combinations and apply a metric to each combination. The combination that produced the minimal value would be the optimal arrangement.
Sync files from Windows to Linux using SSH
Over the weekend I decided to figure out how to sync files between windows based computers and Linux based computers, specifically Ubuntu. On windows I investigated a number of technologies. Finally I settled on cwrsync. The reason for the choice is that I really like rsync. I have a number of scripts that work really well (and are fast) that I use on my Linux boxes on a regular basis. There is rsync available in cygwin but that is far too heavy for simple file synchronization. cwrsync is the best of both worlds. It packages the cygwin dll and rsync binaries in a form that is easy to use on windows.
Speed Up Factor
I watch a lot of Coursera videos and usually view them at 1.25x or 1.5x normal viewing speed. I started thinking about how much that would translate into viewing time.
Sequential Generation from an Index Value
I needed to be able to generate a sequence of letters from a specific index value. Basically, I wanted to loop through a sequence of values and retrieve the corresponding string value. For example, 0 would be A; 4 would be E; 36 would be AK; etc.
Rsync between Windows Folders
Following from the last post, here is an example script that uses cwrsync to sync a network share and another folder. I had to map the network share to a drive before I could use it properly.
Yesterday, everything was working well with my Ubuntu installation. I had to go and mess that up! I thought that I would go and remove packages that I no longer needed. After pruning the files from synaptic everything seemed OK till I restarted the computer. I couldn’t boot into the desktop. I figure I removed something critical. I spent a couple of hours trying to recover.
I have been planning on moving from windows for awhile now. I just hadn’t really analyzed what was keeping me in windows. I finally got around to it and realized that I really only use VB.net and c#. With the mono framework, there should really be nothing to hold me back. So I made the decision to switch my home computer over. I had previously installed Ubuntu on my son’s computer and was very impressed with it.
Python Script to Parse PFSense DHCP Log
I have a captive portal setup on my PFSense which allows my laptops and various other devices to connect through wifi. I was looking at the DHCP logs provided by PFsense the other day and realized that I needed a way to verify the macs that were requesting ip addresses. I put together a python script that parses the log and attempts to match the mac addresses that I know with the ones in the log. Enjoy the code and note that the macs have been changed.
Python Reading Material
A while back I finished a pretty good book on python Python Scripting for Computational Science by Hans Petter Langtangen (link). It was a pretty good introduction to python. I really liked the slant towards the sciences and engineering. The problem sets were good.
I have moved the blog from wordpress to a new hosting provider Cloud-A. They provide a server and I configure it to run. So far the process has been pretty straight forward. I have the blog running on Ubuntu 14.04 and hosted using Nginx. It really didn’t take to long to get things up and running. The longest part was converting my old posts to reStructuredText. I decided to use a so-called static blog generated called Nikola as opposed to another wordpress implementation.
Mercurial and TortoiseHG on Ubuntu
I like Mercurial as a version control system because it is cross-platform (written in python) and is distributed (meaning it doesn’t require a central server to function). I use it on windows quite extensively and was one of the pieces of software that I needed on Linux. The other piece that I needed was TortoiseHG. It is a graphical front end to mercurial and works well.
Mercurial Push/Pull script with status checking
This is a modification to the original script that I published a while back now checks the status (hg status) of the repository before doing anything. If there are uncommitted changes, a message is printed and the repository is ignored in the pull/update mechanism. The check for commit status is also made for pushes as well. It is a very nice improvement to the script.
Mercurial Push/Pull and Update scripts
I like Mercurial as a version control system. It has a number of advantages over more traditional systems such as Subversion. I won’t go into details, they are easy to find on the internet. What I have found with mercurial is that I organize all of my repos under a root directory. I also use TortoiseHG as a graphical client that manages the commits and push/pull cycles. It works well for a single repository. Unfortunately it doesn’t work as well for a large number of repositories, that is it can’t do batch push/pull or updates.
File and Folder Permissions
As I get my Ubuntu system running the way I like I find I am copying files over from my old windows partitions (mp3’s, documents, pictures, etc.). I was looking at the permissions of my pictures - they were set to 777. I didn’t understand why. I think it has to do with the fact that I copied them from a windows ntfs partition. I can understand if it were set to 666, but having an the executable bit set really throw me. I wanted to change my pictures to permissions of 644. I tried running the chmod command in my home folder on my pictures.
Copy Pictures from a Digital Camera and Automatically Rename to Date and Time Taken
Most digital cameras use some sort of naming scheme that leaves a lot to be desired. The names usually consist of something like:
Convert MTS (AVCHD) Files to xvid
I have a Panasonic Lumix camera that generates MTS (AVCHD) movie files. These files are 720p HD files and are really large. I want to store them in a smaller file format without sacrificing quality. Using ffmpeg it is pretty straight forward to convert an MTS (AVCHD) movie file to xvid using ffmpeg. Using the following command will accomplish the goal nicely:
Convert MTS (AVCHD) Files to mkv
Here is a simple shell script that will use ffmpeg to convert mts files to mkv format using the h264 codec to compress them.
Convert MP3s to iPod Audio Book format (M4B)
I had the need to convert a group of mp3 files into a format that was suitable for playing on my iPod. Of course the mp3s could be played directly on the iPod without any trouble. This is great for songs, but an audio book is significantly longer. In my case I have a 40 minute commute each way and most audio books are too long to listen to during a commute. The iPod supports m4b files which are audio book files and they remember where they were stopped so you can resume listening to it after putting the iPod to sleep or listening to your music collection. The audio book format also supports changing the play back speed so it will be read to you much faster.
Configuring MathJax on Ghost
I am going to add MathJax support. In the code injection portion of your settings, add the following code to the header injection mechanism:
Configure Syntax Highlighting on Ghost
For syntax highlighting I am going to use highlight.js because I don’t have to install anything. Simply add the following code to the blog header code injection in the settings:
5-pin Bowling Statistics Calculator
My son plays 5 pin bowling and is a member of YBC Canada. I used to keep track of his average and some statistics using a spreadsheet. I would enter the data after the end of every series of games and then copy the cells down so that the formula were applied and the correct statistics were calculated. This process worked well enough except I started to notice small discrepancies between my calculations and the posted results.
I have switched platforms again. I have moved to the blogging platform.