Pilot’s Flying Logbook

AFE Pilot's Flying Logbook

I bought myself a Pilot’s Flying Logbook on Sunday after an interesting lesson in which I learnt more about Straight and Level Flight, saw two de Havilland Tiger Moth biplanes, did my first take off and experienced my first go-around (aborted landing due to a plane taxiing onto the runway we were about to land on)

It has three entries so far; I look forward to adding a lot more in the next few months whilst the weather is so nice.

Danny

 

Open Source JavaScript talk at JavaScript Cambridge

Presenting at JavaScript Cambridge

I presented a talk about my experiences developing Open Source JavaScript applications for CS Blogs today at the JavaScript Cambridge user group.

I covered

  • how your intended developer audience should affect your technical decisions
  • how a good application doesn’t always become a good open source project
  • how to structure an application or system to make it easy to contribute to
  • how the CS Blogs workflow fits together

After having received some feedback from Charlotte, who watched the whole thing, I intend to make some improvements to the structure of the presentation and submit it to some other conferences.

I’m not sure how much sense the slides will make without me talking around them, but they’re available here.

Danny

Braces

There are a few things that any seasoned Software Engineer will have had arguments discussions about. Windows vs Linux, Merge vs Rebase and inevitably code indentation style.

Just today Rob and I discussed whether we should diverge from the “One True Brace Style” (1TBS) decreed by the AirBnB JavaScript Guide toward the Stoustrup style of indentation. The only difference? Stroustrup does not use a “cuddled else”, instead else keywords are on their own line.

Does such a minor difference matter? I would argue it does. If being able to read code in a certain style increases a programmers productivity then that is no bad thing. However, this increase in productivity can be easily offset by having to change style when working in different codebases. Consistency is important.

To maintain consistency in the CS Blogs codebase every component would have to be updated. This would mean 100s of lines changing for style, reducing the effectiveness of git blame and muddying the commit history. Even if we were to do this eslint-config-airbnb was downloaded 399,657 times in the last month and I would wager most of the projects using it are sticking with the suggested 1TBS style. The advantage of having code that looks like the “standard” for an open source project is that it enables potential contributers to get involved that bit easier.

My theory about code style guidelines is that in a team of people n-1 people will be unhappy with at least part of the guideline. The only person that will be completely happy with them will be the person whom decided upon the rules. Programming is merely transcribing processes and thoughts into a language a computer can understand, and in that sense it is very personal and everyone is likely therefore to have strong feelings around how those thoughts look on screen.

As with so many things in Software Engineering, in many ways the style you choose doesn’t matter, but sticking to it and enforcing consistency does. This is why I am against changing the CS Blogs codebase even though I agree with Rob that the stoustrup style is nicer on the eye.

So, what can Rob do in this situation? The first option would be to just keep writing in the 1TBS style until it seems natural (this took me a few days of writing), however he could also use an automated code formatter to change how his local code looks and then automatically have it changed to the prescribed style before any commits to version control. Any mistakes by the automated code formatter would be caught by the ESLint commit hook.

Danny

Re-architecting CS Blogs

Where are we now?

As I mentioned in my previous post the current CS Blogs system grew out of a prototype. This meant that the requirements of the system were discovered in parallel with designing and implementing the system, resulting in the slightly weird architecture shown below.

Old CSBlogs Architecture

Old CSBlogs Architecture

I say its weird because the `web-app` component isn’t really just a web application — it’s also an API server for the android application (and in theory any other app) and includes all the business logic of the system.

The decision to use MongoDB was born partly out of the desire to be “JavaScript all the way down” and partly out of the desire to be using what was cool at the time. Unfortunately at the time of building the system MongoDB wasn’t supported as a SaaS offering on Microsoft Azure — where CS Blogs is currently hosted — so the database was hosted on MLab, making database calls more expensive in terms of networking time than necessary.

The `feed-aggregator` is a small node.js application ran as an Azure WebJob. It was hacked together in a few days and really only supports certain RSS and ATOM feeds. For example it works great for ATOM feeds using <description> tags, but not ones which use <content> tags. These oversights were made due to the software not being developed on much real data, essentially only my own feed, and the homogeneous nature of our users blogs — they’re mainly all Blogger or WordPress.com.

Despite the obvious and numerous flaws of the system it has worked well for the past year or so. However, when I wanted to add the concepts of organisation to the system — a way of seeing blogs only written by people at a certain company or university — I found the system to be a hodge-podge of technical debt, to the point where adding new features was going to take longer than developing a good, modular, expandable system. It was time to pay down the technical debt.

Requirements

The first thing to do was to determine what parts of the old system were good — and try to ensure that these poistive things didn’t regress in the new system –, which things were in need of improvement and what new features we should add in at the same time.

Fortunately CS Blogs does do a number of things well:

  • Short lead time — New posts appear in the system within 15 minutes
  • Good Web App — The front end works well both on desktop and on mobile and is very performant due to its lack of scripts. The work Rob did on the styling makes it a joy to use
  • Good Authentication — Users enjoy being able to use Github, Stack Exchange or WordPress to sign in and I enjoy not having to look after their passwords

A few things it could improve on are:

  • Support for a larger range of RSS and ATOM feeds  — ATOM support in particular isnt great in the current system
  • A lot of functionality only works in the web app — Any method which requires authentication, such as signing up to the system, isn’t avaliable through the API
  • Feed aggregation downloads every authors feed every 15 minutes, this is a lot of data and wouldn’t be economic to scale to 100s of users
  • Code maintainability is poor due to a complete lack of automated testing and linting

The additional user-facing features I want to implement are:

  • Notifications of new blog posts for CS Blogs applications on Android/iOS
  • Support for the aforementioned organisations feature

Designing a Distributed System

The system you can see in the diagram below was designed with the intention of fulfilling out of the requirements which I outlined above. You’ll notice the use of Amazon Web Services icons, as I have recently switched hosting from Azure to AWS. There are a enough reasons for this decision to warrent its own blog post, so I wont go into detail here.

newcsb-3

The new CS Blogs Architecture

In the new system all applications are treated as first class citizens, meaning there is nothing that the web application can do that any other application can’t. This is achieved by having all of the business logic, authentication and database interaction handled by the `api-server` — which is accessable by anthing that can make HTTPS:// requests and handle JSON responses.

This means that the mobile applications will be able to perform actions such as registering a user and editing their details, which they cannot under the current system. Another benefit to the mobile applications that isn’t shown on this diagram is that the `feed-downloader` calls Amazon SNS with information about how many new blog posts it has found every time it runs, this in turn is relayed to the mobile applications in the form of notifications.

Whereas in the old system we used MongoDB, I’ve opted to use PostgreSQL — via the Sequelize Node.js ORM — this time around. Some of the features I want to implement in the future, such as organisations, make more sense as relations rather than as document in my mind and the ecosystem of applicatons for interacting with SQL databases, and in partciular PostgreSQL, is much more mature than MongoDB.

The `feed-downloader` is portable, but contains an entry point so that it can be used as a infrastructureless AWS Lambda function (and I suppose this entry point would also work for the newly released Azure Function system). It’s a bit more clever than the old `feed-aggregator` in that it uses If-Modified-Since HTTP requests to only download and parse RSS or ATOM feeds that purport to have changed since the last time an aggregation was ran.

Implementation

The implementation of the `feed-downloader`, `api-server` and `web-app` components follows my guide to writing better quality Node.js applications. Node.js was chosen due to its abundance of good quality libraries, ease of interaction with JSON objects and the authors familarity with it in production scenarios.

ES2015 JavaScript features including the module system, string interpolation and destructuring are used throughout to aid readability of the system — therefore Babel is required for transpilation.

Just some of the feed-downloader tests

Just some of the feed-downloader tests

In order to meet the requirement of good maintainability the `feed downloader` was built using the test driven development methodology and currently has 99% test coverage. These tests use real data, feeds from actual CS Blogs authors, including feeds from Blogger, WordPress.com, WordPress.org, Ghost and Jekyll.

Theres still a lot to be done before before the new CS Blogs can be released, so why not hit up the contribution guide and get involved?

Danny

Writing better quality Node.js applications

In February last year I started writing my first Node.js App, csblogs.com, alongside Rob Crocombe. The application has run without too many issues since then, serving around 1100 unique visitors per month. However, because it started out as a prototype we didn’t follow many of the best practices we should have done, and its starting to show now we want to extend the application.

Since writing a more complicated application at Trainline — which provides an API to clients and consumes many Windows Services, RESTful APIs and Redis Caches itself — I’ve realised how important it is to be using good software engineering techniques and tools from the very beginning of development.

Whilst most of the concepts in this post are language independent, the example tools I explain are all geared towards Node.

The Basics

These first few things are obvious, and are things you should be doing in all your projects.

Source Control

Source Control everything, even prototypes. The minuscule amount of disk space a git repository will require, and the few seconds every so often to write a commit message will be incomparable to the amount of time you will save by being able to revert a change, or check when changes occurred.

I’ve taken to using Github’s variant of the git flow pattern in which branches are deployed to production and only merged into master once they have been tested “in the wild”. This means that whatever code is in master is always certified as working in production and can be relied upon to be rolled back to in the event a new branch doesn’t work as intended. I like the WordPress Calypso Branch Naming Scheme to make it easy to understand what is being developed in each branch.

Branches use the following naming conventions:

  • add/{something} — When you are adding a completely new feature
  • update/{something} — When you are iterating on an existing feature
  • fix/{something} — When you are fixing something broken in a feature
  • try/{something} — When you are trying out an idea and want feedback

If you don’t like that naming convention or it doesn’t suit your needs, thats fine. Choose a naming convention and stick with it. As with so many stylistic choices in Software Engineering it isn’t the style that is important but the uniformity and consistency it brings.

Documentation

There are few things as annoying when developing software as opening a repository to find no information in how to build the project, run tests against the project or what data and functions the code it contains exposes to its consumers. When developing your code you should try to keep your documentation up to date with at least this information:

  • How to build
  • How to run tests
  • How to deploy
  • Data and functions exposed

Things like how to run your build, test and deployment scripts shouldn’t be changing so often as to be a pain to keep up to date. The data and functions exposed however may change reasonably often, especially if you are iterating whilst developing an API, so in order to make that task easier I suggest you use something like Swagger.io.

Testing

Testing code is one of the things that, 5 years into learning about developing software, I’m still learning about and still eager to learn more about. Good quality testing can be the difference between a bug rearing its head in production and costing you thousands of pounds and it being caught, thought about, and fixed earlier in the development cycle. Automated testing also means that you can be confident that any changes you make won’t be causing regression bugs.

When writing a new class, I first sketch out the interface — e.g. the public constructors, functions and data that the class will expose.

class Train {
     constructor(name, britishRailCode) {

     }

     getTopSpeed() {

     }

     determineLocation() {

     }
}

Then I spend some time thinking about all the potential edge cases, as well as the ‘happy path‘ through each of the functions. For example, what should happen when an invalid BR code is provided to the constructor? What happens if GPS coordinates cannot be determine due to faulty hardware — or in the case of html5 geolocation lack of user permission — in the determineLocation() call? What data should I get back from each of these functions? Is there a timeout after which the function should return an error if it hasn’t completed?

Once I have an idea of the expected behaviour of the class I start writing test, in a Behaviour Driven Development way, using the Chai Assertion Library and Mocha Test Framework. There are many advantages to use BDD, one of my favourites is that you dont write names for your unit tests, you state the expected behaviour in a full sentence — this makes it so much easier to realise the intention of the test and makes it so that test code can, in some senses, document the application code. Another great attribute of BDD is that it allows you to think in terms of what you want your code to behave like, rather than implementation details.

Test Coverage

Test coverage is a highly discredited method of determining the quality of a set of unit tests. Covering every line of code doesn’t mean you have thought of every conceivable edge case and therefore doesn’t guarantee your code is free of defects — indeed no testing can, as testing only shows the existence of bugs, not their absence. However, at minimum you should be covering every line of code — and every branch. (Yes, you can have a branch of code that doesn’t include any lines — I leave working out how as an exercise for the reader)

Linting

JavaScript allows you to make decisions on many areas of syntax. To use semi-colons, or not to use semi-colons — that is just one of the questions. When writing code in a team its easier if all of the code is formatted in the same way, so you don’t waste time reformatting it to a way you prefer code to be written. One way of doing this is to have everyone memorise your projects coding conventions and hope they stick to them, a much better way is to use a linter which will warn the programmer if they break any of the projects rules — this ensures that everybody writes in the same style.

An additional benefit of linting is that, depending on which linter you use, you get static analysis of your code provided to you too. This means the linter can point out any variables you have defined, but haven’t later used, for example.

I personally prefer to use ESLint as it allows different rules to be configured through the use of plugins. This means that you can have a set of rules for React JSX code and a different, more suitable, set of rules for your Mocha tests. For the bulk of my application code I use the official Airbnb style guide ESLint plugin — I like Airbnb’s focus on using modern JavaScript constructs and having code be as explicit as possible. They also provide lint rules for React code.

Commit Hooks

Linting and Testing is all well and good, but you need to have the members of your team buy in to using it. And, even on a one man team, I often forget to run tests and linting before I commit code resulting in broken builds and ugly code being in the master branch of my respository.

 

Pre-commit hooks to the rescue!

Pre-commit hooks to the rescue!

Commit hooks can be used to ensure that your unit tests and linter are always ran before a developer can commit their code. This means they can’t forget to lint or unit test and actually saves time in the long run. I use the pre-commit package to provide this functionality. In combination with good unit tests and linting, pre-commit hooks can help ensure that the code in your repository is always working and readable. (Note: a developer can decide to skip hooks if they’re in a rush to develop a hot fix, but this should be avoided under normal working conditions)

Configuration

Configuration is an important part of any application. It can be as simple as wanting to be able to change which database you want to connect to when on your location machine vs which one you connect to in production, however you don’t want this information to get into the public domain!

I used to use JSON files for configuration. However, these could easily be accidentally committed to git and make the secrets they contain public knowledge. Recently I’ve opted to use Environmental Variables for all the reasons outlined by Twelve Factor. The dotenv module for Node.js makes environmental variables easy to change in development. In the CSBlogs applications I’ve been writing I provide a sample.env file which includes all the names of environmental variables developers should be setting to get the app working in their local environment.

So, there it is. A quick run-down of some very basic steps to give yourself a nice place to work in JavaScript world. Now get developing!

Danny

Graduating from York

IMG_0091

On January 23rd I graduated with an MSc in Advanced Computer Science from The University of York. It was a nice occasion to get together and be merry with the family, and of course for some photos in my cap and gown — this time they were grey and blue.

My parents took my degree home with them and hung it up next to my undergraduate degree and departmental award. I think they look nice together.

12656110_1522924184675043_1926747582_o

For some more nice pictures you can check out my dads blog post about it.

Danny

HackTrain 2.0

12304083_786742814805282_703787864428458423_o

It seems like a lifetime ago, but the first HackTrain event was actually only 9 months ago — It probably feels so long ago to me as so much has change in the interim, including me getting a job as a result of the first event.

When an opportunity came up to represent Trainline at the second HackTrain I jumped at it as I thought it would be nice to go full circle in a  sense, going back with a job in the industry, and simply because I loved the friendly competitive atmosphere of the hackathon.

HackTrain 2.0 was a much larger event than the original, with 3 trains of Hackers rather than 1. I was assigned to the Big Data Train, in which participants would focus on using the APIs made available by the rail industry,

I originally thought I would be mentoring teams and helping them get set up with any RailTech APIs I knew about, however I actually ended up participating in a team consisting of myself, two Computer Science students and an engineer from french national rail operator SNCF. We worked together to build a web application called TrainGuard.

Desktop view of TrainGuard

Desktop view of TrainGuard

TrainGuards tagline is Calmer, Drier journeys. This is because the aim of the application, which is written in Node.js, is to allow customers to make more informed choices about their journeys. When using travel booking services that are currently available you can see what time you’ll depart from your origin and what time you’re expected to get to your destination, but nothing is said about what will happen in between.

When we were conducting market research on our South West Trains service to Bournemouth a lot of people told us they disliked being on trains which were full of football fans, or loud people coming out from a gig. The small sample we took rated those experiences as being as bad or worse than being on a delayed train, and that to avoid such circumstances they would be willing to reschedule their booking. Weather factored into peoples travel decisions less, but was still important.

Mobile View of TrainGuard

Mobile View of TrainGuard

On match days, or other particularly busy periods of rail travel the Train Operating Companies (TOCs) would much rather that their load of passengers was more evenly distributed throughout the days services. This is because each minute a train is delayed on the British Rail Network the train operating company gets fined, in some cases hundreds or thousands of pounds per minute — depending on where the tardiness occurs and how many other train services are affected by it.

The idea therefore occurred to me that we could aid both the traveller and the TOCs by allowing passengers who wish to avoid such events a chance to reschedule their journey and change their tickets. Initially we started out with the idea of making this an API that ticket distributors, such as Trainline, could tap into — however, we realised that at a hackathon client facing applications are more likely to do well.

As a team we implemented a subsystem that could connect to the SilverRail travel planning API, which had been specially opened up to the public for HackTrain 2.0. This subsystem would accept a journey plan in the form of an origin station code, a destination station code and a departure time and return a list of stations comprising the complete route. Using this information we then checked for events within a given distance of each station en route on Eventful, which is itself an aggregation of many other event websites — using specific filters we could find only the events that we felt could cause disruption to the end user.

In a perfect world this information would be shown to the user at or before the time of booking, or if circumstances changed their could be alerted via a notification on their phone — however, for our prototype they would have to visit a webpage hosted by us. The webpage was written using semantic HTML5 and styled using CSS3, it was designed to be responsive and as you can see above it works well on both desktop and mobile, mobile being especially important of course as the act of travelling means you’re most likely to be on a mobile rather than a desktop when you check this page.

When designing the page I wanted to make it beautiful to look at, but also really easy to grok what the page was trying to tell you — therefore I went with a design that included Bootstrap Jumbotrons with parallax scrolling high quality image backgrounds.

As with all Hackathons the most nerve-racking, but equally exciting, part of the weekend was the pitching at the end. We felt we had a good product that was mutually beneficial for both the industry and its customers but we had to convince 2 panels of judges of that. As I mentioned before, the event this time round took place on 3 different trains, so in order to get to the final we had to have one of the top 3 pitches out of the 10 teams on our train.

Excitingly we did make it to the final, in which we pitched to judges from the rail industry. There were judges from SilverRail, Trainline, Great Western Trains and The Ministry of Transport. Unfortunately we didn’t come in the top 3 teams at then end, but we had a lot of fun over the weekend and were proud that we got to the final and were given an opportunity to tell the industry how we felt they could deliver Smarter Journeys — which is something Trainline is currently focused on.

Thanks to everyone at HackPartners who made the event possible, and all of the sponsors of the event for making it so fun and giving 120 of the best hackers a chance to develop much needed improvements for the Rail industry.

Danny

Follow

Get every new post delivered to your Inbox.

Join 283 other followers

%d bloggers like this: