Categories
Software Engineering

Re-architecting CS Blogs

Where are we now?

As I mentioned in my previous post the current CS Blogs system grew out of a prototype. This meant that the requirements of the system were discovered in parallel with designing and implementing the system, resulting in the slightly weird architecture shown below.

Old CSBlogs Architecture
Old CSBlogs Architecture

I say its weird because the `web-app` component isn’t really just a web application — it’s also an API server for the android application (and in theory any other app) and includes all the business logic of the system.

The decision to use MongoDB was born partly out of the desire to be “JavaScript all the way down” and partly out of the desire to be using what was cool at the time. Unfortunately at the time of building the system MongoDB wasn’t supported as a SaaS offering on Microsoft Azure — where CS Blogs is currently hosted — so the database was hosted on MLab, making database calls more expensive in terms of networking time than necessary.

The `feed-aggregator` is a small node.js application ran as an Azure WebJob. It was hacked together in a few days and really only supports certain RSS and ATOM feeds. For example it works great for ATOM feeds using <description> tags, but not ones which use <content> tags. These oversights were made due to the software not being developed on much real data, essentially only my own feed, and the homogeneous nature of our users blogs — they’re mainly all Blogger or WordPress.com.

Despite the obvious and numerous flaws of the system it has worked well for the past year or so. However, when I wanted to add the concepts of organisation to the system — a way of seeing blogs only written by people at a certain company or university — I found the system to be a hodge-podge of technical debt, to the point where adding new features was going to take longer than developing a good, modular, expandable system. It was time to pay down the technical debt.

Requirements

The first thing to do was to determine what parts of the old system were good — and try to ensure that these poistive things didn’t regress in the new system –, which things were in need of improvement and what new features we should add in at the same time.

Fortunately CS Blogs does do a number of things well:

  • Short lead time — New posts appear in the system within 15 minutes
  • Good Web App — The front end works well both on desktop and on mobile and is very performant due to its lack of scripts. The work Rob did on the styling makes it a joy to use
  • Good Authentication — Users enjoy being able to use Github, Stack Exchange or WordPress to sign in and I enjoy not having to look after their passwords

A few things it could improve on are:

  • Support for a larger range of RSS and ATOM feeds  — ATOM support in particular isnt great in the current system
  • A lot of functionality only works in the web app — Any method which requires authentication, such as signing up to the system, isn’t avaliable through the API
  • Feed aggregation downloads every authors feed every 15 minutes, this is a lot of data and wouldn’t be economic to scale to 100s of users
  • Code maintainability is poor due to a complete lack of automated testing and linting

The additional user-facing features I want to implement are:

  • Notifications of new blog posts for CS Blogs applications on Android/iOS
  • Support for the aforementioned organisations feature

Designing a Distributed System

The system you can see in the diagram below was designed with the intention of fulfilling out of the requirements which I outlined above. You’ll notice the use of Amazon Web Services icons, as I have recently switched hosting from Azure to AWS. There are a enough reasons for this decision to warrent its own blog post, so I wont go into detail here.

newcsb-3
The new CS Blogs Architecture

In the new system all applications are treated as first class citizens, meaning there is nothing that the web application can do that any other application can’t. This is achieved by having all of the business logic, authentication and database interaction handled by the `api-server` — which is accessable by anthing that can make HTTPS:// requests and handle JSON responses.

This means that the mobile applications will be able to perform actions such as registering a user and editing their details, which they cannot under the current system. Another benefit to the mobile applications that isn’t shown on this diagram is that the `feed-downloader` calls Amazon SNS with information about how many new blog posts it has found every time it runs, this in turn is relayed to the mobile applications in the form of notifications.

Whereas in the old system we used MongoDB, I’ve opted to use PostgreSQL — via the Sequelize Node.js ORM — this time around. Some of the features I want to implement in the future, such as organisations, make more sense as relations rather than as document in my mind and the ecosystem of applicatons for interacting with SQL databases, and in partciular PostgreSQL, is much more mature than MongoDB.

The `feed-downloader` is portable, but contains an entry point so that it can be used as a infrastructureless AWS Lambda function (and I suppose this entry point would also work for the newly released Azure Function system). It’s a bit more clever than the old `feed-aggregator` in that it uses If-Modified-Since HTTP requests to only download and parse RSS or ATOM feeds that purport to have changed since the last time an aggregation was ran.

Implementation

The implementation of the `feed-downloader`, `api-server` and `web-app` components follows my guide to writing better quality Node.js applications. Node.js was chosen due to its abundance of good quality libraries, ease of interaction with JSON objects and the authors familarity with it in production scenarios.

ES2015 JavaScript features including the module system, string interpolation and destructuring are used throughout to aid readability of the system — therefore Babel is required for transpilation.

Just some of the feed-downloader tests
Just some of the feed-downloader tests

In order to meet the requirement of good maintainability the `feed downloader` was built using the test driven development methodology and currently has 99% test coverage. These tests use real data, feeds from actual CS Blogs authors, including feeds from Blogger, WordPress.com, WordPress.org, Ghost and Jekyll.

Theres still a lot to be done before before the new CS Blogs can be released, so why not hit up the contribution guide and get involved?

Danny

Categories
University

Year 3 Semester 2 Results

Today I received my final set of grades for my BSc (Hons) in Computer Science from the University of Hull – This included my two second semester modules, Mobile Devices and Application and Distributed Systems Programming, as well as my Final Year Project.

I achieved a grade of 85% in Mobile Devices and Applications, and 89% in Distributed Systems Programming.

The final year project was worth twice as many credits as each second semester, and so had more of an effect on the final grade. Thankfully I did quite well in the final year project, achieving a grade of 86%.

My overall weighted average for this year, including my first semester modules grades, is 86.5%.

This grade, weighted with my second year grades, means that my final grade for my degree as a whole is 86% – a very high first! I am of course over the moon with this.

I’d like to again say thank you to everyone who has made my time at university not only great for learning, but truly the best three years of my life (so far! :P). Particularly, but not limited to:

  • Rob Crocombe
  • Simon Watkins
  • Hayley Hatton
  • Russell Billingsley
  • Toby Russell
  • Jon Rich
  • Tom “Jeff” Procter
  • Special mention to “our American foreign exchange students”

 

  • Dr Martin Walker
  • Eur Ing Brian Tompsett
  • Rob Miles
  • Dr David Parker
  • Dr Peter Robinson

And of course anyone I spent time with in the labs or any of the many, many nights out in the first two years. Last but by no stretch of the imagination least thanks to my Mum, Dad, Brother and Sister for supporting me throughout the last 3 years.

I’m looking forward to trying to maintain this good score next year at York! Of course I will continue to do this blog throughout my time there too.

Danny

Categories
University

Semester 1, Year 3 draws to a close

Yesterday I had my last exam of the semester, and handed in my interim report for my final year project. Those two things being done signals the end of the first semester of the third and final year of my Bachelors degree. Exciting times.

This semester has been an interesting blend of very challenging, incredibly interesting and quite good fun — and though there have been a few times when I’ve felt slightly overwhelmed by work I’m glad I took the modules I did and felt I have learnt and achieved a lot!

I will receive the results for both “Languages and Their Compilers” and “Data Mining and Decision Systems” on the 24th of February, I will of course update the blog when I know what grades I have achieved.

Looking Forward

Year 3 Semester 2 Timetable
Year 3 Semester 2 Timetable

As you can see from the above timetable I expect to be spending a lot of my second semester of this year working on my final year project, an IDE for the programming language PHP.

Alongside my project and studies I will also be continuing in my role as an undergraduate demonstrator for the department of computer science. In the forthcoming semester I have been tasked with helping out students on the 1st year module “Programming 2” which teaches object orientation and other concepts in C#.

The two modules I will be taking in semester 2 are “Mobile Devices and Applications” and “Distributed Systems Programming”

Mobile Devices and Applications is the module concerned with developing mobile apps with a good user experience, knowledge of different mobile platforms — such as iOS, Android and Windows Phone — and technologies — such as 3G, 4G and WiFi. I am aiming to do really well in this module as I have already developed quite a few mobile apps.

Distributed Systems Programming is a module about the “architectures, technologies and programming paradigms used in implementing and deploying distributed computing applications”. A distributed system is “a software system in which components located on networked computers communicate and coordinate their actions by passing messages”. I’m looking forward to this module because I really enjoyed networking in year 2.

I will of course keep the blog updated throughout the upcoming semester. Bring it on!

Danny