All aboard the Employment Train

thetrainline.com

One of the questions I had to answer for both myself and interviewers was “What do you want to achieve in your first job?”. My answer was always a quote I read on a blog by a programmer hero of mine, Jeff Atwood. He said you should, as a junior software engineer, “endeavor to be the dumbest guy in the room” which simply means place yourself around intelligent experienced programmers and learn! This is what I wanted to do and I’ve been fortunate enough to be hired by a company that has an environment that will allow me to do that – thetrainline.com.

thetrainline.com sells train tickets in the UK and Europe both from its own website and by providing the software for Train Operating Companies such as Virgin Trains and First. They operate out of an office in Farringdon, one tube stop from London Kings Cross station.

I start in late September and honestly cannot wait to learn and earn with them. My official job title will be “Agile Developer”.

A big thank you is in order to Teck Loy Low who I met on The Hacktrain and subsequently put me in contact with thetrainline. If this isn’t a good advert for getting yourself involved with hackathons and the like I don’t know what is.

Danny

From Academia to The Real World™

My CV - This only gets you the interview, you have to do the rest

My CV – This only gets you the interview, you have to do the rest

It feels like yesterday, but it has now been 50 weeks since I graduated from The University of Hull. I know this because its my Birthday in 2 weeks — and theres the small matter of Charlotte graduating (Well done!). The ever-accelerating passage of time has meant that in recent months I have turned my attentions to planning out what I intend to do once my MSc is over.

This blog is primarily written for me to look back on in a few years and hopefully say “look how far I’ve come” or quite possibly ask where it all went wrong from here. However, it may include some useful tidbits of information for future graduates looking for jobs.

My job hunt started when a company well known for their search engine approached me offering me an interview for a Graduate Software Engineering position in either Mountain View or London. This was exciting. I was invited for a final interview having completed a telephone interview with an software engineer/team leader. At this point I had to decide if I wanted to work in either London or Moutain View — opting to work in Moutain View meant that my interview had to take place within a week due to certain visa requirements — in retrospect I should have waited and gone for the London job with more time to prepare, however the draw of a Californian life-style was too much! I can’t actually write about what I was asked in the interview, however I can say that it was a fantastic learning experience. Anyone who gets a chance to interview at a large tech firm certainly should, even if its only for the chance to better their skills.

In the end I wasn’t offered the job. I was dissapointed but the lady from HR told me I was close to getting it and should reapply in a year or two once I had more experience under my belt. This was both reassuring and something nice to work towards.

Around the same time as that interview I was engadged in a process alongside Dr. Dan Franks to apply for funding for a PhD program, in which I would work on an Evolutionary Computation project dealing with issues of crowd behaviour in evolved systems and comparisons to the real-world. This was an awesome project and I appreciate the work Dr. Franks put into my application with me. I also appreciate his honesty when it came to discussing whether doing a PhD was right for me at this time in my life — in the end I came to the decision it wasn’t. Whilst I love research and doing new and exciting things I was more interested in getting my teeth into some real-world software engineering projects and improving my skills in that area. I hope that a PhD is something that I come back to at some point later in life. I was offered the PhD on a full stipend and fees paid, but turned it down after a lot of reflection.

A few months passed as I knuckled down on the last few modules of the taught portion of my course but now, as I wrote previously, I am in the research semester of my Masters Degree — whilst this is in some ways no less busy than the taught portion of the course it does have the advantange of being a period of time in which I don’t have to be physically located in York for anything other than a weekly supervisor meeting — this has given me the oppertunity to go job seeking again.

Without enumerating over each interview process I’ve gone through over the last 3 weeks, because there been rather a few, I want to make the following observations:

  • You will probably recieve a few emails thanking you for your time but informing you that the company doesn’t wish to move ahead with your application. A few rejections in a row can get you a bit down, but…
  • There are many reasons for not being offered a job. As many, if not more, than there are to be offered one. It could be something as simple as not being as enthusiastic about their technical platform as you are about say, Node.js. So don’t take anything personally and behave professionally, don’t take it to heart.
  • Doing interviews will undoutedly make you a better Software Engineer. After a few weeks of Data Structure, Algorithm and Software Architecture questions coming at you in a high pressure environment you’ll notice how much your thought processes have changed and how much you’ve improved.
  • Attending interviews is a great way to discover if you think you could handle commuting every day. In the past 2 weeks I’ve travlled over 3,000 miles doing interviews (York to London return is 431 and a half miles). I came to the conclusion I could, quite happily, travel a (shorter) distance each day
  • You will know if you would work for a company within an hour of being there. I remember one interview in particular where I knew very quickly there was no way I would want to work there (the best thing to do in this situation is continue as normal and be professional). Other times you’ll experience what you percieve to be your dream work environment. Remember, you’re interviewing the company just as much as they’re interviewing you. You’ll be spending 8 hours a day there, 5 days a week for the foresable future. There are plenty of CS jobs avaliable so make sure you get one which is good for you. If you like a job that also means you’ll work better for them, so its good for them too if you turn down a job you wouldn’t love.
  • Every company interviews in a slightly different way. I personally found a mixture of programming on a PC as well as ‘whiteboarding’ data structure and algorithms gave me the best environment in which I could show what I percieve to be my skills. Some companies only do one, or the other, which is a shame.
  • You should read “Code Complete” and “Programming Interviews Exposed” and brush up on your Data Structures and Algorithms before going to any interviews (I wish I’d done this before going to that search engine company), you’ll thank yourself for it.
  • Companies love enthusiasm. Proving you’re enthusiastic about Computer Science is quite easy, especially if you have a blog like this one, however as quite a stoic person I found it hard at times to show enthusiasm for a companies product — especially if that product was not public facing and therefore I’d never had a chance to use it. I’m not really sure what to say or do to help these situations.
  • You will be asked about anything, no matter how minor it is, that is on your CV. Fortunately everything on my CV is true so it wasn’t too bad, I can only imagine how hard it is if it isn’t.
  • There are some questions you will be asked by every company. You should work on refining your answers over time. The two I got asked everywhere were “Why don’t you have an A-Level in Maths?” and “Why did you decide to do a masters degree at York?”.

Just yesterday I formally accepted one of the 4 job offers I had received in this process at a fantastic company, for whom I’m very excited to be working for. I’ll write more about that in my next blog post!

Danny

A More Semantic Web with Schema.org, The Open Graph Protocol and HTML5

Semantic Web with html5, schema.org and the open graph protocol

One of the most important things for any modern business is its internet presence. If you’re not on the internet, or not active and visible on the internet, you might as well not exist to a large group of people. Search Engine Optimisation is the process of improving ones website so that it might appear higher up the Google Search rankings, where more people are likely to find it.

At the same time, one of the most interesting elements of modern software and services  is its openness. Everyone from local councils to The Association of Train Operating Companies is currently in the process of opening up their data to the world and hoping someone innovative, or with a different set of skills and resources, can make something they either couldn’t imagine themselves or didn’t have the time and money to build — for mutual benefit.

One possible enhancements to SEO and Openness for an organisation is to make their website semantic. The definition of Semantics, according to The Oxford Dictionary, is:

The branch of linguistics and logic concerned with meaning. The two main areas are logical semantics, concerned with matters such as sense and reference and presupposition and implication, and lexical semantics, concerned with the analysis of word meanings and relations between them.

The main takeaway point is that things, in this case HTML markup for websites, have meaning. We need to make sure that the meanings we are making visible to the world actually mean what we want them to mean. A nice side-effect of this is that web pages become a lot easier to parse or screen-scrape and extract information from.

HTML5

Prior to HTML5 the best way to give meaning to a tag was to use an id. So if you were to markup a simple website with a header and a list of news stories you might come up with something like this:

<div id="header">
	<h1>News Website</h1>
	<img src="logo.png" alt="logo"/>
</div>
<div id="newslist">
	<div class="story">
		<h2>News Title</h2>
		<p>Here is some exciting news!</p>
	</div>
	<div class="story">
		<h2>Another bit of news</h2>
		<p>A shame, as no news is good news!</p>
	</div>
</div>

Whilst this is relatively clean code, it does come with some issues. How is a screen-reader or search engine spider meant to know the meaning of a “story” element for example? Whilst it seems simple viewing it as a human being, we must remember that there are literally thousands of possibilities for element id names that mean “story”.

HTML5 provides some new Semantic Tags which allow us to bake meaning into elements themselves. Check out the example below which simplifies and improves the previous code using the new HTML 5 semantic tags.

<header>
	<h1>News Website</h1>
	<img src="logo.png" alt="logo"/>
</header>
<main>
	<article>
		<h2>News Title</h2>
		<p>Here is some exciting news!</p>
	</article>
	<article>
		<h2>Another bit of news</h2>
		<p>A shame, as no news is good news!</p>
	</article>
</main>

This implementation allows a browser, spider or screen reader to accurately understand what each element is for as the tag names used have been standardized by the W3C. In case you’re wondering the `<article>` tag is what is detected by browsers like IE and Safari to show a Reading View.

Wherever possible you should aim to use the semantic tags over generatic tags such as `<div>`. It makes code easier to read in addition to being more semantically correct. A full list of the HTML5 semantic tags and their meanings can be found on DiveIntoHTML5.

The Open Graph Protocol

Whilst I had been using HTML5 semantic elements for some time, I wanted to do more as part of the CS Blogs project both in terms of SEO and improving user experience through semantics.

I started with the Open Graph Protocol. The Open Graph protocol was developed by Facebook to allow websites to integrate better with Facebook, both in app and on the web, however other Social Media services also take advantage of open graph, including Pintrest, Twitter and even Google+.

The Open Graph protocol is implemented as a series of `<meta>` tags that you place in the head of your HTML pages. Each page can describe itself as identifying a Person, Movie, Song or other graph object using code such as that shown below for a Blogger on CS Blogs.com

<meta property="og:title" content="The Computer Science Blogs profile of Daniel Brown" />
<meta property="og:site_name" content="Computer Science Blogs"/>
<meta property="og:type" content="profile"/>
<meta property="og:locale" content="en_GB"/>
<meta property="og:image" content="https://avatars.githubusercontent.com/u/342035" />
<meta property="profile:first_name" content="Daniel"/>
<meta property="profile:last_name" content="Brown"/>
<meta property="profile:username" content="dannybrown"/>

As you can see most open graph properties start with an `og:` suffix, except those particular to the type of content you are making available, which are suffixed with the type name. The documentation for what tags are available can be found on the Open Graph Website.

This code will then be used by Facebook when someone links to that particular web page in their messages, or on their newsfeed. Here’s an example:

Open Graph element displayed on Facebook newsfeed

Open Graph element displayed on Facebook newsfeed

Whilst open graph is great for this purpose it does have some limitations. Each page can only be of one type, and you cannot add semantics for more than one element. This limitation is a problem for pages such as csblogs.com/bloggers which represents multiple people.

Despite its limitations its still worth implementing open graphs on pages for which it makes sense, especially if those pages are likely to be shared on social media.

Facebook, as usual, have some great development tools for open graph including the Open Graph Debugger, which allows you to see how Facebook interprets your page (but because Open Graph is a standard it’ll also help you debug any issues with Pintrest, Twitter etc.)

Schema.org

Schema.org is a standard developed in a weird moment of collaboration between the 3 search engine giants — Google, Microsoft and Yahoo. It allows you to specify the meaning of certain elements of content. You can technically do this using 3 different types of syntax, however in this blog post I will focus on micro data, partly because its the easiest to understand, fits inline with your pages and is an official part of the HTML5 spec, but also because its the only format currently fully supported by the Google search engine.

To begin with here is the HTML 5 structure of a blog post before it has been marked up with schema.org micro data. It should be pretty simple to understand if you’ve checked out the HTML 5 semantic elements mentioned previously.

<article>
    <header>
        <h2><a href="dannybrown.net">A Blog Post</a></h2>
    </header>
    <img src="dannybrown.net/image.png" alt="Featured Image"/>
    <p>This is an exert... <a class="read-more" href="dannybrown.net">Read more →</a></p>
    <footer>
        <div class="article-info">
            <a class="avatar" href="/bloggers/dannybrown">
                <img class="avatar" src="dannybrown.net/danny.png" alt="Avatar"/>
            </a>
            <a class="article-author" href="/bloggers/dannybrown">Daniel Brown</a>
            <p class="article-date">1 day ago</p>
        </div>
    </footer>
</article>

In order to markup our html with Schema.org we need to do a few things:

  1. Determine which Schema.org schema best suits the element we are describing.
  2. Determine the scope of that element
  3. Add the microdata attributes to our HTML

For our blog post example above the most relevant schema is BlogPosting. You can see all of the different types in a hierarchy at schema.org. The scope of the BlogPosting is the entire block contained within the `<article>` tags.

The scope of an item is delimited on the opening tag of our scope using the `itemscope` attribute. Read it as “Every bit of micro data within this element is about one item”. When we define the `itemscope` we also need to give it is type — this is done with the `itemtype` attribute. The value of the `itemtype` is the url of the schema.org schema — in our case `http://schema.org/BlogPosting`.

The values of fields that make up our schema, for example the “headline” of a blogpost are either other schemas or the values of elements. Here’s a fully schema’d up blog post:

<article itemscope itemtype="http://schema.org/BlogPosting">
    <header>
        <h2 itemprop="headline"><a href="dannybrown.net">A semantic blog post</a></h2>
    </header>
    <img itemprop="image" src="dannybrown.net/image.png" alt="Featured Image"/>
    <p itemprop="articleBody">This is an exert... <a itemprop="url" class="read-more" href="dannybrown.net">Read more →</a></p>
    <footer>
        <div class="article-info">
			<div itemscope itemprop="author" itemtype="https://schema.org/Person">
                <a class="avatar" href="/bloggers/dannybrown">
                    <img class="avatar" itemprop="image" src="dannybrown.net/danny.png" alt="Avatar"/>
                </a>
                <a class="article-author" itemprop="sameAs" href="/bloggers/dannybrown"><span itemprop="givenName">Daniel</span> <span itemprop="familyName">Brown</span></a>
			</div>
            <p class="article-date" itemprop="datePublished">1 day ago</p>
        </div>
    </footer>
</article>

Here we can see that just by assigning an `itemprop` attribute to a tag, the textual content it contains becomes the value of the named field. We can also see that a Person schema can be nested inside our BlogPosting schema to give us a rich author ‘object’.

One other thing worth noting here is that I elected to add `<span>` elements (which don’t change the visual layout of the HTML page) around the first and last names of the author so as to be able to correctly mark them up with `givenName` and `familyName` itemprops.

Any elements which you mark up with schema.org should be visible to the end user. Writing schema elements into your page and then hiding them via css or JavaScript will actually result in your SEO ratings being reduced, and could impare applications which rely on schema properties. (For example if a screen reader used schema.org properties, which to my knowledge none do yet)

Google provides a debugger for Schema.org, which came in great use whilst I was added in support for CS Blogs, its called the Structured Data Testing Tool. The output for a the home page of csblogs.com is shown below:

Google Structured Data Testing Tool Output

Google Structured Data Testing Tool Output

As you can see using Schema.org means that the Google search engine can actually understand what is on the page, and therefore its semantic meaning. csblogs.com is therefore more likely to go up in search terms that include the word blog, or search for the names of the authors mentioned for example.

Wrapping Up

Hopefully this blog post will have made you think about what you can do to make your websites more semantic — and therefore better for search engines, accessibility and in terms of openness. You can use all three of the technologies above at the same time, and I would implore you to do so. In return you’ll benefit from better Search Engine rankings, your users will benefit from better Social Media integration and screen reading for those with disabilities, and search engines can point people to web pages with a better understanding of what that page represents rather than just scanning for keywords.

Danny

Computer Science Blogs Beta

CSBlogs.com Homepage - Desktop

Rob and I have both been doing a lot of work on CS Blogs since the last time I blogged about it. Its now in a usable state, and the public is now welcome to sign up and use the service, as long as they are aware there may be some bugs and changes to public interfaces at any time.

The service has been split up into 4 main areas, which will be discussed below:

csblogs.com – The CS Blogs Web App

CSBlogs.com provides the HTML5 website interface to Computer Science Blogs. The website itself is HTML5 and CSS 3 compliant, supports all screen sizes through responsive web design and supports high and low DPI devices through its use of scalable vector graphics for iconography.

Through the web app a user can read all blog posts on the homepage, select a blogger from a list and view their profile — including links to their social media, github and cv — or sign up for the service themselves.

One of the major flaws with the hullcompsciblogs system was that to sign up a user had to email the administrator and be added to a database manually. Updating a profile happened in the same way. CSBlogs.com times to entirely remove that pain point by providing a secure, easy way to get involved. Users are prompted to sign in with a service — either GitHub, WordPress or StackExchange — and then register. This use of OAuth services means that we never know a users password (meaning we can’t lose it) and that we can auto-fill some of their information upon sign in, such as email address and name, saving them precious time.

As with every part of the site a user can sign up, register manage and update their profile entirely from a mobile device.

api.csblogs.com – The CS Blogs Application Programming Interface

Everything that can be viewed and edited on the web application can be viewed and edited from any application which can interact with a RESTful JSON API. The web application itself is actually built onto of the same API functions.

We think making our data and functions available for use outside of our system will allow people to come up with some interesting applications for a multitude of platforms that we couldn’t support on our own. Alex Pringle has already started writing an Android App.

docs.csblogs.com – The CS Blogs Documentation Website

docs.csblogs.com is the source of information for all users, from application developers consuming the API to potential web app and feed aggregator developers. Alongside pages of documentation on functions and developer workflows there are live API docs and support forums.

In the screenshot below you can see a screenshot of a docs.csblogs.com page which shows a developer the expected outcome of an API call and actually allows them to test it, in a similar way to the Facebook graph explorer, live on the documentation page.

CS Blogs API Documentation

CS Blogs API Documentation

Thanks to readme.io for providing our documentation website for free due to us being an open source project they are interested in!

The CS Blogs Feed Aggregator

The feed aggregator is a node.js application which, every five minutes, requests the RSS/ATOM feed of each blogger and adds any new blogs to the CSBlogs database.

The job is triggered using a Microsoft Azure WebJob, however it is written so that it could also be triggered by a standard UNIX chronjob.

Whilst much of the actual RSS/ATOM parsing is provided by libraries it has been interesting to see inconsistencies between different platforms handling of syndication feeds. Some give you links to images used in blog posts, some don’t, some give you “Read more here” links, some don’t. A reasonable amount of code was written to ensure that all blog posts appear the same to end-users, no matter their original source.

Try it!

I welcome anyone who wants to to try to service now at http://csblogs.com. We would also love any help, whether that be submitting bugs via GitHub issues or writing code over at our public repository.

Danny

Research Semester Begins

Most people don't do much more than a simple `git log`

My final semester at The University of York has officially begun. Over the next 5 months I will be conducting 100 credits worth of (hopefully) novel research in the area of Git Source Control Querying and Analytics through the use of Model-Driven Engineering under the supervision of Dr. Dimitris Kolovos.

The original project proposal is shown below:

Advanced Querying of Git Repositories

Git is a distributed version control system that is widely used both in academia and industry. Git provides a command-line API through which basic queries can be evaluated against local repositories (e.g. git log) but lacks facilities for expressing complex queries in a concise manner. The aim of this project is to support such complex high-level queries on Git repositories. In the context of this project, the student will need to

1) identify the metadata stored in a Git repository and extract it to an object-oriented representation (e.g. using JGit)

2) develop a driver that will allow languages of the Epsilon platform to query extracted metadata at a high level of abstraction. For example, the following query would select all files larger than 200 lines and which were last modified by joe@foo.com on a Wednesday:

File.all.select(f|f.lines > 200 and 
  (f.lastModifiedBy = "joe@foo.com" or f.lastModifiedDay = "Wednesday"))

Such an advanced query facility would enable the development of advanced Git repository analytics and visualisation services (e.g. using Epsilon’s EGL as a server-side scripting language).

I’m currently in the very early stages of literature review and finding out what other git analytics programs are available so there isn’t too much to talk about. However, I will as ever keep the blog up to date with my progress over the next few months.

Danny

Dollar IDE Screenshots

I realised today that I never got round to posting any screenshots of Dollar IDE, the PHP Integrated Development Environment I made for my Final Year Project at The University of Hull, once I’d finished developing the feature complete version for submission. This meant I couldn’t show it to anyone when I was talking about it, so I’ve posted some below.

Dollar IDE Start Screen

Dollar IDE Start Screen

This is the first screen a user sees when opening the Application. They can create a new project or open one from a git repository or the local computer — a recent project list makes it easy to get back into a project you’ve been working on.

When making a new project inputs such as “Project Name” and “Save Location” are validated as-you-type so a user always knows how to resolve any problems (e.g. invalid characters or selected a directory that you don’t have write permissions for).

The “Project Type” drop down allows you to select templates for your project. E.g. A web template which has an index.html and ‘images’, ‘styles’ and ‘js’ folders included. The idea was to allow this to be extended so you that you could select, for example, a CakePHP project type and DollarIDE would download CakePHP and resolve all the dependencies, however this has not yet been implemented.

Dollar IDE Open Project from Github

Dollar IDE Open Project from Github

DollarIDE integrates with any git repository through LibGit2Sharp, but has enhanced integration with Github through their API. When you create a project you can have Dollar IDE automatically make and initialize a repository on Github for you and even set if you want it to be public or private. In the above screenshot you can see how Dollar IDE allows you to log in with you Github credentials (which are stored securely using Windows DPAPI) and then select from a dropdown which repository you wish to open and start editing.

Dollar IDE Syntax Highlighting & Autocomplete

Dollar IDE Syntax Highlighting & Autocomplete

Of course most of a developers time is spent in the code editing window itself. In the screenshot above you can see DollarIDE’s tabs, auto-completion and syntax highlighting.

Dollar IDE Settings

Dollar IDE Settings

Seen as developers spend a lot of time in their IDE I felt it was important to ensure that Dollar IDE could be customized to suit their needs. For example, in the Colour Scheme Settings window shown above the user can change both the accent colours and background colour of Dollar. This includes the obligatory dark theme.

You can also see the project pane on the right hand side of the background window in this screenshot. The project pane allows the developer to manage folders and files, and open them for editing — all from inside the same window as the code itself. Due to PHP often being deployed in a CGI setting, file locations are especially important.

Finally, this screenshot also shows that Dollar IDE currently makes no attempt to syntax highlight HTML, which is a great shame as PHP is often intermixed with HTML. This will be one of the first features I add when I eventually open source the project.

Danny

Formal Specification Module Result – 96%

Z Notation PDF Render

The other module I was working on alongside “Topics in Privacy and Security” was “Formal Specification”.

In this module I used Z notation to formally specify a navigation system for Robot Vacuum Cleaners (A bit of a computer science obsession it seems). I then used the technique of promotion to allow for more than 1 robot vacuum cleaner to be in a room and observed some emergent interaction.

Comparison of the two movement algorithms I formally specified.

Comparison of the two movement algorithms I formally specified.

I really enjoyed this module, I like the challenge of working more on a mathematics level rather than a programming level — you have to work with different ‘data structures’ and work with different operators and structures of notation.

I would recommend for anyone to look in Z as it really makes you think about methods and functons in a different way. In terms of pre-conditions (what must be true for the function to start?), post-conditions (what must be true at the end of the function?) and invariants (what must always hold true?). It strikes me as a very good way to think about testing programs, even if you don’t formally specify them before producing them.

A visualisation of some of the emergent behaviour that happened with two robots -- one robot could effectively trap the other in the middle of the room.

A visualisation of some of the emergent behaviour that happened with two robots — one robot could effectively trap the other in the middle of the room.

Formal specification is probably a bit over the top for most software engineering projects, but if I were to ever work on a Planes autopilot system, or a system which moved the control rods in a Nuclear Power Station, I’d want to know formal specification had taken place. So it’s a useful skill to have learnt. It was also a nice refresh on discrete mathematics.

Danny

Follow

Get every new post delivered to your Inbox.

Join 241 other followers

%d bloggers like this: