Blog posts tagged with web

This blog post covers an open-source timetable parsing project I released a couple of months ago. It is available at https://timetable.josephduffy.co.uk and the source is available on GitHub. The post won't go too in-depth on the technical side of the project, but rather the story of how I discovered it was possible.

Since starting my studies at the University of Huddersfield I've always wanted an easy way to see my timetable on my phone. The timetable available on the website isn't responsive and relies on POST data to display future weeks timetables, 2 things that don't work great on mobile, especially when the page is kept open in the background.

To get around this I would manually add each of my lectures and practicals to my calendar. These events could be set as recurring, however they would often need removing on specific days (such as during holidays) or have different information on another date, such as a room change. All of this eventually led me think about the famous XKCD Automation comic, so I started work on a method of automating adding it to my calendar.

Automating

My first idea for how to automate the process was to create a Google Chrome extension. To try and figure out if this was possible I loaded up my timetable.

Here I noticed that the URL didn't have any information about the timetable. My next thought is that my student number must be stored in a cookie. So I opened up the Developer Tools to inspect the request. To my surprise it was a POST request... with the student number as part of the form data.

But surely it would be just be using that to validate that I was the user I said I was, right? I loaded up my favourite HTTP utility, DHC, and created a basic request to load my own timetable.

It worked! Just to double check I messages one of my friends explaining the situation and asking him for his student number. He sent me the number and, again, it worked! My first thought was that I was happy that I'd found a way that I might be able easily scrape the data I needed. My second thought was that it was a bit worrying that by only knowing someone's student number you could find out where someone was likely to be. Despite this I started thinking of how I could truly automate this.

EaaS - Exploitation as a Service

Since it was so easy to access the data I thought it'd be a good idea to make the service available to others. Creating a Google Chrome extension would prevent some users from using the service and could make it a little harder to get it in to a user's calendar. The calendar would also not automatically update. The overall user experience would be worse.

Look at the state of this place!

Getting the current weeks timetable is easy, but what about future weeks? To figure this out I loaded up my timetable again and changed the value in the "Week beginning" dropdown.

As you can see, there's a bit more going on this time. However, having worked with ASP.NET before, I can see that it won't be too hard to make the request work. So I make another request:

Now we have where people will be for the rest of the academic year all of my future timetables!

Any application that can be written in JavaScript, will eventually be written in JavaScript

As per Atwood's Law, any application that can be written in JavaScript, will eventually be written in JavaScript, so naturally I turned towards Node.js.

I stuck with Express and found jsdom, a lovely framework for working with a DOM on the server. This would then allow me to pull the information and traverse the DOM on the server side. It might not handle errors too well but my timetable's markup doesn't appear to have changed since I started University, so it'll do.

Since I've got the DOM on the server-side to get future timetables I can simply take the value from the hidden inputs __VIEWSTATE and __EVENTVALIDATION and send them with the request. Simple!

Who doesn't love a good RFC?

Now came the tedious part: extracting the data and converting it to a format that calendar applications will understand. I've created single calendar events in the past, but never a full calendar with lots of extra fields, such as the VALARM. Overall the iCalendar specification is rather long and complicated, but it's easy enough to focus on the parts needed for the project. Primarily I had to ensure that events would trigger at the right time, independant of time daylight saving time, which means adding ;TZID=Europe/London to all event dates.

Add a couple of options for adding alarms prior to events and set the correct Content-Type headers and you're set! There were a few kinks to work out but I've been using it for a few weeks now and love it.

What you don't know can't hurt you

Before I released the code or created the website I spoke to one of my lecturers to ask whether he knew if I was breaking any rules. Apparently he (along with other members of staff) has told the team responsible for the timetable website about the security issue and they've decided not to do anything about it and essentially ignore the problem. That's up to them, but personally I think it's a little creepy that someone could make a website where people can view anyone at the University's timetable. But who'd do that?


It's a Duffy Thing t-shirt

I recently released a major overhaul for this website. The old website used an old version Node.js and used Ghost to power the blog. I didn't find it very easy to maintain and wanted more flexibility. While the new website may not have the best design, I'm a lot happier with it overall. Along with the rewrite of the website itself, I also gave it a new name: It's a Duffy Thing. This was inspired by a shirt that my Dad bought me.

In this blog post I want to go over a few of the technologies used to power the website. Partially so it's all together and in one place, but also as a kind of "behind the curtain" look at the website. If you want to dive in even further, take a look at the source code on GitHub.

The new version of the website is built using Node.js, SASS, and Handlebars. There's no front-end JavaScript on the website other than a couple of little external scripts (such as Google Analytics), so there's nothing to discuss when it comes to front-end JavaScript.

Node.js

I've liked the idea of Node.js for a while. For simple websites (such as this) that receive little traffic (such as this) and can be used to experiment with new technologies (such as this) I think it's great. My previous projects have used io.js, which was recently merged in to Node.js, but this project uses the latest stable version of Node at the time, version 4.2.2.

Gulp

Using Node has a few other advantages. Ones of those is being able to use gulp. There a lot of workflow automators out there now, and which is "best" seems to change fairly rapidly. I'm happy with my workflow using gulp so that's what I stick to. I use gulp to automate all of the following tasks:

  • Compile my SASS files, including Bootstrap
  • Automatically add browser vender prefixes to support the latest 2 versions of major browsers
  • Removed any CSS rules that aren't used anywhere on the website
  • Minify the output CSS
  • Generate sourcemaps for development

This is all done in a single gulp file. I've also added a watch task so that changes made to SASS file automatically trigger a recompile. I'll likely extend this to the views directory, too, so that any CSS rules added or removed from the HTML will be add or removed from the compiled CSS.

Poet

When looking for a blogging engine I was torn between one which provided a lot of management and functionally out of the box (such as Ghost), or one which offered more of a basic set of scaffolding to work from. After trying Ghost for a while I opted for the simpler approach, which led me to Poet. Poet takes in a directory containing markdown files and converts them to HTML, pulling out some extra meta data from a JSON structure at the top of the file such as the URL slug, publish date, and tags.

I've really loved working with Poet. It doesn't get in the way and me choose how things should be. I actually override most of the routes and mainly use it as a markdown to HTML converter, but it works great for me.

SASS

I've used SASS on an off for a while but I really wanted to dive in a bit further this time. The design and layout for this website is fairly simplistic so I can get away with using Bootstrap and just adding a few additional styles. SASS is perfect for this, and as described in the gulp section, it ends up being really nice workflow which produces a fairly small file.

Handlebars

Along with SASS, Handlebars is probably my favourite things that changed about my workflow when working with the web in recent years. I still find it hard to not put logic in to the Handlebar files (thanks, PHP), but I'm loving partials and inheritance. Magical!

Helmet

Helmet is a great piece of middleware for Express which helps improve the security of a website. There's always more that can be done, but it's been very easy to add things like the Content Security Policy and setting the X-XSS-Protection header.

Open Source

I decided to open source the website for a couple of reasons:

  • It makes deployment via PM2 or just the command line a little easier
  • It makes a good portfolio piece
  • It forces me to write slightly better code since people might look at it

I'm happy with my decision to open source the website. I can already see that I'm writing better commit messages and separating my commits up further.

Future Posts

I've got a few other post ideas (some of which are already 90% written), so there's going to be more activity on here soon. You can subscribe to the RSS feed, follow me on Twitter, or simply check the website soon to see new posts.