Can you be too agile?

The European Youth Online Chess Championships was a great opportunity to showcase the rich features of Tornelo to players and organisers around the world. But…

The first time you do anything is a bit nerve-wracking. No matter how much testing you do, the real world always manages to find a situation, or series of interactions, which you hadn’t considered and results in unexpected behaviour. This event had a lot of firsts for Tornelo:

  • First event with over 700 players
  • First event with spectators (at one point >1500 simultaneous spectators)
  • First high-stakes event with European Titles and medals at stake
  • First event with GMs, IMs and other titled players
  • First event spread through multiple European countries
  • First mission critical integrations (PGN broadcast, TRFx import, pairing import)

Our technical challenges were stability, performance, player experience, arbiter features …. everything was mission critical and there were a LOT of unknowns. Could our servers even handle the load? If something failed we’d have 750 disappointed players all over Europe and 80 angry arbiters with a lot of wasted effort. It would be a very public failure.

In the weeks leading up to the event our team had to focus on the main feature requirement for the event; the ability for arbiters to pause clocks during a game. Tied to this was for the Black player to start White’s clock before they arrived at the board or played a move. This involved a significant refactor of our code because the starting of the clock had been directly linked to making a move!

Our next priority after the key “stop clock” feature was scale and performance. With 3 rounds per day we just couldn’t afford to have something fail – there was no time!

There were also logistic issues for integrating with Swiss Manager, Chess24, FIDE Ratings and other systems – the organisers wanted to be able to import player lists, rather have players self-register. How would we create 750 accounts and give passwords and support to enable everyone to sign-in properly on the day? How could organisers manage online games on Tornelo, but do the pairings in an external pairing program?

One last thing…

It’s really dangerous to rush out just “one last thing”, but I just can’t help myself.

After running some small events with our new “pause clock” feature it became clear that once the clocks stopped it would be really useful to be able to communicate with players. Originally the event had planned to do this via Zoom, but we set the ambitious goal of creating an on-board chat feature, stabilising it, testing it, releasing it and using it during the tournament.

With only 3 days to go we gave ourselves a 50% chance of achieving this but thanks to our amazing team, we released the feature literally hours before the event started. It ended up being one of the most used features!

Agile development

Day 1 – two immediate issues appeared

First, one Section Lobby lost the connection status of all the players. Not a disaster, but inconvenient for the arbiter.

Second, one player saw her move deliver a checkmate – so left the board. On checking her result, noticed it was still a non-result! She went back into the game and saw, to her horror, her time ticking and no mate on the board! Quickly she played the move again and won – this time with her result appearing.

What could have caused this behaviour? Clearly the move had been sent to the server, but had never reached it. Perhaps a disconnection just at the wrong time? How would an arbiter have dealt with that if she had lost on time?

Release Notes: Between Round 1 and 2
– Fixed bug which allowed lobby connection status to fail
– Updated clocks to prevent game-end before server confirmation

Releases Notes: Overnight, before Day 2 started
– Improved connection management and timeouts for unstable local internet connections
– Modify tokens to prevent games losing permissions in edge cases
– Update clocks with a disconnection indicator

Release Notes: Overnight, before Day 3 started
– Updated UI to make it easier to see and accept a draw offer
– Updated PGN import due to edge case failure impacting 1% of pairings
– Filter on PGN export so Chess24 could deal with the large number of games
– Clock updates to mitigate user issues with bad computer internal clock behaviour

We do love being agile, but 9 updates to the code-base WHILE a mission-critical event was in progress is maybe just a little over-excited!

Lucky we have such an awesome team that could pull this off … hats off and thanks to David, Tobi, Simon and Frank who didn’t sleep for 3 days!

0 0 votes
Article Rating
Notify of

Inline Feedbacks
View all comments

1. Sign up

2. Create Organization

3. Setup Your Event