Bughunting

A feat of Javascript engineering, with profitability eaten into by an unreproducible bug.

A feat of Javascript engineering, with profitability eaten into by an unreproducible bug.

“OK, so which levels does this occur on?”

“1,2, and 4.”

“Not 3?”

“Not 3.”

“And this only happens on mobile?”

“Yes, but only some mobile devices”

Well shit. Every level of the maze uses the same logic. They are all checking collisions based on whether or not the player’s “pawn” came in contact with a certain color. Why would this Javascript maze game end up failing on certain levels while on mobile? In-house, we couldn’t replicate the issue no matter which device we used. Although the clients aren’t as tech-saavy as their company advertises them to be, they had a history of honestly reporting the issues the faced.

This was a strange issue. Because we couldn’t replicate it, we couldn’t exactly test it. Although it we would love to dismiss the client’s issues, they had done us the benefit of sending screenshots of something that clearly should not have been happening – the player’s pawn moving through walls. We figured that the only possible causes of that could be a bug in the Javascript or associated drawing engine, or a bug in our code.
We spent a number of hours wondering about the issues, trying every move possible in this maze in a vain attempt to see if we could break the collision detection. Since we had the client’s screenshots, I took a look at them and found that the some of the colors were off very slightly. Because we were doing collision detection against a single hex color, it became clear what the problem was, but not what was causing the problem. The colors were changing for some of the levels, allowing the pawn to blast through them like Kool-Aid Man.

After looking at the colors of the walls in the screenshots given to us by the client, it looked like the color values were being “rounded” somehow. A color like #fdfefc would end up being something like #fcfffc. Because this only occurred on mobile for our clients, we guessed that it was the mobile phone provider of their country being guilty of these changes. It could be some sort of compression used to reduce bandwidth in their country.
That was never proven, but by changing the colors to the “rounded” values of the screenshots, we were able to fix the collision detection issue.

Sometimes bug hunting can take a large number of hours, but this time can be mitigated by asking the right questions;

  • What happened, and what should have happened?
  • In what environment did this happen? (OS, network, browser and version)
  • Can this issue be replicated? What are the steps?
  • How often does this issue occur? (Always, usually, sometimes)

In the time that we spent trying to debug this issue, we spent too much time assuming that our mobile phones would accurately behave the same way as the mobile phones of our client. We found that this was not the case, and only made progress once the client provided screenshots.

 

Leave a Reply

Your email address will not be published. Required fields are marked *