Saturday, 20 June 2015

A Letter from DASH's Chief Engineer

My name is Bill, and I'm the fellow that runs DASH. I certainly appreciate that you may have experienced some issues with our site over the past several weeks. I will tell you that it has also been personally and professionally awful for us, and I am sorry for the frustration and inconvenience that this may have caused you and many other users. 

I wanted to give you a summary of the situation. We have been live with the DASH Marketplace for about 2 years now. We have had 4 interrupted auction days in total over that period, but 3 of those were unfortunately in the last 3 weeks. That is 4 too many interruptions, of course, but we're doing the best we can.

The fact of the matter is that we have grown very strongly over these past two years, and this growth caught up to us. Starting 3 weeks ago, we started working to improve our capacity to handle the load. We were not able to roll out our first improvement in time to prevent our first auction interruption. Since then, we rolled it out and discovered that we needed to do another enhancement. While we were working on that, our DNS provider (completely unrelated to the load issues) decided that we needed to re-work how users reached our servers via their ISPs. That caused some "downtime" even though our systems were fully functional.

Since we implemented that second enhancement for handling laod. it has resulted in a 300% performance improvement for many of our users while at the same time reducing our server load. Or so we thought...

An unintended by-product of our enhancement was to put additional load on our servers when a listing closed. This past Sunday evening, we were closing a listing about every 2.5 seconds for an hour.  So, after about 45 minutes of this, the actual load on our servers was astronomical - the fault being in the code of the very thing that should have made everyone's experience so much better. Two steps forward, but a frustrating step backward.

All I can say is we have now corrected this issue and that we will be "all hands on deck" to ensure that everyone's experience is seamless this week. As you can imagine, it is extremely difficult to simulate these circumstances. Though it may not seem like it, the team at DASH is very accomplished and has managed systems for literally millions of users.

Thanks to everyone for your patience and I hope to see you online throughout the week!

No comments:

Post a Comment