Talking Sober’s maintenance is now complete and the community is available as usual again.
As you may have noticed, Talking Sober was experiencing an exceptionally high load the last few days. It was then offline and unavailable from yesterday until now. Due to these technical difficulties, I had no choice but to completely reinstall and restore Talking Sober from scratch.
For those interested, here is the story with more detail:
Talking Sober started taking a heavy performance hit over the last few days. This eventually caused the community to go into high volume mode where it was unavailable due to a lack of system resources. Normally this should never happen as the community’s infrastructure is more than adequate. I had a look to see what was going on. Going through the logs I then saw a massive scale attack from different devices and different IP addresses trying to breach the system’s admin account by brute forcing it. They were unsuccessful, but the scale of the attack caused a slow down of the site as it constantly had to process these authentication attempts. I am not sure if this was a manual or just an automated attempt by bots.
After seeing this I set about increasing security. I made several improvements to the system’s security. However, it required a system reboot and a Discourse reboot. Here I made the mistake of trying to fix things too quickly. I was nervous about the attacks. The reboots triggered an upgrade of Discourse. Normally this would not be an issue, but for some reason a few duplicate database entries crept in and it caused the database to not start correctly. The database was thus unusable.
Fortunately we had a relatively recent full backup of the community which I was then able to restore after some manual tweaks. It took me a good few hours, but I finally got it up. Unfortunately that means that there will be some data lost. I was only able to restore data up until the 16th of June 2021.
Here I would like to give massive thanks to our @patrons . It was thanks to their support that I had set up and was able to maintain the automatic backup system. These backups honestly saved the day in such a big way.
I’d also like to thank our @moderators who helped communicate and keep things calm.
Here are some after effects:
- You may need to log in again
- If you don’t know your username or password, make use of the forgotten password feature. Be sure to keep an eye on your spam and junk folders too.
- The server might be a bit slow for the next few hours as it ‘warms up’
- There might be a few configuration items that I missed. If something seems broken, feel free to mention it here.
- Going forward, the server will be faster and more stable with improved security
- Unfortunately your posts from the last day or two might have been lost
As always, reach out if you have any concerns. I’ve replied to several emails, but I’m sure there are more questions from the community that I haven’t gotten to. For now I am going to bed and getting some rest, the last 24 hours have been brutal.