How to analyze incoming traffic following recent site crashes

Hi Guys,

Can anyone give me any advice on how to analyse why my site has been getting a lot of down time recently.

My setup is as in craigs videos on an Ubuntu server with 4GB Ram.

To get the site going again i need to restart the server and then restart redis with :
sudo systemctl restart redis-server
However often as soon as i restart redis the site goes down again…
Then eventually it works fine for some reason… Sometimes after 5mins, sometimes after an hour…

In the past i’ve already upped the memory limit on redis to 2GB and it seems to be working fine.

Recently I’ve noticed that i’ve been getting a lot of spam incriptions to the news letter so i activated Captcha.
So I was thinking that it may be a good idea to try and block the incoming traffic from certain IP addresses on my server to see if that may help.
If the incoming traffic is from certain foreign countries i have no need to have any incoming traffic from there…
Is this a good idea?
And any idea how i can see what ip addresses have been coming on the website just before my site goes down?
Are there any good tools or commandes to use to see this through the ssh interface?

Also i did recently disable a search bar module as i thought it maybe coming from that.
Could any cron tasks have anything to do with it if a module has been disabled?

Any advice would be great.
Thanks Andy.

I have also just noticed on the search terms list that there are a few search terms that have gone up with crazy numbers. Captcha doesn’t seem to be doing anything to prevent it. Any ideas?

You’ll probably end up blocking Search Engines from crawling your site.

Also, checkout Restrict specific countries from accessing my Magento store

They’ll be in your apache logs. Look for a file with “ssl” and “access” in the name within /var/log/apache2. There are tools you can use to make reading these files easier, so you might wanna do some Google searches to find what they are.

As for tools and stuff to help you monitor/administer traffic and resources, you’d be better off asking one of the following forums (as this question mostly leans towards DevOps):

Thank’s Craig!
I’ll take a look.

I’m still having a bit of a nightmare with the site at the moment as it still keeps going down everyday.

To see the last 100 logs as i understand i used the following commande :

sudo tail -100 /var/log/apache2/access.log

From what i could see I’ve got a bot called SemrushBot with ip addresses from Moldovia that has been spamming my searchbar. I’ve added this to the bottom of my
htaccess file and blocked all of the IP addresses from moldovia.

<Limit GET HEAD POST>
  order allow,deny
  allow from all
  deny from 5.32.168.0/21
  deny from IP ADDRESS RANGE
  deny from IP ADDRESS RANGE
  etc etc...
</Limit>

I find this makes the htaccess file look a bit of a mess… Especially if i have to add more ip addresses in the future…

I’ve tried other methods that i found on other forums but this one seems to prevent
the bot from spamming the search bar as in the magento back office the search term for a certain key word is no longer increasing.

However in the logs i can still see the bot, and the site now goes down often.
So if the bot is no longer able to access the site then i don’t understand why i can still see it in the logs…

Also to get the site going again i need to restart the server and then redis. But as soon as i restart redis the site goes down again after a short time.
But I don’t understand why after restarting the server and restarting redis a few times the site works ok for a while until it goes down again…

Any advice any one has will be more than welcome :pray: !!

I found one the tools that I mentioned in my last post (apacheviewer.com). You’ll find it much easier to trawl through your Apache logs with that.

SemrushBot is legit like. It’s just a web crawler. Checkout this article about Good vs Bad bots.

Also, I’ve been told by that using .htaccess to block traffic isn’t a good idea. Either use a Managed Firewall or integrate Cloudflare into your website. Both of those are considered “good practice”.

As a final thought, it’s entirely possible that you’re barking up the wrong tree. It could simply be that your config files for Redis, Apache, MySQL, etc are not tweaked for the type of traffic that you’re receiving. For more help on configuration optimisation, you’ll want to turn to serverfault.com. That forum is tailored for your question.

Hi Craig,
Ye the story of my life. I always seem to be barking up the wrong tree :joy:
Anyway the log view on apacheviewer.com is a great tool. I managed to find where the bad traffic was coming from.

Even though SemrushBot are legit they were bombarding my search bar. So i had no choice but to block them. And in the good vs bads bots link you gave they are seen as a ‘bad’ bot.
I’ve also had some traffic trying to find the URL of my back office which is nice :face_with_symbols_over_mouth: They were overloading my server and had to be blocked too…
So i imagine that montoring the incoming traffic is going to be a constant battle…

At the moment i’m still using the .htaccess to block the traffic but its calmed things down for the moment as the websites stable again.
So i’m now reseaching how to setup a firewall with aws.
Thanks again for the advice.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.