Problem with indexing - Timeout occurred

Come here for help or to post comments on Sphider
Post Reply
maraka
Posts: 2
Joined: Thu Aug 10, 2023 5:31 am

Problem with indexing - Timeout occurred

Post by maraka »

Hello.

When indexing any website, the server crashes after about a minute and an error pops up: A timeout occurred, I'm using CloudFlare..
Attachments
{1225CACB-89AF-4A79-9B68-34C7205CC755}.png.jpg
{1225CACB-89AF-4A79-9B68-34C7205CC755}.png.jpg (139.77 KiB) Viewed 8152 times
{758ADB14-63A9-43A9-B091-DA127D9BE16C}.png.jpg
{758ADB14-63A9-43A9-B091-DA127D9BE16C}.png.jpg (84.21 KiB) Viewed 8152 times
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Problem with indexing - Timeout occurred

Post by captquirk »

I have no experience with CloudFlare. However, I do know CloudFlare addresses web security, and one of those issues is DDoS.
The timeout you are getting is a 524, which is specific to CloudFlare.

Sooo, as someone who has no knowledge of the workings of CloudFlare, I am going to make a wild guess! Sphider by default has no delay between requests during the indexing process. MAYBE this is being interpreted as a DDoS attack??? On the "Settings" tab, in the "Sphider settings" section, the "Minimal delay between page downloads" item... The default is 0. Try increasing this, say to 30. This will slow Sphider down, but IF this works it is WAY better than a timeout.

If this has no effect, we have to look elsewhere, probably do some research, or find someone experienced with CloudFlare.
maraka
Posts: 2
Joined: Thu Aug 10, 2023 5:31 am

Re: Problem with indexing - Timeout occurred

Post by maraka »

it's not even about cloudflare, I turned it off CF. When I turn on the indexing of a website again, a white page appears in a minute and I don't see any completion of the index. I use shared hosting and when I turn on the index, the server doesn't last and crashes... Is it possible to somehow reduce the indexing performance?
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Problem with indexing - Timeout occurred

Post by captquirk »

When the blank page shows, look at the top of the browser. Does the tab indicate activity with perhaps a spinning circle or some graphic? This would mean there is still activity. Sphider often looks like it is doing nothing, then a burst of text followed by more waiting. If there is no indication that the browser is "busy", then yes, it crashed. If the is activity, wait it out until activity has ceased. You should see something other than just a blank page. Also, check the log files afterward.

If it IS crashing because Sphider is too aggressive, you can slow Sphider down. On the "Settings" tab, in the "Sphider settings" section, the "Minimal delay between page downloads" item... The default is 0. Try increasing this, say to 30.

If you STILL can't get Sphider to stay alive, you need to talk to your host support. Depending on the provider, some hosting companies severely limit users on shared hosting.
Equinoxe58
Posts: 14
Joined: Fri Oct 06, 2023 8:02 pm

Re: Problem with indexing - Timeout occurred

Post by Equinoxe58 »

Good morning,
I have the same problem. I get a blank page and then after about a minute, I get the "connection timeout" message.
I tried your solution with no results.
Do you have another lead?
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Problem with indexing - Timeout occurred

Post by captquirk »

Timeout errors are USUALLY 500 or 504 HTTP errors.
Since these errors are related to browsers and their connections, I have found that they can be avoided by running the indexing from a command prompt.

Now, if you happen to have Sphider installed locally and you are indexing a remote site, this is no issue at all.

If Sphider is installed on the same host as the web site, then depending on your host and host plan type, you MIGHT have command prompt access, in which case you can index from there.

If you have no command prompt access, I am (currently) at a loss for solutions.

Although I have heard from others with similar (non-Sphider) timeout issues that changing the default PHP timeout can help.

Personally, I have had increasing timeout issues with my own host on my personal site. I am NOT on shared hosting and my host support has been very helpful, but even they have not found a solution. My thought is that in this day of malicious attackers on the internet, there is some security which may consider many requests from a single IP to be an attack and cuts it off?
Another thought I have considered (but not yet tried) is that while my site's IP itself is not shared, the SQL server is. Sometime I may try (but it costs money) to get my own virtual SQL server. MAYBE that will help?

There are just SO MANY factors that go into timeouts... :|
Equinoxe58
Posts: 14
Joined: Fri Oct 06, 2023 8:02 pm

Re: Problem with indexing - Timeout occurred

Post by Equinoxe58 »

Good morning,
I managed to index my remote site with Sphider installed locally on wampserver. It worked well.
Now I have 2 questions
In the French language, there are many accents on letters like éèàùô....
Is it possible during the search, if the person enters for example eleve (without the accents) instead of élève, that sphider answers élève?
Do you understand my request?
And finally, is it possible to index just a small piece of site, for example when adding a small text, without having to re-index everything?
Thank you in advance for your answers.
And congratulations again for your work. This is a very nice program!
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Problem with indexing - Timeout occurred

Post by captquirk »

For your first question, the search term must be an EXACT match, So 'eleve' will not match 'élève'.
It MIGHT be possible to get a match with wildcards, such as '*ve' (* is a wildcard), but no guarantee. (I think you may need 3 valid characters in addition to the wild card.)
Another possibility is to index the site (original index, not re-index) with word stemming enabled. No promises that will work, but it might be possible. I am no expert on the specifics of word stemming, but it is supposed to find similar words when matching. There is a French version of word stemming included with Sphider.

As to the second question, you can try to index a specific page. It MIGHT be necessary to also include that page as a MUST INCLUDE to avoid indexing links contained on that page.
Post Reply