Search found 302 matches

by captquirk
Fri Dec 15, 2023 4:37 pm
Forum: Sphider Help
Topic: Question about the sitemap
Replies: 4
Views: 2715

Re: Question about the sitemap

:D Thanks!
by captquirk
Fri Dec 15, 2023 4:36 pm
Forum: Sphider Help
Topic: indexed with the message "Page contains less than 10 word BUG???
Replies: 3
Views: 2936

Re: indexed with the message "Page contains less than 10 word BUG???

First off, the bug got fixed. But on reflection and testing, the ability to index decimals may be pointless! Why? You can't search ("and", "or") for a word containing a period (.) or comma (,)! For example, a search for 1.06 ( or 1,06) will translate into a search for "1 06&...
by captquirk
Thu Dec 14, 2023 12:38 am
Forum: Sphider Help
Topic: Question about the sitemap
Replies: 4
Views: 2715

Re: Question about the sitemap

Tentative enhancement to allow reading xml.gz files: /** * Read the sitemap.xml file on the server * * @param string $input_file Sitemap file name * * @return array $links Array of links found in sitemap.xml */ function getSiteMap($input_file) { $links = ''; $sitemap = simplexml_load_file($input_fil...
by captquirk
Wed Dec 13, 2023 4:08 pm
Forum: Sphider Help
Topic: Question about the sitemap
Replies: 4
Views: 2715

Re: Question about the sitemap

A short history of Sphider... Originally, Sphider read ONLY a sitemap.xml file. This worked fine on small websites, which Sphider was intended for. I did expand this so that Sphider could use sitemap.xml as an index, and now will accept further xml files. HOWEVER, I did NOT take xml.gz into consider...
by captquirk
Mon Dec 11, 2023 7:34 pm
Forum: Sphider Help
Topic: indexed with the message "Page contains less than 10 word BUG???
Replies: 3
Views: 2936

Re: indexed with the message "Page contains less than 10 word BUG???

Confirmed! Checking "Index decimals" DOES indeed cause Sphider to report "Page contains less than 10 words." In fact, in my test case, "less than 1 word!" Initial look at the code does not real any smoking guns, so this needs to be studied more to find a fix. This may b...
by captquirk
Sun Dec 10, 2023 3:22 am
Forum: Sphider Help
Topic: Problem with indexing - Timeout occurred
Replies: 7
Views: 3632

Re: Problem with indexing - Timeout occurred

For your first question, the search term must be an EXACT match, So 'eleve' will not match 'élève'. It MIGHT be possible to get a match with wildcards, such as '*ve' (* is a wildcard), but no guarantee. (I think you may need 3 valid characters in addition to the wild card.) Another possibility is to...
by captquirk
Thu Dec 07, 2023 4:04 pm
Forum: Sphider Help
Topic: Problem with indexing - Timeout occurred
Replies: 7
Views: 3632

Re: Problem with indexing - Timeout occurred

Timeout errors are USUALLY 500 or 504 HTTP errors. Since these errors are related to browsers and their connections, I have found that they can be avoided by running the indexing from a command prompt. Now, if you happen to have Sphider installed locally and you are indexing a remote site, this is n...
by captquirk
Thu Dec 07, 2023 3:49 pm
Forum: Sphider Help
Topic: indexed with the message "Page contains less than 10 word BUG???
Replies: 3
Views: 2936

Re: indexed with the message "Page contains less than 10 word BUG???

A BUG! in MY code? Heaven forbid!

Well, to be honest it is possible (because it has happened so many times before). :lol:

And considering your report on a "less than 10 words" solution, this is DEFINITELY something I will be looking into.

Thanks for the tip... and the tip-off.
by captquirk
Mon Nov 13, 2023 10:34 pm
Forum: Sphider MODS
Topic: mod for small parameter
Replies: 1
Views: 3581

Re: mod for small parameter

The search finds all results for a query. It displays the top result for each domain found, then indented with the second top result within each domain. Sometimes there is no second. There is no way to modify the code to show more. I think I understand what you are trying to accomplish, which is to ...
by captquirk
Thu Nov 09, 2023 11:54 pm
Forum: Sphider Help
Topic: Indexing a NEW site from the Index Tab fails in 5.4.0
Replies: 0
Views: 3337

Indexing a NEW site from the Index Tab fails in 5.4.0

When trying to index a NEW (not already in the database) site from the Index tab in 5.4.0, you just get a white screen. This will, of course, be corrected in the next release. In the meantime, however, this can be corrected by editing spider.php, line 844 to read: ."? , ? , ? , ? , ?, ?)" ...