Page 1 of 1

Can't indexing

Posted: Thu Nov 02, 2023 9:18 pm
by Equinoxe58
Hi,
I have a problem with indexing.
Most of my links (around 95%) are not indexed with the message "Page contains less than 10 words".
Yet Sphider is looking in the contents of the mysql tables?
Is there something I have configured incorrectly?
Thank you in advance for your help.
(My links are in http, because I work actually with wampserver.)

Watch below ...

65. Retrieving: http://la-machine/divers/forum-jeu-de-piste.php at 17:18:23.
Size of page: 8.89kb. Starting indexing at 17:18:23. Page contains less than 10 words
Links found: 50. New links: 0
66. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=1 at 17:18:23.
Size of page: 11.84kb. Starting indexing at 17:18:23. Page contains less than 10 words
Links found: 63. New links: 0
67. Retrieving: http://la-machine/divers/histoires-mach ... .php?id=10 at 17:18:23.
Size of page: 17.49kb. Starting indexing at 17:18:24. Page contains less than 10 words
Links found: 63. New links: 0
68. Retrieving: http://la-machine/divers/histoires-mach ... .php?id=11 at 17:18:24.
Size of page: 13.18kb. Starting indexing at 17:18:24. Page contains less than 10 words
Links found: 63. New links: 0
69. Retrieving: http://la-machine/divers/histoires-mach ... .php?id=12 at 17:18:24.
Size of page: 16.70kb. Starting indexing at 17:18:24. Page contains less than 10 words
Links found: 63. New links: 0
70. Retrieving: http://la-machine/divers/histoires-mach ... .php?id=13 at 17:18:24.
Size of page: 13.96kb. Starting indexing at 17:18:24. Page contains less than 10 words
Links found: 63. New links: 0
71. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=2 at 17:18:24.
Size of page: 12.23kb. Starting indexing at 17:18:24. Page contains less than 10 words
Links found: 63. New links: 0
72. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=3 at 17:18:24.
Size of page: 12.28kb. Starting indexing at 17:18:25. Page contains less than 10 words
Links found: 63. New links: 0
73. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=4 at 17:18:25.
Size of page: 10.65kb. Starting indexing at 17:18:25. Page contains less than 10 words
Links found: 63. New links: 0
74. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=5 at 17:18:25.
Size of page: 11.58kb. Starting indexing at 17:18:25. Page contains less than 10 words
Links found: 63. New links: 0
75. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=6 at 17:18:25.
Size of page: 17.97kb. Starting indexing at 17:18:25. Page contains less than 10 words
Links found: 63. New links: 0
76. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=7 at 17:18:25.
Size of page: 17.40kb. Starting indexing at 17:18:25. Page contains less than 10 words
Links found: 63. New links: 0
77. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=8 at 17:18:25.
Size of page: 13.55kb. Starting indexing at 17:18:26. Page contains less than 10 words
Links found: 63. New links: 0
78. Retrieving: http://la-machine/divers/histoires-machinoises.php?id=9 at 17:18:26.
Size of page: 11.67kb. Starting indexing at 17:18:26. Page contains less than 10 words
Links found: 63. New links: 0

Re: Can't indexing

Posted: Fri Nov 03, 2023 3:11 am
by captquirk
Sphider looks for the actual page content.
I would need to see the actual page to determine the issue.
If a page consists only of links and/or images and no actual text, it is possible there will be "fewer than 10 words".
One way to see for sure is to clear the site, go into settings and change the minimum number of words to 0, then try indexing again.

Re: Can't indexing

Posted: Sat Nov 04, 2023 1:49 pm
by Equinoxe58
Hi,
My pages don't contains only links and images.
There is a lot of texte in those pages
I have put minimum number of words to 0 for testing.
My pages are indexed but in the BDD, the champ fulltext is empty if description is full, and fulltext is full when description is empty. Do you understand ?
Thank you for your help

Here are two exemples in my BDD

Re: Can't indexing

Posted: Sat Nov 04, 2023 4:10 pm
by captquirk
I think I understand. You can get title and description but no fulltxt
OR
fulltxt but no title or description.

Do keywords get indexed for pages with fulltxt?

I would like to see the source code for a page from each type. Example: sport/resultatclub-2023?id=96 and recits/fanfare-et-harmonies.php. DO NOT POST THESE TO THE FORUM! Email them to me (compressed, .gz or .zip) at an address I will leave for you in a private message. This is a rather odd problem and I want to be sure there is no unusual issue in the page while keeping this information from the public.