Using Sphider 3.3.0-MB, I added a new site:
http://www.marquis-kyle.com.au/
I set index level to: full
In settings, I checked "Index pdf".
Minimum words per page is 10, minimum word length is 3.
Using these settings, I ran the spider. There were 1668 links discovered, a couple being duplicates.
The end result was 989 pages indexed with 18,686 keywords.
466 pages were not indexed due to a no index flag in the meta tag.
A couple pages reported having less than 10 words. I looked at one, which was actually a pdf containing only images, so no surprise there.
Why you are only getting 146 pages is strange. When looking at the "Sites" tab, is there any indication that indexing was not completed? If so, try resuming. The only other thing I can think of is possibly a timeout issue. Try indexing from the command prompt.
Now, concerning the problem with "./../aboutblog.htm" or "./../autobio.htm"...
Your method of referencing these pages is somewhat indirect, although COMPLETELY VALID! Any browser worth its salt will properly interpret the reference. Looking at the Sphider code, it SEEMS it should be functioning the same way. Why it is not doing so for this site is something I need to look at more deeply.
As I said, the method IS VALID, but if I may borrow from one of your pages, "Do as much as necessary, as little as possible."
Consider replacing "./../aboutblog.htm" with simply "/aboutblog.htm". Both a browser and Sphider will understand.
Back to the issue of only getting 146 pages indexed....
Sphider can index from a sitemap, but it has to be in the form of "sitemap.xml". This is different format than your "sitemap.shtml". During normal indexing, of course, Sphider will read and pick up links from the shtml page just as it would any other valid html page. Just out of curiousity, I ran a sitemap generator on
http://www.marquis-kyle.com.au/, and got one with 1008 links, including 6 pdf files. (I used Sitemap Generator 9 from Microsys.
http://www.microsystools.com/products/s ... generator/)
Let me know how you make out, and if still having issues I'll see what else I can come up with.