Page 1 of 1

Limitations at scale?

Posted: Sat Dec 17, 2022 8:21 am
by kraisor
Hey again,

I'm looking at using this again (very much appreciate the fact that you've done just a great job keeping it updated!) and wanted to get your thoughts on the upper limitations.

My plan involves crawling and indexing multiple domains with a few million IRLs, do you feel that the software would be able to keep up with that, including vreacans where content gets updated frequently?

Thanks

Re: Limitations at scale?

Posted: Sat Dec 17, 2022 8:53 am
by kraisor
I forgot to ask, how is Spider doing currently with JavaScript rendering? Some of the sites I'd like to index were built with JavaScript.

Re: Limitations at scale?

Posted: Sat Apr 08, 2023 7:53 pm
by captquirk
First, sorry for the long delay at responding. I was having some health issues.

Sphider does not do very well at indexing JavaScript generated content.

As to scalability, Sphider's main intent is for indexing personal sites. HOWEVER, you can index many websites. I have no idea what the upper limits may be, but will most likely depending on the machine setup... available disk space, MySQL settings regarding memory, swap file sizes, etc. I have successfully indexed as many as 25 sites (of varying sizes - tiny to OMG!) and stayed functioning. You get really big and you may notice it functions slower, especially if you perform a maintenance function. Backups were a concern at one time, but I THINK I have that straightened out. (One can hope, anyway.)