Page 2 of 2

Re: ssl redirect

Posted: Tue Oct 09, 2018 9:53 pm
by gaddcasey1
So this is the solution I came up with. so if a request comes from the servers internal address ( the redirect will not happen.

Code: Select all

RewriteCond %{REMOTE_ADDR} !^10\.1\.0\.2$
RewriteCond %{HTTPS} !on
RewriteCond %{HTTP_HOST} ^www\.okanoganpud\.org*
RewriteRule ^(.*)$$1 [L,R=301]
9.php.png (83.24 KiB) Viewed 5984 times

Re: ssl redirect

Posted: Tue Oct 09, 2018 10:11 pm
by captquirk
I tested, as you did. Again, Ubuntu 18.04, MySQL, Sphider 2.0.;0c-PDO.

Where you got "NO HOST", I got:
[Back to admin]
Disallowed files and directories in robots.txt: ... act=Search ... ?act=Login ... hp?act=Reg ... t=calendar ... hp?act=Msg ... p?act=Mail ... ct=Forward ... ct=forward ... ?act=Track ... p?act=Post ... p?act=post ... p?showuser ... ?act=Print ... php?act=ST ... act=UserCP ... act=usercp ... act=Arcade ... t=findpost ... php?act=SF ... ?act=Stats ... act=Online ... ct=Members ... p?act=Help ... ct=Profile ... ct=profile ... act=Attach ... act=attach ... ?act=Print ... ?act=print ... Cookies&k=* ... &do=logout* ... ype=all&k=* ... oderate&f=* ... ule=search* ... sharelink=****&view=findpost**&view=old*&view=new*&mode=linear ... linearplus*&mode=threaded*&mode=threaded ... linearplus*&mode=linear ... ?act=print ... act=friend

1. Retrieving: at 15:40:39.
Size of page: 88.76kb. Starting indexing at 15:40:42.
Links found: 103. New links: 103
2. Retrieving: at 15:40:52.
already in database
3. Retrieving: at 15:40:52.
Size of page: 46.68kb. Starting indexing at 15:40:54.
Links found: 65. New links: 1
4. Retrieving: at 15:40:58.
Size of page: 54.36kb. Starting indexing at 15:40:59.
Links found: 61. New links: 1
5. Retrieving: at 15:41:06.
Size of page: 58.10kb. Starting indexing at 15:41:08.
Links found: 86. New links: 25
6. Retrieving: ... ce-abrams/ at 15:41:14.
Size of page: 60.09kb. Starting indexing at 15:41:17.
Links found: 81. New links: 15
7. Retrieving: at 15:41:20.
Size of page: 58.44kb. Starting indexing at 15:41:21.
Links found: 83. New links: 19
8. Retrieving: at 15:41:25.
Size of page: 38.65kb. Starting indexing at 15:41:27.
Links found: 67. New links: 7
9. Retrieving: at 15:41:28.
Size of page: 54.26kb. Starting indexing at 15:41:30.
Links found: 68. New links: 8
10. Retrieving: at 15:41:33.
Size of page: 70.12kb. Starting indexing at 15:41:36.
Links found: 148. New links: 88
11. Retrieving: ... -security/ at 15:41:41.
Size of page: 68.93kb. Starting indexing at 15:41:43.
Links found: 135. New links: 4
12. Retrieving: at 15:41:46.
Size of page: 68.88kb. Starting indexing at 15:41:48.
Links found: 138. New links: 8
13. Retrieving: at 15:41:51.
Size of page: 66.63kb. Starting indexing at 15:41:53.
Links found: 136. New links: 5
14. Retrieving: ... scan-tool/ at 15:41:54.
Size of page: 65.34kb. Starting indexing at 15:41:56.
Links found: 135. New links: 5
Now THIS is really perplexing! So I have to wonder... is it something about CentOS? CentOS definitely does have a GOOD reputation. I have to agree that the problem, given the newest information, is NOT with your setup.

I admit I am grasping at srtaws here. Perhaps what I need to do is install CentOS as an OS in VirtualBox, install Sphider, PHP, Apache, and MySQL into that, and retest. Given that I am indexing just fine, while you are now, is baffling. Are there ANY https sites you CAN index?

I*t is good you have found a work-around, but I agree it is NOT an ideal fix.

Re: ssl redirect

Posted: Wed Oct 10, 2018 3:31 pm
by gaddcasey1
I have tried some other SSL sites and I am getting the same results :( . I set up an ubuntu Sphider server and I verify that it does work on Ubuntu and not on centos. I thought it might be selinux, but disabling it has no effect.

Re: ssl redirect

Posted: Wed Oct 10, 2018 4:17 pm
by captquirk
It's going to take me a bit, but I'm going to try to get CentOS running in VirtualBox so I can test that.
Another differing factor is Drupal, something with which I have ZERO experience.
I'll post when I have some results.

Changing the subject to work-a rounds...
You could try spidering your site from the command prompt instead of via a browser. When I first started plasying around with Sphider, there was no manual whatsoever and it took a bit of playing around to get the syntax correct, but it worked. The manual I created is no Pulitzer masterpiece, but I think should be useful if you ever want to give that a try.

Re: ssl redirect

Posted: Wed Oct 10, 2018 4:57 pm
by gaddcasey1
We are on the same path lol. I have been working on the command line this morning. I will let you know If I find anything. if you want to test drupal bitnami offers prebuilt ova.

Re: ssl redirect

Posted: Tue Nov 06, 2018 1:38 am
by captquirk
Okay. This has been an adventure!
I finally got CentOS 7 set up and working in virtualbox. Took a few tries to get the configuration right, but got it running. Many years ago, I was a system administrator on a couple UNIX System V boxes. More recently, I have become comfortable using Ubuntu. CentOS is still Linux, but there are differences. I had a heck of a time getting the right permissions setup so Sphider could write to log files, do backups, etc. When I FINALLY realized that I wasn't just dealing with Linux, but SELinux, I was able to get things working there as well. Apparently, w2hen you install CentOS, YOU GET SELinux!

So then once Sphider installed an operational, I set up a site, Spidering was successful, with the process following the robots.txt.

I guess I'm going to have to give Drupal a whirl to see if that makes a difference. I would really like to know what is causing the problems you have had.

UPDATE: Nov. 6. I have Drupal installed although really haven't used it yet. But given that I now have a much better idea of just what Drupal is (a CMS), and I have to believe that while it may affect Sphider search operations, it should have ZERO impact on Sphider's spidering (building the database to be searched) operations. But I will play around and see what happens...

Re: ssl redirect

Posted: Fri Nov 23, 2018 3:43 pm
by ReddWebDev
Set up Sphider via mySQL db on CENTOS 7.5 v76.0.8 and everything runs fine. The only problems I've had so far was with Wordpress .htaccess configs.
Http:// or https:// doesn't matter - Wordpress auto configs upon install. The trouble I see here is when the downstream site admins get in and add to the original .htaccess that Wordpress has set up.

My local boxes are all Linux Mint 19 Cinnamon. Firefox 63.0

As far as file/folder permissions are concerned - I just unpack on the server itself into the desired directory - All of the write permissions are assigned automatically that way.

Best way to run this thing in CMS would be to wrap the arrays with with the same CSS that the CMS original comes with -- I'm always sure to use absolute linking so as to be sure all of the parts and pieces are talking to each other as they should.