ssl redirect

Come here for help or to post comments on Sphider
gaddcasey1
Posts: 12
Joined: Wed Aug 15, 2018 4:48 pm

Re: ssl redirect

Post by gaddcasey1 »

So this is the solution I came up with. so if a request comes from the servers internal address (10.1.0.2) the redirect will not happen.

Code: Select all

RewriteCond %{REMOTE_ADDR} !^10\.1\.0\.2$
RewriteCond %{HTTPS} !on
RewriteCond %{HTTP_HOST} ^www\.okanoganpud\.org*
RewriteRule ^(.*)$ https://www.okanoganpud.org/$1 [L,R=301]
9.php.png
9.php.png (83.24 KiB) Viewed 12889 times
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: ssl redirect

Post by captquirk »

I tested https://www.bleepingcomputer.com, as you did. Again, Ubuntu 18.04, MySQL, Sphider 2.0.;0c-PDO.

Where you got "NO HOST", I got:
[Back to admin]
Spidering https://www.bleepingcomputer.com/
Disallowed files and directories in robots.txt:
https://www.bleepingcomputer.com/cgi-bin/
https://www.bleepingcomputer.com/forums/style_images/
https://www.bleepingcomputer.com/forums ... act=Search
https://www.bleepingcomputer.com/forums ... ?act=Login
https://www.bleepingcomputer.com/forums ... hp?act=Reg
https://www.bleepingcomputer.com/forums ... t=calendar
https://www.bleepingcomputer.com/forums ... hp?act=Msg
https://www.bleepingcomputer.com/forums ... p?act=Mail
https://www.bleepingcomputer.com/forums ... ct=Forward
https://www.bleepingcomputer.com/forums ... ct=forward
https://www.bleepingcomputer.com/forums ... ?act=Track
https://www.bleepingcomputer.com/forums ... p?act=Post
https://www.bleepingcomputer.com/forums ... p?act=post
https://www.bleepingcomputer.com/forums ... p?showuser
https://www.bleepingcomputer.com/forums ... ?act=Print
https://www.bleepingcomputer.com/forums ... php?act=ST
https://www.bleepingcomputer.com/forums ... act=UserCP
https://www.bleepingcomputer.com/forums ... act=usercp
https://www.bleepingcomputer.com/forums ... act=Arcade
https://www.bleepingcomputer.com/forums ... t=findpost
https://www.bleepingcomputer.com/forums ... php?act=SF
https://www.bleepingcomputer.com/forums ... ?act=Stats
https://www.bleepingcomputer.com/forums ... act=Online
https://www.bleepingcomputer.com/forums ... ct=Members
https://www.bleepingcomputer.com/forums ... p?act=Help
https://www.bleepingcomputer.com/forums ... ct=Profile
https://www.bleepingcomputer.com/forums ... ct=profile
https://www.bleepingcomputer.com/forums ... act=Attach
https://www.bleepingcomputer.com/forums ... act=attach
https://www.bleepingcomputer.com/forums ... ?act=Print
https://www.bleepingcomputer.com/forums ... ?act=print
https://www.bleepingcomputer.com/forums ... Cookies&k=*
https://www.bleepingcomputer.com/forums ... &do=logout*
https://www.bleepingcomputer.com/forums ... ype=all&k=*
https://www.bleepingcomputer.com/forums ... oderate&f=*
https://www.bleepingcomputer.com/forums ... ule=search*
https://www.bleepingcomputer.com/forums ... sharelink=*
https://www.bleepingcomputer.com/forums/u/*
https://www.bleepingcomputer.com/forums/user-*
https://www.bleepingcomputer.com/forums/*&view=findpost*
https://www.bleepingcomputer.com/forums/*&view=old
https://www.bleepingcomputer.com/forums/*&view=new
https://www.bleepingcomputer.com/forums/*&mode=linear
https://www.bleepingcomputer.com/forums ... linearplus
https://www.bleepingcomputer.com/forums/*&mode=threaded
https://www.bleepingcomputer.com/forums/*&mode=threaded
https://www.bleepingcomputer.com/forums ... linearplus
https://www.bleepingcomputer.com/forums/*&mode=linear
https://www.bleepingcomputer.com/forums/help.html
https://www.bleepingcomputer.com/forums/lastpost
https://www.bleepingcomputer.com/resources/link
https://www.bleepingcomputer.com/tutori ... ?act=print
https://www.bleepingcomputer.com/tutori ... act=friend
https://www.bleepingcomputer.com/cmsimages/
https://www.bleepingcomputer.com/logreply/

1. Retrieving: https://www.bleepingcomputer.com/ at 15:40:39.
Size of page: 88.76kb. Starting indexing at 15:40:42.
Indexed
Links found: 103. New links: 103
2. Retrieving: https://www.bleepingcomputer.com/ at 15:40:52.
already in database
3. Retrieving: https://www.bleepingcomputer.com/about/ at 15:40:52.
Size of page: 46.68kb. Starting indexing at 15:40:54.
Indexed
Links found: 65. New links: 1
4. Retrieving: https://www.bleepingcomputer.com/advertise/ at 15:40:58.
Size of page: 54.36kb. Starting indexing at 15:40:59.
Indexed
Links found: 61. New links: 1
5. Retrieving: https://www.bleepingcomputer.com/author/ionut-ilascu/ at 15:41:06.
Size of page: 58.10kb. Starting indexing at 15:41:08.
Indexed
Links found: 86. New links: 25
6. Retrieving: https://www.bleepingcomputer.com/author ... ce-abrams/ at 15:41:14.
Size of page: 60.09kb. Starting indexing at 15:41:17.
Indexed
Links found: 81. New links: 15
7. Retrieving: https://www.bleepingcomputer.com/author/mayank-parmar/ at 15:41:20.
Size of page: 58.44kb. Starting indexing at 15:41:21.
Indexed
Links found: 83. New links: 19
8. Retrieving: https://www.bleepingcomputer.com/changelog/ at 15:41:25.
Size of page: 38.65kb. Starting indexing at 15:41:27.
Indexed
Links found: 67. New links: 7
9. Retrieving: https://www.bleepingcomputer.com/contact/ at 15:41:28.
Size of page: 54.26kb. Starting indexing at 15:41:30.
Indexed
Links found: 68. New links: 8
10. Retrieving: https://www.bleepingcomputer.com/download/ at 15:41:33.
Size of page: 70.12kb. Starting indexing at 15:41:36.
Indexed
Links found: 148. New links: 88
11. Retrieving: https://www.bleepingcomputer.com/downlo ... -security/ at 15:41:41.
Size of page: 68.93kb. Starting indexing at 15:41:43.
Indexed
Links found: 135. New links: 4
12. Retrieving: https://www.bleepingcomputer.com/download/adwcleaner/ at 15:41:46.
Size of page: 68.88kb. Starting indexing at 15:41:48.
Indexed
Links found: 138. New links: 8
13. Retrieving: https://www.bleepingcomputer.com/download/combofix/ at 15:41:51.
Size of page: 66.63kb. Starting indexing at 15:41:53.
Indexed
Links found: 136. New links: 5
14. Retrieving: https://www.bleepingcomputer.com/downlo ... scan-tool/ at 15:41:54.
Size of page: 65.34kb. Starting indexing at 15:41:56.
Indexed
Links found: 135. New links: 5
Now THIS is really perplexing! So I have to wonder... is it something about CentOS? CentOS definitely does have a GOOD reputation. I have to agree that the problem, given the newest information, is NOT with your setup.

I admit I am grasping at srtaws here. Perhaps what I need to do is install CentOS as an OS in VirtualBox, install Sphider, PHP, Apache, and MySQL into that, and retest. Given that I am indexing bleepingcomputer.com just fine, while you are now, is baffling. Are there ANY https sites you CAN index?

I*t is good you have found a work-around, but I agree it is NOT an ideal fix.
gaddcasey1
Posts: 12
Joined: Wed Aug 15, 2018 4:48 pm

Re: ssl redirect

Post by gaddcasey1 »

I have tried some other SSL sites and I am getting the same results :( . I set up an ubuntu Sphider server and I verify that it does work on Ubuntu and not on centos. I thought it might be selinux, but disabling it has no effect.
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: ssl redirect

Post by captquirk »

It's going to take me a bit, but I'm going to try to get CentOS running in VirtualBox so I can test that.
Another differing factor is Drupal, something with which I have ZERO experience.
I'll post when I have some results.

Changing the subject to work-a rounds...
You could try spidering your site from the command prompt instead of via a browser. When I first started plasying around with Sphider, there was no manual whatsoever and it took a bit of playing around to get the syntax correct, but it worked. The manual I created is no Pulitzer masterpiece, but I think should be useful if you ever want to give that a try.
gaddcasey1
Posts: 12
Joined: Wed Aug 15, 2018 4:48 pm

Re: ssl redirect

Post by gaddcasey1 »

We are on the same path lol. I have been working on the command line this morning. I will let you know If I find anything. if you want to test drupal bitnami offers prebuilt ova. https://bitnami.com/stack/drupal/virtual-machine
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: ssl redirect

Post by captquirk »

Okay. This has been an adventure!
I finally got CentOS 7 set up and working in virtualbox. Took a few tries to get the configuration right, but got it running. Many years ago, I was a system administrator on a couple UNIX System V boxes. More recently, I have become comfortable using Ubuntu. CentOS is still Linux, but there are differences. I had a heck of a time getting the right permissions setup so Sphider could write to log files, do backups, etc. When I FINALLY realized that I wasn't just dealing with Linux, but SELinux, I was able to get things working there as well. Apparently, w2hen you install CentOS, YOU GET SELinux!

So then once Sphider installed an operational, I set up a site, https://www.okanaganpud.org. Spidering was successful, with the process following the robots.txt.

Soooo....
I guess I'm going to have to give Drupal a whirl to see if that makes a difference. I would really like to know what is causing the problems you have had.

UPDATE: Nov. 6. I have Drupal installed although really haven't used it yet. But given that I now have a much better idea of just what Drupal is (a CMS), and I have to believe that while it may affect Sphider search operations, it should have ZERO impact on Sphider's spidering (building the database to be searched) operations. But I will play around and see what happens...
User avatar
ReddWebDev
Posts: 13
Joined: Fri Nov 23, 2018 3:09 pm
Location: Great Falls, Montana
Contact:

Re: ssl redirect

Post by ReddWebDev »

Set up Sphider via mySQL db on CENTOS 7.5 v76.0.8 and everything runs fine. The only problems I've had so far was with Wordpress .htaccess configs.
Http:// or https:// doesn't matter - Wordpress auto configs upon install. The trouble I see here is when the downstream site admins get in and add to the original .htaccess that Wordpress has set up.

My local boxes are all Linux Mint 19 Cinnamon. Firefox 63.0

As far as file/folder permissions are concerned - I just unpack on the server itself into the desired directory - All of the write permissions are assigned automatically that way.

Best way to run this thing in CMS would be to wrap the arrays with with the same CSS that the CMS original comes with -- I'm always sure to use absolute linking so as to be sure all of the parts and pieces are talking to each other as they should.
Continental breakfasts should be served on tectonic plates
Image
Post Reply