Extended UA String

Come here for help or to post comments on Sphider
Post Reply
User avatar
ReddWebDev
Posts: 13
Joined: Fri Nov 23, 2018 3:09 pm
Location: Great Falls, Montana
Contact:

Extended UA String

Post by ReddWebDev » Fri Nov 23, 2018 10:53 pm

Just a note:

I've taken the liberty of extending the user agent string text field from 20 to 50 in the configset.php on the admin side:

690 echo "<tr>\n";
691 echo "<td class='left1'><input name='_user_agent' value='Mozilla/5.0 (compatible; yourBot/build; +https://www.XXXX.com/XXX/)'";
692 echo $row[25];
693 echo "' type='text' id='user_agent' size='50'></td>\n";
694 echo "<td> User agent string</td>\n";
695 echo "</tr>\n";

-- If you're running a site parsing agent on the open web, it's important that you let site owners know who you are, otherwise they may mistake you for a scraper of some sort or some other nefarious type and block your bot or even worse, your IP Range in their .htaccess.
The address provided in your UA String should point to a page that explains who you are and what your bot does.
Continental breakfasts should be served on tectonic plates
Image

User avatar
captquirk
Site Admin
Posts: 119
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Extended UA String

Post by captquirk » Fri Nov 23, 2018 11:48 pm

Actually, the user agent string is defined in the database as varchar(15). Changing the field length in configset.php really just makes a longer field to enter a string which will be limited to 15 characters!

Still, point taken. That would be a good change to include in the next release.

User avatar
ReddWebDev
Posts: 13
Joined: Fri Nov 23, 2018 3:09 pm
Location: Great Falls, Montana
Contact:

Re: Extended UA String

Post by ReddWebDev » Sat Nov 24, 2018 12:05 am

Would having the same UA among several or more users be wise? This isn't a distributed utility, like Nutch for instance is. Nutch is banned/blocked for it's many abuses over the years. Most of the webmasters and programmers I've known over the past 20 some odd years block agents that don't identify themselves.
Continental breakfasts should be served on tectonic plates
Image

User avatar
captquirk
Site Admin
Posts: 119
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Extended UA String

Post by captquirk » Sat Nov 24, 2018 12:23 am

Would it be wise? Absolutely not! The more people who download and use Sphider, the greater the likelihood of one of them getting a bad reputation. If they, and many others, use the default "Sphider" as a user agent, that could ruin it for everyone. If, however, a user only uses Sphider to build an index just for their own site, a user agent of Sphider would work just as well as "Unique_to_the_world."

So what one enters for a user agent depends entirely upon the intended use of the application.

User avatar
captquirk
Site Admin
Posts: 119
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: Extended UA String

Post by captquirk » Mon Nov 26, 2018 9:38 pm

In the next release of Sphider, the maximum size of the user_agent column in the database will be increased to varchar(50).

The text box for User Agent in configset.php will remain at 20 characters. Like any text box, you can enter more characters than just the box size. You just won't see ALL the text at the same time. The critical factor is how much of the text entered will be stored in the database.

Post Reply