HOW can I get the ER diagram of sphider?

Come here for help or to post comments on Sphider
Post Reply
conf
Posts: 3
Joined: Tue May 14, 2019 9:44 pm

HOW can I get the ER diagram of sphider?

Post by conf »

Firstly, It is an amazing project that helps me a lot.

I have to bring an ER diagram from sphider but I get lost from too many entities and complex relationship among them.

I know because you develop this project you know it very well. cuold anyone give me an ER diagram to sphider, please?

I am using sphider 2.4 pdo.

Is there any way to generate an ER diagram from existing database it will be ver useful. (some program, maybe?).
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: HOW can I get the ER diagram of sphider?

Post by captquirk »

erd.png
erd.png (246.79 KiB) Viewed 16981 times
This is no where near a proper ER, but it should give an idea of what's what.
The settings table stands on its own and controls the workings of the admin, spidering, and search functionality.

The sites table has the details of each site in the catalog. The site_id connects it to the images table and the links table. It also connects to the site_category table, which is also connected to the categories table. The images, links, and site_category table can all have multiple records using site_id. Site_category can also have multiple links to categories by way of the category_id.

The domains table derives it content from url column of the sites table, although there is no other direct relationship.

The keywords tables is a unique list of keywords derived from all the indexed links. It has no direct relationship to either links or sites. It is just that: a list of words. Each word in the table is hashed, and the final character in the hash (0-f) is used to determine which of the link_keywordX tables it should be referenced. A single keyword can occur multiple times in one of the 16 link_keywordX tables. For each occurrence in one of the 16 tables, a single keyword can be related to an individual link containing that word. The word is also associated with domain (subet of the site url) and assigned a weight. Notice, link_id is unique in the links table, but has many occurrences in the lin_keywordX tables. Keyword_id is unique in the keyword table, but may occur multiple times in one SPECIFIC link_keywordX table (determined by the hash). HOWEVER, within the link_keywordX tables, each keyword_id/link_id IS unique.

The query_log is another stand alone table and used to record queries made by users during searches.

The rss_sites and rss_links are independent also, rss_sites having a 1 to many relationship to rss_links.

The pending table is used during indexing to build a list of links found, and the temp table has information on the link currently being processed.

MySQL Workbench has a reverse engineering toll which is SUPPOSED to build a more proper ER diagram, but it fails to draw the actual relationships.

This may not be exactly what you were looking for, but hopefully will aid in understanding the relationships a little better.
User avatar
captquirk
Site Admin
Posts: 299
Joined: Sun Apr 09, 2017 8:49 pm
Location: Arizona, USA
Contact:

Re: HOW can I get the ER diagram of sphider?

Post by captquirk »

A note concerning the prior post:
MySQL Workbench has a reverse engineering toll which is SUPPOSED to build a more proper ER diagram, but it fails to draw the actual relationships.
The reason MySQL Workbench is unable to draw the relationships is because the SQL instructions which build the tables do NOT contain any foreign keys. The current create instructions have been modified somewhat since the early days of Sphider (1.3.6 and preceding), but the basic structure has never been changed.

I DO have a test create SQL which DOES include foreign keys, and thus MySQL Workbench can build an ER diagram with relationships shown. At some point, Sphider may ship with the modified build SQL, but I have no intention of trying to alter any current databases out there. It is 1) probably unnecessary, and 2) would be one heck of an update script!

Now, concerning the ER diagram from the modified script.... it is one busy little guy! A png small enough to post would be too small to make any sense of. It will probably take a lot of work just to make it somewhat usable by rearranging the layout.
Post Reply