Sphider 1.5.3 and Sphider 1.5.3.PDO have been released

Updates to the Sphider search engine have been made. The latest version is 1.5.3. Sphider 1.5.3 is for use when both MySQLi and MySQLnd modules are available in PHP. For individuals who’s host does NOT provide MySQLnd support, but DO provide PDO support, Sphider 1.5.3.PDO is also available. You may find both on the Downloads page (Click the Downloads tab at the top of this page.)

To avoid confusion concerning versions, the PDO version not longer contains a “.1” at the end of the version number, but a simple “.PDO” to distinguish it from the non-PDO version. (Some people thought 1.5.2.1 was an minor update from 1.5.2 when it actually was identical but coded for PDO instead of MySQLnd.)

Changes in 1.5.3 from 1.5.2 are:
Better support for https sites.
Ability to better recognize and follow the directives in a robots.txt file.
Correction of a potential problem when using the CleanDomains function in the event there was only a single domain to clean.
Fixed a number of errors which could appear when a database table prefix contains a hyphen.
Fixed a potential error when running under PHP 7.

Sphider Help Forum is now available

The new Sphider Help Forum for help concerning Sphider 1.4.2 or later is now open, at least on a trial basis. Out of necessity, ALL posts will be moderated. This is because of the tremendous amount of blog, forum, and guestbook spam present on the internet. Apologies for the inconvenience, but that’s life.

Hopefully, this forum can be used by the slowly growing community of users of the updated Sphider. The original Sphider Forum (located at sphider.eu) has become steadily less help and more sales pitch for Sphider-Plus. We have no gripe about Sphider-Plus, per se, but the original Sphider was free and just because the original developer moved on to other interests several years ago, we don’t see why the original can’t live on and evolve with the rest of technology.

The original (1.3.6 and before) has problems with anything later than PHP 5.4, and here we are, most platforms on 5.5 or 5.6 and the trend well underway towards PHP 7.  Any internet technology which simply stands still for 4 to 7 years is going to become lost in the cloud of dust.

Anyway, hopefully the forum will be a better place to air problems and find solutions than blog comments.

Considering another Sphider improvement

The original version of Sphider had very erratic support for indexing HTTPS pages, and wouldn’t even look at the robots.txt file on a HTTPS site. That failing has never been addressed, and even the latest version, 1.5.2, has the same failings when it comes to HTTPS. This has never really been an issue for me before, and even now it is more annoyance than issue as I can work around it.

Still, the “problem” does seem intriguing. After a bit of experimenting, a fix may not be all that difficult. (Famous last words, right?)

I am debating now whether or not to continue investigating alternatives and make more code changes which would improve HTTPS support in Sphider, not only to ensure more reliable connectivity but to enable the robots.txt to be utilized as well. I don’t know that there is that big of a need. We’ve never received any complaints or comments on the issue…

Anyway, at this point there is a POSSIBILITY, but no definite plans one way or the other.

*******************************

UPDATE (Apr 6): I was able to get the robots.txt file read from a https site. First problem, regardless of http or https, the parsing of allowed or disallowed user agents and disallowed files/directories was iffy. If the robots.txt file had lines like “user-agent” or “disallow”, it was parsed, but “User-agent” or “Disallow” was not. It was a case issue. That is now fixed (on my side, not published yet). Second problem, now that I know the file IS being read and parsed, Sphider will STILL index some files in disallowed directories!

If you have any files or directories listed as “url_not_inc” in your settings, that will work, but not the robots.txt disallows, even though that SHOULD be the case. Well, this situation certainly has gotten my interest!

*******************************

UPDATE (Apr 7): I have begun the process of troubleshooting the code to see what is going awry and where. Working alone and having other things to do in life, this can be both time consuming and frustrating. So far, I do know the robots.txt is read and parsed properly. Just where and why the instructions are not acted upon is another matter. At least the question of whether or not I will be attempting another modification has been answered!

*******************************

UPDATE (Apr 8): GOT IT! Preliminary tests show robots.txt is now being followed in both http and https. More testing to follow (found a couple other misc issues and fixed them). Once everything is validated, there will be a 1.5.3. Stay tuned.

Sphider 1.5.2 and 1.5.2.1 (the PDO version) have been released

The newest version(s) of the Sphider search tool have been released and are available from the Downloads tab above. While there isn’t really anything NEW in these releases, they do address a couple of problems encountered. Of most importance, the problem of having Sphider exit during indexing due to web page coding errors on the site being indexed has been addressed. Instead of issuing a fatal error and stopping, only warnings are generated and indexing continues on its merry way. A potential database error when updating the settings has also be thwarted.

Also, the previous PDO version had a bug in which descriptions could disappear from search results listings. This has been fixed.
If you had the previous PDO version (1.5.1.1) and have lost the descriptions, after upgrading to 1.5.2.1, you will need to restore the descriptions by going into the settings tab, go down to the “Search settings” section where it says “Maximum length of page summary displayed in search results”, change the selection to 250 and “Save settings”. (Updating the settings before would change this from the default 250 to either 0 or 1!)

Happy Holidays and Happy indexing!

Sphider 1.5.2 – coming soon

The next version of the Sphider search tool is now in testing. Sphider 1.5.2 (and its companion PDO version, 1.5.2.1) is not very different from the previous version, except for a couple minor fixes on the Settings tab and the fact that the indexing portion has been toned down to issue warnings only when an improperly coded web page is encountered. Sphider 1.5.1 exits with a fatal error instead of continuing to index the site. While improper coding in a web page (commonly having to do with some off beat special character the database has no idea how to interpret) is rare, it sure was a monkey wrench when it came to indexing a web site. A couple other page conditions which could have produced a fatal exit now simply issue warnings (like the url exceeding the length the database could store).

At any rate, both the PDO and non-PDO varieties are now being tested to make sure the intended fixes work properly, and that we haven’t introduced any new problems. Expected arrival at this time is early December.

Blue Origin does it yet again. One booster, three launches, three landings.

On April 2, Blue Origin launched its New Shepard booster for the third successful West Texas landing after a suborbital flight. Previous landings of the same booster previous took place on January 22 and November 23, 2015.

The crew capsule successfully landed by parachute shorty after the booster landed.

SpaceX, which has been successful only once (so far), but it has to be noted that the Falcon 9 is larger and, being orbital, has a greater velocity to contend with. SpaceX hopes to be able to recover and reuse a booster sometime in 2016.

Whether it is Blue Origin or SpaceX, recovering a booster is no simple matter. It is, after all, rocket science!

PDO version of Sphider

Sphider 1.5.1 has proven to be a good, stable version of Sphider. HOWEVER, it seems some people can’t use it because their host chooses not to support MySQLnd, typically for shared hosting. It isn’t because it can’t be done, but because they don’t want to do it. In those instances, if you want MySQLnd, you to have to upgrade to VPS, at an additional charge of course. Sphider users in that scenario now have an option.

We have taken Sphider 1.5.1 and converted the sql to PDO (PHP Data Objects). PDO support is virtually guaranteed. The PDO version is referred to as Sphider 1.5.1.1. PDO has some advantages over MySQLi/MySQLnd, but there are also disadvantages.

MySQLi/MySQLnd is SPECIFIC to a MySQL database, where PDO is a generic supporting a variety of databases, one of which is MySQL. There is an overhead involved. For Sphider, we STILL consider the MySQLnd prepared statement methodology over PDO prepared statements. Reality dictates a PDO version be made available. Our recommendation is that you install the PDO version only if the standard MySQLi/MySQLnd option is not available. If you already have a working Sphider 1.5.1, DO NOT install 1.5.1.1.

One issue encountered was that PDO has no need to use the real_escape_string function…. EXCEPT WHERE IT IS NEEDED!!! The backup and restore functions failed without it. All research indicated “You don’t need real_escape_string, just use PDO prepared statements!” Dogmatic statements like that can come back to bite you. Well, our scenario wasn’t executing sql, it was CREATING sql, specifically, an sql string. Real_escape_string was necessary to create a valid string, and a prepared statement was not possible. We had ALREADY run a query, now we were manipulating the queried data to create a string for LATER use in a different kind of query. So we had to create an emulation for real_escape_string, which was a bit of trial and error. So much for “PDO NEVER needs real_escape_string”.

Working beta, Sphider for WordPress

I now have a working beta version of Sphider for WordPress. You can see the beta in action by clicking on the Search tab. This isn’t a very large blog, so there isn’t much to search for, but you can get an idea.

Suggestions STILL do not work. Accessing the suggest mechanism in test mode shows it IS responding and building a proper json, but is not being passed on as in the normal implementation. Suspect it is something to do with a collision with a WordPress json?

There are probably still some rough edges, but that is what a beta is for… to find those rough edges and smooth them out. Even rough, it functions, which is something the last Sphider for WordPress no longer does!

If you want to give it a whirl, drop us a line and we’ll get the files to you. And, yes, instructions…

THE MORNING AFTER: After a couple false starts, I finally got a package assembled with everything you need. The first package was done late at night and didn’t include everything it should have. I rushed a second version with an addition. There should have been additions!!!


How do you drop us a line? Use the Contact Us form on WorldSpaceflight.com home page, found in Links to the left.

Sphider 1.5.1 released

Sphider 1.5.0 was a major departure from older versions of Sphider in that it incorporated prepared statements, adding significantly to the security of Sphider. It performed very nicely.

But we did not like the database backup and restore procedures. Backup was quick enough, but restore was S-L-O-W!. The larger the database, the worse it got. There had to be a better way. There was, and we found it.  We grew our database to include:

    10 sites
    10 categories (5 top level, 5 sub-categories)
    10, 641 links (pages)
    70,317 keywords
    40,006 kb of cached text
    171,495 kb total size

A backup, producing a gzip file of 14,079 kb, was accomplished in 16 seconds.
A total restoration took 32 seconds. This was a definite improvement over the 6 1/2 HOURS for a smaller database.

Also, as we were no longer looking for coding errors, we began concentrating on the results (or outcomes of admin actions) looking for anything that just was not exactly what we expected to see. We found several bugs which were repaired and tested. Nothing earth-shattering, but bugs nonetheless. Sphider 1.5.1 is the result.

Since Sphider 1.5.1 seems to be the achievement of what we originally set out to do, namely, dispensing with deprecated code, improving security, fixing a few bugs in the original releases, etc., this will probably be the last release for awhile. In the event of some operational problem of immediate concern, a simple patch should be sufficient instead of a whole new release.

Now despite the hours of testing and line-by-line code reviews and results analysis, Murphy’s Law still reigns. We’ll leave it at that.

Sphider for WordPress?

Several years ago, there was a Sphider for WordPress introduced. It was based on the 1.3.4 version of Sphider. Time moved forward, Sphider for WordPress did not. You can still find it. It just most probably isn’t going to work.

A few months back we tried to update it. THAT was a lost cause! So now we have taken our newest Sphider and have started to convert it. It does work, mostly. Still having a few issues, such as suggest doesn’t work and we aren’t sure why not. Also having trouble getting the search integrated into WordPress, although there has been some progress there.

Naturally, since this is a tiny blog, there isn’t much we can thoroughly test it on. Give us a bit more time to get the integration part down and maybe we’ll put it out as a beta, even without suggest working. But maybe we’ll find the problem there, too.

That would be nice, a working version of Sphider for WordPress.


UPDATE: December 15. Integration with WordPress has been accomplished. Suggestions still are not working. Being able to spider and search from WordPress is still a significant achievement. The MAJOR components have been tested and are functional. Still need all the minor branches to be tested.


UPDATE: December 23. Suggestions STILL not working, but Sphider now does a re-index when a post is added or edited. Duplicate domains are being entered in the domains table, but that should be an easy fix. Getting closer to being generally usable.