Saturday, June 02, 2007

parseMe 20070602 Update

Here's another update to parseMe (back story), my little GPL'ed PHP-based RSS/Atom feed reader for mobile phones and other web-capable devices.

You can find the appropriate links below:


Release notes:
  • Moved my CVS repo to Subversion (svn), hence the revision number differences. I considered moving to a distributed revision control system, since they're gaining in popularity, but I got lazy after the major rewrite. ;) Maybe for the next release.

  • This is a quasi-complete code rewrite. In this release, I have moved away from the initial goal of keeping within the 500 lines limit (including comments) and having an "educational" flavour, to focus instead on the code structure, the features, further increased security, etc. The security aspect does account for a lot of the extra lines, when coupled with the new features.

  • The parseMe class has now been substracted from the index.php script and has been moved to lib/php/parseMe.class.php.

  • One of the most significant features, on the user end, is that you can now request any number of feeds to be parsed at once. Keeping in mind that the main target audience for this tool is the mobile market (usually slow, tiny screens, low RAM, etc), the usual total number of feeds offered does not lead to major performance hits, unless of course the sources themselves are slow to answer the tool's request(s). You can of course still set your feed selection in the cookie-based preferences, which now allow for multiple choices.

  • With the multiple feeds feature, the next logical step was to enable some sort of sorting options. You can sort the entries by feeds, or from new to old (descending) or from old to new (ascending). Your favourite sort order can be saved.

  • You can now opt in or out of using the Google Mobile Gateway for destination links, right from the query form, and save your preferred choice.

  • On the server end, self-contained caching is now done through PHP data serialization, since there is no point in reparsing the same XML at every page load, after all.

  • On the security front, and primarily with the concern that we do have an application-writable directory (cache), there are quite a few improvements. Since the data contained in the cache files is not very sensitive by design (and if it is, I'd suggest using ssl and password protecting the app), this is really more of an exercise in good coding practices. And there is of course the concern of php injection attacks.

    • The cache filenames are now generated as a sha1 sum, with the help of an admin-defined shared secret so that they cannot be easily guessed.

    • All cache files now start with a dot (.) so that most web servers will not even serve them, and to be invisible when directory listing is enabled at the server level.

    • On the other hand, there is still a very strong emphasis on user input sanitazation and usage in the logic itself (EG: no client-defined source URL, source validity tests, etc).

  • Fully valid class documentation can be leveraged in IDEs such Eclipse, auto-documentation tool such as phpDocumentor, etc.

No comments: