RFC: Interface for search agents

Subscribe to RFC: Interface for search agents 3 post(s), 2 voice(s)

Avatar nekrad 25 post(s)

Hi folks,

I already wrote some lines about my search agent (http:///auctionwatch.metux.de) and my plan for
JBW integration.

Now I’d like to talk about an open and efficient
robot interface for that. Some first ideas:

  • Evrything starts with some URL prefix
    (eg. http://auctionwatch.metux.de/ROBOT/)
  • http reply compression SHOULD be supported
  • Authentication key is passed via the next “subdir”.
    Structure and semantics are specific to the agent;
    clients just see it as an varbatim text, which just
    has to fit in an (url-encoded) single directory name.
    (eg. an md5). An special “ANONYMOUS” is reserved
    for anonymous (non-logged-in) operations.
  • The next directory level specifies operations
  • Listing just the item-IDs (no further data) is
    done via “list-ids” op. It expects the box name as
    next directory level. The result is formatted as a list
    of “:”+
  • more commands can be defined later :)


Avatar Morgan Schweers Administrator 1,204 post(s)


First a few notes…

I don’t know what you mean by ‘box name’, offhand.

Don’t pass authentication in the URL. Pass it as an ancillary header of some sort, or via basic-auth.

This sounds like a good place for a REST api.

One core idea of REST is that reads should be GET operations, writes should be POST (or PUT, POST, and DELETE, but those are more rarely supported in HTTP APIs). The resource is identified by the path, and authentication is handled out-of-band. (That said, basic authentication works quite well for this initially, as does cookies.)

REST defines (in practice, not the theory of REST, which I’d have to go back and dig through) a set of simple operations defined as CRUD (Create, Read, Update, Delete). Listing items is just an index or Read operation.

From a rails background, this translates pretty easily. So, http://www.example.com/searches would return the list of searches you have defined (presuming you’re authenticated by cookie or Basic Auth), and http://www.example.com/searches/1 would return the results of the first search. You could also use named resources, so you’d have http://www.example.com/searches/favorite_books which would find the search named ‘favorite_books’ and return the results.

You’d edit the record by POSTing (or PUTting) to something like: http://www.example.com/searches/update/1 (or http://www.example.com/searches/update/favorite_books). You create new searches by POST to http://www.example.com/searches/create .

It’s all using existing methodologies, instead of trying to create a new one, mostly. REST stands for REpresentational State Transfer which is a little overkill as acronyms go, but the core of it is taking the mental approach that made the web work well, and applying it to application interfaces.

Sorry; brain-dumping. I hope it’s interesting… :)

— Morgan Schweers, Cyber*FOX*!

Avatar nekrad 25 post(s)


“box name” refers to different directories (aka “boxes”) an search result record may ly in.
One of auctionwatch’s key features is the ability to automatically sort results. In fact it currently
has to dimensions, a) the “box” (where new results per default goes to “incoming”, but the
user may speficy another one; the “watch” button moves to the “WATCH” box, blacklisted
ones to “BLACKLIST”, those reached the price limit to “MAXPRICE”) and the feed name
(which is just an user-given name/tag – many feeds/searches may share the same name).
Both, box name and feed-tag have a special value “*” which (obviously) is no filtering at all.

So I intend to map them into an directory hierachy. Maybe the time scope filtering to
(“ends today”, “ends later”, “ended”) will make up another level. I’m not quite sure about that.

The idea of putting evrything to GET is to make it some bit easier for scripting and maybe
safe a few bits of traffic.

If I understood REST right, it’s an kind of filesystem subset. Modeling evrything into an
(synthetic) filesystem has it’s charme – makings many things easier, the AW appserver
could be actually an 9P server and all frontends (including the web-frontend) just access
it via 9P filesystem :)