Grokking resources and relationships

A project I’m currently working on, a media repository for internal use at Synapse, requires that files in the repository can be marked as related to each other in one of several ways. A few examples:

If I have a file in two different formats (say, TIFF and JPEG), both files should be marked as identical to each other. If the image was modified in some way so that it is different from the original, but not an entirely new image, then the two files should be marked as variations of each other. If a person’s photograph is used in an advertisement, then the files should be marked as one being used within the other. In this case, the relationship is asymmetric: “uses” at one end and “used in” at the other.

I’ve spent the last month trying to build a structurally sound model for storing properties and defining relationships, and I have to admit, the model remains just as hazy no matter how hard I try to define it.

I’ve lately taken to reading up on other attempts at classification and at defining relationships:

Iconclass is a subject specific international classification system for iconographic research and the documentation of images. Here’s an explanation of how it works. Iconclass was designed for classifying western art and isn’t well tuned for an archive of modern media, but it still does an excellent job. Unfortunately, Iconclass is copyrighted and requires an expensive license to even use the classification system. Maybe it’s my open source evangelist background, but I fail to understand why a resource like this is not available for free.

ISO 2788, the ISO standard for a thesaurus, defines nine relationships between words: synonym, related term, alternate term, narrower term, broader term, narrower term instantive, narrower term partitive, broader term instantive, broader term partitive (Once again, you have to pay for a copy of the specification).

Resource Description Framework (RDF) is a comprehensive XML-based syntax for describing a resource but is unfortunately weighed down by its own complexity (RSS 1.0 vs. 2.0 being a prominent example). FOAF is an example of an RDF-based syntax for defining relationships between people, using an MD5 hash of the email address to identify a person.

The Semantic Web project, from what I understand of it so far, wants all resources to be identifiable with a URI, but is stuck with the more fundamental problem of defining what a resource is. Shelley Powers, author of O’Reilly’s Practical RDF, has an excellent explanation. Here’s another from Richard MacManus explaining why the Semantic Web is like Moby Dick.

My current reading: Practical RDF and Information Architecture for the World Wide Web, both from O'Reilly.
  • Avatar

    Anonymous — Oct 6, 2003 6:18:53 PM — #

    Use Directory to store the information
    Try to use a robust Directory with intelligent schema ( open ladap
    is a good start) . Looks like a interesting problem to attack.
    Have not given much thought to the classification issue. Intrigueing !

    ~T

    PS : Not a member of LiveJournal

Leave a Reply

You can respond with a photo by tagging it on Flickr with