Published in Brighton, UK

Clagnut

Machine tags and ISBNs

About a year ago I wrote about how tagging had become more than just a way to annotate content with keywords. Tags had become the wires by which disparate content could be connected. Through use of consistent and detailed tagging, I could begin associating my blog posts with my content on other web sites, such as my Flickr photos and my ma.gnolia links. This is, of course, the joy of API in action, and it made me think differently about how I was tagging my stuff.

Part of that thought was the inclusion of tags such geo:lat=50.382 which enabled me to associate a physical location with a blog post. Called triple tags at the time, we now know this format as machine tags thanks to Flickr. A year on, and other services such as Upcoming and Last.fm are encouraging their users to tag Flickr photos with machine tags such as upcoming:event=144002 and lastfm:event=97947 thereby associating photos with events and gigs, and enabling these services to illustrate content with relevant photography.

In terms of searching and API goodness, I believe Flickr is still the leader in terms of what can be done with machine tags. When tagging, the format is crucial and must be of the form namespace:predicate=value. Flickr recognises the three parts and stores them separately, thus enabling one to search within any namespace, for any predicate and/or for any value. Powerful stuff, as explained in Rev Dan Catt’s post earlier this year.

Anyone can invent any namespace and nomenclature. For example, I have tagged this photo with clagnut:blogid=1900 to explicitly associate it with this blog post. Flickr recognised the machine tag and now I can use the API to search for any photo tagged with the clagnut namespace.

The ability for anyone to create a namespace opens up an interesting future as we begin to see patterns and conventions emerge: amazon:asin=B000AA4I1M anyone? Which brings me to the point of this post. This time last year I was tagging photos and posts which discussed books in the following manner isbn=0713998393. In and of itself this works fine, but it is not a valid machine tag, which means we cannot make use of the afore-mentioned API goodness within Flickr (and where Flickr is leading so others will follow). We therefore need a triple-tag version of the ISBN tag, and here’s my suggestion: iso:isbn=0713998393. ISBN is a standard recognised by the International Organisation for Standardization (ISO) so I thought it made a certain sense for ISO to be the namespace. Other standardised entities could be tagged in a similar way, such as iso:issn=15340295.

Is an ISO namespace something likely to be used in the future? Is it fatally flawed, and if so how would you suggest machine tagging ISBNs instead? Discuss.

Update Further to Peter Firminger’s comment it seems Machinetags.org might become a good place to propose and develop define new namespaces.

30 March 2007

§ Blogging · Flickr

18 comments

Related photos

  • Veerle blogging
  • Chapter 7
  • Page 326
  • Page 297
  • Page 298

Next

Previous

Related posts

Keywords

Machine tags

Comments

  1. 1

    The use of ‘iso’ makes lots of sense to me.

    However, the fact that namespaces can consist of a single keyword that anyone may create seems potentially problematic to me. Cases like ‘amazon’, ‘flickr’ or even ‘clagnut’ are perhaps not controversial, but what happens when five online photo galleries all decide to start populating a ‘photo’ namespace with tags? I could see a case arising where you had several different sites endorsing, for example, ‘photo:category=...’, all with wildly-varying values and semantics.

    Java uses domain names for package namespaces. Would it make sense to adopt a similar convention, at least for tags in a taxonomy that is specifically linked to a particular domain (such as the ‘lastfm’ and ‘upcoming’ examples)?

    Angus McIntyre
    30 Mar 2007
    13:52 GMT
  2. 2

    Aaron from Flickr has this to say on the subject of multiple namespaces:

    if you are concerned about colliding namespaces you should consider adding an additional machine tag to define your namespace.

    For example :

    dc:subject=tags
    xmlns:dc=http://purl.org/dc/elements/1.1/

    Rich
    Rich’s Gravatar
    30 Mar 2007
    14:40 GMT
  3. 3

    I had the same thought as Angus about namespace collisions. At first, Aaron’s solution of tagging the namespace seems a litle kludgey, but it’s actually probably more sensible than

    clagnut.com:blogid=1900

    or (Java-style)

    com.clagnut:blogid=1900

    if you’re going to be using unambiguous namespaces predominantly, as I think XML has shown us is the common usage.

    As to whether iso is a valid namespace for isbn, it seems to be so. (Perhaps those with greater knowledge of ISO can offer greater enlightenment)

    I had not previously really encountered this machine tagging phenomenon but if it takes off it is surely one of the more powerful approaches to the Semantic Web. HTML has shown us it is not sufficient to assume that user agents will be able to understand human-oriented content in a universally agreed manner, so it seems we must bow to the needs of our boxes.

    Bruce Boughton
    Bruce Boughton’s Gravatar
    30 Mar 2007
    16:52 GMT
  4. 4

    Perhaps we could replace co-comment (which I found to not be that good at its job) with comment tagging which associates a blog comment with an OpenID?

    blogcomment:openid=http://myopenid…

    Get your blog pinging a comment aggregator and you’ve got decentralised comment tracking (or maybe not)?

    Bruce Boughton
    30 Mar 2007
    16:56 GMT
  5. 5

    IIRC ISBN are URNs, so why not urn:isbn=12345 ?

    lqd
    30 Mar 2007
    20:07 GMT
  6. 6

    ISBNs are indeed URNs (I know Wikipedia isn’t an authoritative source but check http://en.wikipedia.org/wiki/Uniform_Resource_Name )

    Even more interesting about tagging ISBNs is the fact that ISBNs aren’t just randomly generated unique identifiers, but within each ISBN is encoded the language of the book, the publisher, and a unique identifier for that book (plus a checksum digit). That’s quite a bit of machine-readable information!

    Paul Watson
    Paul Watson’s Gravatar
    31 Mar 2007
    21:55 GMT
  7. 7

    For books I’d be tempted to keep the namespace as general as possible, you know, like “book”:

    • book:isbn=0976072696
    • book:author=Sun Tzu
    • book:title=The Art of War
    • book:lang=en
    • book:translator=Lionel Giles
    • book:text=http://www.gutenberg.org/files/132/132.txt
    • book:pdf=http://www.puppetpress.com/classics/ArtofWarbySunTzu.pdf

    The generality increases the possible richness of tags, but at a trade off with the specificity of standards based tags like “iso:” or “urn:”. But maybe the concept “book” is a little more enduring than any standard?

    Paul Wib
    1 Apr 2007
    00:46 GMT
  8. 8

    I’d second Paul’s suggestion of using book:isbn= as the format. Neither iso: nor urn: add any really meaning/specificity to the tag.

    Or, to think of it another way, how often will you want to search for all photos which have been marked up with absolutely any ISO standard? (iso:*=*)

    kellan
    1 Apr 2007
    01:32 GMT
  9. 9

    I like Paul Wibs’ suggestions – but I’d point out that “book:lang=en” is just duplicated information. The 1st digit of a 10 digit ISBN (or the 4th digit of the new 13 digit ISBNs which have become the standard since Jan 1st this year) identifies the language.

    Using Paul’s example of “0976072696” – in this case the first digit “0” identifies it as an English language book (a 0 or 1 indicates English language, a 2 indicates French, a 3 indicates German etc (full list at http://www.isbn-international.org/en/identifiers/allidentifiers.html).).

    I think any tagging standard should definitely use the new 13 digit ISBN standard (introduced because the number of available ISBNs was going to fall short of the expected number of new books), so 0976072696 becomes 9780976072690 (note: the final digit of the ISBN is a checksum digit, so you can’t just prefix a 10 digit ISBN with “978” – there’s a conversion script at http://www.isbn.org/converterpub.asp))

    Paul Watson
    Paul Watson’s Gravatar
    1 Apr 2007
    09:46 GMT
  10. 10

    The problem with book: is that it doesn’t actually make semantic sense for everything. Not all things with ISSNs are books, for instance (doesn’t ALA have an ISSN now?).

    Also, we don’t need book:title et al, because (as per Rich’s example in one of the comments) we have Dublin Core covering a lot of this. I’d favour using either iso: or urn: for the ISBN/ISSN – the former is probably slightly better because URNs don’t have the traction they perhaps should have, but ISO is pretty well recognised.

    book:text and book:pdf are more interesting. I’m not yet convinced they’re worth it. Tags are (generally) for expressing things in a machine-readable and human-understandable way that we can’t do otherwise – but we have URIs and links. (Which gets us into content negotiation territories, but that’s not something machine tags should try to solve.)

    There’s another problem I have, with the semantics: book:pdf:<uri> is linking to the text of the book, just as book:text:<uri> is, merely in a different format (and possibly a different fidelity). These are, at the end of the day, only representations of the resource that is the book (urn:isbn:FOO as a URI, possibly iso:isbn:FOO as a tag). There have been discussions of ISBN-URI to deferenceable-URI services in the past. Machine tags shouldn’t be used to boil the ocean :-)

    James Aylett
    1 Apr 2007
    12:27 GMT
  11. 11

    James, ISSN stands for International Standard Serial Number and is used for things published in successive parts, like journals, comics or possibly a website (although apparently the British Library, who control ISSNs in the UK, are not too keen on giving them to websites).

    ISBN stands for International Standard Book number and is for any one off publication which (in the UK) the British Library deems worthy of one. So yes, an ISSN can be for many things, but an ISBN is always a book (hence my example).

    I totally agree the semantics in my example weren’t rigorously thought out, but isn’t that exactly why tagging is so successful – it’s an easy way for your average human to create and understand meta-data without having to think too hard or know anything about schemas and standards. This is not OWL/RDF! Maybe book:pdf=url would emerge as a convention, maybe not, it doesn’t really matter when you don’t have hard semantics up front.

    So in this context I’d still argue ‘book’ makes the most sense as a namespace. Most people I know don’t have a clue what a URN is and think Dublin Core is some new style of breakbeat with fiddles, but I’d wager most of them know what a book is :)

    Paul Wib
    1 Apr 2007
    16:31 GMT
  12. 12

    The format of the tag needs to be dependent on how it’s used. Is the function of the tag is to link to the electronic form of a book then that’s one thing, but if the tag is going to be more commonly used in a blog post to link to a place to buy a book (e.g. a link to the product record page on Amazon)?

    I’m sure the latter would be far more commonly used in blogs – and therefore far more quickly adopted. And besides, links to electronic versions of books can already be made machine-readable by the use of a DOI link.

    I guess one alternative is to use something like “book:buy” for a link to Amazon or the Publisher’s online shop.

    Paul Watson
    Paul Watson’s Gravatar
    1 Apr 2007
    19:20 GMT
  13. 13

    My thought is this:
    ISO is an organisation that makes standards. Until they make this perhaps one shouldn’t make it look like as if they had? Wouldn’t it be better idea to do something under your own flag? Like Microformats, DC and start something new up?

    Like Machinetags.com and on that site standarize machinetags like these. For example mt:isbn=887r89w, mt standing for Machinetags.com

    It could perhaps even be a project in the scope of Microformats since machine tags are to tags what microformats are to html.

    Pelle
    Pelle’s Gravatar
    1 Apr 2007
    20:07 GMT
  14. 14

    Paul Wib – I really like the idea of using ‘book’ as the namespace. It’s much more in the spirit of tagging (machine or otherwise).

    As you say, it seems unlikely that anyone would want to search for all stuff tagged with the namespace ‘iso’, whereas searching for stuff tagged with the namespace ‘book’ seems potentially more useful.

    Rich
    Rich’s Gravatar
    2 Apr 2007
    08:28 GMT
  15. 15

    Pelle - I think you\'re referring to http://machinetags.org/wiki/ instead of .com.

    Submit the "book" or "isbn" namespace (and join the mailing list) if you think it\'s important.

    Peter Firminger
    2 Apr 2007
    11:46 GMT
  16. 16

    I refered to the concept rather than a specific site, nice to see that such a site exists – it or something like it should be built upon but instead of defining different namespaces it should define predicates under a namespace shared by the complete project…

    Pelle
    Pelle’s Gravatar
    2 Apr 2007
    16:13 GMT
  17. 17

    For ISSNs, there’s already the PRISM (Publishing Requirements for Industry Standard Metadata) standard namespace. You can see it in action in any of IOP Publishing’s RDF (RSS 1.0) feeds, e.g.
    Inverse Problems Latest Papers along with DC and Taxonomy metadata.

    While I understand the need to take this stuff out of the ivory tower of academia and make it as usable as possible, it would make sense to minimise the number of wheel-reinventions as well.

    Tim Beadle
    Tim Beadle’s Gravatar
    4 Apr 2007
    11:49 GMT
  18. 18

    I’ve posted a draft book: namespace on machinetags.org to get the ball rolling there. All comments and input on the wiki and the machine tags mailing list very welcome.

    Dan Champion
    Dan Champion’s Gravatar
    9 Apr 2007
    06:54 GMT

Add your comment

Comments are now closed on this post. If you have more to say please contact me directly.