Metabrowser: automated metatagging of web pages

By Jon: First published in Online Currents 2005 – 20(3) 6-7

Metabrowser Authoring Tool (also known as Tag Tool) is a program, for viewing and editing metadata on websites. It comes with support for twelve standard metadata schemas, including DC, the Australian Government Locator Service (AGLS), the New Zealand Government Locator Service (NGLS) and the Australian Justice Sector schema. New schemas can also be developed and saved by the user.

Metabrowser is an Australian product and is available for download fromhttp://metabrowser.spirit.net.au. The current version is 1.8101. There is a thirty-day trial period after which the application can be registered online for $AU162.82 (discounts for multiple installations). The authors also support a metadata server which facilitates website searching, and provides consultancy and other assistance in establishing metadata.

Metabrowser users include a wide selection of Australian government bodies and several overseas organisations including the London School of Economics. Sales of the program have apparently slowed recently, but work on an upgraded version is in progress.

The website contains a set of online tutorials which can be accessed and run from within the program itself.

 


Metadata

Metadata in the context of HTML web authoring usually refers to ‘metatags’ incorporated in the head section of a page. These normally contain information aboutthe page itself. Commonly used metadata fields include the author and the issuing authority, the latest revision date, a description and subject keywords. Each metatag contains a name field describing the information type and a content field with the actual information. Although originally proposed as a web-wide system, metadata is largely confined to intranet systems and to large corporate websites which can afford dedicated web managers.

The best-known metadata system is Dublin Core (DC), which was developed at a joint NCSA/OCLC conference in Dublin, Ohio during March 1995. DC contains sixteen optional elements, in no particular order: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, Rights and Audience. Any element can appear multiple times – e.g. for multiple authors.

DC is a loose and entirely optional system but contains some recommendations. Thus the Type element has only 12 recommended terms for content, including sound,text and software. More detailed classifications can be obtained by adding recommended refinements to the core elements; thus Date can be refined toDate.CreatedDate.Issued and so on. Once in place, DC metadata can be automatically checked against a Document Type Definition (DTD) to ensure that it is syntactically correct. Official DC-style metadata is usually indicated by a ‘DC.’ Prefix on the element name.

Other metadata systems have been developed to apply to limited domains: for instance the EDNA system applies to Australian educational resources.

How it works

The Metabrowser window is divided into four sections, all of which can be resized and hidden if necessary.

The Web Browser panel functions as an ordinary web browser; it brings up Web pages on to the screen. Minimal versions of the Home, Back, Next, Refresh, Reload and Print buttons appear in a small toolbar to the left of an address line which shows the URL.

The bottom section of the screen is occupied by the Meta Browser table, showing the metadata (if any) currently assigned to this page. Text from the page itself can be selected and dragged into these fields; e.g. to add a description line.

Double-clicking on a line in the Meta Browser table opens a dialog box (the Edit Panel) between the Web Browser and Meta Browser panels. This is where the user edits existing metadata tags.

Finally, at the left of the screen a hierarchical Tree View panel shows the tags available in the current schemas. More than one schema can be loaded at a time. Refinements on tags appear as branches under the tags. Clicking on a branch in the tree view causes a line for that tag to be added to the Meta Browser table (you can also drag and drop). If you plan to use the entire DC template this can be added with a single command rather than one line at a time. Users can enhance the Tree View with macros so that clicking a branch not only adds a line but also fills in that line with relevant data: for instance clicking on DC.Publisher might add a Publisher line already containing your company’s name.

The Tree View can also be used as a kind of Favorites list, assembling links to web pages that can then be accessed immediately by clicking. This may be useful if you are copying metadata repeatedly from the same web page, for instance. Items in one Tree View can be dragged and dropped on the Tree View in a second Metabrowser window.

Once a set of metadata has been created or modified the page is saved with the metadata embedded in the HEAD section. Users can work with files on their own hard disks or through a link to their online website.

Metabrowser screen shot

Templates and thesauruses

Metabrowser allows the user to set up a metadata ‘template’ which can then be applied to any page in a single operation. (Unfortunately this doesn’t appear to integrate with the powerful template system in Macromedia Dreamweaver, the most widely used web authoring package. Dreamweaver users will have to cut and paste Metabrowser output into their site templates.). The user can also establish their own controlled vocabulary by creating a text file with a .MBI suffix which shows allowable terms to include in the metadata. This can be created and edited within Metabrowser itself, though the process is rather cumbersome, or from outside with a text editing program. Output from commercial thesaurus programs like Term Tree can be massaged into a form which Metabrowser will accept.

Macros

Metabrowser comes with a limited macro capability which allows simple operations to be triggered by Function keys. These might be loading or saving a scheme or a template, inserting a particular element, or deleting all metadata. Standard macros can be shared between multiple users. Macros take the form of hyperlinks, and as well as being triggered by Tree View selections can also be embedded into Web pages, so that the associated operation is carried out when the link is clicked.

Pros and cons

The program itself gave the impression of being reliable and well-written, and for use at a simple level appeared to be quite powerful and effective. I can’t comment on how it compares with alternative methods.

Written support is available through the website and a Word manual which can be downloaded. Both of these were fairly superficial. I was unable to find ways of doing relatively simple things like deleting macros or clearing the Tree View window. Since there is a bewildering collection of new terminology in the program – schemas, RSS channels, Metabrowser Server Records, Local Catalogs, Harvest Control Lists – a glossary would be useful to explain in more detail what these are and under what circumstances they might be useful. As it was, I felt I had blundered into an advanced class when I needed an introductory one. There were a few spelling errors on the website, which was otherwise well-presented though fairly brief.

The program does not always behave in familiar ways: actions like inserting a line in the Meta Browser, which I instinctively looked for under the Edit menu, can only be carried out by right-clicking. Other menu choices, like Edit Current Record, didn’t seem to do anything (what else would I edit?) The program optionally reassigns the F1 key, which should be reserved for Help. Some drag-and-drop activities didn’t work although the mouse pointer indicated that they should, and some of the dialog boxes that come up in response to an action are baffling and don’t make it clear whether the user has succeeded or failed. And one particularly frustrating feature: after navigating all the way through a complex network in order to find and open a web page, I had to go back and do it all over again for the next one in the same directory. Applications should remember the location they were in last.

On a more general level I am disappointed with the program’s one-at-a-time approach to page editing. Most large websites are now moving over to a partially or wholly automatic system, where pages are generated as required from stored information, and it is hard to see how Metabrowser might be applied in this situation. Within its niche, though – creating and editing metadata from formalised structures on individual web pages – Metabrowser does its job well.

I would like to acknowledge the help of Bruce McLeod, the author of Metabrowser, in responding to an earlier draft of this article.