<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>Coffee|Code : Dan Scott - Evergreen</title>
    <link>https://coffeecode.net:443/</link>
    <description>Caffeinated Librarian Geek</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.6.2 - http://www.s9y.org/</generator>
    
    

<item>
    <title>Querying Evergreen from Google Sheets with custom functions via Apps Script</title>
    <link>https://coffeecode.net:443/archives/310-Querying-Evergreen-from-Google-Sheets-with-custom-functions-via-Apps-Script.html</link>
            <category>Coding</category>
            <category>Evergreen</category>
    
    <comments>https://coffeecode.net:443/archives/310-Querying-Evergreen-from-Google-Sheets-with-custom-functions-via-Apps-Script.html#comments</comments>
    <wfw:comment>https://coffeecode.net:443/wfwcomment.php?cid=310</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>https://coffeecode.net:443/rss.php?version=2.0&amp;type=comments&amp;cid=310</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;Our staff were recently asked to check thousands of ISBNs to find out if we already have the corresponding books in our catalogue. They in turn asked me if I could run a script that would check it for them. It makes me happy to work with people who believe in &lt;em&gt;better living through automation&lt;/em&gt; (and saving their time to focus on tasks that only humans can really achieve).&lt;/p&gt;
&lt;p&gt;Rather than taking the approach that I normally would, which would be to just load the ISBNs into a table in our Evergreen database and then run some queries to take care of the task as a one-off, I opted to try for an approach that would enable others to run these sort of adhoc reports themselves. As with most libraries, I suspect, we work with spreadsheets a lot--and as our university has adopted Google Apps for Education, we are slowly using Google Sheets more to enable collaboration. So I was interested in figuring out how to build a custom function that would look for the ISBN and then return a simple &quot;Yes&quot; or &quot;No&quot; value according to what it finds.&lt;/p&gt;
&lt;p&gt;Evergreen has a robust SRU interface, which makes it easy to run complex queries and get predictable output back, and it normalizes ISBNs in the index so that a search for an 10-digit ISBN will return results for the corresponding 13-digit ISBN. That made figuring out the lookup part of the job easy; after that, I just needed to figure out how to create a custom function in Google Sheets.&lt;/p&gt;
&lt;p&gt;As it turns out, there&#039;s a dead-simple &lt;a href=&quot;https://developers.google.com/apps-script/quickstart/macros&quot;&gt;introductory tutorial for creating a custom function in Apps Script&lt;/a&gt; which tells you how to create a new function. And to make a call to a web service, there&#039;s the &lt;a href=&quot;https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app&quot;&gt;URLFetchApp&lt;/a&gt; class. After that, it&#039;s a matter of basic JavaScript. In the end, my custom function looks like the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;
/**
* A custom function that checks for an ISBN in Evergreen
*
* Returns &quot;Yes&quot; if there is a match, or &quot;No&quot; if there is no match
*/
function checkForISBN(isbn) {
  var hostname = &#039;https://example.org&#039;;
  var urlBase = hostname + &#039;/opac/extras/sru&#039;;

  /* Supply a numeric or shortname library identifier 
   * to restrict the search to that part of the organization
   */
  var libraryID = &#039;103&#039;;
  if (libraryID) {
    urlBase += &#039;/&#039; + libraryID;
  }
  urlBase += &#039;?version=1.1&amp;operation=searchRetrieve&amp;maximumRecords=1&amp;query=&#039;;
  
  var q = encodeURIComponent(&#039;identifier|isbn:&#039; + isbn);
  var url = urlBase + q;
  var response = UrlFetchApp.fetch(url);
  if (response.getContentText().search(&#039;&lt;numberOfRecords&gt;1&lt;/numberOfRecords&gt;&#039;) &gt; -1) {
    return &quot;Yes&quot;;
  }
  return &quot;No&quot;;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then I just add a column beside the column with ISBN values and invoke the function as (for example) &lt;code&gt;=CheckForISBN(C2)&lt;/code&gt;.&lt;/p&gt;
&lt;!-- s9ymdb:379 --&gt;&lt;img class=&quot;serendipity_image_center&quot; width=&quot;462&quot; height=&quot;346&quot;  src=&quot;https://coffeecode.net:443/uploads/pics/check_for_isbn.png&quot;  alt=&quot;CheckForISBN() function being invoked in a Google Sheet&quot; /&gt;
&lt;p&gt;Given a bit more time, it would be easy to tweak the function to make it more robust, offer variant search types, and contribute it as a module to the &lt;a href=&quot;https://chrome.google.com/webstore&quot;&gt;Chrome Web Store&lt;/a&gt; &quot;Sheet Add-ons&quot; section, but for now I thought you might be interested in it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Caveats&lt;/strong&gt;&lt;/em&gt;: With thousands of ISBNs to check, occasionally you&#039;ll get an HTTP response error (&quot;&lt;code&gt;#ERROR&lt;/code&gt;&quot;) in the column. You can just paste the formula back in again and it will resubmit the query. The sheet also seems to resubmit the request on a periodic basis, so some of your &quot;Yes&quot; or &quot;No&quot; values might change to &quot;&lt;code&gt;#ERROR&lt;/code&gt;&quot; as a result.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Fri, 15 Apr 2016 14:36:35 -0400</pubDate>
    <guid isPermaLink="false">https://coffeecode.net:443/archives/310-guid.html</guid>
    <category>coding</category>
<category>evergreen</category>

</item>
<item>
    <title>Library catalogues and HTTP status codes</title>
    <link>https://coffeecode.net:443/archives/301-Library-catalogues-and-HTTP-status-codes.html</link>
            <category>Coding</category>
            <category>Evergreen</category>
    
    <comments>https://coffeecode.net:443/archives/301-Library-catalogues-and-HTTP-status-codes.html#comments</comments>
    <wfw:comment>https://coffeecode.net:443/wfwcomment.php?cid=301</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>https://coffeecode.net:443/rss.php?version=2.0&amp;type=comments&amp;cid=301</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p property=&quot;description&quot;&gt;I noticed in Google&#039;s &lt;a href=&quot;https://www.google.com/webmasters/tools/&quot;&gt;Webmaster Tools&lt;/a&gt; that our catalogue had been returning some &lt;em&gt;Soft 404s&lt;/em&gt;. Curious, I checked into some of the URIs suffering from this condition, and realized that Evergreen returns an HTTP status code of &lt;code&gt;200 OK&lt;/code&gt; when it serves up a record details page for a record that has been deleted. The HTML itself has a nice big red alert box warning users that the record has been deleted to help humans realize that what was once there is no longer, but machines typically don&#039;t read English. However, at some point in the past few months, Google started parsing the HTML and recognizing when HTTP status codes are misleading.&lt;/p&gt;
&lt;p&gt;That led me to wonder what happens when you request a record detail page by ID for a record that doesn&#039;t exist in Evergreen. As it turns out, it currently returns HTTP status code &lt;code&gt;200&lt;/code&gt; with a detail page devoid of any details. Also not good! Being a good little Evergreen community member, I &lt;a href=&quot;https://bugs.launchpad.net/evergreen/+bug/1406025&quot;&gt;opened a bug&lt;/a&gt; and put together a fairly simple fix so that the catalogue will return a &lt;code&gt;404 Not Found&lt;/code&gt; for non-existent records and &lt;code&gt;410 Gone&lt;/code&gt; for deleted records. Huzzah for HTTP standards compliance. We build a better web one small step at a time.&lt;/p&gt;
&lt;p&gt;That, in turn, led me to wonder what happens when you request record details for non-existent records in other library systems. Here&#039;s what I found:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Bibliocommons&lt;/strong&gt;: Status &lt;code&gt;302 Moved temporarily&lt;/code&gt; that then leads back to an empty search form. Not good.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Blacklight&lt;/strong&gt;: Status &lt;code&gt;404 Not Found&lt;/code&gt;. Good!&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Encore&lt;/strong&gt;: N/A - appears to send up session based URLs for records. Really?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;III&lt;/strong&gt;: Status &lt;code&gt;200 OK&lt;/code&gt;. Not good.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Koha&lt;/strong&gt;: Status &lt;code&gt;302 Found&lt;/code&gt; with a &lt;code&gt;Location:&lt;/code&gt; header leading to a page with a status &lt;code&gt;404 Not Found&lt;/code&gt;. That redirect probably makes it harder for the machines to recognize that the resource does not at all exist than if it directly returned a &lt;code&gt;404&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Polaris&lt;/strong&gt;: N/A - it seems that the normal web interface doesn&#039;t link directly to titles; instead it serves up titles in the context of search results by position. The mobile web interface offers persistent URLs, but requests for non-existent records return a status &lt;code&gt;302 Found&lt;/code&gt; that redirects back to an empty search form. Not good.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Primo (using a permalink)&lt;/strong&gt;: Status &lt;code&gt;302 Found&lt;/code&gt; that then leads to an empty record details page with a status &lt;code&gt;200 OK&lt;/code&gt;. Not good.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Symphony&lt;/strong&gt;: N/A - I tried a few systems (Houston Public Library, Oxnard Public Library) and it seems SirsiDynix still doesn&#039;t use persistent URLs, nor surface permalinks for records in the default interface.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Voyager&lt;/strong&gt;:  Status &lt;code&gt;200 OK&lt;/code&gt;. Not good.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vufind&lt;/strong&gt;: Status &lt;code&gt;404 Not Found&lt;/code&gt;. Good!&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WorldCat&lt;/strong&gt;: Status &lt;code&gt;200 OK&lt;/code&gt;. Not good.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
Overall, this is a pretty dismal picture of the state of some of the most commonly used library catalogue systems when it comes to compliance with basic web standards. Kudos to Blacklight and Vufind for getting it right--and assuming that my branch gets integrated, Evergreen should join them in the near future.
&lt;/p&gt;
&lt;img src=&quot;https://coffeecode.net:443/uploads/files/404-web-standards-compliance.png&quot; property=&quot;image&quot; alt=&quot;404 Library Catalogue Web Standards Compliance Not Found&quot;&gt; 
    </content:encoded>

    <pubDate>Mon, 29 Dec 2014 11:50:50 -0500</pubDate>
    <guid isPermaLink="false">https://coffeecode.net:443/archives/301-guid.html</guid>
    <category>coding</category>
<category>evergreen</category>

</item>
<item>
    <title>Putting the &quot;Web&quot; back into Semantic Web in Libraries 2014</title>
    <link>https://coffeecode.net:443/archives/296-Putting-the-Web-back-into-Semantic-Web-in-Libraries-2014.html</link>
            <category>Coding</category>
            <category>Evergreen</category>
            <category>Libraries</category>
            <category>Structured data</category>
    
    <comments>https://coffeecode.net:443/archives/296-Putting-the-Web-back-into-Semantic-Web-in-Libraries-2014.html#comments</comments>
    <wfw:comment>https://coffeecode.net:443/wfwcomment.php?cid=296</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>https://coffeecode.net:443/rss.php?version=2.0&amp;type=comments&amp;cid=296</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;I was honoured to lead a workshop and speak at this year&#039;s edition of
&lt;a href=&quot;http://swib.org/swib14&quot;&gt;Semantic Web in Bibliotheken (SWIB)&lt;/a&gt; in Bonn, Germany. It was an amazing
experience; there were so many rich projects being described with obvious
dividends for the users of libraries, once again the European library
community fills me with hope for the future success of the semantic web.
&lt;/p&gt;

&lt;p&gt;
The subject of my talk &quot;Cataloguing for the open web with RDFa and schema.org&quot;
(&lt;a href=&quot;https://coffeecode.net/swib14/talk&quot;&gt;slides&lt;/a&gt; and &lt;a
href=&quot;http://www.scivee.tv/node/63282&quot;&gt;video recording&lt;/a&gt; - &lt;em&gt;gulp&lt;/em&gt;)
pivoted while I was preparing materials for the workshop. I was searching
library catalogues around Bonn looking for a catalogue with persistent URIs
that I could use for an example. To my surprise, catalogue after catalogue used
session-based URLs; it took me quite some time before I was able to find ULB,
who had hosted a VuFind front end for their catalogue. Even then, the
&lt;code&gt;robots.txt&lt;/code&gt; restricted crawling by any user agent. This reminded me
rather depressingly of my findings from current &quot;discovery layers&quot;, which
entirely restrict crawling and therefore put libraries into a black hole on the
web.
&lt;/p&gt;

&lt;p&gt;
Thses findings in the wild are so antithetical to the basic principles of
enabling discovery of web resources that, in a conference about the semantic
web, I opted to spend over half of my talk making the argument that libraries
need to pay attention to the old-fashioned web of documents first and foremost.
The basic building blocks that I advocated were, in priority order:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Persistent URIs, on which everything else is built&lt;/li&gt;
&lt;li&gt;Sitemaps, to facilitate discovery of your resources&lt;/li&gt;
&lt;li&gt;A robots.txt file to filter portions of your website that should not be
    crawled (for example, search results pages)&lt;/li&gt;
&lt;li&gt;RDFa, microdata, or JSON-LD only after you&#039;ve sorted out the first three&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
Only after setting that foundation did I feel comfortable launching into my
rationale for RDFa and schema.org as a tool for enabling discovery on the web:
a mapping of the access points that cataloguers create to the world of HTML
and aggregators. The key point for SWIB was that RDFa and schema.org can enable
full RDF expressions in HTML; that is, we can, should, and must go beyond
surfacing structured data to surfacing linked data through
&lt;code&gt;@resource&lt;/code&gt; attributes and 
&lt;a href=&quot;http://schema.org/sameAs&quot;&gt;schema:sameAs&lt;/a&gt; properties.
&lt;/p&gt;


&lt;blockquote&gt;
The Semantic Web is an extension of the current web in which information is
given well-defined meaning, better enabling computers and people to work in
cooperation. &lt;cite&gt;Tim Berners-Lee, Scientific American, 2001&lt;/cite&gt;
&lt;/blockquote&gt;

&lt;p&gt;
I also argued that using RDFa to enrich the document web was, in fact, truer to
Berners-Lee&#039;s 2001 definition of the semantic web, and that we should focus on
enriching the document web so that both humans and machines can benefit before
investing in building an entirely separate and disconnected semantic web.
&lt;/p&gt;

&lt;p&gt;
I was worried that my talk would not be well received; that it would be
considered obvious, or scolding, or just plain off-topic. But to my relief
I received a great deal of positive feedback. And on the next day, both Eric Miller
and Richard Wallis gave talks on a similar, but more refined, theme:
that libraries need to do a much, much better job of enabling their resources
to be found on the web--not by people who already use our catalogues, but by
people who are &lt;em&gt;not&lt;/em&gt; library users today.
&lt;/p&gt;

&lt;p&gt;
There were also some requests for clarification, which I&#039;ll try to address
generally here (for the benefit of anyone who wasn&#039;t able to talk with me, or
who might watch the livestream in the future).
&lt;/p&gt;

&lt;h4&gt;&quot;When you said anything could be described in schema.org, did you mean we should throw out MARC and BIBFRAME and EAD?&quot;&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;tldr:&lt;/em&gt; I intended &lt;strong&gt;and&lt;/strong&gt;, not &lt;strong&gt;instead of&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;
The first question I was asked was whether there was anything that I had not
been able to describe in schema.org, to which I answered &quot;No&quot;--especially since
the work that the W3C SchemaBibEx group had done to ensure that some of the core
bibliographic requirements were added to the vocabulary. It was not as
coherent or full a response as I would have liked to have made; I blame the
livestream camera &lt;img src=&quot;https://coffeecode.net:443/templates/default/img/emoticons/smile.png&quot; alt=&quot;:-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;
But combined with a part of the presentation where I countered a myth about
schema.org being a very coarse vocabulary by pointing out that it actually
contained 600 classes and over 800 properties, a number of the attendees
interpreted one of the takeaways of my talk as suggesting that libraries should
adopt schema.org as &lt;em&gt;the&lt;/em&gt; descriptive vocabulary, and that MARC,
BIBFRAME, EAD, RAD, RDA, and other approaches for describing library resources
were no longer necessary.
&lt;/p&gt;

&lt;p&gt;
This is not at all what I&#039;m advocating! To expand on my response, you
&lt;em&gt;can&lt;/em&gt; describe anything in schema.org, but you might lose significant
amounts of richness in your description. For example, short stories and poems
would best be described in schema.org as a &lt;a href=&quot;http://schema.org/CreativeWork&quot;&gt;CreativeWork&lt;/a&gt;.
You would have to look at the associated description or keyword properties to
be able to figure out the form of the work.
&lt;/p&gt;

&lt;p&gt;
What I was advocating was that you should map your rich bibliographic
description into corresponding schema.org classes and properties in RDFa at the
time you generate the HTML representation of that resource and its associated
entities. So your poem might be represented as a &lt;a
href=&quot;http://schema.org/CreativeWork&quot;&gt;CreativeWork&lt;/a&gt;, with a
&lt;a href=&quot;http://schema.org/name&quot;&gt;name&lt;/a&gt;,
&lt;a href=&quot;http://schema.org/author&quot;&gt;author&lt;/a&gt;,
&lt;a href=&quot;http://schema.org/description&quot;&gt;description&lt;/a&gt;,
&lt;a href=&quot;http://schema.org/keywords&quot;&gt;keywords&lt;/a&gt;, and
&lt;a href=&quot;http://schema.org/about&quot;&gt;about&lt;/a&gt; values and relationships. Ideally,
the &lt;code&gt;author&lt;/code&gt; will include at least one link (either via 
&lt;a href=&quot;http://schema.org/sameAs&quot;&gt;sameAs&lt;/a&gt;,
&lt;a href=&quot;http://schema.org/url&quot;&gt;url&lt;/a&gt;, or &lt;code&gt;@resource&lt;/code&gt;) to an
entity on the web; and you could do the same with &lt;code&gt;about&lt;/code&gt; if you are
using a controlled vocabulary.
&lt;/p&gt;

&lt;p&gt;
If you take that approach, then you can serve up schema.org descriptions of works
in HTML that most web-oriented clients will understand (such as search engines)
and provide basic access points such as name / author / keywords, while
retaining and maintaining the full richness of the underlying bibliographic
description--and potentially providing access to that, too, as part of the
embedded RDFa, via content negotiation, or &lt;code&gt;&amp;lt;link rel=&quot;&quot;&amp;gt;&lt;/code&gt;,
for clients that can interpret richer formats.
&lt;/p&gt;

&lt;h4&gt;&quot;What makes you think Google will want to surface library holdings in search results?&quot;&lt;/h4&gt;

&lt;p&gt;
There is a perception that Google and other search engines just want to sell
ads, or their own products (such as Google Books). While Google certainly does
want to sell ads and products, they also want to be the most useful tool for
satisfying users&#039; information needs--possibly so they can learn more about those
users and put more effective ads in front of them--but nonetheless, the
motivation is there.
&lt;/p&gt;

&lt;p&gt;
Imagine marking up your resources with the Product / Offer portion of schema.org
you are able to provide search engines with availability information in the
same way that Best Buy, AbeBooks, and other online retailers do (as Evergreen,
Koha, and VuFind already do). That makes it much easier for the search engines
to use everything they may know about their users, such as their current
location, their institutional affiliations, their typical commuting patterns,
their reading and research preferences... to provide a link to a library&#039;s
electronic or print copy of a given resource in a knowledge graph box as one of
the possible ways of satisfying that person&#039;s information needs.
&lt;/p&gt;

&lt;p&gt;
We don&#039;t see it happening with libraries running Evergreen, Koha, and VuFind
yet, realistically because the open source library systems don&#039;t have enough
penetration to make it worth a search engine&#039;s effort to add that to their
set of possible sources. However, if we as an industry make a concerted effort
to implement this as a standard part of crawlable catalogue or discovery record
detail pages, then it wouldn&#039;t surprise me in the least to see such suggestions
start to appear. The best proof that we have that Google, at least, is
interested in supporting discovery of library resources is the continued
investment in Google Scholar.
&lt;/p&gt;

&lt;p&gt;
And as I argued during my talk, even if the search engines never add direct
links to library resources from search results or knowledge graph sidebars,
having a reasonably simple standard like the GoodRelations product / offer
pattern for resource availability enables new web-based approaches for building
appplications. One example could be a fulfillment system that uses sitemaps to
intelligently crawl all of its participating libraries, normalizes the
item request to a work URI, and checks availability by parsing the offers at the
corresponding URIs.
&lt;/p&gt; 
    </content:encoded>

    <pubDate>Thu, 04 Dec 2014 16:15:15 -0500</pubDate>
    <guid isPermaLink="false">https://coffeecode.net:443/archives/296-guid.html</guid>
    <category>coding</category>
<category>evergreen</category>
<category>libraries</category>
<category>structured data</category>

</item>
<item>
    <title>How discovery layers have closed off access to library resources, and other tales of schema.org from LITA Forum 2014</title>
    <link>https://coffeecode.net:443/archives/294-How-discovery-layers-have-closed-off-access-to-library-resources,-and-other-tales-of-schema.org-from-LITA-Forum-2014.html</link>
            <category>Coding</category>
            <category>Evergreen</category>
            <category>Structured data</category>
    
    <comments>https://coffeecode.net:443/archives/294-How-discovery-layers-have-closed-off-access-to-library-resources,-and-other-tales-of-schema.org-from-LITA-Forum-2014.html#comments</comments>
    <wfw:comment>https://coffeecode.net:443/wfwcomment.php?cid=294</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>https://coffeecode.net:443/rss.php?version=2.0&amp;type=comments&amp;cid=294</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p property=&quot;description&quot;&gt;
At the LITA Forum yesterday, I accused (&lt;a href=&quot;http://stuff.coffeecode.net/2014/lita_forum&quot;&gt;presentation&lt;/a&gt;) most discovery layers of not solving the discoverability problems of libraries, but instead exacerbating them by launching us headlong to a closed, unlinkable world. Coincidentally, Lorcan Dempsey&#039;s opening keynote contained a subtle criticism of discovery layers. I wasn&#039;t that subtle.&lt;/p&gt;
&lt;p&gt;
Here&#039;s why I believe commercial discovery layers are not &quot;of the web&quot;: check out their &lt;a href=&quot;http://robotstxt.org&quot;&gt;&lt;code&gt;robots.txt&lt;/code&gt;&lt;/a&gt; files. If you&#039;re not familiar with robots.txt files, these are what search engines and other well-behaved automated crawlers of web resources use to determine whether they are allowed to visit and index the content of pages on a site. Here&#039;s what the &lt;code&gt;robots.txt&lt;/code&gt; files look like for a few of the best-known discovery layers:
&lt;/p&gt;
&lt;pre&gt;
User-Agent: *
Disallow /
&lt;/pre&gt;
&lt;p&gt;
That effectively says &quot;Go away, machines; your kind isn&#039;t wanted in these parts.&quot; And that, in turn, closes off access to your libraries resources to search engines and other aggregators of content, and is completely counter to the overarching desire to evolve to a linked open data world.
&lt;/p&gt;
&lt;p&gt;
During the question period, Marshall Breeding challenged my assertion as being unfair to what are meant to be merely indexes of library content. I responded that most libraries have replaced their catalogues with discovery layers, closing off open access to what have traditionally been their core resources, and he rather quickly acquiesced that that was indeed a problem.
&lt;/p&gt;
&lt;p&gt;
(By the way, a possible solution might be to simply offer two different URL patterns, something like &lt;code&gt;/library/*&lt;/code&gt; for library-owned resources to which access should be granted, and &lt;code&gt;/licensed/*&lt;/code&gt; for resources to which open access to the metadata is problematic due to licensing issues, and which robots can therefore be restricted from accessing.)
&lt;/p&gt;
&lt;p&gt;
Compared to commercial discovery layers on my very handwavy usability vs. discoverability plot, general search engines rank pretty high on both axes; they&#039;re the ready-at-hand tool in browser address bars. And they grok schema.org, so if we can improve our discoverability by publishing schema.org data, maybe we get a discoverability win for our users.
&lt;/p&gt;
&lt;p&gt;
But even if we don&#039;t (SEO is a black art at best, and maybe the general search engines won&#039;t find the right mix of signals that makes them decide to boost the relevancy of our resources for specific users in specific locations at specific times) we get access to that structured data across systems in an extremely reusable way. With sitemaps, we can build our own specialized search engines (Solr or ElasticSearch or Google Custom Search Engine or whatever) that represent specific use cases. Our more sophisticated users can piece together data to, for example, build dynamic lists of collections, using a common, well-documented vocabulary and tools rather than having to dip into the arcane world of library standards (Z39.50 and MARC21).
&lt;/p&gt;
&lt;p&gt;
So why not iterate our way towards the linked open data future by building on what we already have now?
As &lt;a href=&quot;http://kcoyle.blogspot.ca/2014/10/schemaorg-where-it-works.html&quot;&gt;Karen Coyle wrote&lt;/a&gt; in a much more elegant fashion, the transition looks roughly like:
&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;Stored data -&gt; transform/template -&gt; human readable HTML page&lt;/li&gt;
    &lt;li&gt;Stored data -&gt; transform/template (tweaked) -&gt; machine &amp;amp; human readable HTML page&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
That is, by simply tweaking the same mechanism you already use to generate a human readable HTML page from the data you have stored in a database or flat files or what have you, you can embed machine readable structured data as well.
&lt;/p&gt;
&lt;p&gt;
That is, in fact, exactly the approach I took with Evergreen, VuFind, and Koha. And they now expose structured data and generate sitemaps out of the box using the same old MARC21 data. Evergreen even exposes information about libraries (locations, contact information, hours of operation) so that you can connect its holdings to specific locations.
&lt;/p&gt;
&lt;p&gt;
And what about all of our resources outside of the catalogue? Research guides, fonds descriptions, institutional repositories, publications... I&#039;ve been lucky enough to be working with Camilla McKay and Karen Coyle on applying the same process to the Bryn Mawr Classical Review. At this stage, we&#039;re exposing basic entities (&lt;a href=&quot;http://schema.org&quot;&gt;Reviews&lt;/a&gt; and &lt;a href=&quot;http://schema.org/Person&quot;&gt;People&lt;/a&gt;) largely as literals, but we&#039;re laying the groundwork for future iterations where we link them up to external entities. And all of this is built on a Tcl + SGML infrastructure.
&lt;/p&gt;
&lt;p&gt;
So why schema.org? It has the advantage of being a de-facto generalized vocabulary that can be understood and parsed across many different domains, from car dealerships to streaming audio services to libraries, and it can be relatively simply embedded into existing HTML as long as you can modify the templating layer of your system.
&lt;/p&gt;
&lt;p&gt;
And schema.org offers much more than just static structured data; schema.org Actions are surfacing in applications like Gmail as a way of providing directly actionable links--and there&#039;s no reason we shouldn&#039;t embrace that approach to expose &quot;SearchAction&quot;, &quot;ReadAction&quot;, &quot;WatchAction&quot;, &quot;ListenAction&quot;, &quot;ViewAction&quot;--and &quot;OrderAction&quot; (Request), &quot;BorrowAction&quot; (Borrow or Renew), &quot;Place on Reserve&quot;, and other common actions as a standardized API that exists well beyond libraries (see Hydra for a developing approach to this problem).
&lt;/p&gt;
&lt;p&gt;
I want to thank Richard Wallis for inviting me to co-present with him; it was a great experience, and I really enjoy meeting and sharing with others who are putting linked data theory into practice.
&lt;/p&gt; 
    </content:encoded>

    <pubDate>Sat, 08 Nov 2014 11:41:30 -0500</pubDate>
    <guid isPermaLink="false">https://coffeecode.net:443/archives/294-guid.html</guid>
    <category>coding</category>
<category>evergreen</category>
<category>structured data</category>

</item>
<item>
    <title>DCMI 2014: schema.org holdings in open source library systems</title>
    <link>https://coffeecode.net:443/archives/293-DCMI-2014-schema.org-holdings-in-open-source-library-systems.html</link>
            <category>Coding</category>
            <category>Evergreen</category>
            <category>Libraries</category>
            <category>Structured data</category>
    
    <comments>https://coffeecode.net:443/archives/293-DCMI-2014-schema.org-holdings-in-open-source-library-systems.html#comments</comments>
    <wfw:comment>https://coffeecode.net:443/wfwcomment.php?cid=293</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>https://coffeecode.net:443/rss.php?version=2.0&amp;type=comments&amp;cid=293</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;My slides from DCMI 2014: &lt;a
href=&quot;http://stuff.coffeecode.net/2014/dcmi_schemabibex/#/&quot;&gt;schema.org in the
wild: open source libraries++&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;
Last week I was at the &lt;a
href=&quot;http://dcevents.dublincore.org/IntConf/dc-2014&quot;&gt;Dublin Core Metadata
Initiative 2014 conference&lt;/a&gt;, where Richard Wallis, Charles MacCathie Nevile
and I were slated to present on schema.org and the work of the W3C Schema.org
Bibliographic Extension Community Group (#schemabibex). As a first-timer at
DCMI, I wasn&#039;t sure what kind of an audience to expect: there is a
peer-reviewed papers track, and a series of sessions on a truly intimidating
topic (RDF Application Profiles), but on the other hand our own topic was
fairly basic. As it turned out, there was an invigoratingly mixed set of
backgrounds present, and Eric Miller&#039;s opening keynote, which gave an oral
history of the origins of DCMI and a look towards the future challenges for the
organization, reassured me that I wasn&#039;t going to be out of my depth.
&lt;/p&gt;
&lt;p&gt;
Special kudos to Eric for his analogy of the Web to a credit card, which offers
both human-readable and machine-readable data. A nice, clean image!
&lt;/p&gt;
&lt;p&gt;
Richard, Charles and I opted to structure our 1.5 hour session as a series of
short talks followed by a long period of discussion. However, as often happens,
the excitement of speaking in front of a room that drew so many attendees that
we had to jam with more chairs led to that plan breaking down. I cut my own
materials back to illustrating how one of my primary contributions to the
#schemabibex effort--representing library holdings using schema.org&#039;s
GoodRelations-based Product/Offer model--had been implemented in free software
library systems, including Evergreen, Koha, and VuFind. I walked from a basic
bibliographic record (represented as a &lt;a
href=&quot;http://schema.org/Product&quot;&gt;Product&lt;/a&gt;), through to the associated
borrowable items (represented as &lt;a href=&quot;http://schema.org/Offer&quot;&gt;Offers&lt;/a&gt;
with a price of $0.00, call numbers as &lt;a
href=&quot;http://schema.org/sku&quot;&gt;SKUs&lt;/a&gt;, and barcodes as &lt;a
href=&quot;http://schema.org/serialNumber&quot;&gt;serialNumbers&lt;/a&gt;), that were offered by
a specific &lt;a href=&quot;http://schema.org/Library&quot;&gt;Library&lt;/a&gt; with its own set of
operating hours, address, and contact information... all published out of the
box as RDFa in modern Evergreen systems.
&lt;/p&gt;
&lt;p&gt;
I did stray a little to posit that the use case for schema.org is not and should
not be limited to &quot;search engine optimization&quot;, but that this very simple level
of structured data could fairly easily form the basis of an API. In the rather
limited discussion that we were able to hold at the end of the session (and
encroaching on break time), Charles counselled that libraries shouldn&#039;t really
bother with dumbing down their beautiful metadata simply to publish
schema.org... while I countered that the pursuit of publishing beautiful
metadata in the past has generally led librarians to publish no metadata at
all, and that schema.org was a great first step towards building a web of
cultural heritage metadata meant for machine consumption.
&lt;/p&gt;
&lt;p&gt;
I wish I could have stayed longer at DCMI, but it was Thanksgiving in Canada
and there were families to visit and feast with--not to mention children to
help take car of--so I had to depart after just a day and a half. I&#039;m
encouraged by the steps the organization is taking to renew itself, and I hope
to be able to participate again in the future.
&lt;/p&gt; 
    </content:encoded>

    <pubDate>Mon, 13 Oct 2014 21:07:13 -0400</pubDate>
    <guid isPermaLink="false">https://coffeecode.net:443/archives/293-guid.html</guid>
    <category>coding</category>
<category>evergreen</category>
<category>libraries</category>
<category>structured data</category>

</item>

</channel>
</rss>