Authentication/ Authorization
Posted by odyssey2006 on May 16, 2006
Session presenter: Art Rhyno
Session notes: Geoff Sinclair
Art began by discussing the confusion around the terms authentication and authorization: two words that are often used together, but which have distinct conceptual meanings.
- Authentication: A system determines who you are…
- Authorization: …and then gives you access to the resources you are entitled to use.
The users need to access the available content. But not all the content is free, and content providers need to protect their assets. More and more of these assets are exposed through public sites, such as Google Scholar and Microsoft Academic Search. These needs can be balanced by injecting rights management into arbitrary web spaces.
Several methods of authentication were discussed
- Password: This is simple, but does no scale well. People tend to share passwords. It’s difficult to manage patron access to databases that are only password-protected.
- Referer: There can be portability problems here if the vendor doesn’t allow the referer URL to come from a domain, rather than a single URL. This method of access can be easily spoofed, and does not scale well. This is perhaps a more common authentication method in public libraries than in academic libraries. Sometimes, this is the only option for a given database.
- IP Address: This is common in many libraries, and scales well. However, cracks are beginning to appear in IP address authentication because of network complexity. In IP address authentication, the vendor recognizes the IP range of the library or campus, and allows those connection through to protected resources. But an increasing number of patrons access resources from off-campus locations, so it’s necessary to proxy these connections, so that the vendor sees them as originating from the trusted network. Two types of proxy were discussed:
- Proxy within the browser: This is cumbersome. Users must change settings within their browser, engendering all the difficulties of multiple browser types, and the user must also remember to change the settings back when accessing publicly accessible sites.
- Reverse proxy: One example is EZproxy, which is easier for the user, because no settings need to be changed. The library links to protected resources through the reverse proxy. However, links in the HTML are modified to continue to keep the session proxied, and this is getting more difficult because the design of web-based resources is getting more complex, so it’s harder to do these re-writes. Some vendor sites are very problematic, and site changes can break the URL-rewriting algorithms.
- Attribute: This has not been widely adopted…yet. An attribute is a property of an object, such as a phone number or part of an address of a person, so multiple affiliations can be represented in an attribute-based authorization system.[Note: we've now moved away from authentication and into authorization...]
Enter Shibboleth
Shibboleth is a part of the Internet2 Middleware Initiative. It’s not an authentication mechanism per se. It’s a way to share attributes between organizations. Shibboleth is complex!
[At this point, a Shibboleth authorization transaction diagram was reviewed. For those interested, perhaps it would be better to review other documentation on the Web then try to reproduce the specifics here:
http://shibboleth.internet2.edu/tech-intro.html ]
There are several advantages and issues to the Shibboleth model over traditional authentication methods:
- The target (Service Provider) never knows the ID of the user. The user gains access by means of a session handle, given at the origin (usually, the user’s home institution).
- Once a session handle is established, this may be transferable to many targets.
- Shibboleth federations are groups of institutions (libraries and vendors) that have established trust and common policies.
- The Shibboleth software supports standard directory database access protocols (LDAP, X.500?) and MySQL
- The patron database in most ILS software does not support these protocols, requiring the need for batch exports, metadirectory software, or other kludgey solutions (cue the picture of Frankenstein’s monster)
- NCIP is a more complex and richer protocol, but not much support has appeared, but library vendors are eying NCIP for patron self-checkout
- NCIP is meant to accomplish a wide variety of tasks (two installations might have completely unrelated goals), so the standard is heavy.
- A related protocol, SIP, is used by Horizon, Dynix.
- Still, a bulk load may be the most expedient solution
Several Ontario Universities are about to embark on a Shibboleth pilot project.
Knowledge Ontario will need provincial libraries to authenticate in order to access consortial resources. Some libraries may not yet have a authentication solution in place [a straw poll of people in the room revealed we all have our own solutions in place, but perhaps there is a bias in the sample...]
The EZproxy lab session that wasn’t
[The session on authentication and authorization was to be a hands-on lab, but because the network at TPL was locked down, we couldn't install the software to do the lab section.]
Configuration is all. EZproxy can proxy by domain, but it has to know what to proxy. For example, you don’t want it to start proxying Google, using up the bandwidth of your proxy server unnecessarily. On the other hand, when a vendor adds a new server, you might find users breaking out of proxy and being asked for money to view subscribed resources. The EZproxy website has a lot of good information, but it’s not easy to browse, so much of it is hidden. Factiva is problematic. You can configure EZproxy to run on port 80. This will eliminate some technical issues, and your router might already be set to assure higher availability to traffic on port 80. There is a role for EZproxy in Shibboleth.
After session discussion
- Where have Shibboleth federations been established? A few were mentioned. A list is available on Wikipedia:
http://en.wikipedia.org/wiki/Shibboleth_%28Internet2%29 - Some vendors will never implement Shibboleth, so EZproxy will continue to be necessary.
- We’ve had a pretty good run with EZproxy, but the simple answers will not work in the future.
- Harvesters can slurp up all the content of a paid resource (WGET). Does Shibboleth improve the protection of Service Providers to this sort of automated downloading?
- Other libraries have implemented VPN over SSL (Queens,UBC), another mechanism for users to access paid resources off-campus. But what about the problem of the user ending up at Google on a VPN/SSL connection (bandwidth)?
- EZProxy uses redirects, vendor sites use redirects. Some desktop security programs set a limit on the number of redirects.
- Unlike bank cards PINs, library users don’t perceive a vested interest in keeping their library passwords secret.
- Several librarians offered effusive praise for Chris Zagar, the man behind EZproxy. He has recently won the 2006 LITA/Brett Butler Entrepreneurship Award, and we feel that it was well-deserved!