-rs-serve - A remotestorage server implementation
-================================================
+krsd - A Kerberised RemoteStorage server implementation
+=======================================================
-Dependencies
-------------
+Contents:
+
+1. Introduction
+2. Overview
+2.1 remotestorage
+2.2 webfinger
+2.3 authorization tools
+2.4 storage system
+3. Installing
+3.1 Dependencies
+3.2 Getting the code
+3.3 Building
+3.4 Installing system-wide
+3.5 Setting options
+3.6 Integrating authorization
+
+1) Introduction
+---------------
+
+remotestorage is an open specification for personal data storage. It is supposed
+to replace the currently popular proprietary "cloud storage" protocols using an
+open standard and thereby promoting the seperation of applications and their
+data on the web.
+
+For more information, check out these links:
+ * http://remotestorage.io/ - Information about the remotestorage protocol
+ and current implementations.
+ * http://unhosted.org/ - Philosophy, hands-on Tutorials and App collection.
+
+2) Overview
+-----------
+
+krsd brings three things:
+* a HTTP endpoint implementing remotestorage: /storage/{user}
+* a HTTP endpoint implementing webfinger: /.well-known/webfinger
+
+User management is based on Kerberos identities. The SPNEGO mechanism
+specified in RFC 4559 is used to this end. Briefly put, this is a GSS-API
+exchange embedded as base64 in WWW-Authenticate and Authorization headers
+of the Negotiate type.
+
+User accounts are unrelated to accounts on the server hosting this service.
+Storage is both available to users and to services, and each is identified
+with their usual principal names, including forms like these:
+
+ xmpp/xmpp.arpa2.net@ARPA2.NET
+ john@ARPA2.NET
+ john/admin@ARPA2.NET
+
+These forms are translated to paths on the local filesystem as described
+below, under "Storage system".
+
+krsd is a fork of the useful work on rs-serve, and the respective
+locations of the original and derived work are:
+https://github.com/remotestorage/rs-serve
+https://github.com/arpa2/krsd
+The name has been changed to avoid confusion with application users. It is
+not clear to date if this project will live its own life or get integrated
+with the original branch. In the first case, a separate name brings
+clarity; in the second, it has been harmless.
+
+krsd is entirely written in C, using mostly POSIX library functions. It
+relies on a few portable libraries, see the list under "Dependencies" below.
+It does however currently use the signalfd() system call, which is only
+available on Linux. (this is a solvable problem though, if you want to
+be able to run on another system, please open an issue to ask for help)
+
+2.1) remotestorage
+------------------
+
+The currently implemented protocol version is "draft-dejong-remotestorage-01". It has been modified to permit implied, that is HTTP-level, authentication and authorisation. Demo code is available in a separate directory.
+
+Currently the following features are supported:
+* CORS support for all verbs (TODO: May not work at present)
+* GET, PUT, DELETE requests on files and folders
+* Opaque version strings (in directory listings and "ETag" header)
+* Conditional GET, PUT and DELETE requests ("If-Match", "If-None-Match" headers)
+* Protection of all non-public paths via authentication by the browser at the HTTP level, and authorisation based on a .k5remotestorage file in the destination file system
+* Special handling of public paths (i.e. those starting with /public/), such that
+ requests on non-directory paths succeed without authorization.
+* HEAD requests on files and folders with "Content-Length" header
+ (not part of remotestorage-01, only enabled when --experimental flag is given)
+
+2.2) webfinger
+--------------
+
+The webfinger implementation only serves information about remotestorage
+and is currently not extensible.
+The hostname part of user addresses is expected to be the hostname set for
+the rs-serve instance. This currently defaults to "local.dev" and can be
+overridden with the --hostname option.
+Virtual hosting (== hosting storage for multiple domains from a single
+instance) is currently not supported.
+
+The pathname returned ermits for the krsd to parse out two components of
+the pathname to which it stores: /storage/domain.tld/user/ should be at
+the beginning of the URI if it is to be accepted as a remoteStorage URI
+on krsd. Anything after this is taken as a path into the storage
+structures used.
+
+2.3) authorization
+------------------
+
+This version of rs-serve cannot employ the original authorization backend
+and frontend, because it removes the implicit bearer flow described by
+OAuth2, for reasons of security. Instead, it relies on the implicit and
+time-constrained access provided by Kerberos.
+
+It might have been possible to rely on a bearer token obtained from a
+Kerberos-specific OAuth2 node, which would have made minimal or no changes
+to rs-serve, but the downside of that would be that the intermediate form
+of the bearer tokens provide access for all times, whereas Kerberos tickets
+passed directly put a time constraint on the exchange. (This point however,
+is negotiable. It seems that OAuth2 permits it. Encryption to resource
+and information-rich handling in the IDP would be possible, in a site
+managed and controlled by the user's side of things. Let's talk?)
+
+Another option might have been to package Kerberos tickets in bearer tokens,
+but that would have introduced an extra intermediate party with no added
+value but unpacking / repacking the token, so it was deemed less attractive.
+
+The current implementation does not constraint access to particular users,
+or groups of users. It is very likely that such a facility will be added
+in a future version, possibly based on central management infrastructure.
+
+The --auth-uri option now causes a warning, but is not fatal.
+
+
+2.4) Storage system
+-------------------
+
+The payload data of the remotestorage endpoint is stored on the local filesystem
+within the respective user's target directory. Determining this directory is
+done by splitting the principal name at the @ sign and reshaping it according
+to these steps:
+
+ 0. Initially, the pathname of a realm is a fixed string that is shared by
+ all principals.
+
+ Example 0a.
+ A fallback default could be /var/lib/krs/
+
+ Example 0b.
+ When using OpenAFS as a backing store for this service, the starting
+ point /afs/ may be more useful.
+
+ 1. Firstly, the realm is removed and translated into its related domain name,
+ which should be supported in the surrounding infrastructure. In case of
+ generic hosting, the surrounding infrastructure could be the DNS, with
+ DNSSEC validated for sites that use it. The result from this mapping is
+ the DNS domain name, represented as lowercase characters without a
+ trailing dot.
+
+ Example 1a.
+ ARPA2.NET is setup in the local infrastructure, and its domain name
+ is found to be arpa2.net -- which is used as a pathname component.
+
+ Example 1b.
+ ARPA2.NET is not setup in local infrastructure, and it is looked up
+ in DNS under the same domain name, arpa2.net -- and if that site
+ thought it sufficiently important to sign with DNSSEC, then this
+ will be validated. What is looked up under the domain name is a
+ TXT record named _kerberos, holding what should match literally
+ with the realm name, case included. In case of a mismatch, fail to
+ help the user. Since the realm is being looked up, rather than a
+ DNS hostname, there will not be a trace up to parent domains.
+
+ What we now concatenate the DNS-name to the installation directory
+ as another subdirectory level.
+
+ 2. Secondly, the local part (before the @ sign) is treated literally as a
+ path, with one or more levels. Note that this means that a slash is
+ treated as a directory separator. The last level is, once again, a
+ directory name.
+
+ Example 2a.
+ john@ARPA2.NET will append john/ to the path found so far.
+
+ Example 2b.
+ john/admin@ARPA2.NET will append john/admin/ to the path found
+ so far.
+
+ Example 2c.
+ http/chitchat.arpa2.net will append http/chitchat.arpa2.net/ to
+ the path found so far.
+
+ 3. Thirdly, a fixed directory name such as remotestorage/ is appended to
+ the path name. This is done to separate remote storage from local
+ storage and from other remote storage components, and to make all the
+ other directories unavailable. This is of no use to local storage, but
+ it is immensely useful when dealing with storage on an OpenAFS share
+ that is also employed for other purposes.
+
+The result is a concrete path into a filesystem. A few complete examples may
+be helpful at this point.
+
+Example. The user john@ARPA2.NET could be found in DNS zone arpa2.net, and
+dependent on local settings his files could end up in mounted locations like:
+
+ /afs/arpa2.net/john/remotestorage/...
+ /var/lib/krs/arpa2.net/john/remotestorage/...
+
+Example. After changing user-ID to john/admin under the same realm, the
+files accessible to John are found in places like:
+
+ /afs/arpa2.net/john/admin/remotestorage/...
+ /var/lib/krs/arpa2.net/john/admin/remotestorage/...
+
+Example. When a server is not acting on behalf of a user (through S4U with
+Constrained Delegation) but on its own title, it depends on its principal
+name. For example, xmpp/xmpp.arpa2.net@ARPA2.NET could be found on:
+
+ /afs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...
+ /var/lib/krs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...
+
+The filesystem path is configured in the webfinger profile, and may or may
+not relate to the Kerberos Principal Name used to access the resource.
+For full flexibility, the remotestorage directory should hold a file named
+.k5remotestorage which must hold the Kerberos Principal Name on a line of
+its own, if it is to have read/write access to anything underneath this
+directory.
+
+The reliance on filesystem paths implies a few noteworthy restrictions:
+
+* The remotestorage endpoint cannot be used to store both a directory and a file
+ under the same path (ignoring the trailing slash). That means you cannot store
+ /foo/bar/baz and /foo/bar, but only one of them. This is a natural restriction
+ of traditional filesystems, that is currently well adhered to by all apps using
+ remotestorage (as far as I know).
+
+* MIME types may not be exact for files that were added "out-of-band", that is
+ not added via the remotestorage protocol, but by copying to the remotestorage/
+ directory by other means. krsd stores MIME type and character encoding
+ under the "user.mime_type" and "user.charset" extended attributes, given these
+ are supported by the underlying filesystem. When these attributes aren't set,
+ a MIME type is guessed using libmagic, which may not always yield desirable
+ results. (for example an empty file, created using "touch" will be transmitted
+ via remotestorage with a Content-Type header of "inode/x-empty; charset=binary")
+ If even libmagic fails to make sense of a file, the Content-Type is set to
+ "application/octet-stream; charset=binary".
+
+* Filesystem privileges must be setup to grant access to the user. In the
+ case of OpenAFS, this will be based on the user, and krsd will act
+ on behalf of the user when storing files. In this case, Constrained
+ Delegation must be permissive to S4U2Proxy use, and the principal ticket
+ for the user must probably be created proxiable. The details of this have
+ not been ironed out yet.
+
+3) Installing
+-------------
+
+These steps should enable you to install krsd.
+
+3.1) Dependencies
+-----------------
- GNU make
- pkg-config (or tweak the Makefile)
- gcc
- libc
-- libevent
+- libevent (>= 2.0)
- libmagic
+- libattr
+- BerkeleyDB
On Debian based systems, this should give you all you need:
- apt-get install build-essential libevent-dev libmagic-dev
+ apt-get install build-essential libevent-dev libmagic-dev libattr1-dev libssl-dev libdb-dev pkg-config
-If you want to develop, you may also want debug symbols and valgrind (required by leakcheck.sh script):
+If you want to develop, you may also want debug symbols and valgrind (required by
+leakcheck.sh script):
apt-get install libevent-dbg valgrind
-Building
---------
+3.2) Getting the code
+---------------------
+
+Given you are reading this file, you probably have the code already, but just to
+be sure:
+
+Currently the krsd code is hosted on github.
+
+You can browse it online, at:
+
+ https://github.com/arpa2/krsd
+
+or close it using git:
+
+ git clone git://github.com/arpa2/krsd.git
+
+Note that krsd itself is a clone of rs-serve, with the distinctive
+feature that krsd introduces Kerberos authentication.
+
+3.3) Building
+-------------
Given you have all dependencies installed, simply run
and you should be good to go.
-Installing
-----------
+3.4) Installing system-wide
+---------------------------
-To install the rs-serve binary to /usr/bin, type:
+To install the krsd binary to /usr/bin, run
make install
-To install it somewhere else, tweak the Makefile first.
+as a privileged user.
+
+To install somewhere else, tweak the Makefile first.
+
+This will also install an init script to /etc/init.d/krsd and a default
+configuration to /etc/default/krsd.
+
+On Debian based systems (i.e. when "update-rc.d" is present), "make install"
+will also install the krsd init script into /etc/rc*.d/.
+
+3.5) Setting options
+--------------------
+
+There are a variety of options
+
+If you want to use the init script, you can set options in /etc/default/krsd,
+otherwise just pass them on the command line.
+
+Run:
+
+ krsd --help
+
+to get a list of supported options.
+
+3.6) Integrating authorization
+------------------------------
+
+To integrate an authorization endpoint, you need to do two things:
+
+* configure endpoint URI
+
+ Set the --auth-uri option to a printf style format string. "%s" will be
+ replaced with the username.
+
+* configure your authorization endpoint to manage krsd tokens
+
+ krsd doesn't care where tokens come from, but it need to know them to
+ decide whether a given request is authorized or not. It maintains an internal
+ store for authorizations (i.e. structures of [user-name, token, scopes]),
+ which must be managed from the outside.
+
+ The tools to do this are:
+
+ * rs-add-token:
+
+ Usage: rs-add-token <user> <token> <scope1> [<scope2> ... <scopeN>]
+
+ - <user> is the login name of the user (krsd must be able to resolve
+ it using getpwnam() in order to find the home directory)
+ - <token> is the token string authenticating future requests. For krsd
+ it is an opaque string.
+ - <scope1>..<scopeN> are scope strings in the same form as described in
+ draft-dejong-remotestorage-01, Section 9.
+
+ * rs-remove-token:
+
+ Usage: rs-remove-token <user> <token>
+
+ <user> and <token> must both be given.
+ If the token cannot be found, rs-remove-token terminates with non-zero status.
+
+ * rs-list-tokens:
+
+ Lists all currently installed tokens and their respective scopes.
+
+ The output format is primarily meant for (human) debugging and subject to change.
+
+4) Contributing
+---------------
+
+* Note that krsd is a fork of the original admirable work named rs-serve.
+ Never blame the rs-serve authors for mistakes that I (Rick) have added.
+ And don't expect them to support my mistakes!
+
+* If you've found a bug, or have any questions, please open an issue on github:
+ https://github.com/remotestoage/rs-serve/issues
+
+* If you want to contribute, fork the project on github and send pull requests.
+
+* In any case, don't hesitate to talk with us on IRC:
+ #remotestorage and #unhosted, both on irc.freenode.org
+
+ Webchat links:
+ - #unhosted: http://webchat.freenode.net/?channels=unhosted
+ - #remotestorage: http://webchat.freenode.net/?channels=remotestorage