Every now and then I come across something that mentions how you should use PKI tokens in keystone as the cryptography gives it better security.
It happened today and so I thought I should clarify:
There is no added security benefit to using keystone with PKI tokens over UUID tokens.
There are advantages to PKI tokens:
Token validation without a request to keystone means less impact on keystone.
And there are disadvantages:
Larger token size.
Additional complexity to set up.
However the fundamental model, that this opaque chunk of data in the ‘X-Auth-Token’ header indicates that this request is authenticated does not change between PKI and UUID tokens.
If someone steals your PKI token you are just as screwed as if they stole your UUID token.
In the last post I did on keystoneclient sessions there was a lot of hand waving about how they should work but it’s not merged yet.
Standardizing clients has received some more attention again recently - and now that the sessions are more mature and ready it seems like a good opportunity to explain them and how to use them again.
For those of you new to this area the clients have grown very organically, generally forking off some existing client and adding and removing features in ways that worked for that project.
Whilst this is in general a problem for user experience (try to get one token and use it with multiple clients without reauthenticating) it is a nightmare for security fixes and new features as they need to be applied individually across each client.
Sessions are an attempt to extract a common authentication and communication layer from the existing clients so that we can handle transport security once, and keystone and deployments can add new authentication mechanisms without having to do it for every client.
Sessions and authentications are user facing objects that you create and pass to a client, they are public objects not a framework for the existing clients.
They require a change in how you instantiate clients.
The first step is to create an authentication plugin, currently the available plugins are:
For the primary user/password and token authentication mechanisms that keystone supports in v2 and v3 and for the test case where you know the endpoint and token in advance.
The parameters will vary depending upon what is required to authenticate with each.
Plugins don’t need to live in the keystoneclient, we are currently in the process of setting up a new repository for kerberos authentication so that it will be an optional dependency.
There are also some plugins living in the contrib section of keystoneclient for federation that will also likely be moved to a new repository soon.
Keystone and nova clients will now share an authentication token fetched with keystone’s v3 authentication.
The clients will authenticate on the first request and will re-authenticate automatically when the token expires.
This is a fundamental shift from the existing clients that would authenticate internally to the client and on creation so by opting to use sessions you are acknowledging that some methods won’t work like they used to.
For example keystoneclient had an authenticate() function that would save the details of the authentication (user_id etc) on the client object.
This process is no longer controlled by keystoneclient and so this function should not be used, however it also cannot be removed because we need to remain backwards compatible with existing client code.
In converting the existing clients we consider that passing a Session means that you are acknowledging that you are using new code and are opting-in to the new behaviour.
This will not affect 90% of users who just make calls to the APIs, however if you have got hacks in place to share tokens between the existing clients or you overwrite variables on the clients to force different behaviours then these will probably be broken.
The above flow is useful for users where they want to have there one token shared between one or more clients.
If you are are an application that uses many authentication plugins (eg, heat or horizon) you may want to take advantage of using a single session’s connection pooling or caching whilst juggling multiple authentications.
You can therefore create a session without an authentication plugin and specify the plugin that will be used with that client instance, for example:
globalSESSIONifnotSESSION:SESSION=ksc_session.Session()auth=get_auth_plugin()# you could deserialize it from a db,# fetch it based on a cookie value...keystone=keystone_v3.Client(session=SESSION,auth=auth)
Auth plugins set on the client will override any auth plugin set on the session - but I’d recommend you pick one method based on your application’s needs and stick with it.
Loading from a config file
There is support for loading session and authentication plugins from and oslo.config CONF object.
The documentation on exactly what options are supported is lacking right now and you will probably need to look at code to figure out everything that is supported.
I promise to improve this, but to get you started you need to register the options globally:
group='keystoneclient'# the option groupkeystoneclient.session.Session.register_conf_options(CONF,group)keystoneclient.auth.register_conf_options(CONF,group)
There is an ongoing effort to create a standardized CLI plugin that can be used by new clients rather than have people provide an –os-auth-plugin every time.
It is not yet ready, however clients can create and specify there own default plugins if –os-auth-plugin is not provided.
For Client Authors
To make use of the session in your client there is the keystoneclient.adapter.Adapter which provides you with a set of standard variables that your client should take and use with the session.
The adapter will handle the per-client authentication plugins, handle region_name, interface, user_agent and similar client parameters that are not part of the more global (across many clients) state that sessions hold.
The adapter then has .get() and .post() and other http methods that the clients expect.
It’s great to have renewed interest in standardizing client behaviour, and I’m thrilled to see better session adoption.
The code has matured to the point it is usable and simplifies use for both users and client authors.
Having just release v0.5 of requests-mock and having it used by both keystoneclient and novaclient with others in the works I thought I’d finally do a post explaining what it is and how to use it.
I was the person who brought HTTPretty into the OpenStack requirements.
The initial reason for this was that keystoneclient was transitioning from the httplib library to requests and I needed to prove that there was no changes to the HTTP requests during the transition.
HTTPretty is a way to mock HTTP responses at the socket level, so it is not dependant on the HTTP library you use and for this it was fairly successful.
As part of that transition I converted all the unit tests so that they actually traversed through to the requesting layer and found a number of edge case bugs because the responses were being mocked out above this point.
I have therefore advocated that the clients convert to mocking at the request layer rather than stubbing out returned values.
I’m pretty sure that this doesn’t adhere strictly to the unit testing philosophy of testing small isolated changes, but our client libraries aren’t that deep and I’d honestly prefer to just test the whole way through and find those edge cases.
Having done this has made it remarkably easier to transition to using sessions in the clients as well, because we are testing the whole path down to making HTTP requests for all the resource tests so again have assurances that the HTTP requests being sent are equivalent.
At the same time we’ve had a number of problems with HTTPretty:
It was the lingering last requirement for getting Python 3 support. Thanks to Cyril Roelandt for finally getting that fixed.
For various reasons it is difficult for the distributions to package.
It has a bad habit of doing backwards incompatible, or simply broken releases. The current requirements string is: httpretty>=0.8.0,!=0.8.1,!=0.8.2,!=0.8.3
Because it acts at the socket layer it doesn’t always play nicely with other things using the socket. For example it has to be disabled for live memcache tests.
It pins its requirements on pypi.
Now I feel like I’m just ranting.
There are additional oddities I found in trying to fix these upstream but this is not about bashing HTTPretty.
requests-mock follows the same concepts allowing users to stub out responses to HTTP requests, however it specifically targets the requests library rather than stubbing the socket.
All the OpenStack clients have been converted to requests at this point, and for the general audience if you are writing HTTP code in Python you should be using requests.
Note: a lot of what is used in these examples is only available since the 0.5 release.
The current OpenStack requirements still have 0.4 so you’ll need to wait for some of the new syntax.
The intention of requests-mock is to work in as similar way to requests itself as possible.
Hence all the variable names and conventions should be as close to a requests.Response as possible.
Note that because the callback was passed as the json parameter the return type is expected to be the same as if you had passed it as a predefined json=blob value.
If you wanted to return text the callback would be on the text parameter.
So rather than give a lot of examples i’ll just highlight some of the interesting things you can do with the library and how to do it.
Queue mutliple responses for a url, each element of the list is interpreted as if they were **kwargs for a response.
In this case every request other than the first will get a 401 error:
I am terrible at keeping my git branches in order.
Particularly since I work across multiple machines and forget where things are I will often have multiple branches with different names being different versions of the same review.
On a project I work on frequently I currently have 71 local branches which are a mix of my code, some code reviews, and some branches that were for trialling ideas.
git review at least prefixes branches it downloads with review/ but that doesn’t help to figure out what was happening with local branches labelled auth through auth-4.
However this post isn’t about me fixing my terrible habit it’s about two git commands which help me work with the mess.
This gives a nicely formatted list of branches in the project sorted by the last time they were committed to and how long ago it was.
So if I know I’m looking for a branch that I last worked on last week I can quickly locate those branches.
The next is a script to figure out which of my branches have made it through review and have been merged upstream which I called branch-merged.
Using git you can already call git branch --merged master to determine which branches are fully merged into the master branch.
However this won’t take into account if a later version of a review was merged, in which case I can probably get rid of that branch.
We can figure this out by using the Commit-Id: field of our Gerrit reviews.
So print out the branches where all the Commit-Ids are also in master.
It’s not greatly efficient and if you are working with code bases with long histories you might need to limit the depth, but given that it doesn’t run often it completes quickly enough.
There’s no guarantee that there wasn’t something new in those branches, but most likely it was an earlier review or test code that is no longer relevant.
I was considering a tool that could use the Commit-Id to figure out from gerrit if a branch is an exact match to one that was previously up for review and so contained no possibly useful experimenting code, but teaching myself to clean up branches as I go is probably a better use of my time.
This made sense in code when using httplib for communication where you use each of those independent pieces.
However we removed httplib a number of releases ago and now simply reconstruct the full URL in code in the form:
Keystoneclient has recently introduced a Session object.
The concept was discussed and generally accepted at the Hong Kong Summit that keystoneclient as the root of authentication (and arguably security) should be responsible for transport (HTTP) and authentication across all the clients.
The majority of the functionality in this post is written and up for review but has not yet been committed.
I write this in an attempt to show the direction of clients as there is currently a lot of talk around projects such as the OpenStack-SDK.
When working with clients you would first create an authentication object, then create a session object with that authentication and then re-use that session object across all the clients you instantiate.
fromkeystoneclient.auth.identityimportv2fromkeystoneclientimportsessionfromkeystoneclient.v2_0importclientauth=v2.Password(auth_url='https://localhost:5000/v2.0',username='user',password='pass',tenant_name='demo')sess=session.Session(auth=auth,verify='/path/to/ca.pem')ksclient=client.Client(session=sess,region_name='RegionOne')# other clients can be created sharing the sess parameter
Now whenever you want to make an authenticated request you just indicated it as part of the request call.
# requests with authenticated are sent with a tokenusers=sess.get('http://localhost:35357/v2.0/users',authenticated=True)
This was pretty much the extent of the initial proposal, however in working with the plugins I have come to realize that authentication is responsible for much more than simply getting a token.
A large part of the data in a keystone token is the service catalog.
This is a listing of the services known to an OpenStack deployment and the URLs that we should use when accessing those services.
Because of the disjointed way in which clients have been developed this service catalog is parsed by each client to determine the URL with which to make API calls.
With a session object in control of authentication and the service catalog there is no reason for a client to know its URL, just what it wants to communicate.
The values of service_type and endpoint_type are well known and constant to a client, region_name is generally passed in when instantiating (if required).
Requests made via the client object will have these parameters added automatically, so given the client from above the following call is exactly the same:
Where I feel that this will really begin to help though is in dealing with the transition between API versions.
Currently deployments of OpenStack put a versioned endpoint in the service catalog eg for identity http://localhost:5000/v2.0.
This made sense initially however now as we try to transition people to the V3 identity API we find that there is no backwards compatible way to advertise both the v2 and v3 services.
The agreed solution long-term is that entries in the service catalog should not be versioned eg. http://localhost:5000 as the root path of a service will list the available versions.
So how do we handle this transition across the 8+ clients?
try:users=sess.get('/users',authenticated=True,service_type='identity',endpoint_type='admin',region_name='RegionOne',version=(2,0))# just specify the version you needexceptkeystoneclient.exceptions.EndpointNotFound:logging.error('No v2 identity endpoint available',exc_info=True)
This solution also means that when we have a suitable hack for the transition to unversioned endpoints it needs only be implemented in one place.
Reliant on this is a means to discover the available versions of all the OpenStack services.
Turns out that in general the projects are similar enough in structure that it can be done with a few minor hacks.
For newer projects there is now a definitive specification on the wiki.
A major advantage of this common approach is we now have a standard way of determining whether a version of a project is available in this cloud.
Therefore we get client version discovery pretty much for free:
ifsess.is_available(service_type='identity',version=(2,0)):ksclient=v2_0.client.Client(sess)else:logging.error("Can't create a v2 identity client")
That’s a little verbose as a client knows that information, so we can extract a wrapper:
ksclient=keystoneclient.client.Client(session=sess,version=(2,0))ifksclient:# do stuff
So the session object has evolved from a pure transport level object and this departure is somewhat concerning as I don’t like mixing layers of responsibility.
However in practice we have standardized on the requests library to abstract much of this away and the Session object is providing helpers around this.
So, along with standardizing transport, by using the session object like this we can:
reduce the basic client down to an object consisting of a few variables indicating the service type and version required.
finally get a common service discovery mechanism for all the clients.
shift the problem of API version migration onto someone else - probably me.
Disclaimers and Notes
The examples provided above use keystoneclient and the ‘identity’ service purely because this is what has been implemented so far.
In terms of CRUD operations keystoneclient is essentially the same as other client in that it retrieves its endpoint from the service catalog and issues requests to it, so the approach will work equally well.
Currently none of the other clients rely upon the session object, I have been waiting on the inclusion of authentication plugins and service discovery before making this push.
Region handling is still a little awkward when using the clients.
I blame this completely on the fact that region handling is awkward on the servers.
In Juno we should have hierarchical regions and then it may make sense to allow region_name to be set on a session rather than per client.
I have often found that when dealing with multiple branches and refactoring patches I get caught out by left over *.pyc files from python files that don’t exist on this branch.
This bit me again recently so I went looking for options.
A useful environment variable that I found via some stackoverflow questions is: PYTHONDONTWRITEBYTECODE which, when set, prevents python from writing .pyc and .pyo files.
This is not something that I want to set permanently on my machine but is great for development.
The other tool I use for all my python projects is virtualenvwrapper which allows you to isolate project dependencies and environments in what I think is a more intuitive way than with virtualenv directly.
Armed with the simple idea that these two concepts should be able to work together I found I was not the first person to think of this.
There are other guides out there but the basic concept is simply to set PYTHONDONTWRITEBYTECODE when we activate a virtualenv and reset it when we deactivate it.
With the Havana release of OpenStack, Keystone gains the ability to issue and verify tokens “bound” to some authentication mechanism.
To understand the reason for this feature we need to first consider the security model of the current token architecture.
OpenStack tokens are what we call “bearer tokens”.
The term seems to have come out of the OAuth movement but means that whoever has the token has all the rights associated with that person.
This is not an uncommon situation on the Internet, it is the way basic auth (username and password), cookies, and session ids all work, and one of the reasons that SSL is so important when authenticating against a website.
If an attacker was to get your token then they have all the rights of that token for as long as it is valid, including permission to reissue a token or change your password.
While all of these mechanism are symmetric secrets, they are only shared between two end points.
Keystone tokens are shared across all of the public services in an OpensStack deployment.
As OpenStack grows and this token is presented to an ever increasing list of services the vulnerability of this mechanism increases.
So what can we do about it?
The typical answer, particularly for the enterprise, is to use Kerberos or x509 client certificates.
This is a great solution but we don’t want to have each service dealing with different authentication mechanisms, that’s what Keystone does.
What is a “bound token”?
A “bound token” is a regular keystone token with some additional information that indicates that the token may only be used in conjunction with the specified external authentication mechanism.
Taking the example of Kerberos, when a token is issued Keystone embeds the name of the Kerberos principle into the token.
When this token is then presented to another service the service notices the bind information and ensures that Kerberos authentication was used and that the same user is making the request.
So how does this help to protect token hijacking?
To give an example:
Alice connects to Keystone using her Kerberos credentials and gets a token.
Embedded within this token is her Kerberos principal name alice@ACME.COM.
Alice authenticates to HaaS (hacked as a service) using her token and Kerberos credentials and is allowed to perform her operations.
Bob, who has privileged access to HaaS, records the token that Alice presented to the service (or otherwise gets Alice’s token)
Bob attempts to connect to Keystone as Alice to change her password.
He connects to keystone with his own Kerberos credentials bob@ACME.COM.
Because these credentials do not match the ones that were present when the token was created his access is disallowed.
It does not necessarily mean that the user initially authenticated themselves by there Kerberos credentials, they may have used there regular username and password.
It simply means that the user who created the token has said that they are also the owner of this Kerberos principal (note: that it is tied to the principal, not a ticket so it will survive ticket re-issuing) and the token should not be authenticated in future without it present.
What is implemented?
Currently tokens issued from Keystone can be bound to a Kerberos principal.
Extending this mechanism to x509 client certificates should be a fairly simple exercise but will not be included in the Havana release.
A patch to handle bind checking in auth_token middleware is currently under review to bring checking to other services.
There are however a number of problems with enforcing bound tokens today:
Kerberos authentication is not supported by the eventlet http server (the server that drives most of the OpenStack web services), and so there is no way to authenticate to the server to provide the credentials.
This essentially restricts bind checking to services running in httpd, which to the best of my knowledge is currently only keystone and swift.
None of the clients currently support connecting with Kerberos authentication.
The option was added to Keystoneclient as a proof of concept but I am hoping that this can be solved across all clients by standardizing the way they communicate rather than having to add and maintain support in each individual client.
There will also be the issue of how to configure the servers to use these clients correctly.
Kerberos tickets are issued to users, not hosts, and typically expire after a period of time.
To allow unattended servers to have valid Kerberos credentials requires a way of automatically refreshing or fetching new tickets.
I am told that there is support for this scenario coming in Fedora 20 but I am not sure what it will involve.
Configuring Token Binding
The new argument to enable token binding in keystone.conf is:
# External auth mechanisms that should add bind information to token.
# eg kerberos, x509
bind = kerberos
As mentioned currently only the value Kerberos is currently supported here.
One of the next supported mechanisms will be x509 client certificates.
To enable token bind authentication in keystone.conf is:
# Enforcement policy on tokens presented to keystone with bind information.
# One of disabled, permissive, strict, required or a specifically required bind
# mode e.g. kerberos or x509 to require binding to that authentication.
enforce_token_bind = permissive
As illustrated by the comments the possible values here are:
disabled: Disables token bind checking.
permissive: Token bind information will be verified if present.
If there is bind information for a token and the server does not know how to verify that information then it will be ignored and the token will be allowed.
This is the new default value and should have no effect on existing systems.
strict: Like permissive but if unknown bind information is present then the token will be rejected.
required: Tokens will only be allowed if bind information is present and verified.
A specific form of bind information is present and verified.
The only currently available value here is kerberos indicating that a token must be bound to a Kerberos principal to be accepted.
For a deployment with access to a Kerberos or x509 infrastructure token binding will dramatically increase your user’s security.
Unfortunately the limitations of Kerberos within OpenStack don’t really make this a viable deployment option in Havana.
Watch this space however as we add x509 authentication and binding, and improve Kerberos handling throughout.
Keystone has been slowly pushing away from being deployed with Eventlet and the keystone-all script in favour of the more traditional httpd mod_wsgi application method.
There has been discussion of Eventlet’s place in OpenStack before and its (mis)use has led to numerous subtle bugs and problems, however from my opinion in Keystone the most important reasons to move away from Eventlet are:
Eventlet does not support Kerberos authentication.
pyOpenSSL only releases the GIL around some SSL verification commands.
This leads to a series of hacks to prevent long running crypto commands blocking Eventlet threads and thus the entire Keystone process.
There are already a lot of httpd authentication/authorization plugins that we could make use of in Keystone.
It’s faster to have things handled by httpd modules in C than in Python.
Keystone has shipped with sample WSGI scripts and httpd configuration files since Foslom and documentation for how to use them is available however most guides and service wrappers (upstart, systemd etc) will use the keystone-all method.
To get some wider adoption and understanding of the process I’ve just added Keystone with httpd support into devstack.
in your localrc or environment variables and re-run ./stack.sh to try it out.
P.S. Swift can also be deployed this way by adding swift to the (comma separated) services list.
There has been interest recently in porting novaclient’s authentication plugin system to the rest of the OpenStack client libraries and moving the plugins into keystoneclient.
At a similar time Alessio Ababilov started trying to introduce the concept of a common base client into keystoneclient.
This is a fantastic idea and one that is well supported by the Keystone, Oslo and I’m sure other teams.
I’ve been referring to this move as APIClient as that is the name of the folder in Oslo code.
At its core is a change in how clients communicate that will result in some significant changes to the base client objects and incorporate these plugins.
Keystone is interested in handling how communication is managed within OpenStack, not just for tokens but as we bring in client certificate and kerberos authentication it will need to have influence over the requests being sent.
After discussing the situation with Alessio he agreed to let me take his base work and start the process of getting these changes into keystoneclient with the intent that this pattern be picked up by other OpenStack clients.
This has unfortunately been a slower process than I would have liked and I think it is hampered by a lack of clear understanding in what is trying to be achieved, which I hope to address with this post.
What follows is in the vein of Alessio’s ideas and definitely a result of his work but is my own interpretation of the problem and the implementation has been rewritten from that initial work.
Most OpenStack clients have the concept of a HTTPClient which abstracts the basic communication with a server, however projects differ in what this object is and how it is used.
Novaclient creates an instance of a HTTPClient object which it saves as self.client (for yet another candidate for what a client object is).
Much of what the novaclient object does then in terms of setting and using authentication plugins is simply a wrapper around calls to the HTTPClient object.
Managers (the part of client responsible for a resource eg user, project etc) are provided with a reference to the base client object (this time saved as api) and so make requests in the form self.api.client.get.
Keystoneclient subclasses HTTPClient and managers make calls in the form self.api.get.
Other projects can go either way depending on which client they were using as reference.
My guess here is that when keystoneclient was initially split out from novaclient the subclassing of HTTPClient was intentional, such that keystoneclient would provide an authenticated HTTPClient that novaclient would use.
Keystoneclient however has its own managers and requirements and the projects have sufficiently diverged so that it no longer fits into this role.
To this day novaclient does not use keystoneclient (in any way) and introduced authentication plugins instead.
If there is going to be a common communication framework then there must be a decision between:
Standardizing on a common base client class that is capable of handling communication (as keystoneclient does).
Create a standalone communication object that clients make use of (as novaclient does).
The APIClient design goes for the latter.
We create a communication object that can be used by any type of client and be reused by different instances of clients (which novaclient does not currently allow).
This communication object is passed between clients deprecating some of the ever increasing list of parameters passed to clients and changes the flow from authenticating a client to authenticating a channel that clients can make use of.
This centralizes authentication and token fetching (including kerberos and client certs), catalog management and endpoint selection and will let us address caching, HTTP session management etc in the future.
In the initial APIClient this object was the new HTTPClient, however this term was so abused I am currently using ClientSession (as it is built on the requests library and is similar to the requests.Session concept) but debate continues.
This is where authentication plugins will live so that any communication through a ClientSession object can request a token added from the plugin.
Maintaining the plugin architecture is preferred here to simply having multiple ClientSession subclasses to allow independent storing and caching of authentication, plugin discovery, and changing or renewing authentication.
It is obviously a little longer than the current method but I’m sure that the old syntax can be maintained for when you only need a single client.
Implementations of this are starting to go into review on keystoneclient.
For the time being some features from nova such as authentication plugins specifying CLI arguments are not being considered until we can ensure that the new system meets at least the current functionality.
The major problem found so far is maintaining API compatibility.
Much of what is currently on keystoneclient that will be moved is defined publicly and cannot simply be thrown away even though they are typically attributes and abstract functions that a user should have no need of.
Hopefully this or something very similar will be coming to the various OpenStack clients soon.