title: Apache Mesos - Authorization layout: documentation
Authorization
In Mesos, the authorization subsystem allows the operator to configure the
actions that certain principals are allowed to perform. For example, the
operator can use authorization to ensure that principal foo
can only register
frameworks subscribed to role bar
, and no other principals can register
frameworks subscribed to any roles.
A reference implementation local authorizer provides basic security for most
use cases. This authorizer is configured using Access Control Lists (ACLs).
Alternative implementations could express their authorization rules in
different ways. The local authorizer is used if the
--authorizers
flag is not specified (or manually set to
the default value local
) and ACLs are specified via the
--acls
flag.
This document is divided into two main sections. The first section explores the concepts necessary to successfully configure the local authorizer. The second briefly discusses how to implement a custom authorizer; this section is not directed at operators but at engineers who wish to build their own authorizer back end.
HTTP Executor Authorization
When the agent's --authenticate_http_executors
flag is set, HTTP executors are
required to authenticate with the HTTP executor API. When they do so, a simple
implicit authorization rule is applied. In plain language, the rule states that
executors can only perform actions on themselves. More specifically, an
executor's authenticated principal must contain claims with keys fid
, eid
,
and cid
, with values equal to the currently-running executor's framework ID,
executor ID, and container ID, respectively. By default, an authentication token
containing these claims is injected into the executor's environment (see the
authentication documentation for more information).
Similarly, when the agent's --authenticate_http_readwrite
flag is set, HTTP
executor's are required to authenticate with the HTTP operator API when making
calls such as LAUNCH_NESTED_CONTAINER
. In this case, executor authorization is
performed via the loaded authorizer module, if present. The default Mesos local
authorizer applies a simple implicit authorization rule, requiring that the
executor's principal contain a claim with key cid
and a value equal to the
currently-running executor's container ID.
Local Authorizer
Role vs. Principal
A principal identifies an entity (i.e., a framework or an operator) that interacts with Mesos. A role, on the other hand, is used to associate resources with frameworks in various ways. A useful analogy can be made with user management in the Unix world: principals correspond to usernames, while roles approximately correspond to groups. For more information about roles, see the roles documentation.
In a real-world organization, principals and roles might be used to represent various individuals or groups; for example, principals could correspond to people responsible for particular frameworks, while roles could correspond to departments within the organization which run frameworks on the cluster. To illustrate this point, consider a company that wants to allocate datacenter resources amongst multiple departments, one of which is the accounting department. Here is a possible scenario in which the accounting department launches a Mesos framework and then attempts to destroy a persistent volume:
- An accountant launches their framework, which authenticates with the Mesos
master using its
principal
andsecret
. Here, let the framework principal bepayroll-framework
; this principal represents the trusted identity of the framework. - The framework now sends a registration message to the master. This message
includes a
FrameworkInfo
object containing aprincipal
androles
; in this case, it will use a single role namedaccounting
. The principal in this message must bepayroll-framework
, to match the one used by the framework for authentication. - The master consults the local authorizer, which in turn looks through its ACLs
to see if it has a
RegisterFramework
ACL which authorizes the principalpayroll-framework
to register with theaccounting
role. It does find such an ACL, the framework registers successfully. Now that the framework is subscribed to theaccounting
role, any weights, reservations, persistent volumes, or quota associated with the accounting department's role will apply when allocating resources to this role within the framework. This allows operators to control the resource consumption of this department. - Suppose the framework has created a persistent volume on an agent which it
now wishes to destroy. The framework sends an
ACCEPT
call containing an offer operation which willDESTROY
the persistent volume. - However, datacenter operators have decided that they don't want the accounting
frameworks to delete volumes. Rather, the operators will manually remove the
accounting department's persistent volumes to ensure that no important
financial data is deleted accidentally. To accomplish this, they have set a
DestroyVolume
ACL which asserts that the principalpayroll-framework
can destroy volumes created by acreator_principal
ofNONE
; in other words, this framework cannot destroy persistent volumes, so the operation will be refused.
ACLs
When authorizing an action, the local authorizer proceeds through a list of
relevant rules until it finds one that can either grant or deny permission to
the subject making the request. These rules are configured with Access Control
Lists (ACLs) in the case of the local authorizer. The ACLs are defined with a
JSON-based language via the --acls
flag.
Each ACL consist of an array of JSON objects. Each of these objects has two
entries. The first, principals
, is common to all actions and describes the
subjects which wish to perform the given action. The second entry varies among
actions and describes the object on which the action will be executed. Both
entries are specified with the same type of JSON object, known as Entity
. The
local authorizer works by comparing Entity
objects, so understanding them is
key to writing good ACLs.
An Entity
is essentially a container which can either hold a particular value
or specify the special types ANY
or NONE
.
A global field which affects all ACLs can be set. This field is called
permissive
and it defines the behavior when no ACL applies to the request
made. If set to true
(which is the default) it will allow by default all
non-matching requests, if set to false
it will reject all non-matching
requests.
Note that when setting permissive
to false
a number of standard operations
(e.g., run_tasks
or register_frameworks
) will require ACLs in order to work.
There are two ways to disallow unauthorized uses on specific operations:
-
Leave
permissive
set totrue
and disallowANY
principal to perform actions to all objects except the ones explicitly allowed. Consider the example below for details. -
Set
permissive
tofalse
but allowANY
principal to perform the action onANY
object. This needs to be done for all actions which should work without being checked against ACLs. A template doing this for all actions can be found in acls_template.json.
More information about the structure of the ACLs can be found in their definition inside the Mesos source code.
ACLs are compared in the order that they are specified. In other words,
if an ACL allows some action and a later ACL forbids it, the action is
allowed; likewise, if the ACL forbidding the action appears earlier than the
one allowing the action, the action is forbidden. If no ACLs match a request,
the request is authorized if the ACLs are permissive (which is the default
behavior). If permissive
is explicitly set to false, all non-matching requests
are declined.
Authorizable Actions
Currently, the local authorizer configuration format supports the following entries, each representing an authorizable action:
Action Name | Subject | Object | Description |
---|---|---|---|
register_frameworks |
Framework principal. | Resource roles of the framework. | (Re-)registering of frameworks. |
run_tasks |
Framework principal. | UNIX user to launch the task as. | Launching tasks/executors by a framework. |
teardown_frameworks |
Operator username. | Principals whose frameworks can be shutdown by the operator. | Tearing down frameworks. |
reserve_resources |
Framework principal or Operator username. | Resource role of the reservation. | Reserving resources. |
unreserve_resources |
Framework principal or Operator username. | Principals whose resources can be unreserved by the operator. | Unreserving resources. |
create_volumes |
Framework principal or Operator username. | Resource role of the volume. | Creating volumes. |
destroy_volumes |
Framework principal or Operator username. | Principals whose volumes can be destroyed by the operator. | Destroying volumes. |
resize_volume |
Framework principal or Operator username. | Resource role of the volume. | Growing or shrinking persistent volumes. |
create_block_disks |
Framework principal. | Resource role of the block disk. | Creating a block disk. |
destroy_block_disks |
Framework principal. | Resource role of the block disk. | Destroying a block disk. |
create_mount_disks |
Framework principal. | Resource role of the mount disk. | Creating a mount disk. |
destroy_mount_disks |
Framework principal. | Resource role of the mount disk. | Destroying a mount disk. |
get_quotas |
Operator username. | Resource role whose quota status will be queried. | Querying quota status. |
update_quotas |
Operator username. | Resource role whose quota will be updated. | Modifying quotas. |
view_roles |
Operator username. | Resource roles whose information can be viewed by the operator. | Querying roles and weights. |
get_endpoints |
HTTP username. | HTTP endpoints the user should be able to access using the HTTP "GET" method. | Performing an HTTP "GET" on an endpoint. |
update_weights |
Operator username. | Resource roles whose weights can be updated by the operator. | Updating weights. |
view_frameworks |
HTTP user. | UNIX user of whom executors can be viewed. | Filtering http endpoints. |
view_executors |
HTTP user. | UNIX user of whom executors can be viewed. | Filtering http endpoints. |
view_tasks |
HTTP user. | UNIX user of whom executors can be viewed. | Filtering http endpoints. |
access_sandboxes |
Operator username. | Operating system user whose executor/task sandboxes can be accessed. | Access task sandboxes. |
access_mesos_logs |
Operator username. | Implicitly given. A user should only use types ANY and NONE to allow/deny access to the log. | Access Mesos logs. |
register_agents |
Agent principal. | Implicitly given. A user should only use types ANY and NONE to allow/deny agent (re-)registration. | (Re-)registration of agents. |
get_maintenance_schedules |
Operator username. | Implicitly given. A user should only use types ANY and NONE to allow/deny access to the log. | View the maintenance schedule of the machines used by Mesos. |
update_maintenance_schedules |
Operator username. | Implicitly given. A user should only use types ANY and NONE to allow/deny access to the log. | Modify the maintenance schedule of the machines used by Mesos. |
start_maintenances |
Operator username. | Implicitly given. A user should only use types ANY and NONE to allow/deny access to the log. | Starts maintenance on a machine. This will make a machine and its agents unavailable. |
stop_maintenances |
Operator username. | Implicitly given. A user should only use the types ANY and NONE to allow/deny access to the log. | Ends maintenance on a machine. |
get_maintenance_statuses |
Operator username. | Implicitly given. A user should only use the types ANY and NONE to allow/deny access to the log. | View if a machine is in maintenance or not. |
Authorizable HTTP endpoints
The get_endpoints
action covers:
/files/debug
/logging/toggle
/metrics/snapshot
/slave(id)/containers
/slave(id)/containerizer/debug
/slave(id)/monitor/statistics
Examples
Consider for example the following ACL: Only principal foo
can register
frameworks subscribed to the analytics
role. All principals can register
frameworks subscribing to any other roles (including the principal foo
since permissive is the default behavior).
{
"register_frameworks": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["analytics"]
}
},
{
"principals": {
"type": "NONE"
},
"roles": {
"values": ["analytics"]
}
}
]
}
Principal foo
can register frameworks subscribed to the analytics
and
ads
roles and no other role. Any other principal (or framework without
a principal) can register frameworks subscribed to any roles.
{
"register_frameworks": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["analytics", "ads"]
}
},
{
"principals": {
"values": ["foo"]
},
"roles": {
"type": "NONE"
}
}
]
}
Only principal foo
and no one else can register frameworks subscribed to the
analytics
role. Any other principal (or framework without a principal) can
register frameworks subscribed to any other roles.
{
"register_frameworks": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["analytics"]
}
},
{
"principals": {
"type": "NONE"
},
"roles": {
"values": ["analytics"]
}
}
]
}
Principal foo
can register frameworks subscribed to the analytics
role
and no other roles. No other principal can register frameworks subscribed to
any roles, including *
.
{
"permissive": false,
"register_frameworks": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["analytics"]
}
}
]
}
In the following example permissive
is set to false
; hence, principals can
only run tasks as operating system users guest
or bar
, but not as any other
user.
{
"permissive": false,
"run_tasks": [
{
"principals": { "type": "ANY" },
"users": { "values": ["guest", "bar"] }
}
]
}
Principals foo
and bar
can run tasks as the agent operating system user
alice
and no other user. No other principal can run tasks.
{
"permissive": false,
"run_tasks": [
{
"principals": { "values": ["foo", "bar"] },
"users": { "values": ["alice"] }
}
]
}
Principal foo
can run tasks only as the agent operating system user guest
and no other user. Any other principal (or framework without a principal) can
run tasks as any user.
{
"run_tasks": [
{
"principals": { "values": ["foo"] },
"users": { "values": ["guest"] }
},
{
"principals": { "values": ["foo"] },
"users": { "type": "NONE" }
}
]
}
No principal can run tasks as the agent operating system user root
. Any
principal (or framework without a principal) can run tasks as any other user.
{
"run_tasks": [
{
"principals": { "type": "NONE" },
"users": { "values": ["root"] }
}
]
}
The order in which the rules are defined is important. In the following
example, the ACLs effectively forbid anyone from tearing down frameworks even
though the intention clearly is to allow only admin
to shut them down:
{
"teardown_frameworks": [
{
"principals": { "type": "NONE" },
"framework_principals": { "type": "ANY" }
},
{
"principals": { "type": "admin" },
"framework_principals": { "type": "ANY" }
}
]
}
The previous ACL can be fixed as follows:
{
"teardown_frameworks": [
{
"principals": { "type": "admin" },
"framework_principals": { "type": "ANY" }
},
{
"principals": { "type": "NONE" },
"framework_principals": { "type": "ANY" }
}
]
}
The ops
principal can teardown any framework using the
/teardown HTTP endpoint. No other principal can
teardown any frameworks.
{
"permissive": false,
"teardown_frameworks": [
{
"principals": {
"values": ["ops"]
},
"framework_principals": {
"type": "ANY"
}
}
]
}
The principal foo
can reserve resources for any role, and no other principal
can reserve resources.
{
"permissive": false,
"reserve_resources": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"type": "ANY"
}
}
]
}
The principal foo
cannot reserve resources, and any other principal (or
framework without a principal) can reserve resources for any role.
{
"reserve_resources": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"type": "NONE"
}
}
]
}
The principal foo
can reserve resources only for roles prod
and dev
, and
no other principal (or framework without a principal) can reserve resources for
any role.
{
"permissive": false,
"reserve_resources": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["prod", "dev"]
}
}
]
}
The principal foo
can unreserve resources reserved by itself and by the
principal bar
. The principal bar
, however, can only unreserve its own
resources. No other principal can unreserve resources.
{
"permissive": false,
"unreserve_resources": [
{
"principals": {
"values": ["foo"]
},
"reserver_principals": {
"values": ["foo", "bar"]
}
},
{
"principals": {
"values": ["bar"]
},
"reserver_principals": {
"values": ["bar"]
}
}
]
}
The principal foo
can create persistent volumes for any role, and no other
principal can create persistent volumes.
{
"permissive": false,
"create_volumes": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"type": "ANY"
}
}
]
}
The principal foo
cannot create persistent volumes for any role, and any
other principal can create persistent volumes for any role.
{
"create_volumes": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"type": "NONE"
}
}
]
}
The principal foo
can create persistent volumes only for roles prod
and
dev
, and no other principal can create persistent volumes for any role.
{
"permissive": false,
"create_volumes": [
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["prod", "dev"]
}
}
]
}
The principal foo
can destroy volumes created by itself and by the principal
bar
. The principal bar
, however, can only destroy its own volumes. No other
principal can destroy volumes.
{
"permissive": false,
"destroy_volumes": [
{
"principals": {
"values": ["foo"]
},
"creator_principals": {
"values": ["foo", "bar"]
}
},
{
"principals": {
"values": ["bar"]
},
"creator_principals": {
"values": ["bar"]
}
}
]
}
The principal ops
can query quota status for any role. The principal foo
,
however, can only query quota status for foo-role
. No other principal can
query quota status.
{
"permissive": false,
"get_quotas": [
{
"principals": {
"values": ["ops"]
},
"roles": {
"type": "ANY"
}
},
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["foo-role"]
}
}
]
}
The principal ops
can update quota information (set or remove) for any role.
The principal foo
, however, can only update quota for foo-role
. No other
principal can update quota.
{
"permissive": false,
"update_quotas": [
{
"principals": {
"values": ["ops"]
},
"roles": {
"type": "ANY"
}
},
{
"principals": {
"values": ["foo"]
},
"roles": {
"values": ["foo-role"]
}
}
]
}
The principal ops
can reach all HTTP endpoints using the GET
method. The principal foo
, however, can only use the HTTP GET on
the /logging/toggle
and /monitor/statistics
endpoints. No other
principals can use GET on any endpoints.
{
"permissive": false,
"get_endpoints": [
{
"principals": {
"values": ["ops"]
},
"paths": {
"type": "ANY"
}
},
{
"principals": {
"values": ["foo"]
},
"paths": {
"values": ["/logging/toggle", "/monitor/statistics"]
}
}
]
}
Implementing an Authorizer
In case you plan to implement your own authorizer module, the authorization interface consists of three parts:
First, the authorization::Request
protobuf message represents a request to be
authorized. It follows the
Subject-Verb-Object
pattern, where a subject ---commonly a principal---attempts to perform an
action on a given object.
Second, the
Future<bool> mesos::Authorizer::authorized(const mesos::authorization::Request& request)
interface defines the entry point for authorizer modules (and the local
authorizer). A call to authorized()
returns a future that indicates the result
of the (asynchronous) authorization operation. If the future is set to true, the
request was authorized successfully; if it was set to false, the request was
rejected. A failed future indicates that the request could not be processed at
the moment and it can be retried later.
The authorization::Request
message is defined in authorizer.proto:
message Request {
optional Subject subject = 1;
optional Action action = 2;
optional Object object = 3;
}
message Subject {
optional string value = 1;
}
message Object {
optional string value = 1;
optional FrameworkInfo framework_info = 2;
optional Task task = 3;
optional TaskInfo task_info = 4;
optional ExecutorInfo executor_info = 5;
optional MachineID machine_id = 11;
}
Subject
or Object
are optional fiels; if they are not set they
will only match an ACL with ANY or NONE in the
corresponding location. This allows users to construct the following requests:
Can everybody perform action A on object O?, or Can principal Z
execute action X on all objects?.
Object
has several optional fields of which, depending on the action,
one or more fields must be set
(e.g., the view_executors
action expects the executor_info
and
framework_info
to be set).
The action
field of the Request
message is an enum. It is kept optional---
even though a valid action is necessary for every request---to allow for
backwards compatibility when adding new fields (see
MESOS-4997 for details).
Third, the ObjectApprover
interface. In order to support efficient
authorization of large objects and multiple objects a user can request an
ObjectApprover
via
Future<shared_ptr<const ObjectApprover>> getApprover(const authorization::Subject& subject, const authorization::Action& action)
.
The resulting ObjectApprover
provides
Try<bool> approved(const ObjectApprover::Object& object)
to synchronously
check whether objects are authorized. The ObjectApprover::Object
follows the
structure of the Request::Object
above.
struct Object
{
const std::string* value;
const FrameworkInfo* framework_info;
const Task* task;
const TaskInfo* task_info;
const ExecutorInfo* executor_info;
const MachineID* machine_id;
};
As the fields take pointer to each entity the ObjectApprover::Object
does not
require the entity to be copied.
Authorizer must ensure that ObjectApprover
s returned by getApprover(...)
method
are valid throughout their whole lifetime. This is relied upon by parts of Mesos code
(Scheduler API, Operator API events and so on) that have a need to frequently authorize
a limited number of long-lived authorization subjects.
This code on the Mesos side, on its part, must ensure that it does not store
ObjectApprover
for authorization subjects that it no longer uses (i.e. that it
does not leak ObjectApprover
s).
NOTE: As the ObjectApprover
is run synchronously in a different actor process
ObjectApprover.approved()
call must not block!