DC/OS Apache HDFS Security

The DC/OS Apache HDFS service supports HDFS’s native transport encryption, authentication, and authorization mechanisms. The service provides automation and orchestration to simplify the usage of these important features.

A good overview of these features can be found here.

Note: These security features are only available on DC/OS Enterprise 1.10 and above.

Transport Encryption

With transport encryption enabled, DC/OS Apache HDFS will automatically deploy all nodes with the correct configuration to encrypt communication via SSL. The nodes will communicate securely between themselves using SSL.

The service uses the DC/OS CA to generate the SSL artifacts that it uses to secure the service. Any client that trusts the DC/OS CA will consider the service’s certificates valid.

Note: Enabling transport encryption is not required to use Kerberos authentication, but transport encryption can be combined with Kerberos authentication.

Prerequisites

Configure Transport Encryption

Set up the service account

Grant the service account the correct permissions.

  • In DC/OS 1.10, the required permission is dcos:superuser full.
  • In DC/OS 1.11 and later, the required permissions are:
dcos:secrets:default:/<service name>/* full
dcos:secrets:list:default:/<service name> read
dcos:adminrouter:ops:ca:rw full
dcos:adminrouter:ops:ca:ro full

where <service name> is the name of the service to be installed.

Run the following DC/OS Enterprise CLI commands to set permissions for the service account on a strict cluster:

dcos security org users grant ${SERVICE_ACCOUNT} dcos:mesos:master:task:app_id:<service/name> create
dcos security org users grant ${SERVICE_ACCOUNT} dcos:mesos:master:reservation:principal:dev_hdfs create
dcos security org users grant ${SERVICE_ACCOUNT} dcos:mesos:master:volume:principal:dev_hdfs create

Install the service

Install the DC/OS Apache HDFS service including the following options in addition to your own:

{
    "service": {
        "service_account": "<your service account name>",
        "service_account_secret": "<full path of service secret>",
        "security": {
            "transport_encryption": {
                "enabled": true, "allow_plaintext": <true|false default false>

            }
        }
    }
}

Transport Encryption for Clients

With Transport Encryption enabled, service clients will need to be configured to use the DC/OS CA bundle to verify the connections they make to the service. Consult your client’s documentation for trusting a CA and configure your client appropriately.

Authentication

DC/OS Apache HDFS supports Kerberos authentication.

Kerberos Authentication

Kerberos authentication relies on a central authority to verify that HDFS clients are who they say they are. DC/OS Apache HDFS integrates with your existing Kerberos infrastructure to verify the identity of clients.

Prerequisites

  • The hostname and port of a KDC reachable from your DC/OS cluster
  • Sufficient access to the KDC to create Kerberos principals
  • Sufficient access to the KDC to retrieve a keytab for the generated principals
  • The DC/OS Enterprise CLI
  • DC/OS Superuser permissions

Configure Kerberos Authentication

Create principals

The DC/OS Apache HDFS service requires Kerberos principals for each node to be deployed. The overall topology of the HDFS service is:

  • 3 journal nodes
  • 2 name nodes (with ZKFC)
  • A configurable number of data nodes

Note: Apache HDFS requires a principal for both the service primary and HTTP. The latter is used by the HTTP api.

The required Kerberos principals will have the form:

<service primary>/name-0-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/name-0-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
<service primary>/name-0-zkfc.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/name-0-zkfc.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
<service primary>/name-1-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/name-1-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
<service primary>/name-1-zkfc.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/name-1-zkfc.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>

<service primary>/journal-0-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/journal-0-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
<service primary>/journal-1-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/journal-1-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
<service primary>/journal-2-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/journal-2-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>

<service primary>/data-<data-index>-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>
HTTP/data-<data-index>-node.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>

with:

  • service primary = service.security.kerberos.primary
  • data index = 0 up to data_node.count - 1
  • service subdomain = service.name with all/'s removed
  • service realm = service.security.kerberos.realm

For example, if installing with these options:

{
    "service": {
        "name": "a/good/example",
        "security": {
            "kerberos": {
                "primary": "example",
                "realm": "EXAMPLE"
            }
        }
    },
    "data_node": {
        "count": 3
    }
}

then the principals to create would be:

example/name-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/name-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/name-0-zkfc.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/name-0-zkfc.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/name-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/name-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/name-1-zkfc.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/name-1-zkfc.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE

example/journal-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/journal-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/journal-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/journal-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/journal-2-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/journal-2-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE

example/data-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/data-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/data-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/data-1-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/data-2-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
HTTP/data-2-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
Active Directory

Microsoft Active Directory can be used as a Kerberos KDC. Doing so requires creating a mapping between Active Directory users and Kerberos principals.

The utility ktpass can be used to both create a keytab from Active Directory and generate the mapping at the same time.

The mapping can, however, be created manually. For a Kerberos principal like <primary>/<host>@<REALM>, the Active Directory user should have its servicePrincipalName and userPrincipalName attributes set to,

servicePrincipalName = <primary>/<host>
userPrincipalName = <primary>/<host>@<REALM>

For example, with the Kerberos principal example&#x2F;name-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE, then the correct mapping would be,

servicePrincipalName = example&#x2F;name-0-node.agoodexample.autoip.dcos.thisdcos.directory
userPrincipalName = example&#x2F;name-0-node.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE

If either mapping is incorrect or not present, the service will fail to authenticate that Principal. The symptom in the Kerberos debug logs will be an error of the form

KRBError:
sTime is Wed Feb 07 03:22:47 UTC 2018 1517973767000
suSec is 697984
error code is 6
error Message is Client not found in Kerberos database
sname is krbtgt/AD.MESOSPHERE.COM@AD.MESOSPHERE.COM
msgType is 30

when the userPrincipalName is set incorrectly, and an error of the form

KRBError:
sTime is Wed Feb 07 03:44:57 UTC 2018 1517975097000
suSec is 128465
error code is 7
error Message is Server not found in Kerberos database
sname is kafka/kafka-1-broker.confluent-kafka.autoip.dcos.thisdcos.directory@AD.MESOSPHERE.COM
msgType is 30

when the servicePrincipalName is set incorrectly.

Place Service Keytab in DC/OS Secret Store

The DC/OS Apache HDFS service uses a keytab containing all node principals (service keytab). After creating the principals above, generate the service keytab making sure to include all the node principals. This will be stored as a secret in the DC/OS Secret Store.

NOTE: DC/OS 1.10 does not support adding binary secrets directly to the secret store, only text files are supported. Instead, first base64 encode the file, and save it to the secret store as `/desired/path/__dcos_base64__secret_name`. The DC/OS security modules will handle decoding the file when it is used by the service.

The service keytab should be stored at service/path/name/service.keytab. As noted above. for DC/OS 1.10, it would be __dcos_base64__service.keytab), where service/path/name matches the path and name of the service. For example, if installing with the options

{
    "service": {
        "name": "a/good/example"
    }
}

then the service keytab should be stored at a/good/example/service.keytab.

Documentation for adding a file to the secret store can be found here.

NOTE: Secrets access is controlled by [DC/OS Spaces](/latest/security/ent/#spaces-for-secrets), which function like namespaces. Any secret in the same DC/OS Space as the service will be accessible by the service. However, matching the two paths is the most secure option. Additionally the secret name `service.keytab` is a convention and not a requirement.

Install the Service

Install the DC/OS Apache HDFS service with the following options in addition to your own:

{
    "service": {
        "security": {
            "kerberos": {
                "enabled": true,
                "kdc": {
                    "hostname": "<kdc host>",
                    "port": <kdc port>
                },
                "primary": "<service primary default hdfs>",
                "realm": "<realm>",
                "keytab_secret": "<path to keytab secret>",
                "debug": <true|false default false>
            }
        }
    }
}

Authorization

The DC/OS Apache HDFS service supports HDFS’s native authorization, which behaves similarly to UNIX file permissions. If Kerberos is enabled as detailed above, then Kerberos principals are mapped to HDFS users against which permissions can be assigned.

Enable Authorization

Prerequisites

Set Kerberos Principal to User Mapping

A custom mapping must be set to map Kerberos principals to OS user names for the purposes of determining group membership. This is supplied by setting the parameter

{
    "hdfs": {
        "security_auth_to_local": "<custom mapping>"
    }
}

where <custom mapping> is a base64-encoded string.

Note: There is no default mapping. One MUST be set at install time or as a service update.

This article has a good description of how to build a custom mapping, under the section “Kerberos Principals and UNIX User Names”.

NOTE: In DC/OS 1.11 and above, the DC/OS UI will automatically encode and decode the mapping to and from base64. If installing from the CLI or from the UI in a version older than DC/OS 1.11, it is necessary to do the encoding manually.