This topic describes how to configure secure DC/OS service accounts for Spark.
A service like Apache Spark typically performs certain privileged actions on the cluster, which might require authenticating with the cluster. A service account associated with the service is used to authenticate with the DC/OS cluster. It is recommended to provisioning a separate service account for each service that would perform privileged operations. Service accounts authenticate using public-private keypair. The public key is used to create the service account in the cluster, while the corresponding private key is stored in the secret store. The service account and the service account secret are passed to the service as install time options.
|Security mode||Service Account|
If you install a service in permissive mode and do not specify a service account, Metronome and Marathon will act as if requests made by this service are made by an account with the superuser permission.
- DC/OS CLI installed and be logged in as a superuser.
- Enterprise DC/OS CLI 0.4.14 or later installed.
In this step, a 2048-bit RSA public-private key pair is created using the Enterprise DC/OS CLI.
Create a public-private key pair and save each value into a separate file within the current directory.
dcos security org service-accounts keypair <private-key>.pem <public-key>.pem
From a terminal prompt, create a new service account (for example,
spark) containing the public key (
dcos security org service-accounts create -p <your-public-key>.pem -d <description> spark
You can verify your new service account using the following command.
dcos security org service-accounts show spark
Create a secret (
spark/<secret-name>) with your service account and private key specified (
dcos security secrets create-sa-secret <private-key>.pem <service-account-id> spark/<secret-name>
You can list the secrets with this command:
dcos security secrets list /
Create and assign permissions
Use the following
curl commands to rapidly provision the Spark service account with the required permissions. You can also provision the service account through the UI.
Follow these instructions to authenticate in strict mode.
Using the secret store
DC/OS Enterprise allows users to add privileged information in the form of a file to the DC/OS secret store. These files can be referenced in Spark jobs and used for authentication and authorization with various external services (for example, HDFS). For example, you can use this functionality to pass Kerberos
keytab files. For details about how to use secrets, see understanding secrets.
Where to place secrets
For a secret to be available to Spark, it must be placed in a path
that can be accessed by the Spark service. If only Spark requires access to a secret, you can store the secret in a path that matches the name of the Spark service (for example,
spark/secret). See the Secrets Documentation about Spaces for details about how secret paths restrict service access to secrets.
Anyone who has access to the Spark (Dispatcher) service instance has access to all secrets available to it. Do not grant users access to the Spark Dispatchers instance unless they are also permitted to access all secrets available to the Spark Dispatcher instance.
You can store binary files, like a Kerberos keytab, in the DC/OS secrets store. In DC/OS 1.11 and later, you can create secrets from binary files directly. In DC/OS 1.10 or lower, files must be base64-encoded–as specified in RFC 4648–before being stored as secrets.
DC/OS 1.11 and later
To create a secret called
mysecret with the binary contents of
kerb5.keytab, run the following command:
dcos security secrets create --file kerb5.keytab mysecret
DC/OS 1.10 or earlier
To create a secret called
mysecret with the binary contents of
kerb5.keytab, first encode it using the
base64 command line utility. The following example uses BSD
base64 (default on macOS).
base64 -i krb5.keytab -o kerb5.keytab.base64-encoded
base64 (the default on Linux) inserts line-feeds in the encoded data by default.
Disable line-wrapping with the
-w 0 argument.
base64 -w 0 -i krb5.keytab > kerb5.keytab.base64-encoded
Now that the file is encoded, it can be stored as a secret.
dcos security secrets create -f kerb5.keytab.base64-encoded some/path/__dcos_base64__mysecret
some/path/__dcos_base64__mysecret secret is referenced in your
dcos spark run command, its base64-decoded contents are made available as a temporary file in your Spark application.
Using Mesos secrets
Once a secret has been added in the secret store, you can pass it to Spark with the
spark.mesos.<task-name>.secret.<filenames|envkeys> configuration parameters, where
<task-name> is either
envkeys identifies the secret as either a file-based secret or an environment variable. These configuration parameters take comma-separated lists that are “zipped” together to make the final secret file or environment variable. In most cases, you should use file-based secrets whenever possible because they are more secure than environment variable secrets.
To use the Mesos containerizer, add this configuration:
For example, to use a secret named
spark/my-secret-file as a file in the driver and the executors, add these configuration parameters:
--conf spark.mesos.containerizer=mesos --conf spark.mesos.driver.secret.names=spark/my-secret-file --conf spark.mesos.driver.secret.filenames=target-secret-file --conf spark.mesos.executor.secret.names=spark/my-secret-file --conf spark.mesos.executor.secret.filenames=target-secret-file
These settings put the contents of the secret
spark/my-secret-file in a secure RAM-FS mounted secret file named
target-secret-file in the drivers’ and executors’ sandboxes. If you want to use a secret as an environment variable (for example, AWS credentials), you can change the configurations to be similar to the following:
--conf spark.mesos.containerizer=mesos --conf spark.mesos.driver.secret.names=/spark/my-aws-secret,/spark/my-aws-key --conf spark.mesos.driver.secret.envkeys=AWS_SECRET_ACCESS_KEY,AWS_ACCESS_KEY_ID
These example settings illustrate a secret access key stored in a secret named
spark/my-aws-secret and a secret key ID in
When using a combination of environment and file-based secrets, there must be an equal number of sinks and secret sources (files and environment variables). For example:
--conf spark.mesos.containerizer=mesos --conf spark.mesos.driver.secret.names=/spark/my-secret-file,/spark/my-secret-envvar --conf spark.mesos.driver.secret.filenames=target-secret-file,placeholder-file --conf spark.mesos.driver.secret.envkeys=PLACEHOLDER,SECRET_ENVVAR
This code places the content of
spark/my-secret-file into the
PLACEHOLDER environment variable and the
target-secret-file file as well as the content of
spark/my-secret-envvar into the
placeholder-file. In the case of binary secrets, the environment variable is empty because environment variables cannot be assigned binary values.
SSL support in DC/OS Apache Spark encrypts the following channels:
- From the DC/OS admin router to the dispatcher.
- Files served from the drivers to their executors.
To enable SSL, a Java keystore (and, optionally, truststore) must be provided, along with their passwords. The first three settings below are required during job submission. If using a truststore, the last two are also required:
||Path to keystore in secret store|
||The password used to access the keystore|
||The password for the private key|
||Path to truststore in secret store|
||The password used to access the truststore|
In addition, there are a number of Spark configuration variables relevant to SSL setup. These configuration settings are optional:
||Allowed cyphers||JVM defaults|
The keystore and truststore are created using the Java keytool. The keystore must contain one private key and its signed public key. The truststore is optional and might contain a self-signed root-CA certificate that is explicitly trusted by Java.
Add the stores to your secrets in the DC/OS secret store. For example, if your keystores and truststores are
trust.jks, respectively, then use the following commands to add them to the secret store:
dcos security secrets create /spark/keystore --text-file server.jks dcos security secrets create /spark/truststore --text-file trust.jks
You must add the following configurations to your
dcos spark run command.
The ones in parentheses are optional:
dcos spark run --verbose --submit-args="\ --keystore-secret-path=<path/to/keystore, e.g. spark/keystore> \ --keystore-password=<password to keystore> \ --private-key-password=<password to private key in keystore> \ (—-truststore-secret-path=<path/to/truststore, for example, spark/truststore> \) (--truststore-password=<password to truststore> \) (—-conf spark.ssl.enabledAlgorithms=<cipher, for example, TLS_RSA_WITH_AES_128_CBC_SHA256> \) --class <Spark Main class> <Spark Application JAR> [application args]"
DC/OS 1.10 or earlier: Since both stores are binary files, they must be base64 encoded before being placed in the DC/OS secret store. Follow the instructions above on encoding binary secrets to encode the keystore and truststore.
This section discusses executor authentication and BlockTransferService encryption.
Spark uses Simple Authentication Security Layer (SASL) to authenticate executors with the driver and for encrypting messages sent between components. This functionality relies on a shared secret between all components you expect to communicate with each other. A secret can be generated with the DC/OS Spark CLI:
dcos spark secret <secret_path>
dcos spark secret /spark/sparkAuthSecret
This example generates a random secret and uploads it to the DC/OS secrets store at the designated path. To use this secret for RPC authentication, add the following configutations to your CLI command:
dcos spark run --submit-args="\ ... --executor-auth-secret=/spark/sparkAuthSecret ... "