HDP Upgrade and Transparent Data Encryption support on Isilon OneFS 8.2
The objective of this testing is to demonstrate the Hortonworks HDP upgrade from HDP 2.6.5 to HDP 3.1 , during which Transparent Data Encryption(TDE) KMS keys and configuration are ported to OneFS Service from HDFS service after upgrade accurately, this facilitates Hadoop user to leverage TDE support on OneFS 8.2 straight out of the box after upgrade without any changes to the TDE/KMS configurations.
HDFS Transparent Data Encryption
The primary motivation of Transparent Data Encryption on HDFS is to support both end-to-end on wire and at rest encryption for data without any modification to the user application. The TDE scheme adds an additional layer of data protection by storing the decryption keys for files on a separate key management server. This separation of keys and data guarantees that even if the HDFS service is completely compromised the files cannot be decrypted without also compromising the keystore.
Concerns and Risks
The primary concern with TDE is mangling/losing Encrypted Data Encryption Keys (EDEKs) which are unique to each file in an Encryption Zone and are necessary to decrypt the data within. If this occurs, the customer’s data will be lost (DL). A secondary concern is managing Encryption Zone Keys (EKs) which are unique to each Encryption Zone and are associated with the root directory of each Zone. Losing/Mangling the EK would result in data unavailability (DU) for the customer and would require admin intervention to remedy. Finally, we need to make sure that EDEKs are not reused in anyway as this would weaken the security of TDE. Otherwise, there is little to no risk to existing or otherwise unencrypted data since TDE only works within Encryption Zones which are not currently supported.
Hortonworks HDP 2.6.5 on Isilon OneFS 8.2
To install HDP 2.6.5 on OneFS 8.2 by following the install guide.
Note: In install, the document is for OneFS 8.1.2 in which hdfs user is mapped to root in the Isilon setting, which is not required on OneFS 8.2, but need to create a new role to the hdfs user to backup/restore RWX access on the file system.
OneFS 8.2 [New Steps to be new role to the hdfs access zone]
hop-isi-dd-3# isi auth roles create --name=BackUpAdmin --description="Bypass FS permissions" --zone=hdp hop-isi-dd-3# isi auth roles modify BackupAdmin --add-priv=ISI_PRIV_IFS_RESTORE --zone=hdp hop-isi-dd-3# isi auth roles modify BackupAdmin --add-priv=ISI_PRIV_IFS_BACKUP --zone=hdp hop-isi-dd-3# isi auth roles view BackUpAdmin --zone=hdp Name: BackUpAdmin Description: Bypass FS permissions Members: - Privileges ID: ISI_PRIV_IFS_BACKUP Read Only: True ID: ISI_PRIV_IFS_RESTORE Read Only: True hop-isi-dd-3# isi auth roles modify BackupAdmin --add-user=hdfs --zone=hdp ----- [ Optional:: Flush the auth mapping and cache to make hdfs take effect immediately] hop-isi-dd-3# isi auth mapping flush --all hop-isi-dd-3# isi auth cache flush --all -----
1. After HDP 2.6.5 is installed on OneFS 8.2 following the install guide and above steps to add hdfs user backup/restore role. Install Ranger and Ranger KMS services, run service check on all the services to make sure the cluster is healthy and functional.
2. On the Isilon make sure hdfs access zone and hdfs user role are setup as required.
Isilon version
hop-isi-dd-3# isi version Isilon OneFS v8.2.0.0 B_8_2_0_0_007(RELEASE): 0x802005000000007:Thu Apr 4 11:44:04 PDT 2019 root@sea-build11-04:/b/mnt/obj/b/mnt/src/amd64.amd64/sys/IQ.amd64.release FreeBSD clang version 3.9.1 (tags/RELEASE_391/final 289601) (based on LLVM 3.9.1) hop-isi-dd-3#
HDFS user role setup
hop-isi-dd-3# isi auth roles view BackupAdmin --zone=hdp Name: BackUpAdmin Description: Bypass FS permissions Members: hdfs Privileges ID: ISI_PRIV_IFS_BACKUP Read Only: True ID: ISI_PRIV_IFS_RESTORE Read Only: True hop-isi-dd-3#
Isilon HDFS setting
hop-isi-dd-3# isi hdfs settings view --zone=hdp Service: Yes Default Block Size: 128M Default Checksum Type: none Authentication Mode: all Root Directory: /ifs/data/zone1/hdp WebHDFS Enabled: Yes Ambari Server: Ambari Namenode: kb-hdp-z1.hop-isi-dd.solarch.lab.emc.com ODP Version: Data Transfer Cipher: none Ambari Metrics Collector: pipe-hdp1.solarch.emc.com hop-isi-dd-3#
hdfs to root mapping removed from the access zone setting
hop-isi-dd-3# isi zone view hdp Name: hdp Path: /ifs/data/zone1/hdp Groupnet: groupnet0 Map Untrusted: Auth Providers: lsa-local-provider:hdp NetBIOS Name: User Mapping Rules: Home Directory Umask: 0077 Skeleton Directory: /usr/share/skel Cache Entry Expiry: 4H Negative Cache Entry Expiry: 1m Zone ID: 2 hop-isi-dd-3#
3. TDE Functional Testing
Primary Testing Foci
Reads and Writes: Clients with the correct permissions must always be able to reliably decrypt.
Kerberos Integration: Realistically, customers will not deploy TDE without Kerberos. [ In this testing Kerberos is not integrated]
TDE Configurations
HDFS TDE Setup
a. Create an encryption zone (EZ) key
Hadoop key create <keyname>
User “keyadmin” has privileges to create, delete, rollover, set key material, get, get keys, get metadata, generate EEK and Decrypt EEK. These privileges are controlled in Ranger web UI, login as keyadmin / <password> and setup these privileges.
[root@pipe-hdp1 ~]# su keyadmin bash-4.2$ whoami keyadmin bash-4.2$ hadoop key create key_a key_a has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}. KMSClientProvider[http://pipe-hdp1.solarch.emc.com:9292/kms/v1/] has been updated. bash-4.2$ hadoop key create key_a key_a has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}. KMSClientProvider[http://pipe-hdp1.solarch.emc.com:9292/kms/v1/] has been updated. bash-4.2$ bash-4.2$ hadoop key list Listing keys for KeyProvider: KMSClientProvider[http://pipe-hdp1.solarch.emc.com:9292/kms/v1/] key_data key_b key_a bash-4.2$ Note:: New Keys can also be created from Ranger KMS UI.
OneFS TDE Setup
a. Configure KMS URL in the Isilon OneFS CLI
isi hdfs crypto settings modify –kms-url=<url-string> –zone=<hdfs-zone-name> -v
isi hdfs crypto settings view –zone=<hdfs-zone-name>
hop-isi-dd-3# isi hdfs crypto settings view --zone=hdp Kms Url: http://pipe-hdp1.solarch.emc.com:9292 hop-isi-dd-3#
b. Create a new directory in Isilon OneFS CLI under the Hadoop zone that needs to be encryption zone
mkdir /ifs/hdfs/<new-directory-name>
hop-isi-dd-3# mkdir /ifs/data/zone1/hdp/data_a hop-isi-dd-3# mkdir /ifs/data/zone1/hdp/data_b
c. After new directory created, create encryption zone by assigning encryption key and directory path
isi hdfs crypto encryption-zones create –path=<new-directory-path> –key-name=<key-created-via-hdfs> –zone=<hdfs-zone-name> -v
hop-isi-dd-3# isi hdfs crypto encryption-zones create --path=/ifs/data/zone1/hdp/data_a --key-name=key_a --zone=hdp -v Create encryption zone named /ifs/data/zone1/hdp/data_a, with key_a hop-isi-dd-3# isi hdfs crypto encryption-zones create --path=/ifs/data/zone1/hdp/data_b --key-name=key_b --zone=hdp -v Create encryption zone named /ifs/data/zone1/hdp/data_b, with key_b
NOTE:
-
- Encryption keys need to be created from hdfs client
- Need KMS store to manage keys example Ranger KMS
- Encryption zones can be created only on Isilon with CLI
- Creating an encryption zone from hdfs client fails with Unknown RPC RemoteException.
TDE Setup Validation
On HDFS Cluster
a. Verify the same from hdfs client [Path is listed from the hdfs root dir]
hdfs crypto -listZones
bash-4.2$ hdfs crypto -listZones /data_a key_a /data_b key_b
On Isilon Cluster
a. List the encryption zones on Isilon [Path is listed from the Isilon root path]
hdfs crypto -listZones
bash-4.2$ hdfs crypto -listZones /data_a key_a /data_b key_b
TDE Functional Testing
Authorize users to the EZ and KMS Keys
Ranger KMS UI
a. Login into Ranger KMS UI using keyadmin / <password>
b. Create 2 new policies to assign users (yarn, hive) to key_a and (mapred, hive) to key_b with the Get, Get Keys, Get Metadata, Generate EEK and Decrypt EEK permissions.
TDE HDFS Client Testing
a. Create sample files, copy it to respective EZs and access them from respective users.
/data_a EZ associated with key_a and only yarn, hive users have permissions
bash-4.2$ whoami yarn bash-4.2$ echo "YARN user test file, can you read this?" > yarn_test_file bash-4.2$ rm -rf yarn_test_fil bash-4.2$ hadoop fs -put yarn_test_file /data_a/ bash-4.2$ hadoop fs -cat /data_a/yarn_test_file YARN user test file, can you read this? bash-4.2$ whoami yarn bash-4.2$ exit exit [root@pipe-hdp1 ~]# su mapred bash-4.2$ hadoop fs -cat /data_a/yarn_test_file cat: User:mapred not allowed to do 'DECRYPT_EEK' on 'key_a' bash-4.2$
/data_b EZ associated with key_b and only mapred, hive users have permissions
bash-4.2$ whoami mapred bash-4.2$ echo "MAPRED user test file, can you read this?" > mapred_test_file bash-4.2$ hadoop fs -put mapred_test_file /data_b/ bash-4.2$ hadoop fs -cat /data_b/mapred_test_file MAPRED user test file, can you read this? bash-4.2$ exit exit [root@pipe-hdp1 ~]# su yarn bash-4.2$ hadoop fs -cat /data_b/mapred_test_file cat: User:yarn not allowed to do 'DECRYPT_EEK' on 'key_b' bash-4.2$
User hive has permission to decrypt both keys i.e. ca access both EZs
USER user with decrypt privilege [HIVE] [root@pipe-hdp1 ~]# su hive bash-4.2$ pwd /root bash-4.2$ hadoop fs -cat /data_b/mapred_test_file MAPRED user test file, can you read this? bash-4.2$ hadoop fs -cat /data_a/yarn_test_file YARN user test file, can you read this? bash-4.2$
Sample distcp to copy data between EZs.
bash-4.2$ hadoop distcp -skipcrccheck -update /data_a/yarn_test_file /data_b/ 19/05/20 21:20:02 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=true, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[/data_a/yarn_test_file], targetPath=/data_b, targetPathExists=true, filtersFile='null', verboseLog=false} 19/05/20 21:20:03 INFO client.RMProxy: Connecting to ResourceManager at pipe-hdp1.solarch.emc.com/10.246.156.91:8050 19/05/20 21:20:03 INFO client.AHSProxy: Connecting to Application History server at pipe-hdp1.solarch.emc.com/10.246.156.91:10200 """ """ 19/05/20 21:20:04 INFO mapreduce.Job: Running job: job_1558336274787_0003 19/05/20 21:20:12 INFO mapreduce.Job: Job job_1558336274787_0003 running in uber mode : false 19/05/20 21:20:12 INFO mapreduce.Job: map 0% reduce 0% 19/05/20 21:20:18 INFO mapreduce.Job: map 100% reduce 0% 19/05/20 21:20:18 INFO mapreduce.Job: Job job_1558336274787_0003 completed successfully 19/05/20 21:20:18 INFO mapreduce.Job: Counters: 33 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=152563 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=426 HDFS: Number of bytes written=40 HDFS: Number of read operations=15 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=4045 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=4045 Total vcore-milliseconds taken by all map tasks=4045 Total megabyte-milliseconds taken by all map tasks=4142080 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=114 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=91 CPU time spent (ms)=2460 Physical memory (bytes) snapshot=290668544 Virtual memory (bytes) snapshot=5497425920 Total committed heap usage (bytes)=196083712 File Input Format Counters Bytes Read=272 File Output Format Counters Bytes Written=0 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=40 BYTESEXPECTED=40 COPY=1 bash-4.2$ bash-4.2$ hadoop fs -ls /data_b/ Found 2 items -rwxrwxr-x 3 mapred hadoop 42 2019-05-20 04:24 /data_b/mapred_test_file -rw-r--r-- 3 hive hadoop 40 2019-05-20 21:20 /data_b/yarn_test_file bash-4.2$ hadoop fs -cat /data_b/yarn_test_file YARN user test file, can you read this? bash-4.2$
Hadoop user without permission
bash-4.2$ hadoop fs -put test_file /data_a/ put: User:hdfs not allowed to do 'DECRYPT_EEK' on 'key_A' 19/05/20 02:35:10 ERROR hdfs.DFSClient: Failed to close inode 4306114529 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /data_a/test_file._COPYING_ (inode 4306114529)
TDE OneFS CLI Testing
EZ on Isilon EZ, no user has access to read the file
hop-isi-dd-3# whoami root hop-isi-dd-3# cat data_a/yarn_test_file ▒?Tm@DIc▒▒B▒▒>\Qs▒:[VzC▒▒Rw^<▒▒▒▒▒8H# hop-isi-dd-3% whoami yarn hop-isi-dd-3% cat data_a/yarn_test_file ▒?Tm@DIc▒▒B▒▒>\Qs▒:[VzC▒▒Rw^<▒▒▒▒▒8H%
Upgrade the HDP to the latest version, following the upgrade process blog.
After upgrade make sure all the services are up running and pass the service check.
HDFS service will be replaced with OneFS service, under OneFS service configuration make sure KMS related properties are ported successfully.
Login into KMS UI and check the policies are intact after upgrade [ Note after upgrading new “Policy Labels” column added]
TDE validate existing configuration and keys after HDP 3.1 upgrade
TDE HDFS client testing existing configuration and keys
a. List the KMS provider and key to check they are intact after the upgrade
[root@pipe-hdp1 ~]# su hdfs bash-4.2$ hadoop key list Listing keys for KeyProvider: org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@2f54a33d key_b key_a key_data bash-4.2$ hdfs crypto -listZones /data key_data /data_a key_a /data_b key_b
b. Create sample files, copy it to respective EZs and access them from respective users
[root@pipe-hdp1 ~]# su yarn bash-4.2$ cd bash-4.2$ pwd /home/yarn bash-4.2$ echo "YARN user testfile after upgrade to hdp3.1, can you read this?" > yarn_test_file_2 bash-4.2$ hadoop fs -put yarn_test_file_2 /data_a/ bash-4.2$ hadoop fs -cat /data_a/yarn_test_file_2 YARN user testfile after upgrade to hdp3.1, can you read this? bash-4.2$ [root@pipe-hdp1 ~]# su mapred bash-4.2$ cd bash-4.2$ pwd /home/mapred bash-4.2$ echo "MAPRED user testfile after upgrade to hdp3.1, can you read this?" > mapred_test_file_2 bash-4.2$ hadoop fs -put mapred_test_file_2 /data_b/ bash-4.2$ hadoop fs -cat /data_b/mapred_test_file_2 MAPRED user testfile after upgrade to hdp3.1, can you read this? bash-4.2$ [root@pipe-hdp1 ~]# su yarn bash-4.2$ hadoop fs -cat /data_b/mapred_test_file_2 cat: User:yarn not allowed to do 'DECRYPT_EEK' on 'key_b' bash-4.2$ [root@pipe-hdp1 ~]# su hive bash-4.2$ hadoop fs -cat /data_a/yarn_test_file_2 YARN user testfile after upgrade to hdp3.1, can you read this? bash-4.2$ hadoop fs -cat /data_b/mapred_test_file_2 MAPRED user testfile after upgrade to hdp3.1, can you read this? bash-4.2$ bash-4.2$ hadoop distcp -skipcrccheck -update /data_a/yarn_test_file_2 /data_b/ 19/05/21 05:23:38 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=true, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[/data_a/yarn_test_file_2], targetPath=/data_b, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192, verboseLog=false}, sourcePaths=[/data_a/yarn_test_file_2], targetPathExists=true, preserveRawXattrsfalse 19/05/21 05:23:38 INFO client.RMProxy: Connecting to ResourceManager at pipe-hdp1.solarch.emc.com/10.246.156.91:8050 19/05/21 05:23:38 INFO client.AHSProxy: Connecting to Application History server at pipe-hdp1.solarch.emc.com/10.246.156.91:10200 " 19/05/21 05:23:54 INFO mapreduce.Job: map 0% reduce 0% 19/05/21 05:24:00 INFO mapreduce.Job: map 100% reduce 0% 19/05/21 05:24:00 INFO mapreduce.Job: Job job_1558427755021_0001 completed successfully 19/05/21 05:24:00 INFO mapreduce.Job: Counters: 36 " Bytes Copied=63 Bytes Expected=63 Files Copied=1 bash-4.2$ hadoop fs -ls /data_b/ Found 4 items -rwxrwxr-x 3 mapred hadoop 42 2019-05-20 04:24 /data_b/mapred_test_file -rw-r--r-- 3 mapred hadoop 65 2019-05-21 05:21 /data_b/mapred_test_file_2 -rw-r--r-- 3 hive hadoop 40 2019-05-20 21:20 /data_b/yarn_test_file -rw-r--r-- 3 hive hadoop 63 2019-05-21 05:23 /data_b/yarn_test_file_2 bash-4.2$ hadoop fs -cat /data_b/yarn_test_file_2 YARN user testfile after upgrade to hdp3.1, can you read this? bash-4.2$ hadoop fs -cat /data_b/yarn_test_file_2 YARN user testfile after upgrade to hdp3.1, can you read this? bash-4.2$
TDE OneFS client testing existing configuration and keys
a. List the KMS provider and key to check they are intact after upgrade
hop-isi-dd-3# isi hdfs crypto settings view --zone=hdp Kms Url: http://pipe-hdp1.solarch.emc.com:9292 hop-isi-dd-3# isi hdfs crypto encryption-zones list Path Key Name ------------------------------------ /ifs/data/zone1/hdp/data key_data /ifs/data/zone1/hdp/data_a key_a /ifs/data/zone1/hdp/data_b key_b ------------------------------------ Total: 3 hop-isi-dd-3#
b. Permission to access previous created EZs
hop-isi-dd-3# cat data_b/yarn_test_file_2 3▒ ▒{&▒{<N▒7▒ ,▒▒l▒n.▒▒▒bz▒6▒ ▒G▒_▒l▒Ieñ+ ▒t▒▒N^▒ ▒# hop-isi-dd-3# whoami root hop-isi-dd-3#
TDE validate new configuration and keys after HDP 3.1 upgrade
TDE HDFS Client new keys setup
a. Create new keys and list
bash-4.2$ hadoop key create up_key_a up_key_a has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}. org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@11bd0f3b has been updated. bash-4.2$ hadoop key create up_key_b up_key_b has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}. org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@11bd0f3b has been updated. bash-4.2$ hadoop key list Listing keys for KeyProvider: org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@2f54a33d key_b key_a key_data up_key_b up_key_a bash-4.2$
b. After EZs created from OneFS CLI check the zones reflect from HDFS client
bash-4.2$ hdfs crypto -listZones /data key_data /data_a key_a /data_b key_b /up_data_a up_key_a /up_data_b up_key_b
TDE OneFS Client new encryption zone setup
a. Create new EZ from OneFS CLI
HOP-ISI-DD-3# ISI HDFS CRYPTO ENCRYPTION-ZONES CREATE --PATH=/IFS/DATA/ZONE1/HDP/UP_DATA_A --KEY-NAME=UP_KEY_A --ZONE=HDP -V Create encryption zone named /ifs/data/zone1/hdp/up_data_a, with up_key_a hop-isi-dd-3# isi hdfs crypto encryption-zones create --path=/ifs/data/zone1/hdp/up_data_b --key-name=up_key_b --zone=hdp -v Create encryption zone named /ifs/data/zone1/hdp/up_data_b, with up_key_b hop-isi-dd-3# isi hdfs crypto encryption-zones list Path Key Name --------------------------------------- /ifs/data/zone1/hdp/data key_data /ifs/data/zone1/hdp/data_a key_a /ifs/data/zone1/hdp/data_b key_b /ifs/data/zone1/hdp/up_data_a up_key_a /ifs/data/zone1/hdp/up_data_b up_key_b --------------------------------------- Total: 5 hop-isi-dd-3#
Create 2 new policies to assign users (yarn, hive) to up_key_a and (mapred, hive) to up_key_b with the Get, Get Keys, Get Metadata, Generate EEK and Decrypt EEK permissions.
TDE HDFS Client testing on upgraded HDP 3.1
a. Create sample files, copy it to respective EZs and access them from respective users
/up_data_a EZ associated with up_key_a and only yarn, hive users have permissions
[root@pipe-hdp1 ~]# su yarn bash-4.2$ echo "After HDP Upgrade to HDP 3.1, YARN user, Creating this file" > up_yarn_test_file bash-4.2$ hadoop fs -put up_yarn_test_file /up_data_a/ bash-4.2$ hadoop fs -cat /up_data_a/up_yarn_test_file After HDP Upgrade to HDP 3.1, YARN user, Creating this file bash-4.2$ hadoop fs -cat /up_data_b/up_mapred_test_file cat: User:yarn not allowed to do 'DECRYPT_EEK' on 'up_key_b' bash-4.2$
/up_data_b EZ associated with up_key_b and only mapred, hive users have permissions
[root@pipe-hdp1 ~]# su mapred bash-4.2$ cd bash-4.2$ echo "After HDP Upgrade to HDP 3.1, MAPRED user, Creating this file" > up_mapred_test_file bash-4.2$ hadoop fs -put up_mapred_test_file /up_data_b/ bash-4.2$ hadoop fs -cat /up_data_b/up_mapred_test_file After HDP Upgrade to HDP 3.1, MAPRED user, Creating this file bash-4.2$
User hive has permission to decrypt both keys i.e. ca access both EZs
USER user with decrypt privilege [HIVE]
[root@pipe-hdp1 ~]# su hive bash-4.2$ hadoop fs -cat /up_data_b/up_mapred_test_file After HDP Upgrade to HDP 3.1, MAPRED user, Creating this file bash-4.2$ hadoop fs -cat /up_data_a/up_yarn_test_file After HDP Upgrade to HDP 3.1, YARN user, Creating this file bash-4.2$
Sample distcp to copy data between EZs.
bash-4.2$ hadoop distcp -skipcrccheck -update /up_data_a/up_yarn_test_file /up_data_b/ 19/05/22 04:48:21 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=true, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[/up_data_a/up_yarn_test_file], targetPath=/up_data_b, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192, verboseLog=false}, sourcePaths=[/up_data_a/up_yarn_test_file], targetPathExists=true, preserveRawXattrsfalse " 19/05/22 04:48:23 INFO mapreduce.Job: The url to track the job: http://pipe-hdp1.solarch.emc.com:8088/proxy/application_1558505736502_0001/ 19/05/22 04:48:23 INFO tools.DistCp: DistCp job-id: job_1558505736502_0001 19/05/22 04:48:23 INFO mapreduce.Job: Running job: job_1558505736502_0001 " Bytes Expected=60 Files Copied=1 bash-4.2$ hadoop fs -ls /up_data_b/ Found 2 items -rw-r--r-- 3 mapred hadoop 62 2019-05-22 04:43 /up_data_b/up_mapred_test_file -rw-r--r-- 3 hive hadoop 60 2019-05-22 04:48 /up_data_b/up_yarn_test_file bash-4.2$ hadoop fs -cat /up_data_b/up_yarn_test_file After HDP Upgrade to HDP 3.1, YARN user, Creating this file bash-4.2$
Hadoop user without permission
bash-4.2$ hadoop fs -put test_file /data_a/ put: User:hdfs not allowed to do 'DECRYPT_EEK' on 'key_A' 19/05/20 02:35:10 ERROR hdfs.DFSClient: Failed to close inode 4306114529 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /data_a/test_file._COPYING_ (inode 4306114529)
TDE OneFS CLI Testing
Permissions on Isilon EZ, no user has access to read the file
hop-isi-dd-3# cat up_data_a/up_yarn_test_file %*݊▒▒ixu▒▒▒=}▒▒▒h~▒7▒=_▒▒▒0▒[.-$▒:/▒Ԋ▒▒▒▒\8vf▒{F▒Sl▒▒#
Conclusion
Above testing and results prove that HDP upgrade does not break and TDE configuration and same are ported to new OneFS service after a successful upgrade.