Advanced Shell Scripting for Oracle professionals

ADVANCED SHELL SCRIPTING
FOR ORACLE
PROFESSIONALS
JEVGEŅIJS REUTS

INTRO
• WHO?
- Oracle Applications Database Consultant @ Pythian
- Oracle Database Administrator Certified Professional
- Working with Oracle since 2006
- Not shell scripting guru
• WHY?
- Interesting case to share with community
© 2015 Pythian Confidential2

BUSINESS CASE
• Migrate users for particular department from on
premise Oracle Internet Directory 10g instance
to Oracle Internet Directory 11g instance located
in Amazon (AWS)

CUSTOMER REQUIREMENTS
• Migrate users to new basedn or “tree” in OID 11g
• CSV with usernames provided
• Initial downtime requirement max 4h
• Later changed to “No Downtime allowed”

INITIAL REVIEW
• CSV contains 2,2M usernames
• OID cn: attribute = username
– Example:
USER_ID,SOURCE_USER_ID
317149,SNOWLIS
• Usernames has no pattern for filtering
• Total amount of users in 10g OID 7,7M

OID USER RECORD EXAMPLE
[oracle@oid10g ~]$ ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w password -L -s sub -b "cn=users,dc=test,dc=example,dc=com" "(cn=SNOWLIS)" "*"
dn: cn=SNOWLIS, cn=users,dc=test,dc=example,dc=com
authpassword;oid: {SASL/MD5-DN}E5GNW+/uc5Q4vaUHTpoV8w==
authpassword;oid: {SASL/MD5-U}em8szBiI6lQe7oSZys9S6w==
authpassword;oid: {SASL/MD5}OIcK6dZZFlu7kZOw8+RxEQ==
authpassword;orclcommonpwd: {MD5}UVSevJPyPkXxUHoK1QMOfw==
authpassword;orclcommonpwd: {X- ORCLLMV}C5A7687D19248DD11D71060D896B7A46
authpassword;orclcommonpwd: {X- ORCLNTV}769F744EC914822D37C66B8EFBFD68F9
authpassword;orclcommonpwd: {X- ORCLIFSMD5}AMLZgqATptPU1TkLgpGh1w==
authpassword;orclcommonpwd: {X- ORCLWEBDAV}Fg/OrZz6AEATMeJMXWm19A==
cn: SNOWLIS
mail: test.test@example.com
objectclass: orcluserv2
objectclass: organizationalPerson
objectclass: top
objectclass: person
objectclass: inetorgperson
orclisenabled: ENABLED
orclpassword: {x- orcldbpwd}1.0:059A0F10E478B5BB
sn: SNOWLIS
uid: SNOWLIS
userpassword: {SHA}1btDzs8cj+zHwHLzsgEaUCJ0nn0=

INITIAL APPROACH
• Create shell script
• Script will read usernames from csv line by line
• With ldapsearch will check if entry exists
• If exists with ldapsearch again dumping all the entry
content to ldif file
• Replace basedn or “tree” with sed
• Import users with native OID bulkload utility (uses
SQL*Loader, downtime required)

INITIAL APPROACH EXAMPLE
cat ${v_base_dir}/usernames.csv | grep -v "USER_ID,SOURCE_USER_ID" | awk 'BEGIN
{FS=","}{print $2}' | while read v_username ; do
v_ldap_result=$(ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w ${v_oid_pwd} -L -s
sub -b "cn=users,dc=test,dc=exaple,dc=com" "(cn=${v_username})" "dn" | wc -l)
if [ ${v_ldap_result} -gt 0 ] ; then
ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w ${v_oid_pwd} -L -s sub -b
"cn=users,dc=test,dc=exaple,dc=com" "(cn=${v_username})" "*" >>
${v_base_dir}/content_generated_from_cvs.ldif
echo "" >> ${v_base_dir}/content_generated_from_cvs.ldif
else
echo ${v_username} >> ${v_base_dir}/users_not_in_oid.log
fi
done

PROBLEM WITH INITIAL APPROACH
• Single ldapsearch operation takes ~1s
• For 2,2M users that is 2,2M seconds
• Or 611 hours or 25 days
• Not an option

APPROACH 2
• Full export of OID basedn or “tree”
ldifwrite connect=“SID"
basedn="cn=users,dc=test,dc=example,dc=com"
ldiffile=content_generated_from_cvs.ldif threads=8
• Create shell script to read full export file and
compare usernames against CSV file
• If user exists dumping all the entry content to ldif file
• Import users with native OID bulkload utility (uses
SQL*Loader, downtime required)

PROBLEM WITH APPROACH 2
• Full export file size 9GB
• Total 7,7M users x 21 attribute
• Huge amount of lines
• CSV file has 2,2M lines
• HOW TO HANDLE THIS EFFICIENTLY?

WAY TO GO
• Use BASH associative array
• Load username from CSV to array
• Proceed reading full user dump file against array
• Loading 2,2M row from CSV to array took 50
minutes
• NOTE: BASH associative arrays are available
since bash version 4.0

BASH ASSOCIATIVE ARRAY
# load csv to array
declare -A myarray1
while read line_data
do
myarray1[${line_data}]=1
done <<< "$(cat usernames.csv | grep -v "USER_ID,SOURCE_USER_ID"
| awk 'BEGIN {FS=","}{print $2}')“
[oracle@oid10g ~]$ echo ${myarray1[SNOWLIS]}
1
[oracle@oid10g ~]$ echo ${myarray1[SNOWLIS1]}

BASH ASSOCIATIVE ARRAY
• Use BASH associative array
• Load username from CSV to array
• Proceed reading full user dump file against array
• Loading 2,2M row from CSV to array took 50
minutes

CONSTRUCTING MAIN BLOCK
• Reading full dump file
• If dn: attribute, then extracting cn: user attribute
• Checking if cn: or username persist in array
• If persists in array, setting print flag and dumping
all line until next dn: attribute

CONSTRUCTING MAIN BLOCK
while read v_user_entry_item ; do
v_user_entry_res=$(echo ${v_user_entry_item}| grep "^dn:" | wc -l)
if [ ${v_user_entry_res} -gt 0 ] ; then
v_username=$(echo ${v_user_entry_item} | awk 'BEGIN {FS=","}{print $1}' | awk 'BEGIN
{FS="="}{print $2}')
if [ "1" == "${myarray1[$v_username]}" ]; then
print_status=1
else
print_status=0
fi
fi
if [ ${print_status} = "1" ] ; then
echo ${v_user_entry_item} >> content_to_load.ldif
fi
done < ${v_base_dir}/content_generated_from_cvs.ldif

RUNNING THE SCRIPT
• Script started, working as expected
• But still slow, target to complete > 24h
• Why script taking so long?
• strace -c -f -p <pid>
• Cat, grep, sed and awk utilities
• When bash runs a command it forks a child
process

STRACE OUTPUT
strace -c -f -p 17011
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
95.11 0.134360 1600 84 28 wait4
1.48 0.002091 6 336 fstat
1.16 0.001635 3 486 dup2
0.93 0.001316 0 3140 rt_sigprocmask
0.20 0.000280 1 249 write
0.19 0.000264 2 112 getegid
0.18 0.000251 1 420 mmap
0.17 0.000246 1 375 open
0.15 0.000213 0 841 57 close
0.10 0.000141 0 1035 fcntl
0.08 0.000111 0 280 84 stat
0.05 0.000072 1 140 28 access
0.04 0.000058 2 28 munmap
0.04 0.000056 0 1000 lseek
0.03 0.000049 1 84 brk
0.03 0.000041 0 196 mprotect
0.02 0.000030 0 621 rt_sigaction
0.02 0.000029 0 112 getuid
0.02 0.000028 0 557 557 ioctl
0.00 0.000000 0 669 read
------ ----------- ----------- --------- --------- ----------------
100.00 0.141271 11232 754 total

BASH STRING PROCESSING
${parameter:offset:length}
sting="dn: cn=SNOWLIS, cn=users,dc=test,dc=example,dc=com"
echo ${string:0:3}
dn:
${parameter:offset}
sting= "cn: SNOWLIS"
echo ${string:4}
SNOWLIS

BASH STRING PROCESSING
sting=snowlis
echo ${string^^}
SNOWLIS
sting=SNOWLIS
echo ${string,,}
snowlis

REWRITTEN SCRIPT VERSION
echo "Processing full export LDIF..."
print_status=0
# Reading the user list
while read v_user_entry_item ; do
if [ "X${v_user_entry_item:0:3}" == "Xdn:" ] ; then
echo "${TMP}" >> content_to_load.ldif
print_status=0
fi
TMP=""
fi
if [ "X${v_user_entry_item:0:4}" == "Xcn: " ] ; then
v_user_entry_item_cn=${v_user_entry_item:4}
if [ "1" == "${myarray1[${v_user_entry_item_cn^^}]}" ]; then
print_status=1
myarray1[${v_user_entry_item_cn^^}]=2
else
print_status=0
fi
fi
TMP="${TMP}
${v_user_entry_item}"
done < ${v_base_dir}/content_generated_from_cvs.ldif
echo "${TMP}" >> content_to_load.ldif
fi

RESULTS
• Script execution time decreased to 4h
• Still not fast enough
• Redesign the script to run in 4 parallel sessions
• Split full dump file with split command to 4 parts
• Merge four output file
• Script execution time decreased to 1h

POST PROCESSING & IMPORT
sed -i
"s/cn=Users,dc=test,dc=example,dc=com/cn=users,ou=he,dc=te
st,dc=example2,dc=net/g" content_to_load.ldif
• Remove internal OID attributes with sed
• Run import with ldapadd native OID tool
ldapadd -h localhost -p 3060 -D "cn=orcladmin" -w <pwd> -f
content_to_load.ldif -c

CONCLUSION
• Cat, awk, sed and grep utilities are efficient and
useful working with small files
• Working with huge size files use bash string
processing where possible
• Bash associative arrays can help and improve
performance of your scripts

BEER TIME !!!!

Advanced Shell Scripting for Oracle professionals

More Related Content

What's hot

Similar to Advanced Shell Scripting for Oracle professionals

More from Andrejs Vorobjovs

Recently uploaded

Advanced Shell Scripting for Oracle professionals