What are the key points to focus on before starting to learn ETL Development....
The myth of the "small script
1. The myth of the
small script
Mario A. Santini - Full Stack Developer
JUG TAA
2. The Mith Of The Small Script
This is a story about how what seemed a simple
problem, actually is rather complicated
3. A New Task
• Your boss: “Please, write a small script that
prints out the 3rd and the 5th column of this
file.”
• Your boss: “Keep in mind, I need it: ASAP!”
4. Have A Deep Look At The
Task
• Looks as an easy and short task, but actually you
missed some important infos.
• Most of the requirements or client expectations
arrive like this.
• It is simple for your boss, just because it’s easy to
explain. Of course if you don’t go deep inside the
details.
Actually software is made by details!
5. Remember, We Don’t Have
Time
• On those situations it is better for you to go
back to your boss with something to show, and
not with something to ask about.
• So we guess the missed details and do our best!
6. The Script - Requirements
• We know we have a text file divided in columns
to parse
• We guess the files are on a Linux server
• We guess just to printout the result
• The fast and easy way is to write a very short
shell script in bash
8. The Script Doesn’t Work!
• Your boss: “Sorry, I launched your script and got
an error: “
cut: Error reading ./data/
… after a complicated interview with your boss…
• You: “You miss the parameter, that’s why the script fail”
• Your boss: “Sorry man, I don’t know anything about a
parameter”
9. The First Amendment
• Add a simple help
• Check if the data file is in input, if not print the
help
You don’t know how
people will use your code!
10. The First Amendment
$ cat ./my_short_script_v2.sh
#!/bin/bash
function help {
echo "Please run the script as follow example:"
echo ""
echo "$ $1 <data_file_name>"
echo ""
echo "<data_file_name>: as the name of a csv file in data folder"
echo ""
echo ""
}
if [ "$#" -ne 1 ]; then
help $0;
exit 0
fi
data="./data"
cut -f 3,5 -d , ${data}/$1
11. The Second Amendment
• Your boss: “It works with few files, but some times I
got an error, could you have a look please?”
• After a little investigation you have:
• your script doesn’t support fields that contains the
field separator, even if they are quoted and valid
csv files;
• your script doesn’t support different field
separator characters.
12. Now Works But…
• Your boss: “In some files, we need to print out
the 2nd and the 4th columns, instead of the 3rd
and the 5th.”
Things changes!
13. A New Improovement
• Your boss: “I would like to work at home, but I
don’t have any other Unix like Oss there.”
14. Possible Solutions
• Docker / Vagrant
… it’s a bunch of gigs and more headaches for the future…
• Cygwin…
No thanks… too complicated for my boss.
The better solution is to re-write your script in a way that it is more
easy to manage:
use a general purpose programming language like Java
( Of course you could use your preferred language to, but remember, we are a Jug! ;) )
15. What “Like Java” means
• a powerful general purpose language
• could run on many different operating systems
without needs of changes on the source code
• have a long list of available libraries that helps
writing less code and doing things fast and safe
• supported from a wide and huge community
• Last but not last: to be Open Source, better if FOSS
16. What We Need
In order to re-write the script in Java, we could
even use a short cut by including useful libraries:
• A lib to handle csv files
• A lib to handle command line options
• An installing framework tool
• A lib for log
17. Conclusions
In the real word you never have something as “a
small script”.
Even if you are a developer like me, and you have
your HD full of small scripts doing everything you
need to be done.
18. Conclusions 2
The scripts you write for yourself are simple
because you don’t have to deal with many choices,
you just design it as it perfectly fits your needs.
And more, you fits the world to fits your script.
If your needs will changes, you know it and you’ll
change the scripts first or adapt the environment
as convenient.
19. Conclusion 3
Don’t be afraid to use general purpose languages
like Java.
They are so widely spread because you could use it
for short programs, like the one we shown you, to
very complicated systems made by millions of line
of code.
So think it easy and simple as KISS wants, but not
be trivial.