Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Practical unix utilities for text processing

8,146 views

Published on

Published in: Technology

Practical unix utilities for text processing

  1. 1. grep | sed | awk | xargs | etc<br />Practical *nix utilities(for text processing)<br />
  2. 2. whoami<br />
  3. 3. /<br />awk<br />cat<br />grep<br />tac<br />sed<br />echo<br />ls<br />du<br />test<br />mv<br />split<br />tail<br />dir<br />join<br />wc<br />head<br />vim<br />tr<br />sort<br />sum<br />cut<br />expr<br />uniq<br />paste<br />kill<br />tee<br />
  4. 4. /<br />Log<br />Mega App<br />Files<br />DB<br />
  5. 5. pwd | ls | find | tee<br />GNU <br />Coreutils<br />http://www.gnu.org/software/coreutils/<br />The takeaway command:man<br />> info coreutils<br />
  6. 6. pwd | ls | find | tee<br />List of files:<br />ls –l<br />ls–1<br />ls –latr<br />find . –name *.txt<br />
  7. 7. pwd | ls | find | tee<br />Seek for a string in a file:<br />grep“cat” file.txt<br />grep –v “dog” file.txt<br />grep –i “PaTtErN” file.txt<br />egrep“cat|dog” file.txt<br />zgrep“cat” file.txt.gz<br />
  8. 8. for / xargs<br />Do something with each file:<br />for file in `find . –name *tmp`<br /> do<br />rm$file<br /> done<br />find . –name *tmp| xargsrm<br />
  9. 9. pwd | ls | find | tee<br />find + grep<br />find . -name '*txt' -exec grep-l aaa{} ;<br />find . -name '*txt' | xargsgrep-l aaa<br />
  10. 10. pwd | ls | find | tee<br />cat<br />grep<br />tac<br />echo<br />ls<br />du<br />test<br />mv<br />split<br />tail<br />dir<br />join<br />wc<br />head<br />tr<br />sort<br />sum<br />cut<br />expr<br />uniq<br />paste<br />kill<br />tee<br />
  11. 11. paste<br />
  12. 12. join<br />
  13. 13. sort | uniq<br />
  14. 14. wc<br />
  15. 15. cut<br />
  16. 16. csplit<br />
  17. 17. awk<br />sed<br />
  18. 18. sed<br />sfor substitution<br />sed‘s/cat/dog/’ <br /># cat -> dog<br />sed ‘s/(a)(b)/21/’<br /># ab-> ba<br />
  19. 19. sed<br />pfor printing<br />sed –n ‘/dog/p’ <br /># print lines that match ‘dog’<br />sed–n ‘/start/,/end/p’<br /># print range<br />
  20. 20. sed<br />dto delete<br />sed ‘/dog/d’<br /># delete lines that match ‘dog’<br />sed ‘1,/pattern/d’<br /># delete range<br />
  21. 21. sed<br />| and –e for invocation<br />sed ‘s/a/A/’ | sed ‘s/b/B/’<br />#<br />sed –e ‘s/a/A/’ –e ‘s/b/B/’ <br />#<br />
  22. 22. sed<br />{ .. } to group the commands<br />sed ‘/pattern/ {<br /> s/p/P/<br /> s/e/E/<br /> }’<br />#pattern -> PattErn<br />
  23. 23. sed<br />rto read a file<br />sed ‘/include/ r file.txt’<br /># insert file.txt after include<br />wto write to a file<br />sed‘/pattern/ w file.txt’<br /># write matched lines to a file<br />
  24. 24. sedtris<br />
  25. 25. awk<br />aaabbb ccc<br />aaabbbzzz<br />awk '/zzz/' 1.txt <br />grepzzz 1.txt<br />aaabbbzzz<br />
  26. 26. awk<br />awk<br />'BEGIN{<initializations>}<br /><pattern 1> {<actions>}<br /><pattern 2> {<actions>}<br />... <br />END{<final actions>}'<br />
  27. 27. awk<br />awk<br />'BEGIN{a=0, b=0} <br />/aaa/{a++} <br />/bbb/ {b++}<br />END{printf “%d %d”,a,b}'<br />
  28. 28. awk<br />awk<br />'{arr[$2]+=$1} <br />END<br /> { for (id in arr) <br />printf "%s %d ",id,arr[id]}' <br />
  29. 29. exit<br />@antonarhipov<br />

×