The document provides information about a bioinformatics practicum including:
1. An agenda for the practicum covering regex, arrays/hashes, and programming in Perl.
2. An introduction to regular expressions including what they are, why they are used, and basic regex atoms.
3. Examples of using regular expressions in Perl including pattern matching, memory parentheses, finding all matches, and substitution functions.
This document provides an introduction to using regular expressions (REs) in the Textpad text editor. It begins with an overview of why REs are useful for searching text, then demonstrates how to write basic REs to find single characters, character ranges, word endings, and more. It also explains how to use RE features like groups, replacements, counting characters, and searching files or folders. The goal is to help uninitiated users learn the basics of writing and using REs in Textpad.
Regular expressions (regex) are a language used to parse text and apply logic and constraints to find patterns in strings. They provide a concise way to find matches in text that is supported across many programming languages. This document provides an overview of regex, examples of code using regex in different languages, and descriptions of common regex patterns and metacharacters used to define matching rules. It recommends resources for further reading on mastering regular expressions.
The Power of Regular Expression: use in notepad++Anjesh Tuladhar
This document provides an overview of using regular expressions in Notepad++ to perform find-and-replace operations on text. It demonstrates how to remove numbers from a list, swap numbers and names, and rearrange names to put the last name first followed by the first name separated by a comma. The explanations describe how regular expression syntax like character sets, wildcards, and capturing groups allow matching and rearranging text in the find-and-replace dialog box.
Regular Expressions: every developer's best friend and worst nightmare! Join Andrei Zmievski, PHP developer and author of the PHP Regex (PCRE) extension, on a journey that will take you from your first steps into the world of regular expressions to the mastery of this most useful of tools. A must for everyone who's ever wondered what /(?=\d+)bar/ means.
This document provides examples and explanations of regular expressions. It covers basic regex syntax like | (or), [] (character sets), . (wildcards), ^/$ (start/end anchors), and *+? (quantifiers). Simple examples demonstrate matching numbers, ranges, and strings. More complex examples show grouping, repetition, and matching BGP autonomous system paths. The document concludes with examples of using regex to filter Cisco IOS show command output.
This document provides an introduction and overview of regular expressions (regex). It begins with an introduction to regex and how they are used. It then outlines the table of contents which covers literal characters, character classes, anchors, boundaries, alternation, optional and repetitive items, and grouping. The document discusses topics like literal characters, special characters, character classes, shorthand classes, negated classes, metacharacters inside classes, dot matching, and start/end of string anchors. It provides examples and explanations of how regex engines work to match patterns.
This document provides an introduction to regular expressions (regexes). It explains that regexes describe patterns of text that can be used to search for and replace text. It covers basic regex syntax like literals, wildcards, anchors, quantifiers, character sets, flags, backreferences, and the RegExp object. It also discusses using regexes in JavaScript string methods, text editors, and command line tools.
This document provides an overview of regular expressions and the grep command in Unix/Linux. It defines what regular expressions are, describes common regex patterns like characters, character classes, anchors, repetition, and groups. It also explains the differences between the grep, egrep, and fgrep commands and provides examples of using grep with regular expressions to search files.
This document provides an introduction to using regular expressions (REs) in the Textpad text editor. It begins with an overview of why REs are useful for searching text, then demonstrates how to write basic REs to find single characters, character ranges, word endings, and more. It also explains how to use RE features like groups, replacements, counting characters, and searching files or folders. The goal is to help uninitiated users learn the basics of writing and using REs in Textpad.
Regular expressions (regex) are a language used to parse text and apply logic and constraints to find patterns in strings. They provide a concise way to find matches in text that is supported across many programming languages. This document provides an overview of regex, examples of code using regex in different languages, and descriptions of common regex patterns and metacharacters used to define matching rules. It recommends resources for further reading on mastering regular expressions.
The Power of Regular Expression: use in notepad++Anjesh Tuladhar
This document provides an overview of using regular expressions in Notepad++ to perform find-and-replace operations on text. It demonstrates how to remove numbers from a list, swap numbers and names, and rearrange names to put the last name first followed by the first name separated by a comma. The explanations describe how regular expression syntax like character sets, wildcards, and capturing groups allow matching and rearranging text in the find-and-replace dialog box.
Regular Expressions: every developer's best friend and worst nightmare! Join Andrei Zmievski, PHP developer and author of the PHP Regex (PCRE) extension, on a journey that will take you from your first steps into the world of regular expressions to the mastery of this most useful of tools. A must for everyone who's ever wondered what /(?=\d+)bar/ means.
This document provides examples and explanations of regular expressions. It covers basic regex syntax like | (or), [] (character sets), . (wildcards), ^/$ (start/end anchors), and *+? (quantifiers). Simple examples demonstrate matching numbers, ranges, and strings. More complex examples show grouping, repetition, and matching BGP autonomous system paths. The document concludes with examples of using regex to filter Cisco IOS show command output.
This document provides an introduction and overview of regular expressions (regex). It begins with an introduction to regex and how they are used. It then outlines the table of contents which covers literal characters, character classes, anchors, boundaries, alternation, optional and repetitive items, and grouping. The document discusses topics like literal characters, special characters, character classes, shorthand classes, negated classes, metacharacters inside classes, dot matching, and start/end of string anchors. It provides examples and explanations of how regex engines work to match patterns.
This document provides an introduction to regular expressions (regexes). It explains that regexes describe patterns of text that can be used to search for and replace text. It covers basic regex syntax like literals, wildcards, anchors, quantifiers, character sets, flags, backreferences, and the RegExp object. It also discusses using regexes in JavaScript string methods, text editors, and command line tools.
This document provides an overview of regular expressions and the grep command in Unix/Linux. It defines what regular expressions are, describes common regex patterns like characters, character classes, anchors, repetition, and groups. It also explains the differences between the grep, egrep, and fgrep commands and provides examples of using grep with regular expressions to search files.
Regular expressions (regex) are text strings that describe search patterns to match or find other strings or sets of strings. They are useful for tasks like syntax highlighting, find and replace, text searching, and programming. The document discusses various regex constructs like single/multiple characters, word groups, quantifiers, grouping, greedy/lazy matching, lookahead/behind, and provides examples of using regex in C# and Linux tools. It notes pros of regex like flexibility and expressing complex patterns concisely, but also cons like difficulty of reading and debugging complex patterns.
Regular expressions are the main way many tools matches patterns within strings.
For example, finding pieces of text within a larger doc, or finding a restriction site within a larger sequence. This slide report illustrates what a RegEx is and what you can do to find, match, compare or replace text of documents or code.
This document provides an overview of regular expressions including what they are, their history and usage, common patterns and syntax, and examples of using regular expressions in Java. Regular expressions allow complex searches and text manipulation through special pattern syntax. They are very powerful for tasks like validation, extraction, replacement and more. The document covers topics such as character classes, quantifiers, capturing groups, boundaries, and internationalization considerations.
Regular expressions allow matching and manipulation of textual data. They were first discovered by mathematician Stephen Kleene and their search algorithm was developed by Ken Thompson in 1968 for use in tools like ed, grep, and sed. Regular expressions follow certain grammars and use meta characters to match patterns in text. They are used for tasks like validation, parsing, and data conversion.
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
This document discusses regular expressions and provides examples of using regular expression functions in SQL. It begins with an introduction to regular expressions and meta characters. It then demonstrates the REGEXP_LIKE, REGEXP_REPLACE, REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_COUNT functions with examples. The document is presented by Danny Bryant, an IT manager at the City of Atlanta.
Introduction to Regular Expressions RootsTech 2013Ben Brumfield
This document provides an introduction to regular expressions. It defines regular expressions as a small language for describing text patterns. Regular expressions allow for powerful search and replace operations in text. The document outlines some basic regular expression syntax including characters, character classes, quantifiers, anchors, grouping, alternation, and capture groups. It also discusses using regular expressions for tasks like validation, reformatting, and search/replace.
PT.BUZOO INDONESIA is No1 Japanese offshore development company in Indonesia.
We are professional of web solution and smartphone apps. We can support Japanese, English and Indonesia.
We are hiring now at http://buzoo.co.id/
Regular expressions (regex) allow defining patterns to match in text strings using special characters like ., *, + to match types of characters or ranges. They have components like literal characters, character classes, shorthand classes, quantifiers and modifiers that define what is matched. PHP supports both PCRE and POSIX regex with functions like preg_match() and ereg_replace() and there are online tools to test and learn regular expressions.
Become a deft manipulator of text data. Regular Expression is the miracle of text extraction. If you got a text patten in mind, you can write your own pattern match in regular expression.
Regular expressions are patterns used to match strings in programs like sed, awk, grep, and others. They descend from finite automata theory and are used for tasks like searching text files, lexical analysis of programming languages, and early web search engines. In Linux systems, regular expressions are used by commands like sed, grep, and vi. Sed can perform editing operations on lines that match regular expression addresses.
The document provides information about a bioinformatics practicum covering topics like installing and using TextPad, regular expressions (regex), arrays/hashes, variables, flow control, loops, input/output, subroutines, and the three basic Perl data types of scalars, arrays, and hashes. It also discusses customizing TextPad, calculating Pi using random numbers, Buffon's needle problem, programming concepts like warnings and strict mode, what a regex is and why you would use one, regex atoms and quantifiers, anchors, grouping, alternation, and variable interpolation. Pattern matching, memory parentheses, finding all matches, greediness, the substitute function, and the translate function are also summarized.
This document provides an overview of regular expressions in Python. It discusses how regular expressions can be used to search, match, replace, and split strings. The syntax for common regular expression patterns is explained, including character classes, grouping, quantifiers, anchors, and more. Examples are given showing how to use the re module to compile patterns, search and match strings, and extract matched groups. Regular expressions in Python provide a powerful tool for manipulating text data.
Regular expressions are patterns used to match character combinations in strings. They allow concise testing of string properties and manipulation of strings through search, match, and replacement. The document outlines basic regular expression syntax like wildcards, character sets, and flags. It provides examples of using regex to validate input format and extract postal codes and phone numbers through capturing groups. Search finds matches, match returns an array of all matches, and replace substitutes matches using a function.
This document provides an overview of regular expressions (RE) for PHP, including what they are, why they are useful, basic syntax, and examples of using RE to match, find, replace, and split text. RE allow complex text matching and are available in most programming languages including PHP. They can be used to validate formats, extract data, and make replacements in text.
This document provides an introduction to regular expressions including syntax, special characters, metacharacters, order of precedence, examples, and references. It begins with an agenda and overview of syntax. It then defines special characters and metacharacters used in regular expressions. Examples are given for common patterns like digits, email addresses, and credit cards. References for further reading on regular expressions are also included. The document concludes with contact information for questions.
Folio3 is a development partner that focuses on designing custom enterprise, mobile, and social media applications. It was founded in 2005 and has over 200 employees across offices in the US, Canada, Bulgaria, and Pakistan. Folio3 provides services for areas like enterprise solutions, mobile apps, and websites. Some of its clients include companies in healthcare, digital media, and supply chain industries. The document then provides an overview of regular expressions including literal characters, special characters, character classes, grouping, backreferences, and lookarounds.
^Regular Expressions is one of those tools that every developer should have in their toolbox. You can do your job without regular expressions, but knowing when and how to use them will make you a much more efficient and marketable developer. You'll learn how regular expressions can be used for validating user input, parsing text, and refactoring code. We'll also cover various tools that can be used to help you write and share expressions.$
The document discusses regular expressions (regex), which are symbols used to match patterns in text. It provides examples of regex for identifiers, passwords, emails, IPv4 addresses, and IPv6 addresses. It also covers the basics of regex syntax, including characters, quantifiers, anchors, character classes, groups and backreferences. The document notes that regex are used for text matching, searching and replacing. It lists several programming languages and tools that support regex and provides areas where regex are commonly applied, such as validation and pattern matching.
This document provides 15 examples of using the Linux grep command to search files for text patterns. Some key examples include searching for a string in single or multiple files, ignoring case, using regular expressions to match patterns, displaying lines before/after/around matches, inverting matches, counting matches, and highlighting matched text. The examples demonstrate many useful grep options for finding text in files.
This document provides an overview and introduction to using regular expressions (regex) in Perl. It discusses the basic building blocks of regex patterns including characters, character classes, quantifiers, anchors, grouping, alternation, and interpolation. It explains how to use the binding operator (=~) to apply a regex to a variable. It also covers retrieving matched substrings using pattern memory ($1, $2 etc.) and finding all matches using the 'g' modifier. The document demonstrates substituting text using the s/// function and provides an example of the tr/// function.
The document discusses topics related to practicing bioinformatics including:
- Installing and working with the TextPad text editor
- Regular expressions (regex), including patterns, quantifiers, anchors, grouping, alternation, and variable interpolation
- Using regex memory variables ($1, $2, etc.) to extract matched substrings
- The s/// substitution operator and tr/// translation operator
- Applying these skills to tasks like finding restriction enzyme cut sites in DNA sequences
Regular expressions (regex) are text strings that describe search patterns to match or find other strings or sets of strings. They are useful for tasks like syntax highlighting, find and replace, text searching, and programming. The document discusses various regex constructs like single/multiple characters, word groups, quantifiers, grouping, greedy/lazy matching, lookahead/behind, and provides examples of using regex in C# and Linux tools. It notes pros of regex like flexibility and expressing complex patterns concisely, but also cons like difficulty of reading and debugging complex patterns.
Regular expressions are the main way many tools matches patterns within strings.
For example, finding pieces of text within a larger doc, or finding a restriction site within a larger sequence. This slide report illustrates what a RegEx is and what you can do to find, match, compare or replace text of documents or code.
This document provides an overview of regular expressions including what they are, their history and usage, common patterns and syntax, and examples of using regular expressions in Java. Regular expressions allow complex searches and text manipulation through special pattern syntax. They are very powerful for tasks like validation, extraction, replacement and more. The document covers topics such as character classes, quantifiers, capturing groups, boundaries, and internationalization considerations.
Regular expressions allow matching and manipulation of textual data. They were first discovered by mathematician Stephen Kleene and their search algorithm was developed by Ken Thompson in 1968 for use in tools like ed, grep, and sed. Regular expressions follow certain grammars and use meta characters to match patterns in text. They are used for tasks like validation, parsing, and data conversion.
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
This document discusses regular expressions and provides examples of using regular expression functions in SQL. It begins with an introduction to regular expressions and meta characters. It then demonstrates the REGEXP_LIKE, REGEXP_REPLACE, REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_COUNT functions with examples. The document is presented by Danny Bryant, an IT manager at the City of Atlanta.
Introduction to Regular Expressions RootsTech 2013Ben Brumfield
This document provides an introduction to regular expressions. It defines regular expressions as a small language for describing text patterns. Regular expressions allow for powerful search and replace operations in text. The document outlines some basic regular expression syntax including characters, character classes, quantifiers, anchors, grouping, alternation, and capture groups. It also discusses using regular expressions for tasks like validation, reformatting, and search/replace.
PT.BUZOO INDONESIA is No1 Japanese offshore development company in Indonesia.
We are professional of web solution and smartphone apps. We can support Japanese, English and Indonesia.
We are hiring now at http://buzoo.co.id/
Regular expressions (regex) allow defining patterns to match in text strings using special characters like ., *, + to match types of characters or ranges. They have components like literal characters, character classes, shorthand classes, quantifiers and modifiers that define what is matched. PHP supports both PCRE and POSIX regex with functions like preg_match() and ereg_replace() and there are online tools to test and learn regular expressions.
Become a deft manipulator of text data. Regular Expression is the miracle of text extraction. If you got a text patten in mind, you can write your own pattern match in regular expression.
Regular expressions are patterns used to match strings in programs like sed, awk, grep, and others. They descend from finite automata theory and are used for tasks like searching text files, lexical analysis of programming languages, and early web search engines. In Linux systems, regular expressions are used by commands like sed, grep, and vi. Sed can perform editing operations on lines that match regular expression addresses.
The document provides information about a bioinformatics practicum covering topics like installing and using TextPad, regular expressions (regex), arrays/hashes, variables, flow control, loops, input/output, subroutines, and the three basic Perl data types of scalars, arrays, and hashes. It also discusses customizing TextPad, calculating Pi using random numbers, Buffon's needle problem, programming concepts like warnings and strict mode, what a regex is and why you would use one, regex atoms and quantifiers, anchors, grouping, alternation, and variable interpolation. Pattern matching, memory parentheses, finding all matches, greediness, the substitute function, and the translate function are also summarized.
This document provides an overview of regular expressions in Python. It discusses how regular expressions can be used to search, match, replace, and split strings. The syntax for common regular expression patterns is explained, including character classes, grouping, quantifiers, anchors, and more. Examples are given showing how to use the re module to compile patterns, search and match strings, and extract matched groups. Regular expressions in Python provide a powerful tool for manipulating text data.
Regular expressions are patterns used to match character combinations in strings. They allow concise testing of string properties and manipulation of strings through search, match, and replacement. The document outlines basic regular expression syntax like wildcards, character sets, and flags. It provides examples of using regex to validate input format and extract postal codes and phone numbers through capturing groups. Search finds matches, match returns an array of all matches, and replace substitutes matches using a function.
This document provides an overview of regular expressions (RE) for PHP, including what they are, why they are useful, basic syntax, and examples of using RE to match, find, replace, and split text. RE allow complex text matching and are available in most programming languages including PHP. They can be used to validate formats, extract data, and make replacements in text.
This document provides an introduction to regular expressions including syntax, special characters, metacharacters, order of precedence, examples, and references. It begins with an agenda and overview of syntax. It then defines special characters and metacharacters used in regular expressions. Examples are given for common patterns like digits, email addresses, and credit cards. References for further reading on regular expressions are also included. The document concludes with contact information for questions.
Folio3 is a development partner that focuses on designing custom enterprise, mobile, and social media applications. It was founded in 2005 and has over 200 employees across offices in the US, Canada, Bulgaria, and Pakistan. Folio3 provides services for areas like enterprise solutions, mobile apps, and websites. Some of its clients include companies in healthcare, digital media, and supply chain industries. The document then provides an overview of regular expressions including literal characters, special characters, character classes, grouping, backreferences, and lookarounds.
^Regular Expressions is one of those tools that every developer should have in their toolbox. You can do your job without regular expressions, but knowing when and how to use them will make you a much more efficient and marketable developer. You'll learn how regular expressions can be used for validating user input, parsing text, and refactoring code. We'll also cover various tools that can be used to help you write and share expressions.$
The document discusses regular expressions (regex), which are symbols used to match patterns in text. It provides examples of regex for identifiers, passwords, emails, IPv4 addresses, and IPv6 addresses. It also covers the basics of regex syntax, including characters, quantifiers, anchors, character classes, groups and backreferences. The document notes that regex are used for text matching, searching and replacing. It lists several programming languages and tools that support regex and provides areas where regex are commonly applied, such as validation and pattern matching.
This document provides 15 examples of using the Linux grep command to search files for text patterns. Some key examples include searching for a string in single or multiple files, ignoring case, using regular expressions to match patterns, displaying lines before/after/around matches, inverting matches, counting matches, and highlighting matched text. The examples demonstrate many useful grep options for finding text in files.
This document provides an overview and introduction to using regular expressions (regex) in Perl. It discusses the basic building blocks of regex patterns including characters, character classes, quantifiers, anchors, grouping, alternation, and interpolation. It explains how to use the binding operator (=~) to apply a regex to a variable. It also covers retrieving matched substrings using pattern memory ($1, $2 etc.) and finding all matches using the 'g' modifier. The document demonstrates substituting text using the s/// function and provides an example of the tr/// function.
The document discusses topics related to practicing bioinformatics including:
- Installing and working with the TextPad text editor
- Regular expressions (regex), including patterns, quantifiers, anchors, grouping, alternation, and variable interpolation
- Using regex memory variables ($1, $2, etc.) to extract matched substrings
- The s/// substitution operator and tr/// translation operator
- Applying these skills to tasks like finding restriction enzyme cut sites in DNA sequences
Regular expressions are a powerful tool for searching, parsing, and modifying text patterns. They allow complex patterns to be matched with simple syntax. Some key uses of regular expressions include validating formats, extracting data from strings, and finding and replacing substrings. Regular expressions use special characters to match literal, quantified, grouped, and alternative character patterns across the start, end, or anywhere within a string.
This document provides an introduction to regular expressions (regex). It discusses the history and applications of regex, common operators and constructs like quantifiers and character classes, and examples of regex patterns for tasks like validation of emails and IP addresses. It also provides resources for learning more about regex like books, websites and tools for testing regex.
Don't Fear the Regex - CapitalCamp/GovDays 2014Sandy Smith
Have you been scared off by Klingon-looking one-liners in Perl? Do you resort to writing complicated recursive functions just to parse some HTML? Don't!
I'll demystify regular expressions and show you how best to do them in PHP. We'll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I'll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
Given at CapitalCamp & GovDays 2014
This lecture discusses the concept of Regular Expressions along with its usage in different tools such as grep, sed, and awk
Check the other Lectures and courses in
http://Linux4EnbeddedSystems.com
or Follow our Facebook Group at
- Facebook: @LinuxforEmbeddedSystems
Lecturer Profile:
- https://www.linkedin.com/in/ahmedelarabawy
This document provides an introduction to regular expressions (regex) in PHP. It begins with a basic explanation of what regex is for - matching patterns in strings. It then covers the basics of regex syntax, including delimiters, character classes, quantifiers, escaping special characters, and using regex in PHP functions like preg_match() and preg_replace(). It also discusses more advanced topics like character classes, subpatterns, backreferences, modifiers, and when not to use regex. The overall message is that regex is a powerful tool for text manipulation but needs to be used appropriately.
This is the ninth set of slightly updated slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
This document introduces regular expressions (regex), which are patterns used to match character combinations in strings. Regex can be used for text search/replace, validation, and other string manipulation tasks. The basics covered include matching characters, character classes, quantifiers, grouping, alternation, anchors, and capturing subgroups for replacement. Examples demonstrate matching names, dates, URLs, and other patterns.
This document provides an introduction to regular expressions (RegEx). It explains that RegEx allows you to find, match, compare or replace text patterns. It then discusses the basic building blocks of RegEx, including characters, character classes, quantifiers, and assertions. It provides several examples of RegEx patterns to match names, words, ports numbers, and other patterns. It concludes with an overview of common RegEx match types like beginning/end of line, word boundaries, grouping, alternatives, and repetition.
Do you have data and lists you keep having to massage to make it useful for your project? Have you heard of regular expressions but been frightened by the Klingon-looking examples? Fear no longer!
I’ll demystify regular expressions and show you how best to do them in PHP. We’ll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I’ll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
Don't Fear the Regex - Northeast PHP 2015Sandy Smith
Have you been scared off by Klingon-looking one-liners in Perl? Do you resort to writing complicated recursive functions just to parse some HTML? Don’t!
I’ll demystify regular expressions and show you how best to do them in PHP. We’ll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I’ll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
The document provides an introduction to Perl programming and regular expressions. It begins with simple Perl programs to print text and take user input. It then covers executing external commands, variables, operators, loops, and file operations. The document also introduces regular expressions, explaining patterns, anchors, character classes, alternation, grouping, and repetition quantifiers. It provides examples and discusses principles for matching strings with regular expressions.
This document provides an overview of string matching and regular expression functions in Perl. It defines functions like split and join for manipulating strings. It explains different ways of specifying regular expressions using delimiters and modifiers. Special variables like $', $& and $` that capture matched strings are also described. Examples are given to illustrate string substitution operators and capturing matched text using parentheses in regular expressions.
Regular expressions are a powerful tool for searching, matching, and parsing text patterns. They allow complex text patterns to be matched with a standardized syntax. All modern programming languages include regular expression libraries. Regular expressions can be used to search strings, replace parts of strings, split strings, and find all occurrences of a pattern in a string. They are useful for tasks like validating formats, parsing text, and finding/replacing text. This document provides examples of common regular expression patterns and methods for using regular expressions in Python.
The document contains information about regular expressions (regexes), including their basic structure, common features like character types and anchors, and more advanced concepts like groups, capturing groups, backreferences, and quantifiers. It provides examples to demonstrate how different regex patterns can be used to match text. The document is intended as a reference for understanding regex syntax and utilizing various regex constructs.
The document provides an introduction to regular expressions (regex). It covers basic regex syntax including character classes, repetition operators, search flags, and grouping. It also discusses matching, replacing, and combining different regex modes. Examples are provided for text matching, validation of file names and user input, and replacing patterns within strings.
This document summarizes different ways to work with strings in PHP, including single quoted, double quoted, and heredoc strings. It also discusses common string functions for length, searching, replacing, formatting, and regular expressions. Regular expressions provide a way to match patterns and subexpressions in strings through the use of special characters, quantifiers, and delimiters. Functions like preg_match() and preg_replace() allow working with regular expression patterns in PHP.
The document provides an overview of regular expressions (regex), including how a regex engine works, common applications and techniques, and components of a regex pattern. It discusses matching, extracting, and replacing text using regex as well as operators, character classes, anchors, repetition, groups, alternation, lookahead/lookbehind, and techniques for optimizing expressions. The document is intended to introduce readers to the basics of text processing with regex.
This document provides an overview of bioinformatics and biological databases. It discusses how bioinformatics draws from fields like biology, computer science, statistics, and machine learning. Biological databases are important resources for bioinformatics that can be searched and analyzed to answer questions, find similar sequences, locate patterns, and make predictions. The document also outlines common uses of biological databases, such as annotation searches, homology searches, pattern searches, and predictive analyses.
The document discusses the Rh blood group system and its clinical significance. It describes the key observations in 1939 that linked adverse reactions in mothers to stillborn fetuses and blood transfusions from fathers, indicating a relationship. This syndrome is now called hemolytic disease of the fetus and newborn. The Rh system was identified in 1940 through experiments immunizing animals with Rhesus macaque monkey red blood cells. The D antigen is the most important RBC antigen in transfusion practice, as those lacking it do not produce anti-D antibody unless exposed to D antigen through transfusion or pregnancy. Testing for D is routinely performed to ensure D-negative patients receive D-negative blood.
The document discusses views and materialized views in data warehousing and decision support systems. It covers three main points:
1) OLAP queries typically involve aggregate queries, so precomputation is essential for fast response times. Materialized views allow precomputing aggregates across multiple dimensions.
2) Warehouses can be thought of as collections of asynchronously replicated tables and periodically maintained views, renewing interest in efficient view maintenance.
3) Materialized views store the results of views in the database for fast access like a cache, but they require maintenance as underlying tables change. Incremental maintenance algorithms are ideal to efficiently update materialized views.
The document discusses various database concepts including normalization, which is used to design optimal relation schemas by removing redundant data. It also covers transaction processing, which involves executing logical database operations as transactions to maintain data integrity. Database systems use techniques like logging and concurrency control to prevent transaction anomalies and ensure failures can be recovered from.
This document contains a list of names, emails, and study programs of students. It includes their official student code, last name, first name, email, and educational program. There are 20 students listed with their details.
This document discusses the Biological Databases project being conducted by a group of students. The project involves using the video game Minecraft to visualize protein structures retrieved from the Protein Data Bank (PDB). Python scripts are used to import PDB data files and place blocks in Minecraft to represent atoms, with different block colors used to distinguish atom types. SPARQL queries are also employed to search the RDF version of the PDB for protein entries. The goal is to build 3D protein models inside Minecraft for educational and visualization purposes.
The document discusses various bioinformatics tools and algorithms for analyzing protein sequences, including Biopython for working with biological sequence data, the Kyte-Doolittle algorithm for predicting transmembrane regions, and the Chou-Fasman algorithm for predicting secondary structure from amino acid preferences for alpha helices, beta sheets, and random coils. It also provides examples of analyzing Swiss-Prot data to find properties of human proteins and applying these tools and libraries to extract insights from protein sequences.
The document discusses various topics related to analyzing protein sequences using Python and Biopython. It provides examples of using Biopython to parse sequence data from UniProt, calculate lengths and translations of sequences. It also discusses analyzing properties of sequences like molecular weight, isoelectric point, transmembrane regions, and comparing sequences to find conserved motifs. Finally, it introduces hydropathy indices and tools for predicting properties like transmembrane helices from primary sequences.
This document discusses Python functions. It explains that there are built-in functions provided as part of Python and user-defined functions. User-defined functions are created using the def keyword and can take parameters and return values. The body of a function is indented and runs when the function is called. Functions allow code to be reused and organized in a modular way. Examples are provided to demonstrate defining and calling functions with different parameters and return values.
The document provides a recap of Python programming concepts like conditions and statements, while loops, for loops, break and continue statements, and working with strings. It also introduces regular expressions as a way to match patterns in strings using a formal language that can be interpreted by a regular expression processor.
[SUMMARY
This document discusses next generation DNA sequencing technologies. It begins by describing some of the limitations of traditional Sanger sequencing, such as read lengths of 500-1000 bases and throughput of 57,000 bases per run. It then introduces some key next generation sequencing technologies, such as 454 sequencing which uses emulsion PCR and pyrosequencing to achieve read lengths of 20-100 bases but higher throughput of 20-100 Mb per run. Illumina/Solexa sequencing is also discussed, which uses sequencing by synthesis with reversible terminators and laser-based detection. Finally, third generation sequencing technologies are mentioned, such as Pacific Biosciences' single molecule real time sequencing and nanopore sequencing. In summary, the document provides a high-level
The document provides an overview of the history and evolution of various programming languages. It discusses early languages like FORTRAN, LISP, PASCAL, C, and Java. It also covers scripting languages and their uses. The document explains what Python is as a programming language - that it is interpreted, object-oriented, and high-level. It was named after Monty Python and was created by Guido van Rossum. The document then gives examples of using Python to program Minecraft by importing protein data from PDB files and using coordinates to place blocks to visualize proteins in the game.
This document provides an introduction to bio-ontologies and the semantic web. It discusses what ontologies are and how they are used in the bio domain through initiatives like the OBO Foundry. It introduces key semantic web technologies like RDF, URIs, Turtle syntax, and SPARQL query language. It provides examples of ontologies like the Gene Ontology and how ontologies can be represented and queried using these semantic web standards.
This document provides an overview of NoSQL databases, including:
- Key-value stores store data as maps or hashmaps and are efficient for data access but limited in query capabilities.
- Column-oriented stores group attributes into column families and store data efficiently but are operationally challenging.
- Document databases store loosely structured data like JSON and allow retrieving documents by keys or contents.
- Graph databases are suited for interaction networks and path finding but are less suited for tabular data.
The document discusses creating a multicore database project. It recommends taking the following steps:
1. Define what the project is about, what it aims to achieve, and who it is for.
2. Identify information resources and develop a basic data model.
3. Design a user interface mockup without technical constraints, thinking creatively.
This document discusses biological databases and PHP. It begins with an overview of biological databases and examples using BIOSQL to load genetic data from GenBank into a MySQL database. It then provides examples of building a basic 3-tier model with Apache, PHP, and a MySQL backend database. The document also includes a brief introduction to PHP, covering its history, why it is commonly used, and basic syntax like conditional statements.
This document discusses biological databases and SQL. It provides an overview of primary and derived data in biological research, as well as different data levels. It then discusses direct querying of selected bioinformatics databases using SQL and provides examples of 3-tier database models. The document proceeds to discuss rationale for learning SQL to query biological databases and provides definitions and explanations of key SQL concepts like tables, records, queries, data types, keys, integrity rules and constraints.
This document discusses biological databases and bioinformatics. It begins with an overview of bioinformatics as an interdisciplinary field combining biology, computer science, and information technology. It then discusses different types of biological databases, including those focused on sequences, pathways, protein structures, and gene expression. The document outlines some common uses of biological databases, including searching for annotations, identifying similar sequences through homology, searching for patterns, and making predictions. It also briefly discusses comparing data across databases. The summary provides a high-level overview of the key topics and uses of biological databases covered in the document.
The document discusses several topics related to protein structure prediction using Python:
1. It introduces the Chou-Fasman algorithm for predicting protein secondary structure from amino acid sequence. The algorithm calculates preference parameters for each amino acid to be in alpha helices, beta sheets, or other structures.
2. It provides an example of calculating helical propensity.
3. It lists the preference parameters output by the Chou-Fasman algorithm for each amino acid.
4. It outlines the steps of applying the Chou-Fasman algorithm to predict secondary structure elements in a protein sequence.
The document provides information on various Python programming concepts including control structures, lists, dictionaries, regular expressions, exceptions, and biological applications using Biopython. It discusses if/else statements, while and for loops, list operations, dictionary usage, regex patterns, exception handling roles, and gives examples analyzing protein sequences and structures using Biopython.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Assessment and Planning in Educational technology.pptxKavitha Krishnan
In an education system, it is understood that assessment is only for the students, but on the other hand, the Assessment of teachers is also an important aspect of the education system that ensures teachers are providing high-quality instruction to students. The assessment process can be used to provide feedback and support for professional development, to inform decisions about teacher retention or promotion, or to evaluate teacher effectiveness for accountability purposes.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
12. Introduction
Buffon's Needle is one of the oldest problems in the
field of geometrical probability. It was first stated
in 1777. It involves dropping a needle on a lined
sheet of paper and determining the probability of
the needle crossing one of the lines on the page.
The remarkable result is that the probability is
directly related to the value of pi.
http://www.angelfire.com/wa/hurben/buff.html
In Postscript you send it too the printer … PS has no
variables but “stacks”, you can mimick this in Perl
by recursively loading and rewriting a subroutine
14. Two brief diversions (warnings & strict)
• Use warnings;
• Use strict;
• strict – forces you to „declare‟ a variable the
first time you use it.
– usage: use strict; (somewhere near the top of
your script)
• declare variables with „my‟
– usage: my $variable;
– or: my $variable = „value‟;
• my sets the „scope‟ of the variable. Variable
exists only within the current block of code
• use strict and my both help you to debug
errors, and help prevent mistakes.
16. What is a regular expression?
• A regular expression (regex) is simply a
way of describing text.
• Regular expressions are built up of small
units (atoms) which can represent the
type and number of characters in the text
• Regular expressions can be very broad
(describing everything), or very narrow
(describing only one pattern).
17. Why would you use a regex?
• Often you wish to test a string for the
presence of a specific character, word,
or phrase
– Examples
• “Are there any letter characters in my string?”
• “Is this a valid accession number?”
• “Does my sequence contain a start codon
(ATG)?”
18. Regular Expressions
Match to a sequence of characters
The EcoRI restriction enzyme cuts at the consensus
sequence GAATTC.
To find out whether a sequence contains a restriction site
for EcoR1, write;
if ($sequence =~ /GAATTC/) {
...
};
20. Regular Expressions
Match to a character class
• Example
• The BstYI restriction enzyme cuts at the consensus sequence rGATCy,
namely A or G in the first position, then GATC, and then T or C. To find
out whether a sequence contains a restriction site for BstYI, write;
• if ($sequence =~ /[AG]GATC[TC]/) {...}; # This will match all of
AGATCT, GGATCT, AGATCC, GGATCC.
Definition
• When a list of characters is enclosed in square brackets [], one and
only one of these characters must be present at the corresponding
position of the string in order for the pattern to match. You may specify
a range of characters using a hyphen -.
• A caret ^ at the front of the list negates the character class.
Examples
• if ($string =~ /[AGTC]/) {...}; # matches any nucleotide
• if ($string =~ /[a-z]/) {...}; # matches any lowercase letter
• if ($string =~ /chromosome[1-6]/) {...}; # matches chromosome1,
chromosome2 ... chromosome6
• if ($string =~ /[^xyzXYZ]/) {...}; # matches any character except x, X, y,
Y, z, Z
21. Constructing a Regex
• Pattern starts and ends with a / /pattern/
– if you want to match a /, you need to escape it
• / (backslash, forward slash)
– you can change the delimiter to some other
character, but you probably won‟t need to
• m|pattern|
• any „modifiers‟ to the pattern go after the last /
• i : case insensitive /[a-z]/i
• o : compile once
• g : match in list context (global)
• m or s : match over multiple lines
22. Looking for a pattern
• By default, a regular expression is applied to
$_ (the default variable)
– if (/a+/) {die}
• looks for one or more „a‟ in $_
• If you want to look for the pattern in any other
variable, you must use the bind operator
– if ($value =~ /a+/) {die}
• looks for one or more „a‟ in $value
• The bind operator is in no way similar to the
„=„ sign!! = is assignment, =~ is bind.
– if ($value = /[a-z]/) {die}
• Looks for one or more „a‟ in $_, not $value!!!
23. Regular Expression Atoms
• An „atom‟ is the smallest unit of a regular
expression.
• Character atoms
• 0-9, a-Z match themselves
• . (dot) matches everything
• [atgcATGC] : A character class (group)
• [a-z] : another character class, a through z
24. More atoms
• d - All Digits
• D - Any non-Digit
• s - Any Whitespace (s, t, n)
• S - Any non-Whitespace
• w - Any Word character [a-zA-Z_0-9]
• W - Any non-Word character
25. An example
• if your pattern is /ddd-dddd/
– You could match
• 555-1212
• 5512-12222
• 555-5155-55
– But not:
• 55-1212
• 555-121
• 555j-5555
26. Quantifiers
• You can specify the number of times
you want to see an atom. Examples
• d* : Zero or more times
• d+ : One or more times
• d{3} : Exactly three times
• d{4,7} : At least four, and not more than
seven
• d{3,} : Three or more times
• We could rewrite /ddd-dddd/ as:
– /d{3}-d{4}/
27. Anchors
• Anchors force a pattern match to a
certain location
• ^ : start matching at beginning of string
• $ : start matching at end of string
• b : match at word boundary (between w and
W)
• Example:
• /^ddd-dddd$/ : matches only valid phone
numbers
28. Grouping
• You can group atoms together with
parentheses
• /cat+/ matches cat, catt, cattt
• /(cat)+/ matches cat, catcat, catcatcat
• Use as many sets of parentheses as you
need
29. Alternation
• You can specify patterns which match
either one thing or another.
– /cat|dog/ matches either „cat‟ or „dog‟
– /ca(t|d)og/ matches either „catog‟ or „cadog‟
30. Variable interpolation
• You can put variables into your pattern.
– if $string = „cat‟
• /$string/ matches „cat‟
• /$string+/ matches „cat‟, „catcat‟, etc.
• /d{2}$string+/ matches „12cat‟, „24catcat‟, etc.
31. Regular Expression Review
• A regular expression (regex) is a way of
describing text.
• Regular expressions are built up of small
units (atoms) which can represent the type
and number of characters in the text
• You can group or quantify atoms to describe
your pattern
• Always use the bind operator (=~) to apply
your regular expression to a variable
32. Remembering Stuff
• Being able to match patterns is good,
but limited.
• We want to be able to keep portions of
the regular expression for later.
– Example: $string = „phone: 353-7236‟
• We want to keep the phone number only
• Just figuring out that the string contains a
phone number is insufficient, we need to keep
the number as well.
33. Memory Parentheses (pattern memory)
• Since we almost always want to keep
portions of the string we have matched,
there is a mechanism built into perl.
• Anything in parentheses within the
regular expression is kept in memory.
– „phone:353-7236‟ =~ /^phone:(.+)$/;
• Perl knows we want to keep everything that
matches „.+‟ in the above pattern
34. Getting at pattern memory
• Perl stores the matches in a series of default
variables. The first parentheses set goes into
$1, second into $2, etc.
– This is why we can‟t name variables ${digit}
– Memory variables are created only in the amounts
needed. If you have three sets of parentheses, you
have ($1,$2,$3).
– Memory variables are created for each matched set
of parentheses. If you have one set contained
within another set, you get two variables (inner set
gets lowest number)
– Memory variables are only valid in the current scope
35. An example of pattern memory
my $string = shift;
if ($string =~ /^phone:(d{3}-d{4})$/){
$phone_number = $1;
}
else {
print “Enter a phone number!n”
}
36. Finding all instances of a match
• Use the „g‟ modifier to the regular
expression
– @sites = $sequence =~ /(TATTA)/g;
– think g for global
– Returns a list of all the matches (in order),
and stores them in the array
– If you have more than one pair of
parentheses, your array gets values in sets
• ($1,$2,$3,$1,$2,$3...)
37. Perl is Greedy
• In addition to taking all your time, perl regular
expressions also try to match the largest
possible string which fits your pattern
– /ga+t/ matches gat, gaat, gaaat
– „Doh! No doughnuts left!‟ =~ /(d.+t)/
• $1 contains „doughnuts left‟
• If this is not what you wanted to do, use the
„?‟ modifier
– /(d.+?t)/ # match as few „.‟s as you can
and still make the pattern work
38. Substitute function
• s/pattern1/pattern2/;
• Looks kind of like a regular expression
– Patterns constructed the same way
• Inherited from previous languages, so it
can be a bit different.
– Changes the variable it is bound to!
39. Using s
• Substituting one word for another
– $string =~ s/dogs/cats/;
• If $string was “I love dogs”, it is now “I love cats”
• Removing trailing white space
– $string =~ s/s+$//;
• If $string was „ATG „, it is now „ATG‟
• Adding 10 to every number in a string
– $string =~ /(d+)/$1+10/ge;
• If string was “I bought 5 dogs at 2 bucks each”, it is now:
– “I bought 15 dogs at 12 bucks each”
• Note pattern memory!!
• g means global (just like a regex)
• e is special to s, evaluate the expression on the right
41. tr function
• translate or transliterate
• tr/characterlist1/characterlist2/;
• Even less like a regular expression than
s
• substitutes characters in the first list
with characters in the second list
$string =~ tr/a/A/; # changes every „a‟ to an
„A‟
– No need for the g modifier when using tr.
43. Using tr
• Creating complimentary DNA sequence
– $sequence =~ tr/atgc/TACG/;
• Sneaky Perl trick for the day
– tr does two things.
• 1. changes characters in the bound variable
• 2. Counts the number of times it does this
– Super-fast character counter™
• $a_count = $sequence =~ tr/a/a/;
• replaces an „a‟ with an „a‟ (no net change), and
assigns the result (number of substitutions) to
$a_count
44. • Regex-Related Special Variables
• Perl has a host of special variables that get filled after every m// or s///
regex match. $1, $2, $3, etc. hold the backreferences. $+ holds the last
(highest-numbered) backreference. $& (dollar ampersand) holds the
entire regex match.
• @- is an array of match-start indices into the string. $-[0] holds the start of
the entire regex match, $-[1] the start of the first backreference, etc.
Likewise, @+ holds match-end indices (ends, not lengths).
• $' (dollar followed by an apostrophe or single quote) holds the part of the
string after (to the right of) the regex match. $` (dollar backtick) holds the
part of the string before (to the left of) the regex match. Using these
variables is not recommended in scripts when performance matters, as it
causes Perl to slow down all regex matches in your entire script.
• All these variables are read-only, and persist until the next regex match is
attempted. They are dynamically scoped, as if they had an implicit 'local'
at the start of the enclosing scope. Thus if you do a regex match, and call
a sub that does a regex match, when that sub returns, your variables are
still set as they were for the first match.
45. • Finding All Matches In a String
• The "/g" modifier can be used to process all regex
matches in a string. The first m/regex/g will find the
first match, the second m/regex/g the second match,
etc. The location in the string where the next match
attempt will begin is automatically remembered by
Perl, separately for each string. Here is an example:
• while ($string =~ m/regex/g) { print "Found '$&'. Next
attempt at character " . pos($string)+1 . "n"; } The
pos() function retrieves the position where the next
attempt begins. The first character in the string has
position zero. You can modify this position by using
the function as the left side of an assignment, like in
pos($string) = 123;.
47. Oefeningen practicum 2
1. Which of following 4 sequences (seq1/2/3/4)
a) contains a “Galactokinase signature”
http://us.expasy.org/prosite/
b) How many of them?
c) Where (hints:pos and $&) ?
50. Three Basic Data Types
• Scalars - $
• Arrays of scalars - @
• Associative arrays of scalers or
Hashes - %
51. Arrays
Definitions
• A scalar variable contains a scalar value: one number or one
string. A string might contain many words, but Perl regards it as
one unit.
• An array variable contains a list of scalar data: a list of numbers
or a list of strings or a mixed list of numbers and strings. The
order of elements in the list matters.
Syntax
• Array variable names start with an @ sign.
• You may use in the same program a variable named $var and
another variable named @var, and they will mean two different,
unrelated things.
Example
• Assume we have a list of numbers which were obtained as a
result of some measurement. We can store this list in an array
variable as the following:
• @msr = (3, 2, 5, 9, 7, 13, 16);
52. The foreach construct
The foreach construct iterates over a list of scalar
values (e.g. that are contained in an array) and
executes a block of code for each of the values.
• Example:
– foreach $i (@some_array) {
– statement_1;
– statement_2;
– statement_3; }
– Each element in @some_array is aliased to the variable $i in
turn, and the block of code inside the curly brackets {} is
executed once for each element.
• The variable $i (or give it any other name you wish) is
local to the foreach loop and regains its former value
upon exiting of the loop.
• Remark $_
53. Examples for using the foreach construct - cont.
• Calculate sum of all array elements:
#!/usr/local/bin/perl
@msr = (3, 2, 5, 9, 7, 13, 16);
$sum = 0;
foreach $i (@msr) {
$sum += $i; }
print "sum is: $sumn";
54. Accessing individual array elements
Individual array elements may be accessed by
indicating their position in the list (their index).
Example:
@msr = (3, 2, 5, 9, 7, 13, 16);
index value 0 3 1 2 2 5 3 9 4 7 5 13 6 16
First element: $msr[0] (here has the value of 3),
Third element: $msr[2] (here has the value of
5),
and so on.
55. The sort function
The sort function receives a list of variables (or an array) and returns the sorted list.
@array2 = sort (@array1);
#!/usr/local/bin/perl
@countries = ("Israel", "Norway", "France", "Argentina");
@sorted_countries = sort ( @countries);
print "ORIG: @countriesn", "SORTED: @sorted_countriesn";
Output:
ORIG: Israel Norway France Argentina
SORTED: Argentina France Israel Norway
#!/usr/local/bin/perl
@numbers = (1 ,2, 4, 16, 18, 32, 64);
@sorted_num = sort (@numbers);
print "ORIG: @numbers n", "SORTED: @sorted_num n";
Output:
ORIG: 1 2 4 16 18 32 64
SORTED: 1 16 18 2 32 4 64
Note that sorting numbers does not happen numerically, but by the string values of
each number.
56. The push and shift functions
The push function adds a variable or a list of variables to the end of a given
array.
Example:
$a = 5;
$b = 7;
@array = ("David", "John", "Gadi");
push (@array, $a, $b);
# @array is now ("David", "John", "Gadi", 5, 7)
The shift function removes the first element of a given array and returns this
element.
Example:
@array = ("David", "John", "Gadi");
$k = shift (@array);
# @array is now ("John", "Gadi"); # $k is now "David"
Note that after both the push and shift operations the given array @array is
changed!
57. How can I know the length of a given array?
You have three options:
• Assing the array variable into a scalar variable, as in
the previous slide. This is not recommended,
because the code is confusing.
• Use the scalar function. Example:
– $x = scalar (@array); # $x now contains the number of
elements in @array.
• Use the special variable $#array_name to get the
index value of the last element of @array_name.
Example:
– @fruits = ("apple", "orange", "banana", "melon");
– $a = $#fruits;
– # $a is now 3;
– $b = $#fruits + 1;
– # $b is now 4, i.e. # the no. of elements in @fruits.
58. Special array for command line arguments @ARGV
#!/usr/bin/perl -w
# print out user-entered command line
arguments
foreach $arg (@ARGV) {
# print each argument followed by <tab>
print $arg . "t";
}
# print hard return
print "n";
59. Perl Array review
• An array is designated with the ‘@’ sign
• An array is a list of individual elements
• Arrays are ordered
– Your list stays in the same order that you created it, although
you can add or subtract elements to the front or back of the
list
• You access array elements by number, using the
special syntax:
– $array[1] returns the „1th‟ element of the array (remember
perl starts counting at zero)
• You can do anything with an array element that
you can do with a scalar variable (addition,
subtraction, printing … whatever)
61. Text Processing Functions
The split function
• The split function splits a string to a list of substrings according to the
positions of a given delimiter. The delimiter is written as a pattern
enclosed by slashes: /PATTERN/. Examples:
• $string = "programming::course::for::bioinformatics";
• @list = split (/::/, $string);
• # @list is now ("programming", "course", "for", "bioinformatics") # $string
remains unchanged.
• $string = "protein kinase Ct450 Kilodaltonst120 Kilobases";
• @list = split (/t/, $string); #t indicates tab #
• @list is now ("protein kinase C", "450 Kilodaltons", "120 Kilobases")
62. Text Processing Functions
The join function
• The join function does the opposite of split. It receives a delimiter and a
list of strings, and joins the strings into a single string, such that they are
separated by the delimiter.
• Note that the delimiter is written inside quotes.
• Examples:
• @list = ("programming", "course", "for", "bioinformatics");
• $string = join ("::", @list);
• # $string is now "programming::course::for::bioinformatics"
• $name = "protein kinase C"; $mol_weight = "450 Kilodaltons";
$seq_length = "120 Kilobases";
• $string = join ("t", $name, $mol_weight, $seq_length);
• # $string is now: # "protein kinase Ct450 Kilodaltonst120 Kilobases"
63. Three Basic Data Types
• Scalars - $
• Arrays of scalars - @
• Associative arrays of scalers or
Hashes - %
64. When is an array not good enough?
• Sometimes you want to associate a given
value with another value. (name/value pairs)
(Rob => 353-7236, Matt => 353-7122,
Joe_anonymous => 555-1212)
(Acc#1 => sequence1, Acc#2 => sequence2, Acc#n
=> sequence-n)
• You could put this information into an array,
but it would be difficult to keep your names
and values together (what happens when you
sort? Yuck)
65. Problem solved: The associative array
• As the name suggests, an associative array
allows you to link a name with a value
• In perl-speak: associative array = hash
– „hash‟ is the preferred term, for various arcane reasons,
including that it is easier to say.
• Consider an array: The elements (values) are
each associated with a name – the index position.
These index positions are numerical, sequential,
and start at zero.
• A hash is similar to an array, but we get to name
the index positions anything we want
66. The „structure‟ of a Hash
• An array looks something like this:
0 1 2 Index
@array =
'val1' 'val2' 'val3' Value
67. The „structure‟ of a Hash
• An array looks something like this:
0 1 2 Index
@array =
'val1' 'val2' 'val3' Value
• A hash looks something like this:
Rob Matt Joe_A Key (name)
%phone =
353-7236 353-7122 555-1212 Value
68. Hash Rules:
• Names have the same rules as any other
variables (no spaces, etc.)
• A hash is preceded by a „%‟ sign
$value => scalar variable
@array => array variable
%hash => hash variable
• A hash key can be any string
• Hash keys are unique!!
– You may not have two keys in a hash with the
same name.
– You may not have two keys in a hash with the
same name. Ever. Really. I mean it this time.
69. Creating a hash
• There are several methods for creating a hash.
The most simple way – assign a list to a hash.
– %hash = („rob‟, 56, „joe‟, 17, „jeff‟, „green‟);
• Perl is smart enough to know that since you
are assigning a list to a hash, you meant to
alternate keys and values.
– %hash = („rob‟ => 56 , „joe‟ => 17, „jeff‟ => „green‟);
• The arrow („=>‟) notation helps some people,
and clarifies which keys go with which values.
The perl interpreter sees „=>‟ as a comma.
70. Getting at values
• You should expect by now that there is some
way to get at a value, given a key.
• You access a hash key like this:
– $hash{„key‟}
• This should look somewhat familiar
– $array[21] : refer to a value associated with a
specific index position in an array
– $hash{key} : refer to a value associated with a
specific key in a hash
71. Getting at values, continued
• Magic incantation: Given a hash
%somehash, you access the value in a
specific key by this notation:
$somehash{some_key}
• Memorize this incantation!!!
– If it helps, remember that you are getting a single
element out of the hash, hence the $ notation. To
tell perl it is a hash, you use curly braces „{}‟.
72. A phone book program
#!/usr/bin/perl –w
use strict;
my %phonenumbers = („Rob‟ => '353-7236',
„Matt‟ => '353-7122',
„Dave‟ => '353-5284',
„Jeff‟ => 'unlisted - go away');
print "Please enter a name:n";
my $name = <STDIN>;
chomp $name;
print "${name}'s phone number is $phonenumbers{$name}n";
# note ${name} is a way to set off the variable name from any other
text
# $name's may have been interpreted as the variable $name's
73. Count unique things with a hash
• Remember, keys must be unique. So,
while ($thing = <>){
chomp;
$hash{$thing}++
}
– $hash{key} is equal to „‟ the first time you
see an item – add one to it
– $hash{key} is equal to 1 the next time you
see the same thing, add one to it.
– And so on...
74. Printing a Hash
• Of course there is a way to print a hash.
It isn‟t as easy as printing an array:
– print @array; or print “@array”;
– There is no equivalent print %hash;
• We must visit each key and print its
associated value. Sounds like a job for
a loop...
75. Printing a hash (continued)
• First, create a list of keys. Fortunately, there is
a function for that:
– keys %hash (returns a list of keys)
• Next, visit each key and print its associated
value:
foreach (keys %hash){
print “The key $_ has the value $hash{$_}n”;
}
• One complication. Hashes do not maintain any
sort of order. In other words, if you put
key/value pairs into a hash in a particular
order, you will not get them out in that order!!
76. Programming in general and Perl in particular
• There is more than one right way to do it. Unfortunately, there are also
many wrong ways.
– 1. Always check and make sure the output is correct and logical
• Consider what errors might occur, and take steps to ensure that you are
accounting for them.
– 2. Check to make sure you are using every variable you declare.
• Use Strict !
– 3. Always go back to a script once it is working and see if you can
eliminate unnecessary steps.
• Concise code is good code.
• You will learn more if you optimize your code.
• Concise does not mean comment free. Please use as many comments as
you think are necessary.
• Sometimes you want to leave easy to understand code in, rather than
short but difficult to understand tricks. Use your judgment.
• Remember that in the future, you may wish to use or alter the code you
wrote today. If you don‟t understand it today, you won‟t tomorrow.
77. Programming in general and Perl in particular
Develop your program in stages. Once part of it works, save the
working version to another file (or use a source code control
system like RCS) before continuing to improve it.
When running interactively, show the user signs of activity.
There is no need to dump everything to the screen (unless
requested to), but a few words or a number change every few
minutes will show that your program is doing something.
Comment your script. Any information on what it is doing or why
might be useful to you a few months later.
Decide on a coding convention and stick to it. For example,
– for variable names, begin globals with a capital letter and privates
(my) with a lower case letter
– indent new control structures with (say) 2 spaces
– line up closing braces, as in: if (....) { ... ... }
– Add blank lines between sections to improve readibility
81. 3. Palindromes
What is the longest palindroom in palin.fasta ?
Why are restriction sites palindromic ?
How long is the longest palindroom in the genome ?
Hints:
http://www.man.poznan.pl/cmst/papers/5/art_2/vol5a
rt2.html
Palingram.pl