The document contains information about regular expressions (regexes), including their basic structure, common features like character types and anchors, and more advanced concepts like groups, capturing groups, backreferences, and quantifiers. It provides examples to demonstrate how different regex patterns can be used to match text. The document is intended as a reference for understanding regex syntax and utilizing various regex constructs.
grep is a command line utility that searches for patterns in files. It searches the given file(s) for lines containing a match to the given strings or regular expressions and outputs the matching lines. By default, grep prints the matching lines. The name "grep" derives from a similar operation using the Unix/Linux text editor ed. grep uses regular expressions to search for text patterns that can match multiple characters and patterns in text. It has many options to customize searches, such as ignoring case, counting matches, inverting matches, and more.
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/ibrettflorio
REGEX! Love it or hate it, sometimes you actually need it. And when that time comes, there's no reason to be afraid or to ask help from that one weirdo on your team who actually loves regular expressions. (I'm that weirdo, fwiw.)
This session is geared towards beginning and intermediate regex users, as well as experienced programmers and developers who just don't really grok regex. We'll cover the following topics using practical examples that you might encounter in your own projects. (ie. No matching against "dog" and "cat".)
* What is regex? How's it work? A brief history.
* Syntax, special characters, character classes
* Grouping, capturing, and common gotchas
* Use cases for matching, validating, and replacing
* More advanced topics like backreferences and lookarounds
Tutorial on Regular Expression in Perl (perldoc Perlretut)FrescatiStory
This document provides a tutorial on regular expressions (regexps) in Perl. It begins by explaining that regexps describe patterns that can be used to search and extract parts of strings. The tutorial then covers basic regexp concepts like word matching, character classes that match multiple characters, and anchor characters "^" and "$" to match starts or ends of strings. It provides many examples of using simple regexps to search strings and emulates the Unix grep command as an example Perl program.
This document provides a summary of key Ruby concepts including variables, conditional statements, functions, objects, arrays, hashes, and iteration. It explains that variables are created through assignment and can contain letters, numbers, or underscores. Conditional statements like if/else use equality checks. Functions are defined with def and can take parameters and return values. The core data types are objects, arrays, and hashes. Arrays use indexes to access elements while hashes use keys to map to values. Iteration allows processing each element with blocks.
This document provides a quick reference for the Ruby programming language, covering topics like general syntax rules, data types, variables, operators, control expressions, classes and modules, exceptions, and the standard library. It includes short descriptions and syntax examples for core Ruby concepts like strings, arrays, hashes, regular expressions, methods, and blocks in 3 sentences or less.
Regular expressions (regex) are a language used to parse text and apply logic and constraints to find patterns in strings. They provide a concise way to find matches in text that is supported across many programming languages. This document provides an overview of regex, examples of code using regex in different languages, and descriptions of common regex patterns and metacharacters used to define matching rules. It recommends resources for further reading on mastering regular expressions.
The document provides an introduction to Perl programming and regular expressions. It begins with simple Perl programs to print text and take user input. It then covers executing external commands, variables, operators, loops, and file operations. The document also introduces regular expressions, explaining patterns, anchors, character classes, alternation, grouping, and repetition quantifiers. It provides examples and discusses principles for matching strings with regular expressions.
grep is a command line utility that searches for patterns in files. It searches the given file(s) for lines containing a match to the given strings or regular expressions and outputs the matching lines. By default, grep prints the matching lines. The name "grep" derives from a similar operation using the Unix/Linux text editor ed. grep uses regular expressions to search for text patterns that can match multiple characters and patterns in text. It has many options to customize searches, such as ignoring case, counting matches, inverting matches, and more.
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/ibrettflorio
REGEX! Love it or hate it, sometimes you actually need it. And when that time comes, there's no reason to be afraid or to ask help from that one weirdo on your team who actually loves regular expressions. (I'm that weirdo, fwiw.)
This session is geared towards beginning and intermediate regex users, as well as experienced programmers and developers who just don't really grok regex. We'll cover the following topics using practical examples that you might encounter in your own projects. (ie. No matching against "dog" and "cat".)
* What is regex? How's it work? A brief history.
* Syntax, special characters, character classes
* Grouping, capturing, and common gotchas
* Use cases for matching, validating, and replacing
* More advanced topics like backreferences and lookarounds
Tutorial on Regular Expression in Perl (perldoc Perlretut)FrescatiStory
This document provides a tutorial on regular expressions (regexps) in Perl. It begins by explaining that regexps describe patterns that can be used to search and extract parts of strings. The tutorial then covers basic regexp concepts like word matching, character classes that match multiple characters, and anchor characters "^" and "$" to match starts or ends of strings. It provides many examples of using simple regexps to search strings and emulates the Unix grep command as an example Perl program.
This document provides a summary of key Ruby concepts including variables, conditional statements, functions, objects, arrays, hashes, and iteration. It explains that variables are created through assignment and can contain letters, numbers, or underscores. Conditional statements like if/else use equality checks. Functions are defined with def and can take parameters and return values. The core data types are objects, arrays, and hashes. Arrays use indexes to access elements while hashes use keys to map to values. Iteration allows processing each element with blocks.
This document provides a quick reference for the Ruby programming language, covering topics like general syntax rules, data types, variables, operators, control expressions, classes and modules, exceptions, and the standard library. It includes short descriptions and syntax examples for core Ruby concepts like strings, arrays, hashes, regular expressions, methods, and blocks in 3 sentences or less.
Regular expressions (regex) are a language used to parse text and apply logic and constraints to find patterns in strings. They provide a concise way to find matches in text that is supported across many programming languages. This document provides an overview of regex, examples of code using regex in different languages, and descriptions of common regex patterns and metacharacters used to define matching rules. It recommends resources for further reading on mastering regular expressions.
The document provides an introduction to Perl programming and regular expressions. It begins with simple Perl programs to print text and take user input. It then covers executing external commands, variables, operators, loops, and file operations. The document also introduces regular expressions, explaining patterns, anchors, character classes, alternation, grouping, and repetition quantifiers. It provides examples and discusses principles for matching strings with regular expressions.
This document provides an overview of string matching and regular expression functions in Perl. It defines functions like split and join for manipulating strings. It explains different ways of specifying regular expressions using delimiters and modifiers. Special variables like $', $& and $` that capture matched strings are also described. Examples are given to illustrate string substitution operators and capturing matched text using parentheses in regular expressions.
This document provides an introduction to regular expressions (regexes). It explains that regexes describe patterns of text that can be used to search for and replace text. It covers basic regex syntax like literals, wildcards, anchors, quantifiers, character sets, flags, backreferences, and the RegExp object. It also discusses using regexes in JavaScript string methods, text editors, and command line tools.
This document provides an introduction to using regular expressions (REs) in the Textpad text editor. It begins with an overview of why REs are useful for searching text, then demonstrates how to write basic REs to find single characters, character ranges, word endings, and more. It also explains how to use RE features like groups, replacements, counting characters, and searching files or folders. The goal is to help uninitiated users learn the basics of writing and using REs in Textpad.
This document discusses regular expressions and pattern matching in Perl. It explains that regular expressions allow pattern matching in Perl using operators like m//. It provides examples of matching patterns using metacharacters like ., quantifiers like *, grouping using (), alternation with |, and anchors like ^ and $. The document also covers character classes, substitution using s///, and more advanced regular expression topics in Perl.
Perl is an interpreted, general-purpose programming language originally developed for text manipulation and now used widely for a variety of tasks including system administration, web development, and more. It has a small number of basic data types (scalars, arrays, hashes) and supports both procedural and object-oriented programming. Key elements of Perl include its C-style syntax, dynamic typing, and emphasis on practical solutions over purity.
The document provides information about a bioinformatics practicum including:
1. An agenda for the practicum covering regex, arrays/hashes, and programming in Perl.
2. An introduction to regular expressions including what they are, why they are used, and basic regex atoms.
3. Examples of using regular expressions in Perl including pattern matching, memory parentheses, finding all matches, and substitution functions.
This document introduces regular expressions (regex), which are patterns used to match character combinations in strings. Regex can be used for text search/replace, validation, and other string manipulation tasks. The basics covered include matching characters, character classes, quantifiers, grouping, alternation, anchors, and capturing subgroups for replacement. Examples demonstrate matching names, dates, URLs, and other patterns.
The document discusses topics related to practicing bioinformatics including:
- Installing and working with the TextPad text editor
- Regular expressions (regex), including patterns, quantifiers, anchors, grouping, alternation, and variable interpolation
- Using regex memory variables ($1, $2, etc.) to extract matched substrings
- The s/// substitution operator and tr/// translation operator
- Applying these skills to tasks like finding restriction enzyme cut sites in DNA sequences
This document provides an overview of the Perl programming language. It covers what Perl is, how to create and run Perl scripts, scalar and array variables, hashes, control structures like if/else and loops, file operations, and common Perl functions like split and join. Advanced Perl concepts like subroutines, regular expressions, and object-oriented programming are also mentioned. Resources for learning more about Perl like documentation, books, and mailing lists are provided at the end.
Regular expressions (regex) are text strings that describe search patterns to match or find other strings or sets of strings. They are useful for tasks like syntax highlighting, find and replace, text searching, and programming. The document discusses various regex constructs like single/multiple characters, word groups, quantifiers, grouping, greedy/lazy matching, lookahead/behind, and provides examples of using regex in C# and Linux tools. It notes pros of regex like flexibility and expressing complex patterns concisely, but also cons like difficulty of reading and debugging complex patterns.
This document summarizes the key topics that will be covered in an introduction to Perl programming course on day 2, including types of variables, references, sorting, and object orientation. The schedule outlines times for lectures, breaks and lunch. Resources provided include slides, slideshare, and an online community.
The document discusses string manipulation and regular expressions. It provides explanations of regular expression syntax including brackets, quantifiers, predefined character ranges, and flags. It also summarizes PHP functions for regular expressions like ereg(), eregi(), ereg_replace(), split(), and sql_regcase(). Practical examples of using these functions are shown.
Regular expressions are patterns used to match strings in programs like sed, awk, grep, and others. They descend from finite automata theory and are used for tasks like searching text files, lexical analysis of programming languages, and early web search engines. In Linux systems, regular expressions are used by commands like sed, grep, and vi. Sed can perform editing operations on lines that match regular expression addresses.
This document provides an overview and introduction to using regular expressions (regex) in Perl. It discusses the basic building blocks of regex patterns including characters, character classes, quantifiers, anchors, grouping, alternation, and interpolation. It explains how to use the binding operator (=~) to apply a regex to a variable. It also covers retrieving matched substrings using pattern memory ($1, $2 etc.) and finding all matches using the 'g' modifier. The document demonstrates substituting text using the s/// function and provides an example of the tr/// function.
This document provides an overview of Ruby for Java developers, covering the history and culture of both languages, their technical backgrounds, key differences in their languages and frameworks, and how Ruby on Rails works. It demonstrates Ruby concepts through examples and concludes with a discussion on performance and common use cases for each language.
Become a deft manipulator of text data. Regular Expression is the miracle of text extraction. If you got a text patten in mind, you can write your own pattern match in regular expression.
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
This document discusses regular expressions and provides examples of using regular expression functions in SQL. It begins with an introduction to regular expressions and meta characters. It then demonstrates the REGEXP_LIKE, REGEXP_REPLACE, REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_COUNT functions with examples. The document is presented by Danny Bryant, an IT manager at the City of Atlanta.
PT.BUZOO INDONESIA is No1 Japanese offshore development company in Indonesia.
We are professional of web solution and smartphone apps. We can support Japanese, English and Indonesia.
We are hiring now at http://buzoo.co.id/
Perl is a general-purpose programming language created by Larry Wall in 1987. It supports both procedural and object-oriented programming. Perl is useful for tasks like web development, system administration, text processing and more due to its powerful built-in support for text processing and large collection of third-party modules. Basic Perl syntax includes variables starting with $, @, and % for scalars, arrays, and hashes respectively. Conditional and looping constructs like if/else, while, and for are also supported.
The document provides guidelines for Ruby coding conventions including file names, file organization, comments, definitions, statements, white space, naming conventions, and pragmatic programming practices. It recommends lower case class/module names with '.rb' suffixes for files, using require and include at the top of files, and organizing files with a beginning comment, classes/modules, main, and optional testing code. It provides examples of different types of comments and statements.
This document provides an overview of key concepts in probability theory, including:
1) Empirical and theoretical probabilities, which are determined through experimentation/observation and mathematical calculations respectively.
2) Compound events and the addition rule, where the probability of two events occurring is the sum of their individual probabilities minus the probability of both occurring.
3) The multiplication rule for independent events, where the probability of two independent events occurring is the product of their individual probabilities.
4) Dependent events and Bayes' theorem, where the probability of one event is affected by another occurring.
This document provides an overview of string matching and regular expression functions in Perl. It defines functions like split and join for manipulating strings. It explains different ways of specifying regular expressions using delimiters and modifiers. Special variables like $', $& and $` that capture matched strings are also described. Examples are given to illustrate string substitution operators and capturing matched text using parentheses in regular expressions.
This document provides an introduction to regular expressions (regexes). It explains that regexes describe patterns of text that can be used to search for and replace text. It covers basic regex syntax like literals, wildcards, anchors, quantifiers, character sets, flags, backreferences, and the RegExp object. It also discusses using regexes in JavaScript string methods, text editors, and command line tools.
This document provides an introduction to using regular expressions (REs) in the Textpad text editor. It begins with an overview of why REs are useful for searching text, then demonstrates how to write basic REs to find single characters, character ranges, word endings, and more. It also explains how to use RE features like groups, replacements, counting characters, and searching files or folders. The goal is to help uninitiated users learn the basics of writing and using REs in Textpad.
This document discusses regular expressions and pattern matching in Perl. It explains that regular expressions allow pattern matching in Perl using operators like m//. It provides examples of matching patterns using metacharacters like ., quantifiers like *, grouping using (), alternation with |, and anchors like ^ and $. The document also covers character classes, substitution using s///, and more advanced regular expression topics in Perl.
Perl is an interpreted, general-purpose programming language originally developed for text manipulation and now used widely for a variety of tasks including system administration, web development, and more. It has a small number of basic data types (scalars, arrays, hashes) and supports both procedural and object-oriented programming. Key elements of Perl include its C-style syntax, dynamic typing, and emphasis on practical solutions over purity.
The document provides information about a bioinformatics practicum including:
1. An agenda for the practicum covering regex, arrays/hashes, and programming in Perl.
2. An introduction to regular expressions including what they are, why they are used, and basic regex atoms.
3. Examples of using regular expressions in Perl including pattern matching, memory parentheses, finding all matches, and substitution functions.
This document introduces regular expressions (regex), which are patterns used to match character combinations in strings. Regex can be used for text search/replace, validation, and other string manipulation tasks. The basics covered include matching characters, character classes, quantifiers, grouping, alternation, anchors, and capturing subgroups for replacement. Examples demonstrate matching names, dates, URLs, and other patterns.
The document discusses topics related to practicing bioinformatics including:
- Installing and working with the TextPad text editor
- Regular expressions (regex), including patterns, quantifiers, anchors, grouping, alternation, and variable interpolation
- Using regex memory variables ($1, $2, etc.) to extract matched substrings
- The s/// substitution operator and tr/// translation operator
- Applying these skills to tasks like finding restriction enzyme cut sites in DNA sequences
This document provides an overview of the Perl programming language. It covers what Perl is, how to create and run Perl scripts, scalar and array variables, hashes, control structures like if/else and loops, file operations, and common Perl functions like split and join. Advanced Perl concepts like subroutines, regular expressions, and object-oriented programming are also mentioned. Resources for learning more about Perl like documentation, books, and mailing lists are provided at the end.
Regular expressions (regex) are text strings that describe search patterns to match or find other strings or sets of strings. They are useful for tasks like syntax highlighting, find and replace, text searching, and programming. The document discusses various regex constructs like single/multiple characters, word groups, quantifiers, grouping, greedy/lazy matching, lookahead/behind, and provides examples of using regex in C# and Linux tools. It notes pros of regex like flexibility and expressing complex patterns concisely, but also cons like difficulty of reading and debugging complex patterns.
This document summarizes the key topics that will be covered in an introduction to Perl programming course on day 2, including types of variables, references, sorting, and object orientation. The schedule outlines times for lectures, breaks and lunch. Resources provided include slides, slideshare, and an online community.
The document discusses string manipulation and regular expressions. It provides explanations of regular expression syntax including brackets, quantifiers, predefined character ranges, and flags. It also summarizes PHP functions for regular expressions like ereg(), eregi(), ereg_replace(), split(), and sql_regcase(). Practical examples of using these functions are shown.
Regular expressions are patterns used to match strings in programs like sed, awk, grep, and others. They descend from finite automata theory and are used for tasks like searching text files, lexical analysis of programming languages, and early web search engines. In Linux systems, regular expressions are used by commands like sed, grep, and vi. Sed can perform editing operations on lines that match regular expression addresses.
This document provides an overview and introduction to using regular expressions (regex) in Perl. It discusses the basic building blocks of regex patterns including characters, character classes, quantifiers, anchors, grouping, alternation, and interpolation. It explains how to use the binding operator (=~) to apply a regex to a variable. It also covers retrieving matched substrings using pattern memory ($1, $2 etc.) and finding all matches using the 'g' modifier. The document demonstrates substituting text using the s/// function and provides an example of the tr/// function.
This document provides an overview of Ruby for Java developers, covering the history and culture of both languages, their technical backgrounds, key differences in their languages and frameworks, and how Ruby on Rails works. It demonstrates Ruby concepts through examples and concludes with a discussion on performance and common use cases for each language.
Become a deft manipulator of text data. Regular Expression is the miracle of text extraction. If you got a text patten in mind, you can write your own pattern match in regular expression.
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
This document discusses regular expressions and provides examples of using regular expression functions in SQL. It begins with an introduction to regular expressions and meta characters. It then demonstrates the REGEXP_LIKE, REGEXP_REPLACE, REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_COUNT functions with examples. The document is presented by Danny Bryant, an IT manager at the City of Atlanta.
PT.BUZOO INDONESIA is No1 Japanese offshore development company in Indonesia.
We are professional of web solution and smartphone apps. We can support Japanese, English and Indonesia.
We are hiring now at http://buzoo.co.id/
Perl is a general-purpose programming language created by Larry Wall in 1987. It supports both procedural and object-oriented programming. Perl is useful for tasks like web development, system administration, text processing and more due to its powerful built-in support for text processing and large collection of third-party modules. Basic Perl syntax includes variables starting with $, @, and % for scalars, arrays, and hashes respectively. Conditional and looping constructs like if/else, while, and for are also supported.
The document provides guidelines for Ruby coding conventions including file names, file organization, comments, definitions, statements, white space, naming conventions, and pragmatic programming practices. It recommends lower case class/module names with '.rb' suffixes for files, using require and include at the top of files, and organizing files with a beginning comment, classes/modules, main, and optional testing code. It provides examples of different types of comments and statements.
This document provides an overview of key concepts in probability theory, including:
1) Empirical and theoretical probabilities, which are determined through experimentation/observation and mathematical calculations respectively.
2) Compound events and the addition rule, where the probability of two events occurring is the sum of their individual probabilities minus the probability of both occurring.
3) The multiplication rule for independent events, where the probability of two independent events occurring is the product of their individual probabilities.
4) Dependent events and Bayes' theorem, where the probability of one event is affected by another occurring.
Miles is a software company that provides technological solutions for asset and wealth management. It was founded in 1999 and has over a decade of experience in this domain. It has various product lines for portfolio management, securities lending, custody management and fund management. It caters to clients across Asia, Middle East and Europe with over 8 out of 10 portfolio management systems in India licensed by Miles.
This document provides an overview of data warehousing concepts including definitions of data warehousing, the components of a data warehouse architecture, characteristics of data, and the process of data modeling. It describes what a data warehouse is and some key elements like the data sources, data integration, business intelligence tools, and different types of databases. It also discusses data attributes, metadata, and the three levels of data modeling.
This document discusses elimination reactions, specifically E1 and E2 reactions. It explains that E1 reactions proceed through a carbocation intermediate and involve a two-step mechanism, while E2 reactions are concerted and involve both the alkyl halide and base in a single step. It also describes factors that influence the reactivity and selectivity of elimination reactions, such as substrate structure, the nature of the leaving group and base, and conformational effects.
Miles Software provides wealth management software solutions for financial institutions. It has over 250 customers globally and offers an integrated product suite including solutions for wealth management, portfolio management, custody management, and fund management. Miles Software was founded in 1999 and is headquartered in Mumbai, India with additional offices in the UAE.
MoneyWare FundWare is a comprehensive investment compliance monitoring solution that seamlessly integrates and automates the asset management lifecycle. It supports multiple currencies and asset classes, provides flexible reporting, and comprehensive audit trails. The solution monitors regulatory compliance and risk management for various fund types including conventional, Islamic, private equity, hedge, and UCITS funds.
Trusted Techno Business Partner Delivering Excellence and Value to Financial Services Firms Globally
Miles Software is a leading provider of wealth and asset management software solutions, with proven market leadership in India. They have over 250 customers globally across Asia Pacific and EMEA regions. Miles offers the MoneyWare suite of products that cater to dynamic needs of wealth and portfolio management for global capital markets.
Trusted Techno Business Partner Delivering Excellence and Value to Financial Services Firms Globally provides proven wealth and asset management solutions for global markets from a market leader. It has over 250 customers across Asia Pacific and EMEA regions, including private banks, asset management companies, and brokerage firms. The company focuses on continuous innovation in wealth and asset management solutions to meet the dynamic needs of global capital markets.
Quando o objetivo é vender, uma modificação mínima pode impactar a taxa de conversão final. Para que um e-commerce atinja sua performance máxima, é necessário fazer com que os componentes, a equipe e mínimos detalhes funcionem em perfeita harmonia -- e o front-end é um deles.
Nesta palestra, mostrei os desafios enfrentados pelo time de engenharia de Front-end da Baby.com.br: como trabalhar com uma equipe com vários desenvolvedores, gerando componentes auto-contidos, testáveis e escaláveis, mantendo a melhor performance possível, sem perder o padrão de qualidade.
Fonte das métricas: http://blog.bizelo.com/blog/2012/10/18/infographic-shopping-cart-abandonment-rates/
This document discusses the key principles and processes involved in drug discovery and drug-receptor interactions. It outlines the steps of choosing a disease target, identifying a bioassay to test potential drug candidates, finding and isolating lead compounds, determining a drug's structure and effects, and identifying forces that cause drug-receptor binding such as covalent bonding, hydrogen bonding, and hydrophobic interactions. The goal is to discover and develop safe and effective therapeutic drugs through a scientific process.
This document discusses elimination reactions, specifically E1 and E2 reactions. It explains that E1 reactions proceed through a carbocation intermediate and involve a two-step mechanism, while E2 reactions are concerted and involve both the alkyl halide and base in a single step. It also describes factors that influence the reactivity and selectivity of elimination reactions, such as substrate structure, the nature of the leaving group and base, and conformational effects.
What are the challenges we face while developing the front-end for the largest accommodations reservations website in the world?
Working on an e-commerce interface is already a complex task itself; how do we make it work in 224 countries, for customers all around the world? In this presentation, we'll see how our architecture, performance and UI decisions impact the experience of millions of partners and users who book a room with us.
This document summarizes Parkinson's disease and its treatment options. Parkinson's is caused by low dopamine and high acetylcholine levels in the brain, which leads to symptoms like tremors, rigidity, and slowed movement. The main treatments are drugs that increase dopamine like levodopa and dopamine agonists, or decrease acetylcholine like anticholinergics. Levodopa works by crossing the blood brain barrier and being converted to dopamine by dopamine decarboxylase. It is often given with carbidopa to prevent this conversion in other tissues and increase brain levodopa levels. Side effects of these drugs can include increased heart rate, stimulation of the central nervous system, and gastrointestinal issues.
Regular expressions (regex) allow complex pattern matching in text. The document discusses regex basics like literals, character classes, quantifiers, and flags in Python. It explains how to use the re module to compile patterns into RegexObjects and search/match strings. RegexObjects provide reusable patterns while re module functions provide shortcuts but cache compiled patterns.
This document discusses regular expressions and provides examples. It introduces regular expressions and their uses for validating email addresses. It then outlines the main types of regular expressions, including simple, POSIX basic, POSIX extended, and Perl regular expressions. POSIX basic regular expressions are described as creating a standard for Unix tools.
This document provides an introduction to regular expressions (RegEx). It explains that RegEx allows you to find, match, compare or replace text patterns. It then discusses the basic building blocks of RegEx, including characters, character classes, quantifiers, and assertions. It provides several examples of RegEx patterns to match names, words, ports numbers, and other patterns. It concludes with an overview of common RegEx match types like beginning/end of line, word boundaries, grouping, alternatives, and repetition.
This document provides an introduction and overview of regular expressions (regex). It begins with an introduction to regex and how they are used. It then outlines the table of contents which covers literal characters, character classes, anchors, boundaries, alternation, optional and repetitive items, and grouping. The document discusses topics like literal characters, special characters, character classes, shorthand classes, negated classes, metacharacters inside classes, dot matching, and start/end of string anchors. It provides examples and explanations of how regex engines work to match patterns.
The document provides a quick reference guide and cheat sheet for basic regular expressions (regex). It includes tables that list common regex syntax elements, with examples and explanations. The tables are meant to serve as an accelerated course for regex beginners to have as a reference. They cover basic characters, quantifiers, character classes, anchors and other syntax. The full document encourages printing the tables for easy reference when working with regex.
Do you have data and lists you keep having to massage to make it useful for your project? Have you heard of regular expressions but been frightened by the Klingon-looking examples? Fear no longer!
I’ll demystify regular expressions and show you how best to do them in PHP. We’ll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I’ll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
Regular Expressions: JavaScript And BeyondMax Shirshin
Regular Expressions is a powerful tool for text and data processing. What kind of support do browsers provide for that? What are those little misconceptions that prevent people from using RE effectively?
The talk gives an overview of the regular expression syntax and typical usage examples.
Regular expressions (regexes) allow complex pattern matching in text. They are used to find, extract, or replace substrings. Key regex concepts covered in the document include:
- Common regex syntax like literals, character classes, quantifiers, greedy/lazy matching, anchors, word boundaries and subpatterns.
- How a regex engine works by trying to match a pattern to a subject string and returning the earliest match.
- Uses of regex like searching text, extracting/removing/replacing strings, and validating formats.
- Terminology like regex, subject string, match, and engine.
Can someone please fix my code for a hashtable frequencey counter I.pdfanjandavid
Can someone please fix my code for a hashtable frequencey counter: Implement a Java program
that counts word frequencies in a text file. Use a hashtable to store the data the words are the
keys, and their frequencies are the values. The output of the program is the complete list of all
the words and their frequencies in descending order of frequencies when two words have the
same frequency, output them by alphabetical order. Each line of output consists of a word, a tab,
and the frequency. A sample input is \"The Tragedy of Hamlet, Prince of Denmark\", in the file
Hamlet When processing the text, you should keep just the words, and discard all punctuation
marks, digits, and so on. You also need to turn all upper case letters to lower cases.
Solution
Your code is fine except for apostrophe was also being replacced by space because you were
replaceing everything other than alphabets with space. I put apostrophe in exclusion list and then
processed that seperately. Check the code below
import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;
import java.util.ArrayList;
/**
* Created by abdul on 2/6/2017.
*/
public class WordCounter {
public static String finalWord(String str) {
String processedWord = str.replaceAll(\"[^a-zA-Z\']\", \" \").toLowerCase();
processedWord = processedWord.replaceAll(\"[\']\", \"\"); // replace \' with empty string
return processedWord;
}
private Hashtable hash = new Hashtable();
public void fileInput() throws FileNotFoundException {
File text = new File(\"E:/Shake.txt\");
String word;
int count = 1;
Scanner in = new Scanner(text);
ArrayList list = new ArrayList<>();
while (in.hasNextLine()) {
String line = in.nextLine();
line = finalWord(line);
StringTokenizer st = new StringTokenizer(line);
while (st.hasMoreTokens()) {
word = st.nextToken();
if (hash.containsKey(word)) {
hash.put(word, hash.get(word) + 1);
//int count = (Integer) hash.get(word);
//hash.put(word, count + 1);
count++;
} else {
hash.put(word, 1);
count = 1;
}
}
}
Map map = new TreeMap(hash);
//System.out.println(map);
Set set = map.entrySet();
Iterator i = set.iterator();
while(i.hasNext()){
Map.Entry me = (Map.Entry)i.next();
System.out.print(me.getKey() + \" \");
System.out.println(me.getValue());
}
}
public static void main (String[]args) throws FileNotFoundException {
WordCounter abc = new WordCounter();
abc.fileInput();
}
}.
The document provides an overview of regular expressions (regex), including how a regex engine works, common applications and techniques, and components of a regex pattern. It discusses matching, extracting, and replacing text using regex as well as operators, character classes, anchors, repetition, groups, alternation, lookahead/lookbehind, and techniques for optimizing expressions. The document is intended to introduce readers to the basics of text processing with regex.
Don't Fear the Regex - CapitalCamp/GovDays 2014Sandy Smith
Have you been scared off by Klingon-looking one-liners in Perl? Do you resort to writing complicated recursive functions just to parse some HTML? Don't!
I'll demystify regular expressions and show you how best to do them in PHP. We'll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I'll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
Given at CapitalCamp & GovDays 2014
Don't Fear the Regex - Northeast PHP 2015Sandy Smith
Have you been scared off by Klingon-looking one-liners in Perl? Do you resort to writing complicated recursive functions just to parse some HTML? Don’t!
I’ll demystify regular expressions and show you how best to do them in PHP. We’ll cover the syntax and functions that make PHP a great text-parsing language, and give you the foundation to learn more.
As a bonus, I’ll give you two cases people often use as examples for regexes that PHP gives you better native ways to accomplish.
This document provides an overview of JavaScript including:
- It is a dynamic programming language that is interpreted and has object-oriented capabilities. It was introduced in Netscape 2.0 in 1995.
- It allows for increased interactivity on webpages through features like validating user input without page reloads and creating rich interfaces.
- It has various data types, variables, operators, and control structures like conditionals and loops.
- It uses objects and object-oriented programming. Common built-in objects include numbers, strings, arrays, dates and more.
- Functions are reusable blocks of code that can be defined and called. Events allow JavaScript to interact with HTML. Regular expressions provide pattern matching capabilities.
This document provides an introduction to regular expressions (regex) in PHP. It begins with a basic explanation of what regex is for - matching patterns in strings. It then covers the basics of regex syntax, including delimiters, character classes, quantifiers, escaping special characters, and using regex in PHP functions like preg_match() and preg_replace(). It also discusses more advanced topics like character classes, subpatterns, backreferences, modifiers, and when not to use regex. The overall message is that regex is a powerful tool for text manipulation but needs to be used appropriately.
Regular expressions are a powerful tool for searching, matching, and parsing text patterns. They allow complex text patterns to be matched with a standardized syntax. All modern programming languages include regular expression libraries. Regular expressions can be used to search strings, replace parts of strings, split strings, and find all occurrences of a pattern in a string. They are useful for tasks like validating formats, parsing text, and finding/replacing text. This document provides examples of common regular expression patterns and methods for using regular expressions in Python.
The document discusses regular expressions and text processing in Python. It covers various components of regular expressions like literals, escape sequences, character classes, and metacharacters. It also discusses different regular expression methods in Python like match, search, split, sub, findall, finditer, compile and groupdict. The document provides examples of using these regular expression methods to search, find, replace and extract patterns from text.
PHP strings allow storing and manipulating text data. A string is a series of characters that can contain any number of characters limited only by available memory. Strings can be written using single quotes, double quotes, or heredoc syntax. Special characters in strings must be escaped using a backslash. PHP provides many built-in functions for working with strings like concatenation, comparison, searching, replacing, extracting, splitting, joining, formatting and more. Regular expressions provide powerful pattern matching capabilities for strings and PHP has functions like preg_match() for searching strings using regex patterns.
Regular expressions are very commonly used to process and validate text data. Unfortunately, they suffer from various limitations. I will discuss the limitations and illustrate how using grammars instead can be an improvement. I will make the examples oncrete using parsing libraries in a couple of representative languages, although the ideas are language-independent.
I will emphasize ease of use, since one reason for the overuse of regular expressions is that they are so easy to pull out of one's toolbox.
Unit 1-strings,patterns and regular expressionssana mateen
The document provides an introduction to regular expressions (regex). It discusses that regex allow for defining patterns to match strings. It then covers simple regex patterns and operators like character classes, quantifiers, alternations, grouping and anchors. The document also discusses more advanced regex topics such as back references, match operators, substitution operators, and the split operator.
Discover the benefits of outsourcing SEO to Indiadavidjhones387
"Discover the benefits of outsourcing SEO to India! From cost-effective services and expert professionals to round-the-clock work advantages, learn how your business can achieve digital success with Indian SEO solutions.
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
Ready to Unlock the Power of Blockchain!Toptal Tech
Imagine a world where data flows freely, yet remains secure. A world where trust is built into the fabric of every transaction. This is the promise of blockchain, a revolutionary technology poised to reshape our digital landscape.
Toptal Tech is at the forefront of this innovation, connecting you with the brightest minds in blockchain development. Together, we can unlock the potential of this transformative technology, building a future of transparency, security, and endless possibilities.
Instagram has become one of the most popular social media platforms, allowing people to share photos, videos, and stories with their followers. Sometimes, though, you might want to view someone's story without them knowing.
Gen Z and the marketplaces - let's translate their needsLaura Szabó
The product workshop focused on exploring the requirements of Generation Z in relation to marketplace dynamics. We delved into their specific needs, examined the specifics in their shopping preferences, and analyzed their preferred methods for accessing information and making purchases within a marketplace. Through the study of real-life cases , we tried to gain valuable insights into enhancing the marketplace experience for Generation Z.
The workshop was held on the DMA Conference in Vienna June 2024.
Gen Z and the marketplaces - let's translate their needs
Regular expressions
1. DZone, Inc. | www.dzone.com
By Mike Keith
#127
Getting Started with JPA 2.0 Get More Refcardz! Visit refcardz.com
CONTENTS INCLUDE:
n Whats New in JPA 2.0
n JDBC Properties
n Access Mode
n Mappings
n Shared Cache
n Additional API and more...
Getting Started with
JPA 2.0
2. #196
DZone, Inc. | www.dzone.com Get More Refcardz! Visit refcardz.com
Regular Expressions
By: Callum Macrae
About Regular Expressions
A regular expression, also known as a regex or regexp, is a way of defining
a search pattern. Think of regexes as wildcards on steroids. Using
wildcards, *.txt matches anything followed by the string “.txt”. But regular
expressions let you e.g.: match specific characters, refer back to previous
matches in the expression, include conditionals within the expression, and
much more.
This Refcard assumes basic familiarity with program structures, but no
prior knowledge of regular expressions. It would be wise to play with as
many of the examples as possible, since mastering regex requires a good
deal of practical experience.
There is no single standard that defines what regular expressions must
and must not do, so this Refcard does not cover every regex version or
‘flavor’. Parts of this Refcard will apply to some flavors (for example, PHP
and Perl), while the same syntax will throw parse errors with other flavors
(for example, JavaScript, which has a surprisingly poor regex engine).
Regular expressions are supported in most programming and scripting
languages, including Java, .NET, Perl, Python, PHP, JavaScript, MySQL,
Ruby, and XML. Languages that do not natively support regular
expressions nearly always have libraries available to add support. If
you prefer one regex flavor to the flavor the language natively supports,
libraries are often available which will add support for your preferred flavor
(albeit often using a slightly different syntax to the native expressions).
The Structure of a Regular Expression
All regular expressions follow the same basic structure: expression plus
flag. In the following example, "regexp?" is the expression, and "mg" is the
flag used.
/regexp?/mg
Certain programming languages may also allow you to specify your
expression as a string. This allows you to manipulate the expression
programmatically – something you are very likely to do, once you get the
hang of writing regexes. For example, the following code will specify the
same regular expression as above in JavaScript:
new RegExp(“regexp?”, “mg”); // returns /regexp?/mg
Basic Features
Normal Characters
A regular expression containing no special characters ($()*+.?[^|) will match
exactly what is contained within the expression. The following expression
will match the "world" in "Hello world!":
“Hello world!”.match(/world/); // “world”
These characters are treated as special characters:
$()*+.?[^|
and need to be escaped using the backslash (“”) character:
“Price: $20”.match(/$20/); // “$20”
(For more on handling special characters, see page 2 below.)
Character Types
Regex includes character types that can be used to match a range of
commonly used characters:
Type Description
d Decimal digit: [0-9]
D Not a decimal digit: [^0-9]
s Whitespace character: [ nrtf]
S Not a whitespace character: [^ nrtf]
w Word character: [a-zA-Z0-9_]
W Not a word character: [^a-zA-Z0-9_]
Comments
Comments in regex are inserted like this:
“Hello world!”.match(/wor(?#comment)ld/); // “world”
Many flavors of regex (including Java, .Net, Perl, Python, Ruby, and XPath)
include a free-spacing mode (enabled using the x flag). This lets you use
just a hash character to start a comment:
/
(0?d|1[0-2]) # Month
/ # Separator
(20dd) # Year between 2000 and 2099
/x
CONTENTS INCLUDE:
❱ About Regular Expressions
❱ Advanced Features
❱ The Dot Operator
❱ Quantifiers
❱ Flags
❱ Character Classes... and more!
Regular Expressions
3. 2 Regular Expressions
To provide immediate feedback on regex match results, this Refcard will
include matches in comments, like this:
DZone, Inc. | www.dzone.com
“Regex is awesome”.match(/regexp?/i); // “Regex”
The Dot Operator
The dot character (".") can be used to match any character except for new
line characters (n and r). Be careful with the dot operator: it may match
more than you would intuitively guess.
For example, the following regular expression will match the letter c,
followed by any character that isn't a new line, followed by the letter t:
“cat”.match(/c.t/); // “cat”
The dot operator will only match one character (“a”).
Contrast this simple usage of the dot operator with the following,
dangerous usage:
/<p class=”(.+)”>/
That expression initially looks like it would match an opening HTML
paragraph tag, make sure that the only attribute is the class attribute,
and store the class attribute in a group. The “+” is used to quantify the
dot operator: the “+” says ‘match one or more’. Most of the expression is
treated as literal text, and the (.+) looks for a string of any character of any
length and stores it in a group. However, it doesn’t work. Here’s what it
matches instead:
“<p class=’test’ onclick=’evil()’>”.match(/<p class=’(.+)’>/);
// [“<p class=’test’ onclick=’evil()’>”,
// “test’ onclick=’evil()”]
Instead of matching just the class, it has matched the on-click attribute.
This is because the dot operator, matched with the “+” quantifier, picked
up everything before the closing greater-than. The first lesson: the dot
operator can be a clumsy weapon. Always use the most targeted feature
of regex you can. The second lesson: quantifiers are ex-tremely powerful,
especially with the dot operator.
The Pipe Character
The pipe character causes a regular expression to match either the left side
or the right side. For example, the following expression will match either
"abc" or "def":
/abc|def/
Anchors
Regex anchors match specific, well-defined points, like the start of the
expression or the end of a word. Here are the most important:
Char Description Example
^ The caret character matches
the beginning of a string or the
beginning of a line in multi-line
mode.
/^./
Matches "a" in "abcndef",
or both "a" and "d" in mul-ti-
line mode.
$ The dollar character matches the
end of a string, or the end of a line
in multi-line mode.
/.$/
Matches "f" in "abcndef",
or both "c" and "f" in mul-ti-
line mode.
A Always matches the start of a
string, irrespective of multi-line
mode.
/A./
Matches "a" in "abcndef".
Z Always matches the end of a
string, irrespective of multi-line
mode. If the last character is a line
break, it will match the character
before.
/.Z/
Matches "f" in "abc
ndefn". Matches "f" in
"abcndef"
Char Description Example
z Always matches the end of a
string, irrespective of multi-line
mode. Will always match the last
character.
/.z/
Matches "f" in "abcndef".
Matches "n" in "abc
ndefn"
b Matches a word boundary: the
position between a word character
(w) and a non-word character
or the start or end of the string (
W|^|$).
/.b/
Matches "o", " " and "d" in
"Hello world".
B Matches a non-word boundary:
the position between two word
characters or two non-word
characters.
/B.B/
Matches every letter but
"H" and "o" in "Hello".
< Matches the start of a word -
similar to b, but only at the start
of the word.
/<./
Matches "H" and "w" in
"Hello world".
> Matches the end of a word -
similar to b, but only at the end of
the word.
/.>/
Matches "o" and "d" in
"Hello world".
Groups
Regex groups lump just about anything together. Groups are commonly
used to match part of a regular expression and refer back to it later, or to
repeat a group multiple times.
In this Refcard, groups will be represented like this:
“Regex is awesome”.match(/Regex is (awesome)/);
// [“Regex is awesome”, “awesome”]
Capturing Groups
The most basic group, the capturing group, is marked simply by putting
parentheses around whatever you want to group:
“Hello world!”.match(/Hello (w+)/); // [“Hello world”, “world”]
(Note the w character type, introduced on page 1.) Here the capturing
group is saved and can be referred to later.
Non-capturing Groups
There are also non-capturing groups, which will not save the matching text
and so cannot be referred back to later. We can get these by putting “?:”
after the opening parenthesis of a group:
“Hello world!”.match(/Hello (?:w+)/); // [“Hello world”]
Using Capturing vs. Non-capturing Groups
Use capturing groups only when you will need to refer to the grouped text
later. Non-capturing groups save memory and simplify debugging by not
filling up the returned array with data that you don't need.
Named Capture Groups
When a regex includes a named capture group, the func-tion will return an
object such that you can refer to the matched text using the name of the
group as the property:
“Hello world!”.match(/Hello (?<item>w+)/);
// {“0”: “Hello world”, “1”: “world”, “item”: “world”}
The name of the group is surrounded by angle brackets.
Backreferences
You can refer back to a previous capturing group by using a backreference.
After matching a group, refer back to the text matched in that group by
inserting a backward slash followed by the number of the capturing group.
For example, the following will match the same character four times in a row:
4. 3 Regular Expressions
DZone, Inc. | www.dzone.com
“aabbbb”.match(/(w)1{3}/); // bbbb
In this example, the group matches the first “b”; then 1 refers to that “b” and
matches the letters after it three times (the “3” inside the curly brackets).
Capturing groups are numbered in numerical order starting from 1.
A more complicated (but quite common) use-case involves matching
repeated words. To do this, we first want to match a word. We’ll match
a word by grabbing a word boundary, followed by a group capturing the
first word, followed by a space, followed by a backreference to match
the second word, and then a word boundary to make sure that we’re not
matching word-parts plus whole words:
“hello hello world!”.match(/b(w+) 1b/); “hello hello”
It is possible to have a backreference refer to a named capture group, too,
by using k<groupname>. The following regular expression will do the same
as above:
/b(?<word>w+) k<word>b/
Groups and Operator Precedence
Groups can be used to control effective order of operations. For example,
the pipe character has the lowest operator precedence, so pipes are often
used within groups to con-trol the insertion point of the logical ‘or’. The
following expression will match either "abcdef" or "abcghi":
/abc(def|ghi)/
Character Classes
A character class allows you to match a range or set of characters. The
following uses a character class to match any of the contained letters (in
this case, any vowel):
/[aeiou]/
The following regex will match “c”, followed by a vowel, followed by “t”; it
will match “cat” and “cot”, but not “ctt” or “cyt”:
/c[aeiou]t/
You can also match a range of characters using a character class. For
example, /[a-i]/ will match any of the letters between a and i (inclusive).
Character classes work with numbers too. This is particularly useful for
matching e.g. date-ranges. The following regular expression will match a
date between 1900 and 2013:
/19[0-9][0-9]|200[0-9]|201[0-3]/
Ranges and specific matches can be combined: /[a-mxyz]/ will match the
letters a through m, plus the letters x, y, and z.
Negated Character Classes
We can also use character classes to specify characters we don’t want
to match. These are called negated character classes, and are created by
putting a caret (“^”) at the be-ginning of the class. This indicates that we
want to match a character that doesn’t match any of the characters in a
character class.
The following negated character class will match a “c”, followed by a non-vowel,
followed by a “t”. This means that it will match “cyt”, “c0t” and “c.t”,
but will not match “cat”, “cot”, “ct” or “cyyt”:
/c[^aeiou]t/
Hot
Tip
Be careful to distinguish between negated character classes
(“match not-x”) and simply not matching a character (“do
not match x”).
Quantifiers
Quantifiers are used to repeat something multiple times. The basic quanti-fiers
are: + * ?
The most basic quantifier is "+", which means that something should
be repeated one or more times. For example, the following expression
matches one or more vowels:
“queueing”.match(/[aeiou]+/); // “ueuei”
Note that it doesn’t match the previous match (so not the first “u” multiple
times), but rather the previous expression, i.e. another vowel. In order to
match the previous match, use a group and a backreference, along with the
“*” operator.
The “*” operator is similar to the “+” operator, except that “*” matches zero
or more instead of one or more of an expression.
“ct”.match(/ca*t/); // “ct”
“cat”.match(/ca*t/); // “cat”
“caat”.match(/ca*t/); // “caat”
To match the same character from a match one or more times, use a
group, a backreference, and the “*” operator:
“saae”.match(/([aeiou])1*/); // “aa”
In the string “queuing,” this expression would match only “u”. In “aaeaa”, it
would match only the first “aa”.
The “?” character is the final basic quantifier. It makes the previous
expression optional:
“ct”.match(/ca?t/); // “ct”
“cat”.match(/ca?t/); // “cat”
“caat”.match(/ca?t/); // no match
We can make more powerful quantifiers using curly braces. The following
expressions will both match exactly five oc-currences of the letter “a”.
/^a{5}$/
/^aaaaa$/
The first is a lot more readable, especially with higher numbers or longer
groups, where it would be impractical to type them all out.
We can use curly braces to repeat something between a range of times:
/^a{3,5}$/
That will match the letter “a” repeated 3, 4, or 5 times.
If you want to match something repeated up to a certain number of times,
you can use 0 as the first number. If you want to match something more
than a certain number with no maximum, you can just leave the second
number blank:
/^a{3,}$/
This will match the letter “a” repeated 3 or more times with no maximum
number of repetitions.
Greedy vs. Lazy Matches
There are two types of matches: greedy or lazy. Most flavors of regular
expressions are greedy by default.
5. 4 Regular Expressions
DZone, Inc. | www.dzone.com
To make a quantifier lazy, append a question mark to it.
The difference is best illustrated using an example:
‘He said “Hello” and then “world!”’.match(/”.+”/);
// “Hello” and then “world!”
‘He said “Hello” and then “world!”’.match(/”.+?”/);
// “Hello”
The first match is greedy. It has matched as many characters as it can
(between the first and last quotes). The second match is lazy. It has matched
as few characters as possible (between the first and second quotes).
Note that the question mark does not make the quantifier optional (which is
what the question mark is usually used for in regular expressions).
More examples of greedy vs. lazy quantification:
‘He said “Hello” and then “world!”’.match(/”.+?”/);
// “Hello”
‘He said “Hello” and then “world!”’.match(/”.*?”/);
// “Hello”
“aaa”.match(/aa?a/); // “aaa”
“aaa”.match(/aa??a/); // “aa”
‘He said “Hello” and then “world!”’. match(/”.{2,}?”/);
// “Hello”
Note the difference the question mark makes.
Hot
Tip
Be aware of differences between greedy and lazy matching
when writing regular expressions. A greedy expression
can be rewritten into a shorter lazy expression. For regex
beginners, the difference often accounts for unexpected,
counterintuitive results.
Flags
Flags (also known as "modifiers") can be used to modify how the regular
expression engine will treat the regular expression. In most flavors of
regular expression, flags are specified after the final forward slash, like this:
/regexp?/mg
In this case, the flags are m and g.
Flag support does vary among regex flavors, so check the documentation
for the regex engine in your language or library.
The most common flags are: i, g, m, s, and x.
The i flag is the case insensitive flag. This tells the regex engine to ignore
the case. For example, the character class [a-z] will match [a-zA-Z] when
the i flag is used and the literal character “c” will also match C.
“CAT”.match(/c[aeiou]t/); // no match
“CAT”.match(/c[aeiou]t/i); // “CAT”
The g flag is the global flag. This tells the regex engine to match every
instance of the match, not just the first (which is the default). For example,
when replacing a match with a string in JavaScript, using the global flag
means that every match will be replaced, not just the first. The following
code (in JavaScript) demonstrates this:
“cat cat”.replace(/cat/, “dog”); // “dog cat”
“cat cat”.replace(/cat/g, “dog”); // “dog dog”
The m flag is the multiline flag. This tells the regex engine to match “^” and
“$” to the beginning and end of a line respectively, instead of the beginning
and end of the string (which is the default). Note that A and Z will still only
match the beginning and end of the string, regardless of what multi-line
mode is set to.
In the following code example, we see a lazy match from ^ to $ (we’re using
[sS] because the dot operator doesn’t match new lines by default). When
^ and $ match the beginning of the line (when multi-line mode is enabled), it
returns only one line of text. Otherwise, it returns everything.
“1st linen2nd line”.match(/^[sS]+?$/); // “1st linen2nd line”
“1st linen2nd line”.match(/^[sS]+?$/m); // “1st line”
The s flag is the singleline flag (also called the dotall flag in some flavors). By
default, the dot operator will match every character except for new lines. With
the s flag enabled, the dot operator will match everything, including new lines.
“This isna test”.match(/^.+$/); // no match
“This isna test”.match(/^.+$/s); // “This isna test”
Hot
Tip
This behavior can be simulated in languages where the s
flag isn’t supported by using [sS] instead of the s flag.
The x flag is the extended mode flag, which activates extended mode.
Extended mode allows you to make your regular expressions more
readable by adding whitespace and comments. Consider the following
two expressions, which both do exactly the same thing, but are drastically
different in terms of readability.
/([01]?d{1,3}|200d|201[0-3])([-/.])(0?[1-9]|1[012])2(0?[1-9]|[12]d|3[01])/
/
([01]?d{1,3}|200d|201[0-3]) # year
([-/.]) # separator
(0?[1-9]|1[012]) # month
2 # same separator
(0?[1-9]|[12]d|3[01]) # date
/x
Modifying Flags Mid-Expression
In some flavors of regular expressions, it is possible to modify a flag mid-expression
using a syntax similar (but not identical) to groups.
To enable a flag mid-expression, use “(?i)” where i is the flag you want to
activate:
“foo bar”.match(/foo bar/); // “foo bar”
“foo BAR”.match(/foo bar/); // no match
“foo bar”.match(/foo (?i)bar/); // “foo bar”
“foo BAR”.match(/foo (?i)bar/); // “foo BAR”
To disable a flag, use “(?-i)”, where i is the flag you want to disable:
“foo bar”.match(/foo bar/i); // “foo bar”
“foo BAR”.match(/foo bar/i); // “foo BAR”
“foo bar”.match(/foo (?-i)bar/i); // “foo bar”
“foo BAR”.match(/foo (?-i)bar/i); // no match
You can activate a flag and then deactivate it:
/foo (?i)bar(?-i) world/
And you can activate or deactivate multiple flags at the same time by using
“(?ix-gm)” where i and x are the flags we want enabling and g and m the
ones we want disabling.
Handling Special Characters
To handle special characters in a regular expression, either (a) use their
ASCII or Unicode numbers (in a variety of forms), or (b) just escape them.
Any character that isn’t $()*+.?[]^{|} will match itself, even if it is a special
character. For example, the following will match an umlaut:
/ß/
6. 5 Regular Expressions
To match any of the characters mentioned above ()$()*+.?[]^{|}() you can
prepend them with a backward slash to mark them as literal characters:
DZone, Inc. | www.dzone.com
/$/
It is impractical to put some special characters, such as emoji (which will
not display in most operating systems) and invisible characters, right into a
regular expression. To solve this problem, when writing regular expressions
just use their ASCII or Unicode character codes.
The first ASCII syntax allows you to specify the ASCII character code as
octal, which must begin with a 0 or it will be parsed as a backreference. To
match a space, we can use this:
/040/
You can match ASCII characters using hexadecimal, too. The following will
also match a space:
/x20/
You can also match characters using their Unicode character codes, but
only as hexadecimal. The following will match a space:
/u0020/
A few characters can also be matched using shorter escapes. We saw a
few of them previously (such as t and n for tab and new line characters,
respectively – pretty standard), and the following table defines all of them
properly:
Escape Description Character code
a The alarm character (bell). u0007
b Backspace character, but only when
in a character class—word boundary
otherwise.
u0008
e Escape character. u001B
f Form feed character. u000C
n New line character. u000A
r Carriage return character. u000D
t Tab character. u0009
v Vertical tab character. u000B
Finally, it is possible to specify control characters by pre-pending the letter
of the character with c. The following will match the character created by
ctrl+c:
/cC/
Advanced Features
Advanced features vary more widely by flavor, but several are especially
useful and worth covering here.
POSIX Character Classes
POSIX character classes are a special type of character class which
allow you to match a common range of characters—such as printing
characters—without having to type out an ordinary character class every
time.
The table below lists all the POSIX character classes available, plus their
normal character class equivalents.
Class Description Equivalent
[:alnum:] Letters and digits [a-zA-Z0-9]
Class Description Equivalent
[:alpha:] Letters [a-zA-Z]
[:ascii:] Character codes 0 - x7F [x00-x7F]
[:blank:] Space or tab character [ t]
[:cntrl:] Control characters [x00-x1Fx7F]
[:digit:] Decimal digits [0-9]
[:graph:] Printing characters (minus spaces) [x21-x7E]
[:lower:] Lower case letters [a-z]
[:print:] Printing characters (including
spaces)
[x20-x7E]
[:punct:] Punctuation: [:graph:] minus
[:alnum:]
[:space:] White space [ trnvf]
[:upper:] Upper case letters [A-Z]
[:word:] Digits, letters, and underscores [a-zA-Z0-9_]
[:xdigit:] Hexadecimal digits [a-fA-F0-9]
aSERTIONS
Lookahead Assertions
Assertions match characters but don’t return them. Consider the
expression “>”, which will match only if there is an end of word (i.e.,
the engine will look for (W|$)), but won’t actually return the space or
punctuation mark following the word.
Assertions can accomplish the same thing, but explicitly and therefore with
more control.
For a positive lookahead assertion, use “(?=x)”.
The following assertion will behave exactly like >:
/.(?=W|$)/
This will match the “o” and “d” in the string “Hello world” but it will not
match the space or the end.
Consider the following, more string-centered example. This expression
matches “bc” in the string “abcde”:
abcde”.match(/[a-z]{2}(?=d)/); // “bc”
Or consider the following example, which uses a normal capturing group:
“abcde”.match(/[a-z]{2}(d)/); // [“bcd”, “d”]
While the expression with the capturing group matches the “d” and then
returns it, the assertion matches the character and does not return it. This
can be useful for backrefer-ences or to avoid cluttering up the return value.
The assertion is the “(?=d)” bit. Pretty much anything you would put in a
group in a regular expression can be put into an assertion.
Negative Lookahead Assertions
Use a negative lookahead assertion (“(?!x)”) when you want to match
something that is not followed by something else. For example, the
following will match any letter that is not followed by a vowel:
This expression returns e, l, and o in hello. Here we notice a difference
between positive assertions and negative assertions. The following positive
lookahead does not behave the same as the previous negative lookahead:
/w(?=[^aeiou])/