This talk is going to be centered on two papers that are going to appear in the following months:
Neerja Mhaskar and Michael Soltys, Non-repetitive strings over alphabet lists
to appear in WALCOM, February 2015.
Neerja Mhaskar and Michael Soltys, String Shuffle: Circuits and Graphs
to appear in the Journal of Discrete Algorithms, January 2015.
Visit http://soltys.cs.csuci.edu for more details (these two papers are number 3 and 19 on the page), as well as Python programs that can be used to illustrate the ideas in the papers. We are going to introduce some basic concepts related to computations on string, present some recent results, and propose two open problems.
Thue showed that there exist arbitrarily long square-free strings over an alphabet of three symbols (not true for two symbols). An open problem was posed, which is a generalization of Thue’s original result: given an alphabet list L = L1, . . . , Ln, where |Li| = 3, is it always possible to find a square-free string, w = w1w2 . . . wn, where wi ∈ Li? In this paper we show that squares can be forced on square-free strings over alphabet lists iff a suffix of the square-free string conforms to a pattern which we term as an offending suffix. We also prove properties of offending suffixes. However, the problem remains tantalizingly open.
Analysis & Design of Algorithms
Backtracking
N-Queens Problem
Hamiltonian circuit
Graph coloring
A presentation on unit Backtracking from the ADA subject of Engineering.
Thue showed that there exist arbitrarily long square-free strings over an alphabet of three symbols (not true for two symbols). An open problem was posed, which is a generalization of Thue’s original result: given an alphabet list L = L1, . . . , Ln, where |Li| = 3, is it always possible to find a square-free string, w = w1w2 . . . wn, where wi ∈ Li? In this paper we show that squares can be forced on square-free strings over alphabet lists iff a suffix of the square-free string conforms to a pattern which we term as an offending suffix. We also prove properties of offending suffixes. However, the problem remains tantalizingly open.
Analysis & Design of Algorithms
Backtracking
N-Queens Problem
Hamiltonian circuit
Graph coloring
A presentation on unit Backtracking from the ADA subject of Engineering.
Vector Space & Sub Space Presentation
Presented By: Sufian Mehmood Soomro
Department: (BS) Computer Science
Course Title: Linear Algebra
Shah Abdul Latif University Ghotki Campus
This lecture covers:
1. Regular Language and Regular Operations
2. Closure properties of DFAs
3. Union of 2 DFA machines
4. Intersection of 2 DFA machines
5. Complement of DFA machines
This presentation continues with my series of videos on Straight Lines, coordinate geometry.
Here, we learn how to calculate distance of a point from a line and also distance between 2 parallel lines.
This is useful for grade 11 math students. Problems are explained in a simple and easy way.
Vector Space & Sub Space Presentation
Presented By: Sufian Mehmood Soomro
Department: (BS) Computer Science
Course Title: Linear Algebra
Shah Abdul Latif University Ghotki Campus
This lecture covers:
1. Regular Language and Regular Operations
2. Closure properties of DFAs
3. Union of 2 DFA machines
4. Intersection of 2 DFA machines
5. Complement of DFA machines
This presentation continues with my series of videos on Straight Lines, coordinate geometry.
Here, we learn how to calculate distance of a point from a line and also distance between 2 parallel lines.
This is useful for grade 11 math students. Problems are explained in a simple and easy way.
Fair ranking in competitive bidding procurement: a case analysisMichael Soltys
Fair and transparent procurement procedures are a cornerstone of a well functioning free-market economy. In particular, bidding is a mechanism whereby companies compete for contracts; when functioning well, the process rewards both the quality of the proposal, and the “reasonableness” of the quote.
This is a three part talk, where I give some historical context to computer science, then do a pitch for the field (from the point of view of prospective students), and then I talk about my three different research threads (proof complexity of linear algebra, 0-1 combinatorial matrices, string algorithms), and finish with a talk about security - where I mostly do consulting work.
Finite Automata (FAs)
–Our third machine model, after circuits and decision trees.
•Designed to:
–Accept some strings of symbols.
–Recognizea language, which is the set of strings it accepts.
•FA takes as its input a string of any length.
–One machine for all lengths.
–Circuits and decision trees use a different machine for each length.
•Today’s topics:
–Finite Automata and the languages they recognize
–Examples
–Operations on languages
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
Toxic effects of heavy metals : Lead and Arsenicsanjana502982
Heavy metals are naturally occuring metallic chemical elements that have relatively high density, and are toxic at even low concentrations. All toxic metals are termed as heavy metals irrespective of their atomic mass and density, eg. arsenic, lead, mercury, cadmium, thallium, chromium, etc.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
1. Algorithms on Strings
Michael Soltys
CSU Channel Islands
Computer Science
February 4, 2015
Strings - Soltys Math/CS Seminar Title - 1/27
2. String problems are at the heart of Computer Science:
Rewriting systems are Turing complete
In practice analysis of strings is central to:
Algorithmic biology
Text processing
Language theory
Coding theory
Strings - Soltys Math/CS Seminar Introduction - 2/27
3. Basics (COMP 454)
An alphabet is a finite, non-empty set of distinct symbols, denoted
usually by Σ.
e.g., Σ = {0, 1} (binary alphabet)
Σ = {a, b, c, . . . , z} (lower-case letters alphabet)
A string, also called word, is a finite ordered sequence of symbols
chosen from some alphabet.
e.g., 010011101011
|w| denotes the length of the string w.
e.g., |010011101011| = 12
The empty string, ε, |ε| = 0, is in any Σ by default.
Strings - Soltys Math/CS Seminar Introduction - 3/27
4. Σk is the set of strings over Σ of length exactly k.
e.g., If Σ = {0, 1}, then
Σ0
= {ε}
Σ1
= Σ
Σ2
= {00, 01, 10, 11}, etc. |Σk
|?
Kleene’s star Σ∗ is the set of all strings over Σ.
Σ∗ = Σ0 ∪ Σ1
∪ Σ2
∪ Σ3
∪ . . .
=Σ+
Concatenation If x, y are strings, and x = a1a2 . . . am &
y = b1b2 . . . bn ⇒ x · y = xy
juxtaposition
= a1a2 . . . amb1b2 . . . bn
UNIX cat command
Strings - Soltys Math/CS Seminar Introduction - 4/27
5. A language L is a collection of strings over some alphabet Σ, i.e.,
L ⊆ Σ∗. E.g.,
L = {ε, 01, 0011, 000111, . . .} = {0n
1n
|n ≥ 0} (1)
Note:
wε = εw = w.
{ε} = ∅; one is the language consisting of the single string ε,
and the other is the empty language.
Strings - Soltys Math/CS Seminar Introduction - 5/27
6. Consider L = {w| w is of the form x01y ∈ Σ∗ } where Σ = {0, 1}.
We want to specify a DFA A = (Q, Σ, δ, q0, F) that accepts all and
only the strings in L.
Σ = {0, 1}, Q = {q0, q1, q2}, and F = {q1}.
Transition diagram
q
1 0 0,1
10
q0 q2 1
Transition table
0 1
q0 q2 q0
q1 q1 q1
q2 q2 q1
Strings - Soltys Math/CS Seminar Introduction - 6/27
7. A context-free grammar (CFG) is G = (V , T, P, S) — Variables,
Terminals, Productions, Start variable
Ex. P −→ ε|0|1|0P0|1P1.
Ex. G = ({E, I}, T, P, E) where T = {+, ∗, (, ), a, b, 0, 1} and P is
the following set of productions:
E −→ I|E + E|E ∗ E|(E)
I −→ a|b|Ia|Ib|I0|I1
If αAβ ∈ (V ∪ T)∗, A ∈ V , and A −→ γ is a production, then
αAβ ⇒ αγβ. We use
∗
⇒ to denote 0 or more steps.
L(G) = {w ∈ T∗|S
∗
⇒ w}
Strings - Soltys Math/CS Seminar Introduction - 7/27
8. Context-sensitive grammars (CSG) have rules of the form:
α → β
where α, β ∈ (T ∪ V )∗ and |α| ≤ |β|. A language is context
sensitive if it has a CSG.
Fact: It turns out that CSL = NTIME(n)
A rewriting system (also called a Semi-Thue system) is a grammar
where there are no restrictions; α → β for arbitrary
α, β ∈ (V ∪ T)∗.
Fact: It turns out that a rewriting system corresponds to the most
general model of computation; i.e., a language has a rewriting
system iff it is “computable.”
Strings - Soltys Math/CS Seminar Introduction - 8/27
9. A second course in Automata
Chomsky-Schutzenberger Theorem: If L is a CFL, then there
exists a regular language R, an n, and a homomorphism h, such
that L = h(PARENn ∩ R).
Parikh’s Theorem: If Σ = {a1, a2, . . . , an}, the signature of a
string x ∈ Σ∗ is (#a1(x), #a2(x), . . . , #an(x)), i.e., the number of
ocurrences of each symbol, in a fixed order. The signature of a
language is defined by extension; regular and CFLs have the same
signatures.
Strings - Soltys Math/CS Seminar Introduction - 9/27
10. This presentation is about algorithms on strings.
Based on two papers that are coming out in the next months:
Neerja Mhaskar and Michael Soltys
Non-repetitive strings over alphabet lists
to appear in WALCOM, February 2015.
Neerja Mhaskar and Michael Soltys
String Shuffle: Circuits and Graphs
accepted in the Journal of Discrete Algorithms, 2015
Both at http://soltys.cs.csuci.edu (papers 3 & 19)
Strings - Soltys Math/CS Seminar Introduction - 10/27
11. Non-repetitive strings
A word is non-repetitive if it does not contain a subword of the
form vv.
Word with repetition 010101110
Word without repetition 101
Easy observation: what is the smallest n so that any word over
Σ = {0, 1} of length ≥ n has at least one repetition?
Strings - Soltys Math/CS Seminar Non-repetitive strings - 11/27
12. Original Thue problem
For Σ3 = {1, 2, 3} and morphism, due to A. Thue:
S =
1 → 12312
2 → 131232
3 → 1323132
Given a string w ∈ Σ∗
3, we let S(w) denote w with every symbol
replaced by its corresponding substitution:
S(w) = S(w1w2 . . . wn) = S(w1)S(w2) . . . S(wn)
Lemma: If w is non-repetitive then so is S(w).
Strings - Soltys Math/CS Seminar Non-repetitive strings - 12/27
13. Problem extended to alphabet lists
List of alphabets L = L1, L2, . . . , Ln
Can we generate non-repetitive words
w = w1w2 . . . wn, such that the symbol wi ∈ Li ?
Studied by: [GKM10], [Sha09], and it is a natural extension of the
original problem posed and solved by A. Thue.
E.g., L1 = {a, b, c}, L2 = {b, c, d}, L3 = {a, d, 2}, in this case
w = ac2 is over L1, L2, L3 and non-repetitive.
Is that true for any list where |Li | = 3 for all i?
Strings - Soltys Math/CS Seminar Non-repetitive strings - 13/27
14. [GKM10] shows that this can be done for |Li | = 4 for all i with this
algorithm:
pick any w1 ∈ L1
for i + 1 (w = w1w2 . . . wi is non-repetitive) pick a ∈ Li+1
if wa is non-repetitive, then let wi+1 = a
if wa has a square vv, then
vv must be a suffix
delete the right copy of v from w, and restart.
Using sophisticated Lov´asz Local Lemma argument and Catalan
numbers we can show that the above algorithm succeeds with
non-zero probability.
Strings - Soltys Math/CS Seminar Non-repetitive strings - 14/27
15. Particular “yes” cases for L1, L2, . . . , Ln
Has a system of distinct representatives (SDR)
Has the union property
Can be mapped consistently to Σ3 = {1, 2, 3}
It is a partition
Strings - Soltys Math/CS Seminar Non-repetitive strings - 15/27
16. Open Problem 1
Given any list L1, L2, . . . , Ln, where |Li | = 3, can we always find a
non-repetitive string w over such a list?
Strings - Soltys Math/CS Seminar Non-repetitive strings - 16/27
17. Shuffle
w is the shuffle of u, v: w = u v
w = 0110110011101000
u = 01101110
v = 10101000
w = 0110110011101000
Strings - Soltys Math/CS Seminar Shuffle - 17/27
18. Shuffle
w is the shuffle of u, v: w = u v
w = 0110110011101000
u = 01101110
v = 10101000
w = 0110110011101000
w is a shuffle of u and v provided:
u = x1x2 · · · xk
v = y1y2 · · · yk
and w obtained by “interleaving” w = x1y1x2y2 · · · xkyk.
Strings - Soltys Math/CS Seminar Shuffle - 17/27
19. Square Shuffle
w is a square provided it is equal to a shuffle of a u with itself, i.e.,
∃u s.t. w = u u
The string w = 0110110011101000 is a square:
w = 0110110011101000
and
u = 01101100 = 01101100
Strings - Soltys Math/CS Seminar Shuffle - 18/27
20. Result from 2013
given an alphabet Σ, |Σ| ≥ 7,
Square = {w : ∃u(w = u u)}
is NP-complete.
Strings - Soltys Math/CS Seminar Shuffle - 19/27
21. Result from 2013
given an alphabet Σ, |Σ| ≥ 7,
Square = {w : ∃u(w = u u)}
is NP-complete.
What we leave open:
What about |Σ| = 2 (for |Σ| = 1, Square is just the set of
even length strings)
What about if |Σ| = ∞ but each symbol cannot occur more
often than, say, 6 times (if each symbol occurs at most 4
times, Square can be reduced to 2-Sat – see P. Austrin
Stack Exchange post http://bit.ly/WATco3)
Strings - Soltys Math/CS Seminar Shuffle - 19/27
22. Open Problem 2
Is Square NP-complete for alphabets of size {2, 3, 4, 5, 6} ?
Strings - Soltys Math/CS Seminar Shuffle - 20/27
23. Upper and lower bounds
Shuffle(x, y, w) holds if and only if w is a shuffle of x, y
Shuffle ∈ AC0
, but Shuffle ∈ AC1
.
Strings - Soltys Math/CS Seminar Shuffle - 21/27
25. Lower bound
Parity(x) =
0 ≤ i ≤ |x|
i is odd
Shuffle(0|x|−i
, 1i
, x).
Strings - Soltys Math/CS Seminar Shuffle - 23/27
26. n−i
i=1 i=3 i=5 i=n
0 x 1 1 10 0 0x x x1
ii n−i i in−i n−i
Strings - Soltys Math/CS Seminar Shuffle - 24/27
27. Open Problem 3
Is Shuffle in NC1
?
Strings - Soltys Math/CS Seminar Shuffle - 25/27
28. Announcement of two upcoming seminars
1. February 16, 2015, 6:00-7:00pm
Bell Tower 1471
Ryszard Janicki
On Pairwise Comparisons Based Rankings
2. February 16, 2015, 7:00-8:00pm
Bell Tower 1471
Neerja Mhaskar
Repetition in Strings and String Shuffles
Computer Science Seminars:
http://compsci.csuci.edu/degrees/seminars.htm
Strings - Soltys Math/CS Seminar Conclusion - 26/27
29. References
Jaroslaw Grytczuk, Jakub Kozik, and Pitor Micek.
A new approach to nonrepetitive sequences.
arXiv:1103.3809, December 2010.
Jeffrey Shallit.
A second course in formal languages and automata theory.
Cambridge Univeristy Press, 2009.
Strings - Soltys Math/CS Seminar References - 27/27