The document describes a complex problem involving a man traveling by car and train while encountering various delays and detours. It involves calculating the man's arrival time based on factors like his car's fuel efficiency, stops made during the trip, and the schedules and speeds of the trains involved. The problem has many variables and moving parts that make the calculation challenging.
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Solving Complex XML Parsing Problem with XML::Twig
1. The Problem
If
a
man,
A,
who
weighs
11
stone
leaves
from
his
home
at
8:30
in
the
morning
in
a
car
whose
consump<on
is
16.25
mpg
at
an
average
speed
of
40
m.p.h.
to
his
office
which
is
12
miles
away
and
he
stops
for
a
coffee
on
the
way
for
15
minutes
and
also
puts
air
in
one
of
his
tyres
which
has
a
slow
puncture
leFng
out
air
at
a
rate
of
2
lbs
per
square
inch
per
mile
travelled
when
the
car
is
moving
at
32
m.p.h.
and
he
picks
up
a
hitch-‐hiker
B
who
weighs
14
stone
plus
suitcase
But
hitch-‐hiker
B
who
is
a
poli<cal
ac<vist
distributes
leaflets
from
his
suitcase
each
of
which
weigh
an
ounce
at
the
scale
of
2
leaflets
per
person
at
every
bus
stop
and
every
vehicle
on
either
side
of
them
at
every
red
traffic
light
during
the
journey
which
includes
20
bus
stops
with
an
average
of
6
people
per
stop
5
lorries
each
with
a
passenger
one
of
which
exchanged
a
Yorkie
Bar
weighing
an
ounce
for
12
of
the
leaflets
and
2
coaches
each
containing
51
people
7
of
which
from
one
coach
returned
the
leaflets
and
16
people
from
the
other
coach
who
asked
for
a
further
leaflet
each
for
a
member
of
one
of
their
families
Assuming
that
man
A
then
had
to
travel
a
further
2.86
miles
out
of
his
way
to
drop
off
hitch-‐hiker
B
how
late
would
man
A
be
in
arriving
at
the
office
by
9:30
a.m.?
If
he
s<ll
had
6
miles
to
travel
and
his
watch
was
running
23
minutes
slow
but
the
clock
at
the
office
was
running
2
minutes
faster
than
his
was
in
fact
17
minutes
and
3
secs
ahead
of
the
correct
<me
which
was
2:30
in
the
morning
in
Caracas
If
when
5
miles
from
the
office
he
telephoned
his
boss
to
apologize
for
being
late
but
was
told
by
his
boss
C
to
pick
up
a
package
2.63
miles
away
from
his
present
loca<on
and
deliver
it
to
client
D
in
Bristol
by
train,
by
4:30
that
aVernoon
and
at
the
same
<me
man
D
was
mistakenly
told
to
come
to
London
to
receive
same
package
from
man
A
Now
man
A's
train,
train
1,
leV
30
mins.
late
but
man
D's
train,
train
2,
leV
5
mins
early
so
when
the
trains
passed
each
other
train
1
was
travelling
at
75
m.p.h.
to
make
up
for
lost
<me
and
train
2
was
travelling
at
52
m.p.h.
Would
man
A
reach
Bristol
earlier
or
later
according
to
his
watch
which
was
now
running
5
mins.
slower
than
man
D's
would
have
been
had
he
not
got
off
the
train
and
checked
the
correct
<me
at
a
sta<on
between
Bristol
and
London
and
stopped
to
phone
A's
boss,
man
C
to
double
check
A
would
be
there
to
meet
him
and
discover
his
mistake
catch
next
train,
train
3,
back
to
Bristol
which
unlike
A's
train
1
which
stopped
at
4
sta<ons
on
the
way
for
6
mins
each
stop
was
an
express
train
D's
train
caught
up
with
A's
train
1
4
miles
from
Bristol
As
the
trains
drew
alongside
each
other
A's
train
was
travelling
at
12
m.p.h.
and
D's
train
was
travelling
at
13.6
m.p.h.
and
man
A
was
sat
in
the
front...
How
long
would
it
take
to
fill
the
bath?
1
Thursday, 27 September, 12
2. The Problem
• Loading
from
a
6Gb
structured
XML
file
• Methods:
– DOM
-‐
not
enough
memory
– Each
element,
text,
abribute,
allocated
separately
– “Perl
is
a
profligate
wastrel
when
it
comes
to
memory
use.
There
is
a
saying
that
to
es<mate
memory
usage
of
Perl,
assume
a
reasonable
algorithm
for
memory
alloca<on,
mul<ply
that
es<mate
by
10,
and
while
you
s<ll
may
miss
the
mark,
at
least
you
won't
be
quite
so
astonished”
(perldoc
perldebguts)
2
Thursday, 27 September, 12
3. The Problem
• Methods:
– SAX
-‐
possible
but
painful
– Memory
is
not
an
issue
– Event-‐based,
handlers
for
each
element,
text
– Developer
needs
to
build
all
data
structures
3
Thursday, 27 September, 12
4. XML::Twig - a hybrid of DOM and SAX
sub main {
...
my $twig = XML::Twig->new(
twig_handlers => { entry => &entry},
ignore_elts => { reference => 'discard',
dbReference => 'discard' },
do_not_chain_handlers => 1
);
$twig->parsefile($data_file);
$twig->purge;
}
4
Thursday, 27 September, 12
5. XML::Twig - a hybrid of DOM and SAX
sub main { Elements
to
process,
a
handler
... will
be
called
for
each
my $twig = XML::Twig->new(
twig_handlers => { entry => &entry},
ignore_elts => { reference => 'discard',
dbReference => 'discard' },
do_not_chain_handlers => 1
);
$twig->parsefile($data_file);
$twig->purge;
}
4
Thursday, 27 September, 12
6. XML::Twig - a hybrid of DOM and SAX
sub main { Elements
to
process,
a
handler
... will
be
called
for
each
my $twig = XML::Twig->new(
twig_handlers => { entry => &entry},
ignore_elts => { reference => 'discard',
dbReference => 'discard' },
do_not_chain_handlers => 1
);
$twig->parsefile($data_file); Elements
to
$twig->purge; ignore;
these
} are
not
loaded
4
Thursday, 27 September, 12
7. XML::Twig - a hybrid of DOM and SAX
sub main { Elements
to
process,
a
handler
... will
be
called
for
each
my $twig = XML::Twig->new(
twig_handlers => { entry => &entry},
ignore_elts => { reference => 'discard',
dbReference => 'discard' },
do_not_chain_handlers => 1
);
$twig->parsefile($data_file); Elements
to
$twig->purge; ignore;
these
} are
not
loaded
Clean
up
memory
when
done
4
Thursday, 27 September, 12
9. sub entry { We
can
use
my( $twig, $entry) = @_;
XPath
to
find
stuff
my ($organism_name) =
$entry->get_xpath('organism/name[@type = "scientific"]');
return unless ($organism_name &&
$organism_name->trimmed_text() eq 'homo sapiens');
my ($gene) = $entry->get_xpath('gene');
return unless (defined($gene));
my ($gene_name) = $gene->get_xpath('name[@type = "primary"]');
return unless (defined($gene_name));
$gene_name = $gene_name->trimmed_text();
my @synonyms = map {
$_->trimmed_text()
} $gene->get_xpath('name[@type = "synonym"]');
...
$entry->purge;
return 0;
}
5
Thursday, 27 September, 12
10. sub entry { We
can
use
my( $twig, $entry) = @_;
XPath
to
find
stuff
my ($organism_name) =
$entry->get_xpath('organism/name[@type = "scientific"]');
return unless ($organism_name &&
$organism_name->trimmed_text() eq 'homo sapiens');
my ($gene) = $entry->get_xpath('gene');
return unless (defined($gene));
my ($gene_name) = $gene->get_xpath('name[@type = "primary"]');
return unless (defined($gene_name));
$gene_name = $gene_name->trimmed_text(); Methods
to
get
element
data
my @synonyms = map {
$_->trimmed_text()
} $gene->get_xpath('name[@type = "synonym"]');
...
$entry->purge;
return 0;
}
5
Thursday, 27 September, 12
11. sub entry { We
can
use
my( $twig, $entry) = @_;
XPath
to
find
stuff
my ($organism_name) =
$entry->get_xpath('organism/name[@type = "scientific"]');
return unless ($organism_name &&
$organism_name->trimmed_text() eq 'homo sapiens');
my ($gene) = $entry->get_xpath('gene');
return unless (defined($gene));
my ($gene_name) = $gene->get_xpath('name[@type = "primary"]');
return unless (defined($gene_name));
$gene_name = $gene_name->trimmed_text(); Methods
to
get
element
data
my @synonyms = map {
$_->trimmed_text()
} $gene->get_xpath('name[@type = "synonym"]');
...
$entry->purge; Clean
up
return 0; memory
when
} done
5
Thursday, 27 September, 12
12. Assessment
• Advantages
– Only
elements
of
interest
are
loaded
when
needed
– Subset
of
XPath
for
naviga<on,
among
other
methods
– Can
both
read
and
rewrite
XML
– Code
simpler
than
SAX,
without
DOM
memory
death
• Disadvantages
– XML::Twig
is...
slooooooooow-‐ish
• Compared
to
SAX
and
DOM
• Both
of
which
primarily
use
C
6
Thursday, 27 September, 12