SlideShare a Scribd company logo
A personal journey towards
more reproducible
networking research
Olivier Bonaventure
UCLouvain
https://inl.info.ucl.ac.be
Agenda
• The reproducibility crisis
• Dagsthul’s guide to Reproducibility for Experimental Networking
Research
• Artefacts Evaluations
The reproducibility crisis in medicine
Source: Begley, C., Ellis, L. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).
https://doi.org/10.1038/483531a
The reproducibility crisis
Reproducibility in Computer Science
• An interesting study
Software artfeacts
• Methodology
• Select papers from top conferences and journals
• Check if the they use software that does not require special
hardware
• Try to obtain software
• Directly with URL in paper
• By contacting corresponding authors by email
• avoid contacting the same author several times
• Try to recompile software
• With modifications that do no require more than 30 minutes
Survey results
Proposal : sharing contract in papers
Reproducibility in networking research
• A short study conducted with Nicolas Houtain on software artefacts
presented during the SIGCOMM’15 Community session
• Methodology and dataset
• Conference programmes
• SIGCOMM, 2013 and 2014
• CoNext, 2013 and 2014
• Hotnets 2013 and 2014
• Read each paper to identify
• Whether software was used to produce results
• If yes, is this software publicly available ?
• Directly mentioned in the paper
• Available after contacting the authors by email
How accessible is software written for
scientific papers ?
SIGCOMM/CoNEXT/Hotnets
URL in paper
No software
required
No reply from
authors
Authors replied
Software from SIGCOMM papers
SIGCOMM
URL in paper
No software required
No reply from authors
Authors replied
Software from Hotnets papers
Hotnets
URL in paper
No software required
No reply from authors
Authors replied
Software from Conext papers
CoNEXT
URL in paper
No software required
No reply from authors
Authors replied
Authors' responses
SIGCOMM/CoNEXT/Hotnets
complete
not yet ready
partial
not available
2013 conferences
SIGCOMM/CoNEXT/Hotnets 2013
complete
not yet ready
partial
not available
2014 conferences
SIGCOMM/CoNEXT/Hotnets 2014
complete
not yet ready
partial
not available
Recommendations at SIGCOMM 2015
• Encourage authors to release the software used for accepted papers
• Source code preferably (open-source, closed source with "research" license),
if not binaries or VMs
• Datasets used for experiments/analysis
• There are sometimes privacy issues, but not always and some part of the paper should
be evaluated with publicly accessible datasets
• Encourage authors to post additional data in the ACM digital library
next to each paper
Recommendations at SIGCOMM 2015
• Encourage TPC members to consider the reproducibility of the papers
while reviewing
• Could become a standard question of review forms and maybe used for
decisions
• Encourage fully reproducible papers through awards or some form of
recognition
• Encourage steering committees to think about ways to improve the
reproducibility of our papers
SIGCOMM 2017 Reproducibility workshop
Stanford CS244 – Graduate networking class
Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
Reproduced TCP opt-ack attack
21
Original result from paper Alexander and Trey’s reproduced result
(blog post)
R. Sherwood, B. Bhattacharjee, and R. Braud. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005.
Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
What kinds of reproductions?
• Congestion control
• Topologies
• Security attacks
• Applications
22
Publication
# student
reproductions
TCP opt-ack attack 8
Increasing TCP init cwnd 7
TCP Fast Open 7
MPTCP 6
DCTCP 5
Hedera 4
pFabric 3
Sprout 3
(24 other papers) 30
40+ papers
Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
5 years of student projects
23
Publication
# student
reproductions
TCP opt-ack attack 8
Increasing TCP init cwnd 7
TCP Fast Open 7
MPTCP 6
DCTCP 5
Hedera 4
pFabric 3
Sprout 3
(24 other papers) 30
40+ papers 200+ students
Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
Emulators/simulators used by students
24
Python
Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
Repeat, Replicate, Reproduce, Reuse
• Four different words that are not completely synonyms
• Repeat
• A team of researcher can reexecute the same experiments/analysis using their artefacts
and obtain the same results
• Replicate
• Another team of researchers can perform the same experiments as those of the original
paper with the paper’s artefacts and obtain the same results
• Reproduce
• Another team of researchers can collect similar data or develop similar software and
obtain the same results as the original paper without using the paper’s artefacts
• Reuse
• Another team of researchers extends the paper’s artefacts to improve them or solve
other research problems
Difficulty
Repeatability
• Should be mandatory for all Ph.D. students
• This is a basic element of the scientific process, takes notes and ensure that
you can repeat your experiments
• As computer scientists, your experiments are much more deterministics than in natural
sciences
• Don’t underestimate the importance of repeatability
• A few years from now, questions from your jury members might force you to rerun some
experiments
Agenda
• The reproducibility crisis
• Dagsthul’s guide to Reproducibility for Experimental Networking
Research
• Artefacts Evaluations
Dagsthul’s guide to reproducible research
Lots of very useful advice for researchers
• Problem formulation and design
• Hypothesize
• Think before, experiment after
• Try to predict the results of an experiment before running it
• Plan and solicit early feedback
• Plan your experiments and discuss with colleagues before executing them
• Have you considered all the key parameters ?
• What is the lesson that you want to learn from each experiment ?
• Iterate
• You will rerun your experiments several times, automate them
• Put everything in scripts that can execute autonomously at night
• Rerun each experiment several times to check if results are stable
• Check for external interference (e.g. nightly backup running on the server you use for
experiments and which consume CPU,...)
Documentation
• Use a version control system
• For all the code you use and the measurement scripts
• Provide detailed logs of your experiments
• The measured data is important, but also collect
• The system state (networkin configuration, OS, libraries, version of measurement
software)
• The raw data produced by the measurement tool
• shoul include the version of the measurement script you used so that you precisely know
which script was used to collect a given measurement
• Metadata is as important and sometimes more important than data
• Keep regular backups
• Of everything that is important, your disk will fail, you do not know when
Experimentation
• Always start with a basic validation
• simple experiment whose result is unsurprising to check that there are non
configuration errors. If it fails, script stops and you investigate
• Do no reinvent the wheel
• Check for well-maintained tools that you could use for your experiments
• tool developers have already coped with many details that affect measurements and you
could ignore at first attempt
• this is reassuring for reviewers, because they know what the tool measures
• If you develop your own tool, start from a test suite
• Write a technical report to convince other researchers of the quality of your tool
• Monitor your experiment
• Check for problems that could affect your measurements (disk full, competing
processes, network reconfiguration, ...)
Multipath TCP
MPTCP over WiFi/3G
8Mbps, 20ms
2Mbps, 150ms
TCP over WiFi/3G
C. Raiciu, et al. “How hard can it be? designing and implementing a deployable multipath TCP,” NSDI'12:
Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2012.
MPTCP over WiFi/3G
C. Raiciu, et al. “How hard can it be? designing and implementing a deployable multipath TCP,” NSDI'12:
Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2012.
MPTCP over WiFi/3G
Multipath TCP
increases
throughput
MPTCP over WiFi/3G
What happened
here?
Understanding the performance issue
8Mbps, 20ms
2Mbps, 150ms Window
B
A
C
D
Window full !
No new data can be sent on WiFi path
A
Reinject segment on fast path
Halve congestion window on slow subflow
MPTCP over WiFi/3G
Example: how fast can MPTCP go ?
How fast can Multipath TCP go ?
Reproducible measurements
• Software used for the measurements
For more details, http://multipath-tcp.org/pmwiki.php?n=Main.50Gbps
Reproducible measurements (2)
Reproducible measurements (3)
Handling data
• Privacy, anonymization and ethics
• Control plane measurements are usually fine
• Data plane measurements can be more difficulty
• You probably need authorization to collect packets/Netflow from real users
• If you probe external servers, you probably need their authorization
• Data integrity
• Define checks that allow you to verify that the data has been correctly
captured
• Can you miss some data ? Can your measurements be polluted by caches ?
GEANT Traffic
matrices
• The project
Repeatability/Replicability/Reproducibility
• Repeatability
• Dataset collected by GEANT (Netflow) and post-processed
• We could not release the Netflow data, only the post processed information
• We reused the dataset in several PhD theses
• Replicability
• The dataset has been reused by other researchers, but not the tool used to
process the data
• Reproducibility
• Reusability
• Regularly used dataset
When a dataset brings new ideas
http://svnet.u-strasbg.fr/mrinfo/
The mrinfo project
Projet results
The code that you develop
• Licensing and credits
• What are the rules imposed by your institution for releasing code ?
• Is this allowed ?
• Are there special declarations or IPR that you need to take into account ?
• Check this before thinking about submitting paper artefacts
• If you decide for open-source, understand the different types of licenses
• Public domain, BSD/MIT, GPLv2, GPLv3, Apache, ...
• Do you rely on existing code ?
• In-house
• Are you allowed to release the in-house code that you reused ?
• External
• What are the rules imposed by the code that you reuse ?
Experience with open-source licenses
• GPLv2
• Used by our shim6, MPTCP and SRv6 implementations in the Linux kernel
• Does not restrict the ability to create startups
• If you implement kernel modules, try to merge your code upstream
• MIT License
• Used by quic-go and inheritted by Multipath QUIC
• Used by picoquic and inheritted by PQUIC
• Creative commons
• Used for the Computer Networking: Principles, Protocols and Practice ebook
• https://www.computer-networking.info
Simulations
• Your own special simulator
• Plan to spend more time to validate your simulator than to execute your own
measurements
• An existing (and well-maintained) simulator
• Check papers that already used this simulator
• Check how the different models need to be configured to obtain accurate
results
C-BGP
• The project
Repeatability/Replicability/Reproducibility
• Repeatability
• C-BGP was created and maintained by Bruno Quoitin
• https://sourceforge.net/projects/c-bgp/
• Replicability
• Various researchers were able to use the simulator for their own research,
they were able to replicate simple examples
• Reproducibility
• Nobody tried to re-implement a similar simulator. Since it was open-source
and well documented, there was no incentive to reproduce such work
• Reusability
Faster routing
convergence
• The project
Repeatability/Replicability/Reproducibility/Reusability
• Repeatability
• Custom simulator written only for this paper
• Private dataset (ISP topologies) provided by co-authors
• Replicability
• Unfortunately, the custom simulator was never released
• Reproducibility
• Although the paper has been widely cited, I’m not aware of any attempt to
reproduce the paper results
• Reusability
• Lots of citations,
• but no real reuse
Check list with existing simulators
• Is the simulator well-maintained ?
• Can you get support and bug fixes
• Is the simulator open-source ?
• Can you develop your own models and fix problems in its internals ?
• Is the simulator used in your research community ?
• If not, develop your own tests and plan some space in your papers to
convince the reviewers of the quality of the simulator
• Does it contain good models for your research problem ?
• You should develop your model for the core of your research, but reuse
existing models for the protocols that you do not change
Replacing simulation by emulation
Mininet...
Mininet...
• The good
• You can easily model a small network on your laptop and execute your own
code on routers and servers
• This is real code and not a simplified model, all details are considered
• The bad
• Everything (routers, servers, applications) compete for the CPU and memory
on your laptop
• If CPU load is too high, you will get random results
• Don’t run other applications on your laptop while mininet is running
• The ugly
• Different runs of the same experiment can produce very different results
• repetition can help, but in some cases, you will need to scale down
Our experience with Mininet
• Advantages
• Works well for rapid prototyping and to test simple scenarios
• Allows to experiment with patched Linux kernels with limited efforts
• Nice teaching tool, check ipmininet
• Drawbacks
• Works well for low bandwidth flows that are not CPU bound
• Only works for very small networks on our computers
• Be careful with workloads that are CPU bound
Experimental design with Mininet
The experiments
The results
ns3-DCE : best of both worlds ?
ns3-DCE...
• The good
• ns-3 is a simulator and thus your application can simulate very large networks if you
have enough RAM/CPU
• Time is simulated and it usually runs much slower than real-time
• Simulation is fully reproducible
• if you find a bug in your code, you can fix it and then rerun exactly this experiment
• The bad
• unless you specify it, your application executes in zero time
• this ignores CPU consumption effects which could be important in some scenarios
• The ugly
• In some cases porting applications (or worse kernel code) to ns3-DCE can be a tough
software engineering challenge
Our experience with ns3-DCE
• Advantages
• Works well with QUIC implementations in userspace
• Used it extensively as a test suite to avoid regressions in PQUIC
• Since ns3-DCE is deterministic, once you’ve found a scenario with problems in
retransmissions or congestion control, you can check that your fixes solve them and do
not create new ones
• Drawbacks
• Requires very good systems knowledge to port existing applications
• plan to spend some time to get upto speed
• Not widely used by the networking community
Real experiments
• The good
• You execute your entire code on real machines
• Your code needs to handle most corner cases and works well
• The bad
• Getting the required hardware ressources can be costly or difficult
• If you use shared machines, be careful of interferences
• planetlab, grid5000, ...
• The ugly
• Beware of overoptimizing your solution to your hardware setup
• It might be much smaller or very different from deployments
Multipath TCP
First experimental results
Benefits of trying to replicate MPTCP results
• What about WAN measurements ?
• Colleagues in Finland installed MPTCP kernel to replicate our results
• Worked well in their lab
• MPTCP connections could be established between Finland and Belgium, but
no data was ever transferred
The Internet architecture
that we explain to our students
Physical
Datalink
Network
Transport
Application
O. Bonaventure, Computer networking : Principles, Protocols and Practice, open ebook, http://inl.info.ucl.ac.be/cnp3
Physical
Physical
Datalink
Physical
Datalink
Network
In reality
• almost as many middleboxes as routers
• various types of middleboxes are deployed
Sherry, Justine, et al. "Making middleboxes someone else's problem: Network processing as a cloud service."
Proceedings of the ACM SIGCOMM 2012 conference. ACM, 2012.
Internet devices according to Cisco
http://www.cisco.com/web/about/ac50/ac47/2.html
Web Security
Appliance
NAC Appliance
ACE XML
Gateway
Streamer
VPN Concentrator
SSL
Terminator
Cisco IOS Firewall
IP Telephony
Router
PIX Firewall
Right and Left
Voice
Gateway
V
V
V
V
Content
Engine
NAT
Middleboxes in the architecture
• In the official architecture, they do not exist
• In reality...
Physical
Datalink
Network
Transport
Application
Physical
Datalink
Network
Transport
Application
Physical
Datalink
Network
TCP
Physical
Datalink
Network
Transport
Application
TCP segments processed by a router
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
IP
TCP
TCP segments processed by a NAT
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
© O. Bonaventure, 2011
How transparent is the Internet ?
• 25th September 2010 to
30th April 2011
• 142 access networks
• 24 countries
• Sent specific TCP
segments from client to a
server in Japan
Honda, Michio, et al. "Is it still possible to extend TCP?" Proceedings of the 2011 ACM
SIGCOMM conference on Internet measurement conference. ACM, 2011.
End-to-end transparency today
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
Source port Destination port
Checksum Urgent pointer
THL Reserved Flags
Acknowledgment number
Sequence number
Window
Ver IHL ToS Total length
Checksum
TTL Protocol
Flags Frag. Offset
Source IP address
Identification
Destination IP address
Payload
Options
Middleboxes don't change
the Protocol field, but
some discard packets with a
Protocol field different than
TCP or UDP
How to create an MPTCP connection ?
SYN, seq=1234
+MP_CAPABLE[Token=5678]
SYN+ACK+MP_CAPABLE[Token=6543] seq=5678,ack=1235
ACK seq=1235, ack=5679
SYN, seq=3456
+MP_JOIN[Token=6543]
MyToken=5678
YourToken=6543
MyToken=6543
YourToken=5678
How to create an MPTCP connection ?
SYN, seq=1234
+MP_CAPABLE[Token=5678]
SYN+ACK+MP_CAPABLE[Token=6543] seq=5678,ack=1235
ACK seq=1235, ack=5679
SYN, seq=3456
+MP_JOIN[Token=6543]
MyToken=5678
YourToken=6543
MyToken=6543
YourToken=5678
SYN, seq=6789
+MP_JOIN[Token=6543]
Which middleboxes change
TCP sequence numbers ?
• Some firewalls change TCP sequence numbers in SYN segments to
ensure randomness
• fix for old windows95 bug
• Transparent proxies terminate TCP connections
• Widely used in cellular networks and in some enterprises
Consequences of this replicability study
• Multipath TCP was redesigned to cope with all these middleboxes
• Multipath TCP works through most middleboxesand falls back to TCP when
there is severe interference to preserve connectivity
• Michio Honda et al. published the measurement study Is it possible to
extend TCP?
• We developed tracebox, a traceroute like that detects middlebox
interference
Next steps with Multipath TCP
• Repeatability
• Linux kernel code, moved to http://www.multipath-tcp-org
• Mailing list to discuss with other researchers
• Replicability
• Multipath networks pushed us to optimize reaction to link failures
• Convinced two PhD students to start a company focused Multipath TCP
• Created a community with other MPTCP researchers
• Actively encouraged them to submit patches and reviewed them for inclusion
• New releases of the kernel patch at every IETF meeting
Reproducibility/Reusability
• Reproducibility
• Nigel Willimas and colleagues started to implement MPTCP in FreeBSD
• Anumita Biswas informed us that Apple
was interested by MPTCP
• In September 2013 we still surprised
to see MPTCP officially launched by Apple
• Reusability
• Getting MPTCP in the official Linux kernel
proved to be more difficult, but
• MPTCP received the 2019 SIGCOMM Networking Systems Award
Getting MPTCP in the mainline Linux kernel
• In 2013, we thought that everything was ready to bring MPTCP in the
mainline Linux kernel and get wider deployment
• MPTCP was published in RFC6824
• Apple was using it on all iPhones for Siri
• Samsung and LG started to use MPTCP in Android smartphones in Korea
• Unfortunately, MPTCP made a lot of changes to TCP and TCP is a key
part of the Linux kernel code
• Linux developers did not want to risk any regression in performance in the
Linux kernel and did not consider the smartphones to be a strong enough use
case at that time
MPTCP in the mainline Linux kernel (2)
• A slowly progressing engineering project
• Netdev 2015
MPTCP in the mainline Linux kernel (3)
Some lessons learned
• Repeatability is key
• Start at the beginning of your PhD work, don’t take wrong habits
• Releasing artefacts enables other researchers to reuse or expand your
work results
• should contribute to your citations, which is unfortunately a key metric
• Releasing artefacts takes time
• Packaging software, documentation, providing support
• Software and artefacts are not always well valued by community
• main research output metric remains citations and many authors forget to
correctly reference the artefacts they use in their papers
Agenda
• The reproducibility crisis
• Dagsthul’s guide to Reproducibility for Experimental Networking
Research
• Artefacts Evaluations
Earlier attempts at evaluating artefacts
ACM Artefacts Evaluation
• Initiated by different ACM SIGs
• Adopted by ACM with
artefacts badges
Artefacts evaluations in 2018
• Experiment with different conferences
• Conext’18
• SIGCOMM CCR
• all papers published in 2018
• SIGCOMM conferences
Available artefacts at Conext
0
2
4
6
8
10
12
14
16
18
Conext'18 Conext'19 Conext'20
Available
Functional artefacts at Conext
0
2
4
6
8
10
12
14
16
Conext'18 Conext'19 Conext'20
Functional
Reusable artefacts at Conext
0
1
2
3
4
5
6
7
Conext'18 Conext'19 Conext'20
Reusable
Available artefacts at SIGCOMM
0
5
10
15
20
25
30
SIGCOMM'18 SIGCOMM'20 SIGCOMM'21
Available
Available
Functional artefacts at SIGCOMM
0
5
10
15
20
25
SIGCOMM'18 SIGCOMM'20 SIGCOMM'21
Functional
Functional
Reusable/reproduced artefacts at SIGCOMM
0
2
4
6
8
10
12
14
16
18
20
SIGCOMM'18 SIGCOMM'20 SIGCOMM'21
Chart Title
Reusable Reproduced
Artefacts evaluation committees
• Objective
• Verify that the artefacts provided by the authors of accepted papers support the
claims of the paper
• a posteriori evaluation, assigns badges
• Artefacts review
• Starts shortly after paper acceptance
• Why not during the paper review ?
• Reviewers are volunteers who agree to spend time
• You can (should) volunteer
• Allows you to interact with researchers who are working on related topics
• Allows you to understand how to provide good artefacts
• Decision
• Reviewers suggest badges for each paper based on their experience with the artefacts
Recent artefacts evaluation committees
• Conext 2021
• Chairs: Anna Brunstrom, Ken Calvert, Gianni Antichi
• https://conferences2.sigcomm.org/co-next/2021/#!/artifactcommittee
• SIGCOMM 2021
• Chairs: Aaron Gember Jacobson, Arpit Gupta
• https://conferences.sigcomm.org/sigcomm/2021/cf-artifacts.html
• Internet measurements (IMC, PAM, TMA)
• Not yet
Recent artefacts evaluation committees
• Mobicom
• Not yet, but
• Mobisys
• To honor the system nature of ACM MobiSys, this year we integrated an artifact evaluation
program in the review process. Submissions could opt in for such a program by expressly
indicating that at the time of submitting the paper. By doing so, if the paper was eventually
accepted, the authors were committed to providing implementations, models, test suites,
benchmarks, and data used to derive the results presented in the paper to the artifact
evaluation committee. This was formed by a subset of PC members who volunteered for this
job. A total of 12 papers have ultimately applied for the program. The artifact evaluation
committee examined the provided artifacts and validated authors’ requests of one or more
“Artifact Evaluated” badges available as part of the corresponding ACM initiative.
• ICNP, INFOCOM
• Not yet
Artefacts evaluators
• You will work with the artfeacts of recently
accepted papers
• Nice way to see new research results
• Nice way to interact with authors
• Allows you to learn how to prepare the artefacts
for your future papers
• You will encourage authors to release
artefacts
• artefacts will be available if you want to compare
your solution with previous papers

More Related Content

What's hot

Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
EuroPython 2020 - Speak python with devices
EuroPython 2020 - Speak python with devicesEuroPython 2020 - Speak python with devices
EuroPython 2020 - Speak python with devices
Hua Chu
 
Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...
Evans Ye
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
Evans Ye
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scale
Ovais Tariq
 
Securing Databases with Dynamic Credentials and HashiCorp Vault
Securing Databases with Dynamic Credentials and HashiCorp VaultSecuring Databases with Dynamic Credentials and HashiCorp Vault
Securing Databases with Dynamic Credentials and HashiCorp Vault
Mitchell Pronschinske
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Bids talk 9.18
Bids talk 9.18Bids talk 9.18
Bids talk 9.18
Travis Oliphant
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and Potential
Paul Brebner
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
Dell World
 
Deduplication in Open Spurce Cloud
Deduplication in Open Spurce CloudDeduplication in Open Spurce Cloud
Deduplication in Open Spurce Cloud
Mangali Praveen Kumar
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
Paul Brebner
 
Data as a Service
Data as a Service Data as a Service
Data as a Service
Kyle Hailey
 
Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPC
Chris Dwan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
Josh Patterson
 
Scientific
Scientific Scientific
Scientific
marpierc
 
Reactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and RxReactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and Rx
Sumant Tambe
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Building Reactive Applications with DDS
Building Reactive Applications with DDSBuilding Reactive Applications with DDS
Building Reactive Applications with DDS
Angelo Corsaro
 

What's hot (20)

Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
EuroPython 2020 - Speak python with devices
EuroPython 2020 - Speak python with devicesEuroPython 2020 - Speak python with devices
EuroPython 2020 - Speak python with devices
 
Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Monitoring MySQL at scale
Monitoring MySQL at scaleMonitoring MySQL at scale
Monitoring MySQL at scale
 
Securing Databases with Dynamic Credentials and HashiCorp Vault
Securing Databases with Dynamic Credentials and HashiCorp VaultSecuring Databases with Dynamic Credentials and HashiCorp Vault
Securing Databases with Dynamic Credentials and HashiCorp Vault
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Bids talk 9.18
Bids talk 9.18Bids talk 9.18
Bids talk 9.18
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and Potential
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
Deduplication in Open Spurce Cloud
Deduplication in Open Spurce CloudDeduplication in Open Spurce Cloud
Deduplication in Open Spurce Cloud
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
 
Data as a Service
Data as a Service Data as a Service
Data as a Service
 
Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPC
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
Scientific
Scientific Scientific
Scientific
 
Reactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and RxReactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and Rx
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
 
Building Reactive Applications with DDS
Building Reactive Applications with DDSBuilding Reactive Applications with DDS
Building Reactive Applications with DDS
 

Similar to A personal journey towards more reproducible networking research

PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for Repeatability
Tanu Malik
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
Venkatesh Prasad Ranganath
 
Using Wireless Technology to Create Simple Laboratory Tools
Using Wireless Technology to Create Simple Laboratory ToolsUsing Wireless Technology to Create Simple Laboratory Tools
Using Wireless Technology to Create Simple Laboratory Tools
Shane Cox
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
C. Tobin Magle
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
Simon Cockell
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
Stephen Turner
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
c.titus.brown
 
Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...
Kento Aoyama
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Anita de Waard
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j
 
Ph.D Annual report II
Ph.D Annual report IIPh.D Annual report II
Ph.D Annual report II
Matteo Avalle
 
Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)
Jim Gilsinn
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
Yannick Pouliot
 
ProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendlyProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendly
martijnetje
 
2014 manchester-reproducibility
2014 manchester-reproducibility2014 manchester-reproducibility
2014 manchester-reproducibility
c.titus.brown
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
Anubhav Jain
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez
 
TelKart and QnA: An Open Teaching System for Computer Science Courses
TelKart and QnA: An Open Teaching System for Computer Science CoursesTelKart and QnA: An Open Teaching System for Computer Science Courses
TelKart and QnA: An Open Teaching System for Computer Science Courses
Open Education Global (OEGlobal)
 
Iqpc eln joanna mulgrew
Iqpc eln joanna mulgrewIqpc eln joanna mulgrew
Iqpc eln joanna mulgrew
Jo Mulgrew
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
Brendan Gregg
 

Similar to A personal journey towards more reproducible networking research (20)

PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for Repeatability
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
 
Using Wireless Technology to Create Simple Laboratory Tools
Using Wireless Technology to Create Simple Laboratory ToolsUsing Wireless Technology to Create Simple Laboratory Tools
Using Wireless Technology to Create Simple Laboratory Tools
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 
Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...Reproducibility of computational workflows is automated using continuous anal...
Reproducibility of computational workflows is automated using continuous anal...
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
Neo4j Theory and Practice - Tareq Abedrabbo @ GraphConnect London 2013
 
Ph.D Annual report II
Ph.D Annual report IIPh.D Annual report II
Ph.D Annual report II
 
Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
 
ProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendlyProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendly
 
2014 manchester-reproducibility
2014 manchester-reproducibility2014 manchester-reproducibility
2014 manchester-reproducibility
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
 
TelKart and QnA: An Open Teaching System for Computer Science Courses
TelKart and QnA: An Open Teaching System for Computer Science CoursesTelKart and QnA: An Open Teaching System for Computer Science Courses
TelKart and QnA: An Open Teaching System for Computer Science Courses
 
Iqpc eln joanna mulgrew
Iqpc eln joanna mulgrewIqpc eln joanna mulgrew
Iqpc eln joanna mulgrew
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 

More from Olivier Bonaventure

Part3-reliable.pptx
Part3-reliable.pptxPart3-reliable.pptx
Part3-reliable.pptx
Olivier Bonaventure
 
Part10-router.pptx
Part10-router.pptxPart10-router.pptx
Part10-router.pptx
Olivier Bonaventure
 
Part1-Intro-Apps.pptx
Part1-Intro-Apps.pptxPart1-Intro-Apps.pptx
Part1-Intro-Apps.pptx
Olivier Bonaventure
 
Part9-congestion.pptx
Part9-congestion.pptxPart9-congestion.pptx
Part9-congestion.pptx
Olivier Bonaventure
 
Part2-Apps-Security.pptx
Part2-Apps-Security.pptxPart2-Apps-Security.pptx
Part2-Apps-Security.pptx
Olivier Bonaventure
 
Part11-lan.pptx
Part11-lan.pptxPart11-lan.pptx
Part11-lan.pptx
Olivier Bonaventure
 
Part5-tcp-improvements.pptx
Part5-tcp-improvements.pptxPart5-tcp-improvements.pptx
Part5-tcp-improvements.pptx
Olivier Bonaventure
 
Part8-ibgp.pptx
Part8-ibgp.pptxPart8-ibgp.pptx
Part8-ibgp.pptx
Olivier Bonaventure
 
Part4-reliable-tcp.pptx
Part4-reliable-tcp.pptxPart4-reliable-tcp.pptx
Part4-reliable-tcp.pptx
Olivier Bonaventure
 
Part7-routing.pptx
Part7-routing.pptxPart7-routing.pptx
Part7-routing.pptx
Olivier Bonaventure
 
Part6-network-routing.pptx
Part6-network-routing.pptxPart6-network-routing.pptx
Part6-network-routing.pptx
Olivier Bonaventure
 
Part1-Intro-Apps.pptx
Part1-Intro-Apps.pptxPart1-Intro-Apps.pptx
Part1-Intro-Apps.pptx
Olivier Bonaventure
 
Part2-Apps-Security.pptx
Part2-Apps-Security.pptxPart2-Apps-Security.pptx
Part2-Apps-Security.pptx
Olivier Bonaventure
 
Part4-reliable-tcp.pptx
Part4-reliable-tcp.pptxPart4-reliable-tcp.pptx
Part4-reliable-tcp.pptx
Olivier Bonaventure
 
Part3-reliable.pptx
Part3-reliable.pptxPart3-reliable.pptx
Part3-reliable.pptx
Olivier Bonaventure
 
Part 12 : Local Area Networks
Part 12 : Local Area Networks Part 12 : Local Area Networks
Part 12 : Local Area Networks
Olivier Bonaventure
 
Part 11 : Interdomain routing with BGP
Part 11 : Interdomain routing with BGPPart 11 : Interdomain routing with BGP
Part 11 : Interdomain routing with BGP
Olivier Bonaventure
 
Part 10 : Routing in IP networks and interdomain routing with BGP
Part 10 : Routing in IP networks and interdomain routing with BGPPart 10 : Routing in IP networks and interdomain routing with BGP
Part 10 : Routing in IP networks and interdomain routing with BGP
Olivier Bonaventure
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6
Olivier Bonaventure
 
Part 8 : TCP and Congestion control
Part 8 : TCP and Congestion controlPart 8 : TCP and Congestion control
Part 8 : TCP and Congestion control
Olivier Bonaventure
 

More from Olivier Bonaventure (20)

Part3-reliable.pptx
Part3-reliable.pptxPart3-reliable.pptx
Part3-reliable.pptx
 
Part10-router.pptx
Part10-router.pptxPart10-router.pptx
Part10-router.pptx
 
Part1-Intro-Apps.pptx
Part1-Intro-Apps.pptxPart1-Intro-Apps.pptx
Part1-Intro-Apps.pptx
 
Part9-congestion.pptx
Part9-congestion.pptxPart9-congestion.pptx
Part9-congestion.pptx
 
Part2-Apps-Security.pptx
Part2-Apps-Security.pptxPart2-Apps-Security.pptx
Part2-Apps-Security.pptx
 
Part11-lan.pptx
Part11-lan.pptxPart11-lan.pptx
Part11-lan.pptx
 
Part5-tcp-improvements.pptx
Part5-tcp-improvements.pptxPart5-tcp-improvements.pptx
Part5-tcp-improvements.pptx
 
Part8-ibgp.pptx
Part8-ibgp.pptxPart8-ibgp.pptx
Part8-ibgp.pptx
 
Part4-reliable-tcp.pptx
Part4-reliable-tcp.pptxPart4-reliable-tcp.pptx
Part4-reliable-tcp.pptx
 
Part7-routing.pptx
Part7-routing.pptxPart7-routing.pptx
Part7-routing.pptx
 
Part6-network-routing.pptx
Part6-network-routing.pptxPart6-network-routing.pptx
Part6-network-routing.pptx
 
Part1-Intro-Apps.pptx
Part1-Intro-Apps.pptxPart1-Intro-Apps.pptx
Part1-Intro-Apps.pptx
 
Part2-Apps-Security.pptx
Part2-Apps-Security.pptxPart2-Apps-Security.pptx
Part2-Apps-Security.pptx
 
Part4-reliable-tcp.pptx
Part4-reliable-tcp.pptxPart4-reliable-tcp.pptx
Part4-reliable-tcp.pptx
 
Part3-reliable.pptx
Part3-reliable.pptxPart3-reliable.pptx
Part3-reliable.pptx
 
Part 12 : Local Area Networks
Part 12 : Local Area Networks Part 12 : Local Area Networks
Part 12 : Local Area Networks
 
Part 11 : Interdomain routing with BGP
Part 11 : Interdomain routing with BGPPart 11 : Interdomain routing with BGP
Part 11 : Interdomain routing with BGP
 
Part 10 : Routing in IP networks and interdomain routing with BGP
Part 10 : Routing in IP networks and interdomain routing with BGPPart 10 : Routing in IP networks and interdomain routing with BGP
Part 10 : Routing in IP networks and interdomain routing with BGP
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6
 
Part 8 : TCP and Congestion control
Part 8 : TCP and Congestion controlPart 8 : TCP and Congestion control
Part 8 : TCP and Congestion control
 

Recently uploaded

How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdfHow-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
Dolphin Data Lab
 
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
paridubey2024#G05
 
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
shamrisumri
 
How God led me to DTS? Through many different signs and connections that I c...
How God led me to DTS? Through many different signs and connections that  I c...How God led me to DTS? Through many different signs and connections that  I c...
How God led me to DTS? Through many different signs and connections that I c...
AshishMohan57
 
Digital ethnography of the Polish darknet drug trade community
Digital ethnography of the Polish darknet drug trade communityDigital ethnography of the Polish darknet drug trade community
Digital ethnography of the Polish darknet drug trade community
Piotr Siuda
 
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai AvailableChennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
shamrisumri
 
Top 50 Telephone Conversation Sample Examples For IT Industries.pdf
Top 50 Telephone Conversation Sample Examples For IT Industries.pdfTop 50 Telephone Conversation Sample Examples For IT Industries.pdf
Top 50 Telephone Conversation Sample Examples For IT Industries.pdf
Krishna L
 
Effective Tips for Creating the Best Rich Media Ads .pptx
Effective Tips for Creating the Best Rich Media Ads .pptxEffective Tips for Creating the Best Rich Media Ads .pptx
Effective Tips for Creating the Best Rich Media Ads .pptx
AirtoryInc
 
Network Security version1.0 - Module 3.pptx
Network Security version1.0 - Module 3.pptxNetwork Security version1.0 - Module 3.pptx
Network Security version1.0 - Module 3.pptx
Infotainmentforall
 
Software Defined Networking, Concepts and Practical Implementations
Software Defined Networking, Concepts and Practical ImplementationsSoftware Defined Networking, Concepts and Practical Implementations
Software Defined Networking, Concepts and Practical Implementations
Bangladesh Network Operators Group
 
Best CSS Animation Libraries for Web Developers
Best CSS Animation Libraries for Web DevelopersBest CSS Animation Libraries for Web Developers
Best CSS Animation Libraries for Web Developers
Shrestha Raaz
 
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
dilbaagsingh0898
 
Trump fist pump t shirts Trump fist pump t shirts
Trump fist pump t shirts Trump fist pump t shirtsTrump fist pump t shirts Trump fist pump t shirts
Trump fist pump t shirts Trump fist pump t shirts
exgf28
 
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
elbertablack
 
IPv6 Deployment Planning and Security Considerations
IPv6 Deployment Planning and Security ConsiderationsIPv6 Deployment Planning and Security Considerations
IPv6 Deployment Planning and Security Considerations
Bangladesh Network Operators Group
 
Open Source TCP or Netflow Log Server Using Graylog
Open Source TCP or Netflow Log Server Using GraylogOpen Source TCP or Netflow Log Server Using Graylog
Open Source TCP or Netflow Log Server Using Graylog
Bangladesh Network Operators Group
 
Team Cymru Community Services,Overview of all public services
Team Cymru Community Services,Overview of all public servicesTeam Cymru Community Services,Overview of all public services
Team Cymru Community Services,Overview of all public services
Bangladesh Network Operators Group
 
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
QingjieDu1
 
Use of Ontologies in Chemical Kinetic Database CHEMCONNECT
Use of Ontologies in Chemical Kinetic Database CHEMCONNECTUse of Ontologies in Chemical Kinetic Database CHEMCONNECT
Use of Ontologies in Chemical Kinetic Database CHEMCONNECT
Edward Blurock
 
Understanding Threat Intelligence | What is Threat Intelligence
Understanding Threat Intelligence | What is Threat IntelligenceUnderstanding Threat Intelligence | What is Threat Intelligence
Understanding Threat Intelligence | What is Threat Intelligence
Lumiverse Solutions Pvt Ltd
 

Recently uploaded (20)

How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdfHow-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
How-to-Diagnose-Hard-Drives-by-DFL-DDP-2024.pdf
 
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
Kolkata @Girls @Call WhatsApp Numbers 🫦0000XX0000🫦 List For Friendship Girls ...
 
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
High Profile Girls Call ServiCe Chennai XX00XXX00X Tanisha Best High Class Ch...
 
How God led me to DTS? Through many different signs and connections that I c...
How God led me to DTS? Through many different signs and connections that  I c...How God led me to DTS? Through many different signs and connections that  I c...
How God led me to DTS? Through many different signs and connections that I c...
 
Digital ethnography of the Polish darknet drug trade community
Digital ethnography of the Polish darknet drug trade communityDigital ethnography of the Polish darknet drug trade community
Digital ethnography of the Polish darknet drug trade community
 
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai AvailableChennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
Chennai Girls Call ServiCe X00XXX00XX Tanisha Best High Class Chennai Available
 
Top 50 Telephone Conversation Sample Examples For IT Industries.pdf
Top 50 Telephone Conversation Sample Examples For IT Industries.pdfTop 50 Telephone Conversation Sample Examples For IT Industries.pdf
Top 50 Telephone Conversation Sample Examples For IT Industries.pdf
 
Effective Tips for Creating the Best Rich Media Ads .pptx
Effective Tips for Creating the Best Rich Media Ads .pptxEffective Tips for Creating the Best Rich Media Ads .pptx
Effective Tips for Creating the Best Rich Media Ads .pptx
 
Network Security version1.0 - Module 3.pptx
Network Security version1.0 - Module 3.pptxNetwork Security version1.0 - Module 3.pptx
Network Security version1.0 - Module 3.pptx
 
Software Defined Networking, Concepts and Practical Implementations
Software Defined Networking, Concepts and Practical ImplementationsSoftware Defined Networking, Concepts and Practical Implementations
Software Defined Networking, Concepts and Practical Implementations
 
Best CSS Animation Libraries for Web Developers
Best CSS Animation Libraries for Web DevelopersBest CSS Animation Libraries for Web Developers
Best CSS Animation Libraries for Web Developers
 
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Shimla 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Trump fist pump t shirts Trump fist pump t shirts
Trump fist pump t shirts Trump fist pump t shirtsTrump fist pump t shirts Trump fist pump t shirts
Trump fist pump t shirts Trump fist pump t shirts
 
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
Female Service Girls Call Delhi 9873940964 Provide Best And Top Girl Service ...
 
IPv6 Deployment Planning and Security Considerations
IPv6 Deployment Planning and Security ConsiderationsIPv6 Deployment Planning and Security Considerations
IPv6 Deployment Planning and Security Considerations
 
Open Source TCP or Netflow Log Server Using Graylog
Open Source TCP or Netflow Log Server Using GraylogOpen Source TCP or Netflow Log Server Using Graylog
Open Source TCP or Netflow Log Server Using Graylog
 
Team Cymru Community Services,Overview of all public services
Team Cymru Community Services,Overview of all public servicesTeam Cymru Community Services,Overview of all public services
Team Cymru Community Services,Overview of all public services
 
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
SisAi World - Software is AI - Providing AI as Software - Protecting the Inte...
 
Use of Ontologies in Chemical Kinetic Database CHEMCONNECT
Use of Ontologies in Chemical Kinetic Database CHEMCONNECTUse of Ontologies in Chemical Kinetic Database CHEMCONNECT
Use of Ontologies in Chemical Kinetic Database CHEMCONNECT
 
Understanding Threat Intelligence | What is Threat Intelligence
Understanding Threat Intelligence | What is Threat IntelligenceUnderstanding Threat Intelligence | What is Threat Intelligence
Understanding Threat Intelligence | What is Threat Intelligence
 

A personal journey towards more reproducible networking research

  • 1. A personal journey towards more reproducible networking research Olivier Bonaventure UCLouvain https://inl.info.ucl.ac.be
  • 2. Agenda • The reproducibility crisis • Dagsthul’s guide to Reproducibility for Experimental Networking Research • Artefacts Evaluations
  • 3. The reproducibility crisis in medicine Source: Begley, C., Ellis, L. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012). https://doi.org/10.1038/483531a
  • 5. Reproducibility in Computer Science • An interesting study
  • 6. Software artfeacts • Methodology • Select papers from top conferences and journals • Check if the they use software that does not require special hardware • Try to obtain software • Directly with URL in paper • By contacting corresponding authors by email • avoid contacting the same author several times • Try to recompile software • With modifications that do no require more than 30 minutes
  • 8. Proposal : sharing contract in papers
  • 9. Reproducibility in networking research • A short study conducted with Nicolas Houtain on software artefacts presented during the SIGCOMM’15 Community session • Methodology and dataset • Conference programmes • SIGCOMM, 2013 and 2014 • CoNext, 2013 and 2014 • Hotnets 2013 and 2014 • Read each paper to identify • Whether software was used to produce results • If yes, is this software publicly available ? • Directly mentioned in the paper • Available after contacting the authors by email
  • 10. How accessible is software written for scientific papers ? SIGCOMM/CoNEXT/Hotnets URL in paper No software required No reply from authors Authors replied
  • 11. Software from SIGCOMM papers SIGCOMM URL in paper No software required No reply from authors Authors replied
  • 12. Software from Hotnets papers Hotnets URL in paper No software required No reply from authors Authors replied
  • 13. Software from Conext papers CoNEXT URL in paper No software required No reply from authors Authors replied
  • 17. Recommendations at SIGCOMM 2015 • Encourage authors to release the software used for accepted papers • Source code preferably (open-source, closed source with "research" license), if not binaries or VMs • Datasets used for experiments/analysis • There are sometimes privacy issues, but not always and some part of the paper should be evaluated with publicly accessible datasets • Encourage authors to post additional data in the ACM digital library next to each paper
  • 18. Recommendations at SIGCOMM 2015 • Encourage TPC members to consider the reproducibility of the papers while reviewing • Could become a standard question of review forms and maybe used for decisions • Encourage fully reproducible papers through awards or some form of recognition • Encourage steering committees to think about ways to improve the reproducibility of our papers
  • 20. Stanford CS244 – Graduate networking class Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
  • 21. Reproduced TCP opt-ack attack 21 Original result from paper Alexander and Trey’s reproduced result (blog post) R. Sherwood, B. Bhattacharjee, and R. Braud. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005. Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
  • 22. What kinds of reproductions? • Congestion control • Topologies • Security attacks • Applications 22 Publication # student reproductions TCP opt-ack attack 8 Increasing TCP init cwnd 7 TCP Fast Open 7 MPTCP 6 DCTCP 5 Hedera 4 pFabric 3 Sprout 3 (24 other papers) 30 40+ papers Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
  • 23. 5 years of student projects 23 Publication # student reproductions TCP opt-ack attack 8 Increasing TCP init cwnd 7 TCP Fast Open 7 MPTCP 6 DCTCP 5 Hedera 4 pFabric 3 Sprout 3 (24 other papers) 30 40+ papers 200+ students Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
  • 24. Emulators/simulators used by students 24 Python Source: Lisa Yan and Nick McKeown, Learning Networking by Reproducing Network Results, SIGCOMM’17 workshop
  • 25. Repeat, Replicate, Reproduce, Reuse • Four different words that are not completely synonyms • Repeat • A team of researcher can reexecute the same experiments/analysis using their artefacts and obtain the same results • Replicate • Another team of researchers can perform the same experiments as those of the original paper with the paper’s artefacts and obtain the same results • Reproduce • Another team of researchers can collect similar data or develop similar software and obtain the same results as the original paper without using the paper’s artefacts • Reuse • Another team of researchers extends the paper’s artefacts to improve them or solve other research problems Difficulty
  • 26. Repeatability • Should be mandatory for all Ph.D. students • This is a basic element of the scientific process, takes notes and ensure that you can repeat your experiments • As computer scientists, your experiments are much more deterministics than in natural sciences • Don’t underestimate the importance of repeatability • A few years from now, questions from your jury members might force you to rerun some experiments
  • 27. Agenda • The reproducibility crisis • Dagsthul’s guide to Reproducibility for Experimental Networking Research • Artefacts Evaluations
  • 28. Dagsthul’s guide to reproducible research
  • 29. Lots of very useful advice for researchers • Problem formulation and design • Hypothesize • Think before, experiment after • Try to predict the results of an experiment before running it • Plan and solicit early feedback • Plan your experiments and discuss with colleagues before executing them • Have you considered all the key parameters ? • What is the lesson that you want to learn from each experiment ? • Iterate • You will rerun your experiments several times, automate them • Put everything in scripts that can execute autonomously at night • Rerun each experiment several times to check if results are stable • Check for external interference (e.g. nightly backup running on the server you use for experiments and which consume CPU,...)
  • 30. Documentation • Use a version control system • For all the code you use and the measurement scripts • Provide detailed logs of your experiments • The measured data is important, but also collect • The system state (networkin configuration, OS, libraries, version of measurement software) • The raw data produced by the measurement tool • shoul include the version of the measurement script you used so that you precisely know which script was used to collect a given measurement • Metadata is as important and sometimes more important than data • Keep regular backups • Of everything that is important, your disk will fail, you do not know when
  • 31. Experimentation • Always start with a basic validation • simple experiment whose result is unsurprising to check that there are non configuration errors. If it fails, script stops and you investigate • Do no reinvent the wheel • Check for well-maintained tools that you could use for your experiments • tool developers have already coped with many details that affect measurements and you could ignore at first attempt • this is reassuring for reviewers, because they know what the tool measures • If you develop your own tool, start from a test suite • Write a technical report to convince other researchers of the quality of your tool • Monitor your experiment • Check for problems that could affect your measurements (disk full, competing processes, network reconfiguration, ...)
  • 33. MPTCP over WiFi/3G 8Mbps, 20ms 2Mbps, 150ms
  • 34. TCP over WiFi/3G C. Raiciu, et al. “How hard can it be? designing and implementing a deployable multipath TCP,” NSDI'12: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2012.
  • 35. MPTCP over WiFi/3G C. Raiciu, et al. “How hard can it be? designing and implementing a deployable multipath TCP,” NSDI'12: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2012.
  • 36. MPTCP over WiFi/3G Multipath TCP increases throughput
  • 37. MPTCP over WiFi/3G What happened here?
  • 38. Understanding the performance issue 8Mbps, 20ms 2Mbps, 150ms Window B A C D Window full ! No new data can be sent on WiFi path A Reinject segment on fast path Halve congestion window on slow subflow
  • 40. Example: how fast can MPTCP go ?
  • 41. How fast can Multipath TCP go ?
  • 42. Reproducible measurements • Software used for the measurements For more details, http://multipath-tcp.org/pmwiki.php?n=Main.50Gbps
  • 45. Handling data • Privacy, anonymization and ethics • Control plane measurements are usually fine • Data plane measurements can be more difficulty • You probably need authorization to collect packets/Netflow from real users • If you probe external servers, you probably need their authorization • Data integrity • Define checks that allow you to verify that the data has been correctly captured • Can you miss some data ? Can your measurements be polluted by caches ?
  • 47. Repeatability/Replicability/Reproducibility • Repeatability • Dataset collected by GEANT (Netflow) and post-processed • We could not release the Netflow data, only the post processed information • We reused the dataset in several PhD theses • Replicability • The dataset has been reused by other researchers, but not the tool used to process the data • Reproducibility • Reusability • Regularly used dataset
  • 48. When a dataset brings new ideas http://svnet.u-strasbg.fr/mrinfo/
  • 51. The code that you develop • Licensing and credits • What are the rules imposed by your institution for releasing code ? • Is this allowed ? • Are there special declarations or IPR that you need to take into account ? • Check this before thinking about submitting paper artefacts • If you decide for open-source, understand the different types of licenses • Public domain, BSD/MIT, GPLv2, GPLv3, Apache, ... • Do you rely on existing code ? • In-house • Are you allowed to release the in-house code that you reused ? • External • What are the rules imposed by the code that you reuse ?
  • 52. Experience with open-source licenses • GPLv2 • Used by our shim6, MPTCP and SRv6 implementations in the Linux kernel • Does not restrict the ability to create startups • If you implement kernel modules, try to merge your code upstream • MIT License • Used by quic-go and inheritted by Multipath QUIC • Used by picoquic and inheritted by PQUIC • Creative commons • Used for the Computer Networking: Principles, Protocols and Practice ebook • https://www.computer-networking.info
  • 53. Simulations • Your own special simulator • Plan to spend more time to validate your simulator than to execute your own measurements • An existing (and well-maintained) simulator • Check papers that already used this simulator • Check how the different models need to be configured to obtain accurate results
  • 55. Repeatability/Replicability/Reproducibility • Repeatability • C-BGP was created and maintained by Bruno Quoitin • https://sourceforge.net/projects/c-bgp/ • Replicability • Various researchers were able to use the simulator for their own research, they were able to replicate simple examples • Reproducibility • Nobody tried to re-implement a similar simulator. Since it was open-source and well documented, there was no incentive to reproduce such work • Reusability
  • 57. Repeatability/Replicability/Reproducibility/Reusability • Repeatability • Custom simulator written only for this paper • Private dataset (ISP topologies) provided by co-authors • Replicability • Unfortunately, the custom simulator was never released • Reproducibility • Although the paper has been widely cited, I’m not aware of any attempt to reproduce the paper results • Reusability • Lots of citations, • but no real reuse
  • 58. Check list with existing simulators • Is the simulator well-maintained ? • Can you get support and bug fixes • Is the simulator open-source ? • Can you develop your own models and fix problems in its internals ? • Is the simulator used in your research community ? • If not, develop your own tests and plan some space in your papers to convince the reviewers of the quality of the simulator • Does it contain good models for your research problem ? • You should develop your model for the core of your research, but reuse existing models for the protocols that you do not change
  • 61. Mininet... • The good • You can easily model a small network on your laptop and execute your own code on routers and servers • This is real code and not a simplified model, all details are considered • The bad • Everything (routers, servers, applications) compete for the CPU and memory on your laptop • If CPU load is too high, you will get random results • Don’t run other applications on your laptop while mininet is running • The ugly • Different runs of the same experiment can produce very different results • repetition can help, but in some cases, you will need to scale down
  • 62. Our experience with Mininet • Advantages • Works well for rapid prototyping and to test simple scenarios • Allows to experiment with patched Linux kernels with limited efforts • Nice teaching tool, check ipmininet • Drawbacks • Works well for low bandwidth flows that are not CPU bound • Only works for very small networks on our computers • Be careful with workloads that are CPU bound
  • 66. ns3-DCE : best of both worlds ?
  • 67. ns3-DCE... • The good • ns-3 is a simulator and thus your application can simulate very large networks if you have enough RAM/CPU • Time is simulated and it usually runs much slower than real-time • Simulation is fully reproducible • if you find a bug in your code, you can fix it and then rerun exactly this experiment • The bad • unless you specify it, your application executes in zero time • this ignores CPU consumption effects which could be important in some scenarios • The ugly • In some cases porting applications (or worse kernel code) to ns3-DCE can be a tough software engineering challenge
  • 68. Our experience with ns3-DCE • Advantages • Works well with QUIC implementations in userspace • Used it extensively as a test suite to avoid regressions in PQUIC • Since ns3-DCE is deterministic, once you’ve found a scenario with problems in retransmissions or congestion control, you can check that your fixes solve them and do not create new ones • Drawbacks • Requires very good systems knowledge to port existing applications • plan to spend some time to get upto speed • Not widely used by the networking community
  • 69. Real experiments • The good • You execute your entire code on real machines • Your code needs to handle most corner cases and works well • The bad • Getting the required hardware ressources can be costly or difficult • If you use shared machines, be careful of interferences • planetlab, grid5000, ... • The ugly • Beware of overoptimizing your solution to your hardware setup • It might be much smaller or very different from deployments
  • 72. Benefits of trying to replicate MPTCP results • What about WAN measurements ? • Colleagues in Finland installed MPTCP kernel to replicate our results • Worked well in their lab • MPTCP connections could be established between Finland and Belgium, but no data was ever transferred
  • 73. The Internet architecture that we explain to our students Physical Datalink Network Transport Application O. Bonaventure, Computer networking : Principles, Protocols and Practice, open ebook, http://inl.info.ucl.ac.be/cnp3 Physical Physical Datalink Physical Datalink Network
  • 74. In reality • almost as many middleboxes as routers • various types of middleboxes are deployed Sherry, Justine, et al. "Making middleboxes someone else's problem: Network processing as a cloud service." Proceedings of the ACM SIGCOMM 2012 conference. ACM, 2012.
  • 75. Internet devices according to Cisco http://www.cisco.com/web/about/ac50/ac47/2.html Web Security Appliance NAC Appliance ACE XML Gateway Streamer VPN Concentrator SSL Terminator Cisco IOS Firewall IP Telephony Router PIX Firewall Right and Left Voice Gateway V V V V Content Engine NAT
  • 76. Middleboxes in the architecture • In the official architecture, they do not exist • In reality... Physical Datalink Network Transport Application Physical Datalink Network Transport Application Physical Datalink Network TCP Physical Datalink Network Transport Application
  • 77. TCP segments processed by a router Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options IP TCP
  • 78. TCP segments processed by a NAT Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options
  • 79. © O. Bonaventure, 2011 How transparent is the Internet ? • 25th September 2010 to 30th April 2011 • 142 access networks • 24 countries • Sent specific TCP segments from client to a server in Japan Honda, Michio, et al. "Is it still possible to extend TCP?" Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference. ACM, 2011.
  • 80. End-to-end transparency today Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options Source port Destination port Checksum Urgent pointer THL Reserved Flags Acknowledgment number Sequence number Window Ver IHL ToS Total length Checksum TTL Protocol Flags Frag. Offset Source IP address Identification Destination IP address Payload Options Middleboxes don't change the Protocol field, but some discard packets with a Protocol field different than TCP or UDP
  • 81. How to create an MPTCP connection ? SYN, seq=1234 +MP_CAPABLE[Token=5678] SYN+ACK+MP_CAPABLE[Token=6543] seq=5678,ack=1235 ACK seq=1235, ack=5679 SYN, seq=3456 +MP_JOIN[Token=6543] MyToken=5678 YourToken=6543 MyToken=6543 YourToken=5678
  • 82. How to create an MPTCP connection ? SYN, seq=1234 +MP_CAPABLE[Token=5678] SYN+ACK+MP_CAPABLE[Token=6543] seq=5678,ack=1235 ACK seq=1235, ack=5679 SYN, seq=3456 +MP_JOIN[Token=6543] MyToken=5678 YourToken=6543 MyToken=6543 YourToken=5678 SYN, seq=6789 +MP_JOIN[Token=6543]
  • 83. Which middleboxes change TCP sequence numbers ? • Some firewalls change TCP sequence numbers in SYN segments to ensure randomness • fix for old windows95 bug • Transparent proxies terminate TCP connections • Widely used in cellular networks and in some enterprises
  • 84. Consequences of this replicability study • Multipath TCP was redesigned to cope with all these middleboxes • Multipath TCP works through most middleboxesand falls back to TCP when there is severe interference to preserve connectivity • Michio Honda et al. published the measurement study Is it possible to extend TCP? • We developed tracebox, a traceroute like that detects middlebox interference
  • 85. Next steps with Multipath TCP • Repeatability • Linux kernel code, moved to http://www.multipath-tcp-org • Mailing list to discuss with other researchers • Replicability • Multipath networks pushed us to optimize reaction to link failures • Convinced two PhD students to start a company focused Multipath TCP • Created a community with other MPTCP researchers • Actively encouraged them to submit patches and reviewed them for inclusion • New releases of the kernel patch at every IETF meeting
  • 86. Reproducibility/Reusability • Reproducibility • Nigel Willimas and colleagues started to implement MPTCP in FreeBSD • Anumita Biswas informed us that Apple was interested by MPTCP • In September 2013 we still surprised to see MPTCP officially launched by Apple • Reusability • Getting MPTCP in the official Linux kernel proved to be more difficult, but • MPTCP received the 2019 SIGCOMM Networking Systems Award
  • 87. Getting MPTCP in the mainline Linux kernel • In 2013, we thought that everything was ready to bring MPTCP in the mainline Linux kernel and get wider deployment • MPTCP was published in RFC6824 • Apple was using it on all iPhones for Siri • Samsung and LG started to use MPTCP in Android smartphones in Korea • Unfortunately, MPTCP made a lot of changes to TCP and TCP is a key part of the Linux kernel code • Linux developers did not want to risk any regression in performance in the Linux kernel and did not consider the smartphones to be a strong enough use case at that time
  • 88. MPTCP in the mainline Linux kernel (2) • A slowly progressing engineering project • Netdev 2015
  • 89. MPTCP in the mainline Linux kernel (3)
  • 90. Some lessons learned • Repeatability is key • Start at the beginning of your PhD work, don’t take wrong habits • Releasing artefacts enables other researchers to reuse or expand your work results • should contribute to your citations, which is unfortunately a key metric • Releasing artefacts takes time • Packaging software, documentation, providing support • Software and artefacts are not always well valued by community • main research output metric remains citations and many authors forget to correctly reference the artefacts they use in their papers
  • 91. Agenda • The reproducibility crisis • Dagsthul’s guide to Reproducibility for Experimental Networking Research • Artefacts Evaluations
  • 92. Earlier attempts at evaluating artefacts
  • 93. ACM Artefacts Evaluation • Initiated by different ACM SIGs • Adopted by ACM with artefacts badges
  • 94. Artefacts evaluations in 2018 • Experiment with different conferences • Conext’18 • SIGCOMM CCR • all papers published in 2018 • SIGCOMM conferences
  • 95. Available artefacts at Conext 0 2 4 6 8 10 12 14 16 18 Conext'18 Conext'19 Conext'20 Available
  • 96. Functional artefacts at Conext 0 2 4 6 8 10 12 14 16 Conext'18 Conext'19 Conext'20 Functional
  • 97. Reusable artefacts at Conext 0 1 2 3 4 5 6 7 Conext'18 Conext'19 Conext'20 Reusable
  • 98. Available artefacts at SIGCOMM 0 5 10 15 20 25 30 SIGCOMM'18 SIGCOMM'20 SIGCOMM'21 Available Available
  • 99. Functional artefacts at SIGCOMM 0 5 10 15 20 25 SIGCOMM'18 SIGCOMM'20 SIGCOMM'21 Functional Functional
  • 100. Reusable/reproduced artefacts at SIGCOMM 0 2 4 6 8 10 12 14 16 18 20 SIGCOMM'18 SIGCOMM'20 SIGCOMM'21 Chart Title Reusable Reproduced
  • 101. Artefacts evaluation committees • Objective • Verify that the artefacts provided by the authors of accepted papers support the claims of the paper • a posteriori evaluation, assigns badges • Artefacts review • Starts shortly after paper acceptance • Why not during the paper review ? • Reviewers are volunteers who agree to spend time • You can (should) volunteer • Allows you to interact with researchers who are working on related topics • Allows you to understand how to provide good artefacts • Decision • Reviewers suggest badges for each paper based on their experience with the artefacts
  • 102. Recent artefacts evaluation committees • Conext 2021 • Chairs: Anna Brunstrom, Ken Calvert, Gianni Antichi • https://conferences2.sigcomm.org/co-next/2021/#!/artifactcommittee • SIGCOMM 2021 • Chairs: Aaron Gember Jacobson, Arpit Gupta • https://conferences.sigcomm.org/sigcomm/2021/cf-artifacts.html • Internet measurements (IMC, PAM, TMA) • Not yet
  • 103. Recent artefacts evaluation committees • Mobicom • Not yet, but • Mobisys • To honor the system nature of ACM MobiSys, this year we integrated an artifact evaluation program in the review process. Submissions could opt in for such a program by expressly indicating that at the time of submitting the paper. By doing so, if the paper was eventually accepted, the authors were committed to providing implementations, models, test suites, benchmarks, and data used to derive the results presented in the paper to the artifact evaluation committee. This was formed by a subset of PC members who volunteered for this job. A total of 12 papers have ultimately applied for the program. The artifact evaluation committee examined the provided artifacts and validated authors’ requests of one or more “Artifact Evaluated” badges available as part of the corresponding ACM initiative. • ICNP, INFOCOM • Not yet
  • 104. Artefacts evaluators • You will work with the artfeacts of recently accepted papers • Nice way to see new research results • Nice way to interact with authors • Allows you to learn how to prepare the artefacts for your future papers • You will encourage authors to release artefacts • artefacts will be available if you want to compare your solution with previous papers

Editor's Notes

  1. I’ll give you an example of a successful reproduction. Two students in 2016 tried recreating a TCP opt-ack attack, where a misbehaving receiver sends optimistic acknowledgements to multiple victim senders, who in turn generate an overwhelming amount of traffic. On the left is a graph from the original paper – the researchers show that for up to 512 victims, the amount of traffic increases exponentially over time. The student reproduced result is pretty close – the curves match closely for up to 16 victims, whereas the curves for 32 and 64 struggle a bit. While the original researchers simulated their results in ns2, the students emulated the traffic pattern in Mininet, and therefore they hit the performance limit sooner. However, the overall results match quite closely, and we count this as a successful reproduction.
  2. Since we started out using Mininet as our emulator of choice, we didn’t expect students to be able to reproduce research on wireless, or localization, or mass user experiments. Instead, we encouraged them to look at papers covering …, datacenter topologies, …, like video streaming, and so on. Here is a list of the top papers that our students have looked at, to give you an idea. As you can see, the top reproduction is the TCP opt-ack attack I just discussed. And we have a long tail of papers that were attempted once or twice.
  3. We’ve offered this reproducibility project for the past five years. Over 40 papers have been reproduced by students – some more than others. Since each project is done in pairs, we have had over 200 students go through this exercise. The large majority of projects have been successful, seen here in orange, and the blue bars show the students that have not. So what makes an unsuccessful reproduction?
  4. c