.lusoftware verification & validation
VVS .lusoftware verification & validation
VVS
Model-Based Simulation of Legal
Requirements: Experience from Tax
Policy Simulation

Ghanem Soltana, Mehrdad Sabetzadeh, and Lionel Briand 

SnT Centre for Security, Reliability and Trust 
University of Luxembourg, Luxembourg
Context and 
motivation
How did this work come about?
3
•  Collaboration with"
Government of "
Luxembourg
§  CTIE: Government’s IT Centre
§  ACD: Tax Administration Department
•  New tax system under development: Operationalization of the
administrative procedures envisaged by the law
•  The development of such system involves both IT and legal
experts
Objectives 
Simulation data
/Test cases
Actual 
software
system 
Traces to
Traces to
Models of 
legal
requirements 
Generates
1. Legal 
compliance 
2. Change impact
3. Model validation
Simulates
Law
4
0%
2%
4%
6%
8%
10%
12%
0%
5%
10%
15%
20%
25%
0-10.000
10.000-20.000
20.000-30.000
30.000-40.000
40.000-50.000
50.000-60.000
60.000-70.000
70.000-80.000
80.000-90.000
90.000-100.000
100.000-110.000
110.000-120.000
120.000-130.000
130.000-140.000
140.000-150.000
150.000-160.000
160.000-170.000
170.000-180.000
180.000-190.000
190.000-200.000
200.000-250.000
250.000-350.000
350.000-500.000
500.000-700.000
700.000-1.000.000
>1.000.000
Gross annual income (in Euros)
Contributiontorevenue
Households
Percentage of households
Percentage of contribution before change
Percentage of contribution after change
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
0
1-3.000
3.001-6.000
6.001-9.000
9.001-12.000
12.001-15.000
15.001-18.000
18.001-21.000
21.001-24.000
24.001-27.000
27.001-30.000
>30.000
Annual income taxes due (in Euros)
Households
Before change
After change
Focus of the talk
Simulation data
/Test cases
Generates
Simulates
•  Can we bring together and model all the information
necessary for performing a real-world simulation scenario? 
•  To what extent are the simulation results credible? 
5
Models of 
legal
requirements 
 0%
2%
4%
6%
8%
10%
12%
0%
5%
10%
15%
20%
25%
0-10.000
10.000-20.000
20.000-30.000
30.000-40.000
40.000-50.000
50.000-60.000
60.000-70.000
70.000-80.000
80.000-90.000
90.000-100.000
100.000-110.000
110.000-120.000
120.000-130.000
130.000-140.000
140.000-150.000
150.000-160.000
160.000-170.000
170.000-180.000
180.000-190.000
190.000-200.000
200.000-250.000
250.000-350.000
350.000-500.000
500.000-700.000
700.000-1.000.000
>1.000.000
Gross annual income (in Euros)
Contributiontorevenue
Households
Percentage of households
Percentage of contribution before change
Percentage of contribution after change
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
0
1-3.000
3.001-6.000
6.001-9.000
9.001-12.000
12.001-15.000
15.001-18.000
18.001-21.000
21.001-24.000
24.001-27.000
27.001-30.000
>30.000
Annual income taxes due (in Euros)
Households
Before change
After change
2. Change impact
3. Model validation
State of the art
6
State of the art
Legal policy simulation in practice
7
Some existing simulation tools focused on taxation and social security: 

•  ASSERT: Assessing the effects of reforms in taxation
•  SYSIFF: A micro-simulation model for the French tax system
•  EUROMOD: European benefit-tax model and social integration
Dee
EUROMOD example
8
Dependent 
age range
Dependent 
count
EUROMOD example
9
Limitations of current simulation frameworks
10
•  Legal policies are hard-to-validate
•  Single-purpose models
•  Unusable when simulation data is not available
Approach
11
Approach
Our model-based simulation framework
12
Policy
Models
Executable
Simulator
Simulator Code
Generator
Historical
Data
Statistical Guidance about the
Real Simulation Population
Census /
Survey Data
Expert
Estimates
Data
Generator
Artificial
Simulation
Data
(ifavailable)
Simulation
Results
Domain
Model
Interpretations of the
legal text 
(original + modified) 
Main concepts and
relationships of the
simulated domain
[Soltana et al., SoSyM 2016]
Our model-based simulation framework
13
Policy
Models
Executable
Simulator
Simulator Code
Generator
Historical
Data
Statistical Guidance about the
Real Simulation Population
Census /
Survey Data
Expert
Estimates
Data
Generator
Artificial
Simulation
Data
(ifavailable)
Simulation
Results
Domain
Model
•  A legal policy model captures the procedure envisaged by law for
performing a certain activity
•  Notation: Extended Activity Diagrams (ADs) 
•  Support for automated analysis and communication between legal and IT
experts 
Expressive
Visual
Precise
Executable
ADs
Legal policy models
[Soltana et al., MODELS 2014]
14
Policy model example (1)
15
Policy model example (2)
16
Possible
outputs (here,
tax classes)
Traceability
to the legal
provisions
Policy model example (3)
17
Inputs from: (a) the
simulation data or (b)
the underlying legal
texts 
context Tax_Payer
def: are_both_spouses_non_resident():Boolean =
if(self.getSpouse(Constants.TAX_YEAR).oclIsUndefined()) then
false
else
self.oclIsTypeOf(Non_Resident_Tax_Payer)
and
self.getSpouse(onstants.TAX_YEAR).oclIsTypeOf(Non_Resident_Tax_Payer)
endif
Policy model example (4)
18
Input name Description
is_taxed_jointly Yes if a given taxpayer is taxed jointly with another
taxpayer; otherwise, no. The value of this input is
determined via the application of another policy
model: joint taxation.
is_married Yes if a given taxpayer is married; otherwise, no.
are_both_spouses_
non_resident
Yes if a given couple are both non-residents;
otherwise, no.
is_living_separatel
y
Yes if a given taxpayer is “de facto” separated
(séparathion de fait in French); otherwise, no.
is_divorced Yes if a given taxpayer is divorced by mutual
agreement; otherwise, no.
is_divorced_by_
court_order
Yes if a given taxpayer is divorced by court order;
otherwise, no.
has_separation_
transition_state
Yes if a given taxpayer is in a transition state
after separation; otherwise, no. The transition state
is granted when a taxpayer’s date of separation is
within the past three years.
is_widower Yes if a given taxpayer is a widower; otherwise, no.
...
.. .
Glossary for
the inputs
Our model-based simulation framework
19
Policy
Models
Executable
Simulator
Simulator Code
Generator
Historical
Data
Statistical Guidance about the
Real Simulation Population
Census /
Survey Data
Expert
Estimates
Data
Generator
Artificial
Simulation
Data
(ifavailable)
Simulation
Results
Domain
Model
20
Histograms
* Source: STATEC, Luxembourg


- «from histogram»
birthYear: Integer [1]
TaxPayer
[Soltana et al., SoSyM 2016]
21
* Source: STATEC, Luxembourg


1 taxpayer incomes 1..*
Income
TaxPayer (abstract)
«type dependency»
{relativeTo: Income;
condition: self.getAge() >= 60;
source: «from barchart»}
[Soltana et al., SoSyM 2016]
Conditional probabilities
Case study
22
Case study
23
Overview 
•  Selected case study: Simulation of the impact of the potential
abolishment of joint taxation (income-splitting)
•  Provide decision-support for the Government’s actual tax reforms
•  Typical analysis: differential reasoning (different goals)

0%
10%
20%
30%
40%
50%
60%
70%
Tax class 1 Tax class 1.a Tax class 2
Taxpayers
Before change
After change
-20%!
0%!
20%!
40%!
60%!
80%!
100%!
>21.001!
18.001-21.000!
15.001-18.000!
12.001-15.000!
9001-1200!
6001-9000!
3001-6000!
1-3000!
0!
1-3000!
3001-6000!
6001-9000!
9001-1200!
12.001-15.000!
15.001-18.000!
18.001-21.000!
>21.001!
Less taxes to pay! More taxes to pay!
Annual decrease / increase in taxes due (in Euros)!
Households!
24
Goals of the case study
•  RQ1: Can we bring together and model all the information
necessary for performing a real-world simulation scenario? 
a.  Is the required modeling effort practical? 
b.  Is it possible to provide the input (statistical information)
necessary for the generation of simulation data?
•  RQ2: Are the simulation results credible?
25
Policy and domain model construction

•  Source: Luxembourg’s personal income taxes law
•  Iterative process that terminates when the models are deemed valid
by legal experts 
•  Three policy models are directly impacted by the reform
1.  Joint Taxation (JT), 
2.  Tax Class Categorization (TCC), and 
3.  Extra-Professional Deduction (EPD)
•  Domain model
Classes Enum. Associations Generalizations Attributes
#elements 16 4 6 11 24
JT TCC EPD
#elements 111 93 73
26
Goals of the case study
RQ1: Can we bring together and model all the information necessary
for performing a real-world simulation scenario? 
a.  Is the modeling effort practical? 
b.  Is it possible to provide the input (statistical information)
necessary for the generation of simulation data?
•  RQ2: Are the simulation results credible? 
Activity
Model
JT TCC EPD Domain Model
Model construction
(person hours)
7 5 3 7
Model validation
(person hours)
1.5 1 0.5 2
27
Probabilistic information
15 distributions were used to specify Luxembourg’s population’s characteristics 
STATEC
Histograms
(applies to the
whole sample)
Histograms
(applies to a sub-
set of the sample)
2014’s tax report 
Statistic
Residence status
Age
Household size
Types of civil union
Income types
Income amounts
Workers per
household
Divorce rate
Divorcetypes
Widower rate
Income for
pensioners
Incomefortraders
Age of pensioners
Foreign income
types
Residence status
based on spouse’s
residence status
28
Goals of the case study

RQ1: Can we bring together and model all the information necessary
for performing a real-world simulation scenario? 
b.  Is it possible to provide the input (statistical information)
necessary for the generation of simulation data?
•  In total, we attached 57 annotations (stereotypes) to our domain
model for guiding the data generation process
•  About 70% of the annotations came from public sources
•  The remaining 30% were based on feedback from experts and
common sense
29
Goals of the case study
•  RQ1: Can we bring together and model all the information
necessary for performing a real-world simulation scenario? 
a.  Is the required modeling effort practical? 
b.  Is it possible to provide the input (statistical information)
necessary for the generation of simulation data?
•  RQ2: Are the simulation results credible?
30
How to check the quality of the generated data


For each annotated quantity, we perform a sanity check: 
•  Used metric: Normalized
Euclidian distance 
•  Euclidian distance for the
example: 0.14
•  Acceptance threshold: 0.1
•  All sanity checks must
succeed to use the data
sample for simulation
0
0.055
0.11
0.165
0.22
0.275
18-24
 25-34
 35-44
 45-54
 55-64
 65-74
 75-100
Age for the real population
 Age for the generated data
31
Sanity checks of the generated data 
00.010.020.030.040.050.060.070.080.090.1
Residence
status Age
Types of
civil union
Income
types
Income
amounts
Euclideandistance


•  10 generated artificial data
samples 
•  10,000 tax cases per sample 
•  Average generation time was
about 30 minutes
•  All generated sample were
aligned to the real population
(Euclidian distance < 0.1)
32
Credibility of the simulation results 


0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
10%
0-10K
10K-20K
20K-30K
30K-40K
40K-50K
50K-60K
60K-70K
70K-80K
80K-90K
90K-100K
100K-110K
110K-120K
120K-130K
130K-140K
140K-150K
150K-160K
160K-170K
170K-180K
180K-190K
190K-200K
200K-250K
250K-350K
350K-500K
500K-700K
700K-1M
>1M
Brackets of gross annual income (in euro)
Contributiontorevenue
Real percentage of
contribution in 2014
•  Compare the results of simulating the current tax law against actual
tax statistics 
•  Available tax statistics from 2014:
33
Credibility of the simulation results 


•  We ran the simulator over the 10 generated data 
•  Average simulation time was about 80 minutes
•  The simulation results are closely in line with the available tax statistic
0%
2%
4%
6%
8%
10%
12%
0-10K
10K-20K
20K-30K
30K-40K
40K-50K
50K-60K
60K-70K
70K-80K
80K-90K
90K-100K
100K-110K
110K-120K
120K-130K
130K-140K
140K-150K
150K-160K
160K-170K
170K-180K
180K-190K
190K-200K
200K-250K
250K-350K
350K-500K
500K-700K
700K-1M
>1M
Brackets of gross annual income (in euro)
Contributiontorevenue
Real %
Results of S1
Results of S2
Results of S3
Results of S4
Results of S5
Results of S6
Results of S7
Results of S8
Results of S9
Results of S10
34
Goals of the case study
RQ2: Are the simulation results credible? 
The close alignment observed (indirectly) provide confidence that:
• We modeled the legal policies at the right level of abstraction for
our analysis
• Our data generator, while not a substitute for real data, can still
produce meaningful data for simulation
Lessons learned 
35
Lessons learned
36
Think procedures rather than formulae


•  A single policy model might have several semantically-
equivalent model representations 


•  A policy model should be
aligned with the workflow
envisaged by the law
•  Deviations from the
preconceived workflow
hinder communication and
the validation process
is_divorced
age >= 64
Taxpayer belongs
to class 1.a
Taxpayer belongs
to class 1
no
yes
no
yes
is_divorce or
age >= 64 
Taxpayer belongs
to class 1.a
Taxpayer belongs
to class 1
no
yes
37
Maintain traceability to legal text (and beyond)

•  Traceability to the law is a communication asset as experts
frequently needed to consult the original legal provisions
•  Each (change) amendment should be traced because
stakeholders find it difficult to follow how an amendment
impacts all the policy models at hand
38
Keep the modeling notation simple and lean

•  Models should support different representations of the same
information (according to users’ backgrounds)
•  The right representation for legal requirements should be
carefully considered and experimented with legal experts
39
Summary 

•  Case study of our model-based simulation framework to analyze
a real legal reform
•  Applying our approach in real settings is feasible with
reasonable effort 
•  The produced simulation results are deemed credible,
considering we had no access to real data
•  Lessons learned: 
a.  Think procedures rather than formulas 
b.  Maintain Traceability to legal texts (and beyond) 
c.  Keep the modeling notation as simple and lean as possible 
•  Tool available at http://people.svv.lu/tools/polisim/
40
Beyond tax law
What is beneficial to other (prescriptive) laws:
•  Lessons learned
•  Experience in addressing the communication gap between IT
and legal experts
•  Systematic approach for enabling automated analysis 
•  Tooling (automated data-generator, simulator, etc.)
.lusoftware verification & validation
VVS .lusoftware verification & validation
VVS
Model-Based Simulation of Legal
Requirements: Experience from Tax
Policy Simulation

Ghanem Soltana, Mehrdad Sabetzadeh, and Lionel Briand 

SnT Centre for Security, Reliability and Trust 
University of Luxembourg, Luxembourg

Model-Based Simulation of Legal Requirements: Experience from Tax Policy SimulationSince in our system under test we need to handel dynamic a

  • 1.
    .lusoftware verification &validation VVS .lusoftware verification & validation VVS Model-Based Simulation of Legal Requirements: Experience from Tax Policy Simulation Ghanem Soltana, Mehrdad Sabetzadeh, and Lionel Briand SnT Centre for Security, Reliability and Trust University of Luxembourg, Luxembourg
  • 2.
  • 3.
    How did thiswork come about? 3 •  Collaboration with" Government of " Luxembourg §  CTIE: Government’s IT Centre §  ACD: Tax Administration Department •  New tax system under development: Operationalization of the administrative procedures envisaged by the law •  The development of such system involves both IT and legal experts
  • 4.
    Objectives Simulation data /Testcases Actual software system Traces to Traces to Models of legal requirements Generates 1. Legal compliance 2. Change impact 3. Model validation Simulates Law 4 0% 2% 4% 6% 8% 10% 12% 0% 5% 10% 15% 20% 25% 0-10.000 10.000-20.000 20.000-30.000 30.000-40.000 40.000-50.000 50.000-60.000 60.000-70.000 70.000-80.000 80.000-90.000 90.000-100.000 100.000-110.000 110.000-120.000 120.000-130.000 130.000-140.000 140.000-150.000 150.000-160.000 160.000-170.000 170.000-180.000 180.000-190.000 190.000-200.000 200.000-250.000 250.000-350.000 350.000-500.000 500.000-700.000 700.000-1.000.000 >1.000.000 Gross annual income (in Euros) Contributiontorevenue Households Percentage of households Percentage of contribution before change Percentage of contribution after change 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 0 1-3.000 3.001-6.000 6.001-9.000 9.001-12.000 12.001-15.000 15.001-18.000 18.001-21.000 21.001-24.000 24.001-27.000 27.001-30.000 >30.000 Annual income taxes due (in Euros) Households Before change After change
  • 5.
    Focus of thetalk Simulation data /Test cases Generates Simulates •  Can we bring together and model all the information necessary for performing a real-world simulation scenario? •  To what extent are the simulation results credible? 5 Models of legal requirements 0% 2% 4% 6% 8% 10% 12% 0% 5% 10% 15% 20% 25% 0-10.000 10.000-20.000 20.000-30.000 30.000-40.000 40.000-50.000 50.000-60.000 60.000-70.000 70.000-80.000 80.000-90.000 90.000-100.000 100.000-110.000 110.000-120.000 120.000-130.000 130.000-140.000 140.000-150.000 150.000-160.000 160.000-170.000 170.000-180.000 180.000-190.000 190.000-200.000 200.000-250.000 250.000-350.000 350.000-500.000 500.000-700.000 700.000-1.000.000 >1.000.000 Gross annual income (in Euros) Contributiontorevenue Households Percentage of households Percentage of contribution before change Percentage of contribution after change 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 0 1-3.000 3.001-6.000 6.001-9.000 9.001-12.000 12.001-15.000 15.001-18.000 18.001-21.000 21.001-24.000 24.001-27.000 27.001-30.000 >30.000 Annual income taxes due (in Euros) Households Before change After change 2. Change impact 3. Model validation
  • 6.
    State of theart 6 State of the art
  • 7.
    Legal policy simulationin practice 7 Some existing simulation tools focused on taxation and social security: •  ASSERT: Assessing the effects of reforms in taxation •  SYSIFF: A micro-simulation model for the French tax system •  EUROMOD: European benefit-tax model and social integration
  • 8.
  • 9.
  • 10.
    Limitations of currentsimulation frameworks 10 •  Legal policies are hard-to-validate •  Single-purpose models •  Unusable when simulation data is not available
  • 11.
  • 12.
    Our model-based simulationframework 12 Policy Models Executable Simulator Simulator Code Generator Historical Data Statistical Guidance about the Real Simulation Population Census / Survey Data Expert Estimates Data Generator Artificial Simulation Data (ifavailable) Simulation Results Domain Model Interpretations of the legal text (original + modified) Main concepts and relationships of the simulated domain [Soltana et al., SoSyM 2016]
  • 13.
    Our model-based simulationframework 13 Policy Models Executable Simulator Simulator Code Generator Historical Data Statistical Guidance about the Real Simulation Population Census / Survey Data Expert Estimates Data Generator Artificial Simulation Data (ifavailable) Simulation Results Domain Model
  • 14.
    •  A legalpolicy model captures the procedure envisaged by law for performing a certain activity •  Notation: Extended Activity Diagrams (ADs) •  Support for automated analysis and communication between legal and IT experts Expressive Visual Precise Executable ADs Legal policy models [Soltana et al., MODELS 2014] 14
  • 15.
  • 16.
    Policy model example(2) 16 Possible outputs (here, tax classes) Traceability to the legal provisions
  • 17.
    Policy model example(3) 17 Inputs from: (a) the simulation data or (b) the underlying legal texts context Tax_Payer def: are_both_spouses_non_resident():Boolean = if(self.getSpouse(Constants.TAX_YEAR).oclIsUndefined()) then false else self.oclIsTypeOf(Non_Resident_Tax_Payer) and self.getSpouse(onstants.TAX_YEAR).oclIsTypeOf(Non_Resident_Tax_Payer) endif
  • 18.
    Policy model example(4) 18 Input name Description is_taxed_jointly Yes if a given taxpayer is taxed jointly with another taxpayer; otherwise, no. The value of this input is determined via the application of another policy model: joint taxation. is_married Yes if a given taxpayer is married; otherwise, no. are_both_spouses_ non_resident Yes if a given couple are both non-residents; otherwise, no. is_living_separatel y Yes if a given taxpayer is “de facto” separated (séparathion de fait in French); otherwise, no. is_divorced Yes if a given taxpayer is divorced by mutual agreement; otherwise, no. is_divorced_by_ court_order Yes if a given taxpayer is divorced by court order; otherwise, no. has_separation_ transition_state Yes if a given taxpayer is in a transition state after separation; otherwise, no. The transition state is granted when a taxpayer’s date of separation is within the past three years. is_widower Yes if a given taxpayer is a widower; otherwise, no. ... .. . Glossary for the inputs
  • 19.
    Our model-based simulationframework 19 Policy Models Executable Simulator Simulator Code Generator Historical Data Statistical Guidance about the Real Simulation Population Census / Survey Data Expert Estimates Data Generator Artificial Simulation Data (ifavailable) Simulation Results Domain Model
  • 20.
    20 Histograms * Source: STATEC,Luxembourg - «from histogram» birthYear: Integer [1] TaxPayer [Soltana et al., SoSyM 2016]
  • 21.
    21 * Source: STATEC,Luxembourg 1 taxpayer incomes 1..* Income TaxPayer (abstract) «type dependency» {relativeTo: Income; condition: self.getAge() >= 60; source: «from barchart»} [Soltana et al., SoSyM 2016] Conditional probabilities
  • 22.
  • 23.
    23 Overview •  Selectedcase study: Simulation of the impact of the potential abolishment of joint taxation (income-splitting) •  Provide decision-support for the Government’s actual tax reforms •  Typical analysis: differential reasoning (different goals) 0% 10% 20% 30% 40% 50% 60% 70% Tax class 1 Tax class 1.a Tax class 2 Taxpayers Before change After change -20%! 0%! 20%! 40%! 60%! 80%! 100%! >21.001! 18.001-21.000! 15.001-18.000! 12.001-15.000! 9001-1200! 6001-9000! 3001-6000! 1-3000! 0! 1-3000! 3001-6000! 6001-9000! 9001-1200! 12.001-15.000! 15.001-18.000! 18.001-21.000! >21.001! Less taxes to pay! More taxes to pay! Annual decrease / increase in taxes due (in Euros)! Households!
  • 24.
    24 Goals of thecase study •  RQ1: Can we bring together and model all the information necessary for performing a real-world simulation scenario? a.  Is the required modeling effort practical? b.  Is it possible to provide the input (statistical information) necessary for the generation of simulation data? •  RQ2: Are the simulation results credible?
  • 25.
    25 Policy and domainmodel construction •  Source: Luxembourg’s personal income taxes law •  Iterative process that terminates when the models are deemed valid by legal experts •  Three policy models are directly impacted by the reform 1.  Joint Taxation (JT), 2.  Tax Class Categorization (TCC), and 3.  Extra-Professional Deduction (EPD) •  Domain model Classes Enum. Associations Generalizations Attributes #elements 16 4 6 11 24 JT TCC EPD #elements 111 93 73
  • 26.
    26 Goals of thecase study RQ1: Can we bring together and model all the information necessary for performing a real-world simulation scenario? a.  Is the modeling effort practical? b.  Is it possible to provide the input (statistical information) necessary for the generation of simulation data? •  RQ2: Are the simulation results credible? Activity Model JT TCC EPD Domain Model Model construction (person hours) 7 5 3 7 Model validation (person hours) 1.5 1 0.5 2
  • 27.
    27 Probabilistic information 15 distributionswere used to specify Luxembourg’s population’s characteristics STATEC Histograms (applies to the whole sample) Histograms (applies to a sub- set of the sample) 2014’s tax report Statistic Residence status Age Household size Types of civil union Income types Income amounts Workers per household Divorce rate Divorcetypes Widower rate Income for pensioners Incomefortraders Age of pensioners Foreign income types Residence status based on spouse’s residence status
  • 28.
    28 Goals of thecase study RQ1: Can we bring together and model all the information necessary for performing a real-world simulation scenario? b.  Is it possible to provide the input (statistical information) necessary for the generation of simulation data? •  In total, we attached 57 annotations (stereotypes) to our domain model for guiding the data generation process •  About 70% of the annotations came from public sources •  The remaining 30% were based on feedback from experts and common sense
  • 29.
    29 Goals of thecase study •  RQ1: Can we bring together and model all the information necessary for performing a real-world simulation scenario? a.  Is the required modeling effort practical? b.  Is it possible to provide the input (statistical information) necessary for the generation of simulation data? •  RQ2: Are the simulation results credible?
  • 30.
    30 How to checkthe quality of the generated data For each annotated quantity, we perform a sanity check: •  Used metric: Normalized Euclidian distance •  Euclidian distance for the example: 0.14 •  Acceptance threshold: 0.1 •  All sanity checks must succeed to use the data sample for simulation 0 0.055 0.11 0.165 0.22 0.275 18-24 25-34 35-44 45-54 55-64 65-74 75-100 Age for the real population Age for the generated data
  • 31.
    31 Sanity checks ofthe generated data 00.010.020.030.040.050.060.070.080.090.1 Residence status Age Types of civil union Income types Income amounts Euclideandistance •  10 generated artificial data samples •  10,000 tax cases per sample •  Average generation time was about 30 minutes •  All generated sample were aligned to the real population (Euclidian distance < 0.1)
  • 32.
    32 Credibility of thesimulation results 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 0-10K 10K-20K 20K-30K 30K-40K 40K-50K 50K-60K 60K-70K 70K-80K 80K-90K 90K-100K 100K-110K 110K-120K 120K-130K 130K-140K 140K-150K 150K-160K 160K-170K 170K-180K 180K-190K 190K-200K 200K-250K 250K-350K 350K-500K 500K-700K 700K-1M >1M Brackets of gross annual income (in euro) Contributiontorevenue Real percentage of contribution in 2014 •  Compare the results of simulating the current tax law against actual tax statistics •  Available tax statistics from 2014:
  • 33.
    33 Credibility of thesimulation results •  We ran the simulator over the 10 generated data •  Average simulation time was about 80 minutes •  The simulation results are closely in line with the available tax statistic 0% 2% 4% 6% 8% 10% 12% 0-10K 10K-20K 20K-30K 30K-40K 40K-50K 50K-60K 60K-70K 70K-80K 80K-90K 90K-100K 100K-110K 110K-120K 120K-130K 130K-140K 140K-150K 150K-160K 160K-170K 170K-180K 180K-190K 190K-200K 200K-250K 250K-350K 350K-500K 500K-700K 700K-1M >1M Brackets of gross annual income (in euro) Contributiontorevenue Real % Results of S1 Results of S2 Results of S3 Results of S4 Results of S5 Results of S6 Results of S7 Results of S8 Results of S9 Results of S10
  • 34.
    34 Goals of thecase study RQ2: Are the simulation results credible? The close alignment observed (indirectly) provide confidence that: • We modeled the legal policies at the right level of abstraction for our analysis • Our data generator, while not a substitute for real data, can still produce meaningful data for simulation
  • 35.
  • 36.
    36 Think procedures ratherthan formulae •  A single policy model might have several semantically- equivalent model representations •  A policy model should be aligned with the workflow envisaged by the law •  Deviations from the preconceived workflow hinder communication and the validation process is_divorced age >= 64 Taxpayer belongs to class 1.a Taxpayer belongs to class 1 no yes no yes is_divorce or age >= 64 Taxpayer belongs to class 1.a Taxpayer belongs to class 1 no yes
  • 37.
    37 Maintain traceability tolegal text (and beyond) •  Traceability to the law is a communication asset as experts frequently needed to consult the original legal provisions •  Each (change) amendment should be traced because stakeholders find it difficult to follow how an amendment impacts all the policy models at hand
  • 38.
    38 Keep the modelingnotation simple and lean •  Models should support different representations of the same information (according to users’ backgrounds) •  The right representation for legal requirements should be carefully considered and experimented with legal experts
  • 39.
    39 Summary •  Casestudy of our model-based simulation framework to analyze a real legal reform •  Applying our approach in real settings is feasible with reasonable effort •  The produced simulation results are deemed credible, considering we had no access to real data •  Lessons learned: a.  Think procedures rather than formulas b.  Maintain Traceability to legal texts (and beyond) c.  Keep the modeling notation as simple and lean as possible •  Tool available at http://people.svv.lu/tools/polisim/
  • 40.
    40 Beyond tax law Whatis beneficial to other (prescriptive) laws: •  Lessons learned •  Experience in addressing the communication gap between IT and legal experts •  Systematic approach for enabling automated analysis •  Tooling (automated data-generator, simulator, etc.)
  • 41.
    .lusoftware verification &validation VVS .lusoftware verification & validation VVS Model-Based Simulation of Legal Requirements: Experience from Tax Policy Simulation Ghanem Soltana, Mehrdad Sabetzadeh, and Lionel Briand SnT Centre for Security, Reliability and Trust University of Luxembourg, Luxembourg