Faktum is a tool that automatically generates small test databases for data-intensive systems based on their variability models. It represents a database's fields and values as features in a feature model. It then uses Alloy to synthesize configurations that cover all pairwise interactions between features to populate the test databases. The databases are generated by transforming valid configuration tuples into SQL inserts. Faktum was evaluated on a case study of the Norwegian Customs system, generating 935 compact test cases covering all pairwise interactions among 37 field values. The tool uses divide-and-combine strategies to improve scalability for large interaction spaces.
Credit card fraud is a growing problem that affects card holders around the world. Fraud detection has been an interesting topic in machine learning. Nevertheless, current state of the art credit card fraud detection algorithms miss to include the real costs of credit card fraud as a measure to evaluate algorithms. In this paper a new comparison measure that realistically represents the monetary gains and losses due to fraud detection is proposed. Moreover, using the proposed cost measure a cost sensitive method based on Bayes minimum risk is presented. This method is compared with state of the art algorithms and shows improvements up to 23% measured by cost. The results of this paper are based on real life transactional data provided by a large European card processing company.
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubData Con LA
Abstract:-
Data engineering at Dollar Shave Club has grown significantly over the last year. In that time, it has expanded in scope from conventional web-analytics and business intelligence to include real-time, big data and machine learning applications. We have bootstrapped a dedicated data engineering team in parallel with developing a new category of capabilities. And the business value that we delivered early on has allowed us to forge new roles for our data products and services in developing and carrying out business strategy. This progress was made possible, in large part, by adopting Apache Spark as an application framework. This talk describes what we have been able to accomplish using Spark at Dollar Shave Club.
Bio:-
Brett Bevers, Ph.D. Brett is a backend engineer and leads the data engineering team at Dollar Shave Club. More importantly, he is an ex-academic who is driven to understand and tackle hard problems. His latest challenge has been to develop tools powerful enough to support data-driven decision making in high value projects.
Representing verifiable statistical index computations as linked dataJose Emilio Labra Gayo
Presentation slides at SemStats 2014 Workshop.
In this paper we describe the development of the Web Index linked data portal that represents statistical index data and computations. The Web Index is a multi-dimensional measure of the World Wide Web’s contribution to development and human rights globally. It covers 81 countries and incorporates indicators that assess several areas like universal access; freedom and openness; relevant content; and empowerment.
In order to empower the Web Index transparency, one internal requirement was that every published data could be externally verified. The verification could be that it was just raw data obtained from an external source, in which case, the system must provide a link to the data source or that the value has been internally computed, in which case, the system provides links to those values. The resulting portal contains data that can be tracked to its sources so an external agent
can validate the whole index computation process.
We describe the different aspects on the development of the WebIndex data portal, which also offers new linked data visualization tools. Although in this paper we concentrate on the Web Index development, we consider that this approach can be generalized to other projects which involve the publication of externally verifiable statistical computations.
Dynamic Actions can be used in SAP to trigger automatic processing either in the foreground or in the background in SAP. This functionality can be quite powerful, resulting in better data integrity as you use these to create controls and parameters for data entry by your power users. The functionality as delivered in the IMG, however, does appear to be limited. How can you make these dyanmic actions more accessible to manage things like hundreds of different tax authorities? How can you use the dynamic action to update fields versus simply creating new data? This presentation will show you how to maximize the use of dynamic actions in your SAP HR Functionality using ABAP code and creativity.
PART IUse the OLTP logical schema below to build data warehou.docxkarlhennesey
PART I:
Use the OLTP logical schema below to build data warehouse consisted of two data marts. You will need to import the final exam data code for the OLTP logical schema and develop ETL process in SSIS (70 points). Your tables should meet these requirements:
· DimDate table has DateKey values ranging from 19960704 to 19980603 (YYYYMMDD integer data type). If you use Flat File or Excel Data Source to import datekey data, include the datasource file in the submission also (5 points).
· DimDate table should have EnglishMonthName (varchar data type), CalendarYear (INT data type), Quarter (INT data type), LeapYear (BOOLEAN data type), and IsMartinLutherKingHoliday (BOOLEAN data type) attributes (15 points).
· DimProduct should contain product name, supplier name, category name, unit price, and discontinued (10 points).
· AccumulatingOrderFact has 2 foreign keys – OrderDateKey and ShipDateKey, and one degenerate dimension attribute - OrderID (10 points).
· PeriodicSnapShotMonthlyOrder has 2 foreign keys – MonthKey and ProductKey (10 points).
· Correct ETL process to create 3 fact measures for AccumulatingOrderFact (10 points)
· Correct ETL process to create 3 fact measures for PeriodicSnapShotMonthlyOrder (10 points)
The following table and graph show you the data description of fact measures and ERD of the data warehouse:
Fact Measure
Data Description
TotalProductOrderReceivedOnOrderDate
Total number of product order that data warehouse received
on the order date. Note that one order could have multiple
products ordered.
TotalPriceOrderReceivedOnOrderDate
Total amount of price order that data warehouse received
on the order date.
TotalProductOrderShippedOnShipDate
Total number of product order that data warehouse shipped
on the ship date. We assume that products of one order are
always shipped together.
TotalMonthlyProductOrdered
Total number of products ordered of one month
TotalMonthlyAmountOfProductDiscount
Total amount of product discount of one month
MonthlyPopularItem
Display ‘True’ if the product sold the most in that month.
Otherwise ‘No’.
AccumulatingOrderFact
FK
OrderDateKey
int
ShipDateKey
int
FK
OrderID
int
DD
TotalProductOrderReceivedOnOrderDate
int
TotalPriceOrderReceivedOnOrderDate
money
TotalProductOrderShippedOnShipDate
int
DimProduct
PK
ProductID
int
UnitPrice
money
ProductName
varchar(25)
SupplierName
varchar(25)
CategoryName
varchar(25)
Discontinued
bit
DimDate
PeriodicSnapShotMonthlyOrder
DateKey
int
MonthKey
int
PK
FK
ProductKey
int
EnglishMonthName
varchar(10)
FK
CalendarYear
int
TotalMonthlyProductOrdered
int
Quarter
int
TotalMonthlyAmountOfProductDiscount
money
LeapYear
boolean
MonthlyPopularItem
bit
IsMartinLutherKingHoliday
boolean
...
Indian Export Data, Indian Export Ports, India Export Custom Duty Seair Exim Solution
Trusted official source of All Information Indian Export Data compiled from export Bills of Entry filed at India Export Data Customs duty. Daily lists of Indian ports like jnpt, delhi, chennai, mumbai Get detailed data at seair.co.in
Credit card fraud is a growing problem that affects card holders around the world. Fraud detection has been an interesting topic in machine learning. Nevertheless, current state of the art credit card fraud detection algorithms miss to include the real costs of credit card fraud as a measure to evaluate algorithms. In this paper a new comparison measure that realistically represents the monetary gains and losses due to fraud detection is proposed. Moreover, using the proposed cost measure a cost sensitive method based on Bayes minimum risk is presented. This method is compared with state of the art algorithms and shows improvements up to 23% measured by cost. The results of this paper are based on real life transactional data provided by a large European card processing company.
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubData Con LA
Abstract:-
Data engineering at Dollar Shave Club has grown significantly over the last year. In that time, it has expanded in scope from conventional web-analytics and business intelligence to include real-time, big data and machine learning applications. We have bootstrapped a dedicated data engineering team in parallel with developing a new category of capabilities. And the business value that we delivered early on has allowed us to forge new roles for our data products and services in developing and carrying out business strategy. This progress was made possible, in large part, by adopting Apache Spark as an application framework. This talk describes what we have been able to accomplish using Spark at Dollar Shave Club.
Bio:-
Brett Bevers, Ph.D. Brett is a backend engineer and leads the data engineering team at Dollar Shave Club. More importantly, he is an ex-academic who is driven to understand and tackle hard problems. His latest challenge has been to develop tools powerful enough to support data-driven decision making in high value projects.
Representing verifiable statistical index computations as linked dataJose Emilio Labra Gayo
Presentation slides at SemStats 2014 Workshop.
In this paper we describe the development of the Web Index linked data portal that represents statistical index data and computations. The Web Index is a multi-dimensional measure of the World Wide Web’s contribution to development and human rights globally. It covers 81 countries and incorporates indicators that assess several areas like universal access; freedom and openness; relevant content; and empowerment.
In order to empower the Web Index transparency, one internal requirement was that every published data could be externally verified. The verification could be that it was just raw data obtained from an external source, in which case, the system must provide a link to the data source or that the value has been internally computed, in which case, the system provides links to those values. The resulting portal contains data that can be tracked to its sources so an external agent
can validate the whole index computation process.
We describe the different aspects on the development of the WebIndex data portal, which also offers new linked data visualization tools. Although in this paper we concentrate on the Web Index development, we consider that this approach can be generalized to other projects which involve the publication of externally verifiable statistical computations.
Dynamic Actions can be used in SAP to trigger automatic processing either in the foreground or in the background in SAP. This functionality can be quite powerful, resulting in better data integrity as you use these to create controls and parameters for data entry by your power users. The functionality as delivered in the IMG, however, does appear to be limited. How can you make these dyanmic actions more accessible to manage things like hundreds of different tax authorities? How can you use the dynamic action to update fields versus simply creating new data? This presentation will show you how to maximize the use of dynamic actions in your SAP HR Functionality using ABAP code and creativity.
PART IUse the OLTP logical schema below to build data warehou.docxkarlhennesey
PART I:
Use the OLTP logical schema below to build data warehouse consisted of two data marts. You will need to import the final exam data code for the OLTP logical schema and develop ETL process in SSIS (70 points). Your tables should meet these requirements:
· DimDate table has DateKey values ranging from 19960704 to 19980603 (YYYYMMDD integer data type). If you use Flat File or Excel Data Source to import datekey data, include the datasource file in the submission also (5 points).
· DimDate table should have EnglishMonthName (varchar data type), CalendarYear (INT data type), Quarter (INT data type), LeapYear (BOOLEAN data type), and IsMartinLutherKingHoliday (BOOLEAN data type) attributes (15 points).
· DimProduct should contain product name, supplier name, category name, unit price, and discontinued (10 points).
· AccumulatingOrderFact has 2 foreign keys – OrderDateKey and ShipDateKey, and one degenerate dimension attribute - OrderID (10 points).
· PeriodicSnapShotMonthlyOrder has 2 foreign keys – MonthKey and ProductKey (10 points).
· Correct ETL process to create 3 fact measures for AccumulatingOrderFact (10 points)
· Correct ETL process to create 3 fact measures for PeriodicSnapShotMonthlyOrder (10 points)
The following table and graph show you the data description of fact measures and ERD of the data warehouse:
Fact Measure
Data Description
TotalProductOrderReceivedOnOrderDate
Total number of product order that data warehouse received
on the order date. Note that one order could have multiple
products ordered.
TotalPriceOrderReceivedOnOrderDate
Total amount of price order that data warehouse received
on the order date.
TotalProductOrderShippedOnShipDate
Total number of product order that data warehouse shipped
on the ship date. We assume that products of one order are
always shipped together.
TotalMonthlyProductOrdered
Total number of products ordered of one month
TotalMonthlyAmountOfProductDiscount
Total amount of product discount of one month
MonthlyPopularItem
Display ‘True’ if the product sold the most in that month.
Otherwise ‘No’.
AccumulatingOrderFact
FK
OrderDateKey
int
ShipDateKey
int
FK
OrderID
int
DD
TotalProductOrderReceivedOnOrderDate
int
TotalPriceOrderReceivedOnOrderDate
money
TotalProductOrderShippedOnShipDate
int
DimProduct
PK
ProductID
int
UnitPrice
money
ProductName
varchar(25)
SupplierName
varchar(25)
CategoryName
varchar(25)
Discontinued
bit
DimDate
PeriodicSnapShotMonthlyOrder
DateKey
int
MonthKey
int
PK
FK
ProductKey
int
EnglishMonthName
varchar(10)
FK
CalendarYear
int
TotalMonthlyProductOrdered
int
Quarter
int
TotalMonthlyAmountOfProductDiscount
money
LeapYear
boolean
MonthlyPopularItem
bit
IsMartinLutherKingHoliday
boolean
...
Indian Export Data, Indian Export Ports, India Export Custom Duty Seair Exim Solution
Trusted official source of All Information Indian Export Data compiled from export Bills of Entry filed at India Export Data Customs duty. Daily lists of Indian ports like jnpt, delhi, chennai, mumbai Get detailed data at seair.co.in
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Sagar sen caise2013final
1. 1
Testing a Data-intensive System with
Generated Data Interactions
The Norwegian Customs and Excise Case Study
The Norwegian Customs and Excise Case Study
Sagar Sen and Arnaud Gotlieb
Certus V&V Center, Simula Research Laboratory
3. 3
The Heart of Norway’s E-governance:
TVINN
• 30,000 declarations/day, potentially
adhering to about 220,000 customs rules
• Customs rules typically accept/return
declarations based on information in the
declaration
• Norway first country in the world to use
UN’s EDIFACT brokerage standard
• 20% of Norway’s Economy (200 Billion
NOK/year or 25 billion Euros/year)
5. 5
Daily Challenges for TVINN
• Accurate Computation of Taxes
• Preventing criminal activities such
as mafia
Gross-weight > 2 x Net-
weight?
6. 6
Daily Challenges for TVINN
• Accurate Computation of Taxes
• Preventing criminal activities such
as mafia
• Protecting people of Norway from
imports of hazardous substances
7. 7
Daily Challenges Translates to Testing
• Accurate Computation of Taxes
• Preventing criminal activities such as mafia
• Protecting people of Norway from imports of hazardous substances
Testing TVINN
Are customs rules complete? Can they correctly detect
problems in declarations? Are there missing rules?
8. 8
Behind the Scenes: Testing at Toll
Atle, Katrine,
Astrid, Odd
Large amounts of live data (up-to
30,000 customs
declarations/day)
Small team of test
managers
Testing
TVINN
9. 9
Behind the Scenes: Testing at Toll
Is live data complete
for testing all
customs rules?
How long will it take to
test with all live data?
This is a lot of data,
can I select those
relevant to detect
bugs?
10. 10
Can we automatically synthesize small test databases but
effective instead of using live data?
12. 12
Database defined by a schema
specified by
Database
Database Schema
(Eg. Norwegian Customs)
13. 13
Modelling Test Database Configuration
Space with a Feature Model
Database
Tables
Fields
Field
Values
Invariants
CountryCode.CN requires Currency.CNY
CountryCode.CN requires CountryGroup.RCN
Database Configuration Space
14. 14
Configuration to Test Database Population
Feature Model
Configuration of field
values
Database
INSERT INTO Declarations( Category ,
Direction, CountryCode , CurrencyCode )
VALUES ( 'FO' , ' I ' , 'US ' , 'CNY' ) ;
populates
to SQL
INSERT INTO Items( OriginCountry)
VALUES ( 'US' ) ;
...
15. 15
Challenge
1. How to generate small test databases?
2. That satisfy test coverage criteria such as
combinatorial interaction coverage?
17. 17
Faktum
• Input is a feature model of database
variability, Schema,T (Eg.T=2 is pairwise)
• A tool to synthesize test databases
• That covers all T-wise combinatorial
interactions between a set of field values
• Uses the Alloy Analyzer API and
implemented in Java
18. 18
Faktum
Step 1: Feature Model to Constraint Satisfaction
Problem in Alloy
one sig ProductConfigurations{ configurations : set
Configuration}sig Configuration{f1: lone Category_FO, f2: lone Category_EN,
...}
fact Invariant_Category_XOR{all c:Configuration|
#c.f9+#c.f10+#c.f11+#c.f12+#c.f13+#c.f14=1 }
Base Alloy Model
“Database field values are features in a config.”
“Invariants as facts”
“Set of configurations is set of tests”
“A configuration is a test case”
19. 19
Step 2: Generating T-wise data interactions
FU USD
N/P N/P
N/P P
P N/P
P P
T=2, Pairwise Interactions
or Tuples
P=Present in database
N/P=Not Present in database
Faktum
Case study has 2582 pairwise tuples
(interactions)
20. 20
Faktum
Step 3: Tuples of Interactions to Alloy
Predicates
pred tuple1
{all c:Configuration|#c.f1=0 and #c.f2=0}
pred tuple2
{all c:Configuration|#c.f1=0 and #c.f2=1}
pred tuple3
{all c:Configuration|#c.f1=1 and #c.f2=0}
pred tuple4
{all c:Configuration|#c.f1=1 and #c.f2=1}
FU USD
N/P N/P
N/P P
P N/P
P P
transform
Tuple Predicates
21. 21
Faktum
Step 4: Checking Tuple Validity
pred tuple1
{all c:Configuration|#c.f1=0 and #c.f2=0}
pred tuple2
{all c:Configuration|#c.f1=0 and #c.f2=1}
pred tuple3
{all c:Configuration|#c.f1=1 and #c.f2=0}
pred tuple4
{all c:Configuration|#c.f1=1 and #c.f2=1}
Tuple Predicates
Base Alloy Model
Solution Exists?+
solve
Tuple Valid! Tuple Invalid!
Y N
“A fully parallelizable process”
22. 22
Faktum
Step 5: Divide and Combine Strategy
A large number of interaction tuples gives a large number of predicates.
Solving all predicates in one Alloy constraint model is not tractable.
Valid Tuple
Predicates
Base Alloy
Tuple subsets
+
Base Alloy+
Base Alloy+
solve
solve
solve
divide combine
Configuration subsetsConfiguration
set
Perrouin, Sen, Baudry, Le Traon, Automate T-wise Test Generation for SPLs, ICST 2010
“One can explore different divide strategies”
23. 23
Faktum
Step 6: Configuration to SQL
Configuration of field
values
Database
INSERT INTO Declarations( Category ,
Direction, CountryCode , CurrencyCode )
VALUES ( 'FO' , ' I ' , 'US ' , 'CNY' ) ;
populates
to SQL
Configuration Set
INSERT INTO Items...; INSERT INTO
Taxes;..
24. 24
Faktum
Step 7: Generating updates for other fields
Database
completes
UPDATE INTO Declarations (CustomerID , Date , Sequence , Version , Amount ,
FeeAmount ,
TransportCost , ExchangeRate )
VALUES ( ' 2002542616 ' , ' 1965-3-29 ' , ' 1 ' , ' 1 ' , ' 2982.490245 ' , '
1343.471627 ' ,
' 79.0749637 ' , ' 112.7416998 ' ) ;
Basic strategy: Unique values for keys and random generation for other fields
30. 30
Scalability of Configuration Generation
1. 935 configurations required 6803 calls to the Alloy SAT solver
2. Divide-and-combine gives an average of 400 ms per call to solver
31. 31
Conclusion and Future Work
1. We represent variability in test databases as a feature model.
2. Faktum, a tool to synthesize test databases covering T-wise
combinatorial interactions between database field values.
3. We apply Faktum to generate compact test databases for the
Norwegian Customs and Excise Dept.
4. Faktum is scalable due to a divide-and-combine strategy
Future Work
1. We are thinking about a better representation of database variability
2. A multi-processor parallelized implementation of Faktum for fast
generation (not just scalable)
TVINN is a sophisticated software system that processes about 30,000 customs declarations making sure they adhere to more than 100,000 customs rules. Customs rules basically accept or reject declarations using information in it. Incidentally, Norway is a pioneer in E-governance being the first country in the world to use EDIFACT brokerage. A UN standard for exchange of business message. Its in widespread today. The TVINN system is TVINN plays a key role in inching towards a corruption free society.
There are three principal challenges for Toll and in particular TVINN. The first one is the accurate computation of taxes. Some errors in computing taxes could hurt the public image
Second is detecting and preventing criminal activities such as mafia. For instance, if Gross-weight ? two times net weigh then something is fishy. Could there be something else in the packaging. Of course, a lot of prevention is done manually, but TVINN is certainly capable of raising alerts when numerical values look suspicious.
Third challenge is to protect the people of Norway. People seldom would import a gun or a bomb into Norway but may import the key elements to making one. Information about declarations in TVINN can help detect such patterns.
These daily challenges largely translate to Testing of TVINN. Testing TVINN entails asking questions such as : are customs rules complete? can they correctly detect problems in declarations? Are there missing rules?
So who does it? There is a team of very efficient and experience test managers at Toll Customs Norway. We are pleased to have Katrine Langset with us in the audience. A small team behind the scenes ensures that new rules are well tested and robust for a flurry of 30,000 incoming declarations the next day.
So who does it? There is a team of very efficient and experience test managers at Toll Customs Norway. We are pleased to have Katrine Langset with us in the audience. A small team behind the scenes ensures that new rules are well tested and robust for a flurry of 30,000 incoming declarations the next day.
So at Certus we asked ourselves. How can we automate steps for our test managers to improve the quality of TVINN,save them time, and increase their understandability of the system.
So at Certus we asked ourselves. How can we automate steps for our test managers to improve the quality of TVINN,save them time, and increase their understandability of the system.