This presentation by CADE (Brazilian Competition Authority) was made during a workshop on “Cartel screening in the digital era” held by the OECD in Paris on 30 January 2018. More papers and presentations on the topic can be found out at oe.cd/wcsde.
Digital collaboration with Microsoft 365 as extension of Drupal
Cartel screening in the digital era – CADE Brazil – January 2018 OECD Workshop
1. Screening and data mining tools to detect cartels:
Brazilian experience
Workshop on cartel screening in the digital era
OECD, Paris – 30 January 2018
Diogo Thomson de Andrade
Deputy Superintendence of CADE
Felipe Leitão Valadares Roquete
Head of the Intelligence Unity of CADE
2. Disclaimer
The views expressed here are solely those of the author and do not in any way represent the views of the
CADE, or any other entity of the Brazilian Government.
3. Summary
• How (and why) we create a Screening Unit? (1/?)
• Projeto Cérebro1
: state of the art (?/?)
• Projeto Cérebro: findings and (or?) challenges (?/?)
• Simulations: a few actual examples (?/?)
1
Aka “The Brain Project"
6. How (and why) we create a Screening Unit
Origins and reason for primary focus on bid-rigging:
• Authority priority in fighting bid-rigging since 2007;
• Large amount of public data;
• Large number of sources of data;
• High difficult of data mining and multiplicity of databases;
• Context of “big data” initiatives;
• Economic theory and literature of screening;
• Behaviour and pattern analysis of suspicion of collusion in public
procurement (“red flags”).
7. Phase 1: Benchmarking
Phase 2: Data collection
- formal agreements and webscraping
Phase 3: External consultants
- IT (data mining techniques)
- Econometrics (screening methods)
During all phases: internal expert “consultants” (senior investigators and
case handlers) to “guide" external consultants based on the actual cartel
detection experience from the brazilian authority
How (and why) we create a Screening Unit
8. Projeto Cérebro: state of the art
Dump Oracle Comprasnet
200GB
CSV ANP Georef
2GB
TXT Petronect
90GB
CSV Tribunal SP
170GB
WebScraping DF
1GB
Federal Revenue
217GB
XLS CEMED ANVISA
0,2GB
All the data...
>2TB
~40 databases
BAK Tribunal PE
1,5GB
9. • Set of public databases or private databases of public
data (via cooperation agreements);
• One, searchable, IT “language”.
Data
Warehouse
• Patterns and similarities in competitors behaviour;
• Suspicious/rare facts;
• Signs of simulation of competition;
• Plots and trails of analysis of bid-rigging (CADE/others);
• Automation of the analysis made by the case-handlers.
Data Mining
• Screenings literature: test of hypothesys;
• Generalizations based on past concrete cases;
• Microeconomic theory.
Statistical
Tests
Projeto Cérebro: state of the art
11. Projeto Cérebro: state of the art
Applications:
• Ex officio investigations (enough for a dawn raid at least);
• Support and enhance ongoing investigations;
• General support of data for all units of CADE (collateral effect)
12. Architecture
Software: open source (“R”, Python and neo4j) and proprietary (SQL Server and Qlikview)
Projeto Cérebro: findings and (/or?) challenges
13. Challenge #0
Institutional differences: impact on public procurement design and private market
exchanges
- public procurement regulations and bid rigging strategies
- different comprehensiveness in scope of provisions to access public and private data
- data mining techniques and screening methods are replicable?
Projeto Cérebro: findings and (/or?) challenges
14. Challenge #0
Example: Rebid (Kawai & Nakabayashi)
Detecting Large-Scale Collusion in Procurement Auctions
Projeto Cérebro: findings and (/or?) challenges
15. Projeto Cérebro: findings and (/or?) challenges
Challenge #1
Data: access, comprehensiveness, quality and cost
- how much resources
- public procurement and private markets: availability
- completeness and quality
16. Projeto Cérebro: findings and (/or?) challenges
Challenge #1
Example: Swiss Competition Authority
Bid Rigging Flagging
17. Challenge #2
Sreening actual : intelligence or prosecution?
- prioritization tool?
- to enhance formal cases?
- cost benefit analysis?
Projeto Cérebro: findings and (/or?) challenges
18. Projeto Cérebro: findings and (/or?) challenges
Challenge #2
Example: CADE’s Fuel Retail Market Screen
19. Challenge #3
Translation: IT and econometrics technicalities and non-experts
- Jucidiary Power, lawyers, public servers, etc.
- risk to weaken the case?
- strategy: incremental or stationary?
Projeto Cérebro: findings and (/or?) challenges
20. • Federal government e-
procurement system:
Comprasnet
• On average 60.000 public
tenders / year.
• Patterns being looked for:
• Bid suppression
• Cover bidding
• Bid rotation
• Superfluous losing bidders
• Stable market share
• Pricing patterns
• Text similarities
• Submitted files metadata
Simulations: a few actual examples
Brazil: public procurement data’s overview