The document discusses using graph analytics and a graph database to enhance market surveillance and detect market abuse. It provides examples of how trading activity, orders, securities, brokers, etc. can be modeled as a graph network. Queries can then be used to find patterns and connections in the network that may indicate insider trading or other issues. For example, one query searches the graph to find other brokers who took similar positions to a broker of interest, and another looks for historical matching patterns involving the same brokers and securities. Modeling the market as a graph in this way allows for more efficient analysis of relationships compared to traditional relational databases.
Real-time, high-frequency trading (HFT) is placing increasing pressure on regulatory compliance teams to keep up with and monitor the industry's widening pools of structured and unstructured data. Emerging technologies can help capital markets firms use big-data analytics to collect, classify and analyze high volumes of data to formulate strategies for better surveillance, compliance and spot abuse.
Get to grips with FRTB data and data management requirementsLeigh Hill
The Fundamental Review of the Trading Book (FRTB) sets out a revised market risk framework and proposals to improve capital requirements. Although the regulation’s compliance deadline is 18 months away and some details have yet to be finalised, financial institutions need to be preparing now as FRTB is complex, presents significant data management challenges and could have a considerable impact on trading desks.
Join the webinar to find out about:
-State of play on FRTB
-Key data management challenges
-Approaches to compliance
-Technology solutions
-Impact on trading desks
Real-time, high-frequency trading (HFT) is placing increasing pressure on regulatory compliance teams to keep up with and monitor the industry's widening pools of structured and unstructured data. Emerging technologies can help capital markets firms use big-data analytics to collect, classify and analyze high volumes of data to formulate strategies for better surveillance, compliance and spot abuse.
Get to grips with FRTB data and data management requirementsLeigh Hill
The Fundamental Review of the Trading Book (FRTB) sets out a revised market risk framework and proposals to improve capital requirements. Although the regulation’s compliance deadline is 18 months away and some details have yet to be finalised, financial institutions need to be preparing now as FRTB is complex, presents significant data management challenges and could have a considerable impact on trading desks.
Join the webinar to find out about:
-State of play on FRTB
-Key data management challenges
-Approaches to compliance
-Technology solutions
-Impact on trading desks
Sample Report: Asia-Pacific Online Payment Methods: First Half 2015yStats.com
Free report samples from "Asia-Pacific Online Payment Methods: First Half 2015"
Find the full updated 2021 report available for purchase at: https://ystats.com/shop/asia-pacific-online-payment-methods-2021/
Sample Report: Europe Online Payment Methods: First Half 2015yStats.com
Free report samples from "Europe Online Payment Methods: First Half 2015"
Find the full updated report available for purchase at: https://ystats.com/shop/europe-online-payment-methods/
Big Data vs. Big Risk: Real-Time Trade Surveillance in Financial MarketsArcadia Data
Who’s winning the deep forensic analysis ‘arms race’ for compliance?
Real-time trade surveillance in global financial markets has created a data tsunami.
With greater volumes of data comes greater compliance risk. CNBC reports U.S. Banks have been fined over $200B since the financial crisis. How are compliance teams fighting back to make more of the data and stay out of regulatory hot water?
Rapid response to suspect trades means compliance teams need to access and visualize trade patterns, real time and historic data, to navigate the data in depth and flag possible violations.
Join Hortonworks and Arcadia for this live webinar: we’ll cover the use case at a top 50 Global Bank who now has deep forensic analysis of trade activity. The result: interactive, ad hoc data visualization and access across multiple platforms – without limits on historic data – to detect irregularities as they happen.
Sample Report: Global Alternative Online Payment Methods: First Half 2015yStats.com
Free Report Samples for our publication "Global Alternative Online Payment Methods: First Half 2015".
Find the full report available for purchase at: https://ystats.com/shop/global-alternative-payment-methods-2021-post-covid-19/
Cognitive computing market vendors by size, share & growth strategies 2...DheerajPawar4
[207 Pages Report] To provide detailed information about the key factors influencing the growth of the market (drivers, restraints, opportunities, and industry-specific challenges)
Hedge Fund case study solution - Credit default swaps execution system and Gr...Naveen Kumar
I designed the entire end-to-end trading architecture of a hedge fund.
The execution system for integrating a fund with Credit default swap capabilities and also solved Hedge fund's liquidity constraint in moving funds across the countries.
How to build a Single View of Customer using his Digital Journey across multiple channels & multiple assets?
Presented at Big Data & Analytics Innovation Summit, Singapore, 2018
Detecting Opportunities and Threats with Complex Event Processing: Case St...Tim Bass
Detecting Opportunities and Threats with Complex Event Processing: Case Studies in Predictive Customer Interaction Management and Fraud Detection, February 27, 2007 FINAL DRAFT 2, 8th Annual Japan\'s International Banking & Securities System Forum, Tim Bass, CISSP, Principal Global Architect, Director
APIs and Unlocking the Value of Your Data - Strata Barcelona 20143scale
Taking Big Data to the next level by adding APIs. Thinking through data as a product and how to distribute it. Talk at Strata Hadoop 2014 http://strataconf.com/strataeu2014
Sample Report: Asia-Pacific Online Payment Methods: First Half 2015yStats.com
Free report samples from "Asia-Pacific Online Payment Methods: First Half 2015"
Find the full updated 2021 report available for purchase at: https://ystats.com/shop/asia-pacific-online-payment-methods-2021/
Sample Report: Europe Online Payment Methods: First Half 2015yStats.com
Free report samples from "Europe Online Payment Methods: First Half 2015"
Find the full updated report available for purchase at: https://ystats.com/shop/europe-online-payment-methods/
Big Data vs. Big Risk: Real-Time Trade Surveillance in Financial MarketsArcadia Data
Who’s winning the deep forensic analysis ‘arms race’ for compliance?
Real-time trade surveillance in global financial markets has created a data tsunami.
With greater volumes of data comes greater compliance risk. CNBC reports U.S. Banks have been fined over $200B since the financial crisis. How are compliance teams fighting back to make more of the data and stay out of regulatory hot water?
Rapid response to suspect trades means compliance teams need to access and visualize trade patterns, real time and historic data, to navigate the data in depth and flag possible violations.
Join Hortonworks and Arcadia for this live webinar: we’ll cover the use case at a top 50 Global Bank who now has deep forensic analysis of trade activity. The result: interactive, ad hoc data visualization and access across multiple platforms – without limits on historic data – to detect irregularities as they happen.
Sample Report: Global Alternative Online Payment Methods: First Half 2015yStats.com
Free Report Samples for our publication "Global Alternative Online Payment Methods: First Half 2015".
Find the full report available for purchase at: https://ystats.com/shop/global-alternative-payment-methods-2021-post-covid-19/
Cognitive computing market vendors by size, share & growth strategies 2...DheerajPawar4
[207 Pages Report] To provide detailed information about the key factors influencing the growth of the market (drivers, restraints, opportunities, and industry-specific challenges)
Hedge Fund case study solution - Credit default swaps execution system and Gr...Naveen Kumar
I designed the entire end-to-end trading architecture of a hedge fund.
The execution system for integrating a fund with Credit default swap capabilities and also solved Hedge fund's liquidity constraint in moving funds across the countries.
How to build a Single View of Customer using his Digital Journey across multiple channels & multiple assets?
Presented at Big Data & Analytics Innovation Summit, Singapore, 2018
Detecting Opportunities and Threats with Complex Event Processing: Case St...Tim Bass
Detecting Opportunities and Threats with Complex Event Processing: Case Studies in Predictive Customer Interaction Management and Fraud Detection, February 27, 2007 FINAL DRAFT 2, 8th Annual Japan\'s International Banking & Securities System Forum, Tim Bass, CISSP, Principal Global Architect, Director
APIs and Unlocking the Value of Your Data - Strata Barcelona 20143scale
Taking Big Data to the next level by adding APIs. Thinking through data as a product and how to distribute it. Talk at Strata Hadoop 2014 http://strataconf.com/strataeu2014
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
Deep.bi It helps ecommerce teams improve their performance by providing current and detailed insights.
It bring operational excellence and performance for:
- Category Managers / Merchandisers
- Marketers
- Customer service
- UX / Design Team
- Tech / IT
- Executives / Managers
Big data analytics for telecom operators final use cases 0712-2014_prof_m erdasProf Dr Mehmed ERDAS
Big Data Analytics for TELCOs Customer Experience Management Permission Based Marketing for Location and Movement Data Data Modelling Business Use Cases Data Mining BSS OSS COTS OTT Churm Modeling Markov Processes HANA HADOOP INtegration Video Streaming Test cases
Human in the Loop AI for Building Knowledge Bases Yunyao Li
The ability to build large-scale domain-specific knowledge bases that capture and extend the implicit knowledge of human experts is the foundation for many AI systems. We use an ontology-driven approach for the creation, representation and consumption of such domain-specific knowledge bases. This approach relies on several well-known building blocks: natural language processing, entity resolution, data transformation and fusion. I will present several human-in-the-loop work that target domain experts (rather than programmers) to extract the domain knowledge from the human expert and map it into the "right" models or algorithms. I will also share successful use cases in several domains, including Compliance, Finance, and Healthcare: by using these tools we can match the level of accuracy achieved by manual efforts, but at a significantly lower cost and much higher scale and automation.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
3. 3
Surveillance Insight for Financial Services
Fuse insights to detect
suspicious activities
View linkages with instant
drill-down & playbacks
Build employee profiles
with personality &
behavior traits
See alerts & investigations
from continuously updated
risk models
Order
Trade
Execution
Reference
News
Social
Email
Chat
Voice
Reduced cost of employee
non-compliance & misconduct
Faster detection of
sophisticated scenarios
Risk based prioritization of
alerts and reduced false
positives
Transactions
Communications
External
Surveillance
4. 4
In addition to extending the ‘base-level’ controls, clients are also seeking to improve the accuracy
and effectiveness of their surveillance controls through a holistic and cognitive approach
5. 5
Surveillance Solution - Value proposition summary
Reasoning
Engine which goes
beyond rules based
alert system
Fusion of information
from multiple data
sources (ex: Trade,
Electronic
Communication, Voice,
Activity etc.)
Learning models which
are continuously
updated
Providing effective and
easy to use visualization
tools Identify new predictive patterns
by linking various information
with Trader profiles "i.e. know
your trader”
Improve accuracy of alerts -
Reduce false positives and
negatives
Improve efficiency of
investigations
IBM’s Advanced Analytics
(NLP, Sentiment analysis, Semantic Analysis, Behavior analysis, Emotion Analysis etc.)
Real-Time & Scalable Analysis Capabilities
Real-time supervision and
surveillance
6. 6
Solution High level view
6
Data Ingestion Pre-built
Analytics
Insight
Delivery
• Market data as CSV
in FIX format
• REST based
adaptor to accept
Incoming message
payload
• Pump/Dump detection
engine
• Spoofing Engine
• Bayesian Inference
Engine for reasoning
• UIMA based analytic
pipeline
• Dashboards
• REST based
Alert Service
IBM Industry Analytics Solutions
WHAT WE ARE DELIVERING
End-to-End Pre-built Capabilities
Business
Users
Trading
Data
Trade
Quote
Order
Execution
Market /
Customer data
Communication
data
Chat
Voice
Email
8. 8
Order
Position
Market
Social
Reference
Trade
ChatEmail
Bayesian Network, Markov
Processes
Reasoning Engine
People Data
(Brokers, Dealers, Traders,
Customers)
Data Store
• Metadata
• Results
• Raw DataVoice
Unstructured Data Ingestion
Semantic Analysis
Behavior Analysis
Social Analysis
Emotion Analysis
Compliance
Workbench
Raw Data
Anomaly Info.
Anomaly
Scores
Graph
Database
Real-time Analysis Platform
Normalize Analyze
Alert
Generation
In-memory Database
Integrated Trade Surveillance System – Logical View of the Foundation
Salient Features
- Single solution handling
Trade, eComms, &
Voice
- Real-Time, Near Real-
Time & Batch Analysis
- Cognitive Analysis and
Reasoning Engine
- Natural Language
Processing
- Advanced visualization
9. 9
Order
Positio
n
Market
Social
Referen
ce
Trade
ChatEmail
Bayesian Network,
Markov Processes
Reasoning
Engine
People Data
(Brokers, Dealers,
Traders,
Customers)
Data Store
• Metadata
• Results
• Raw DataVoice
Unstructured Data Ingestion
Semantic
Analysis
Behavior Analysis
Social Analysis
Emotion Analysis
Compliance
Workbench
Raw Data
Anomaly Info.
Anomaly
Scores
Graph
Databas
e
Real-time Analysis
Platform
Normalize Analyze
Alert
Generation
In-memory
Database
9
Integrated Trade Surveillance System - System Capability Mapping
Real-time Analysis Platform
• Normalize structured trade
data
• Analyze trade data to detect
anomalous signals
• Load data into data storage
• Speech to Text Technology
• Low-Level:
Codecs/Channel/Handset
Compensation; SNR
Estimation
• Mid-Level: Emotion and
Speech Recognition,
Speaker Diariaziation,
Age/Gender Estimation
• High_level: Sentiment
Analysis, Topic Detection,
Speaker Profile
Data Storage
• Stores both raw data, meta data, temporary
results/signals and the Case Alerts
• SQL Interface
Reasoning Engine
• Based on Bayesian Network
• Models Learn over time
Unstructured Data (eComms) Load
• Load email, chat, twitter, etc. data
• Load Message Board data
• Load other activity data
Compliance Workbench
• Integrated UI for both structured and unstructured data
• Executive Dashboard
• Details of individual Alerts
• Explore and analyze individual indicators/signal
Analysis Pipeline
Social Analysis
• Social communication graph
analysis
Semantic Analysis
Behavior Analysis
• Early Fusion of email
characteristics
• Feature based outlier detection
Emotion Analysis
• Risk Emotion Detection – analyze
user written text
APIs
• Integrate data channels (e-Comms)
• Integrate with Case Management
• Integrate with external Visualization
Tools
1
1
2
2
3
3
4
4
5 5
6
7
77
6
12. 12
Graph Database: storing the network structure in a graph database allows
for deep quantitative analytics
Native Graph DB stores nodes and
relationships directly, It makes
retrieval efficient.
Retrieving multi-step
relationships is a
'graph traversal' problem
• Native Graph Store for High Performance Traversal, Analytics & Visualization
• Generic Graph Query Language for Flexibility & Extensibility
In Relational DB, relationships are
distributed and stored as tables
13. Insider Trading Graph Analysis Example: page 1 of 5
Analyzing complex layered activities distributed across multiple exchanges and asset classes can be extremely
convoluted. Using a native graph structure can dramatically increase the efficiently of mining comparable activities.
Designing workflows to automatically mine similarities within historical broker activity can synthesize scenarios for
business agent review.
FFIV [ is a ]
US Equity [ in ]
NASDAQ [ with ]
Derivative Contracts [ in ]
AMEX, BATS, BOX, C2, CBOE, PHLX, ISE
Graph Legend:
Equity OrderDerivativeBroker
BUY SELL
Single Native Daily Market Graph (≈ 40B Nodes)
Relationship of Orders
This example is meant to demonstrate how
fundamental market activity can be stored in an
innate graph structure that intuitively replicates how
the market activity occurs.
Here we find equity FFIV which is listed on the
NASDAQ but has derivatives contracts sold
through multiple exchanges. We can structure
the market activity as a vast network of
orders. Each encoding information on edges as
well as order properties improves efficiency.
The relationship between securities, orders, and brokers is paramount. Entities are
modeled:
Equity Order BrokerDaily:
≈ 10,000 ≈ 40
Billion
≈ 650,000
Additional degrees of separate such as derivatives/securities can be modeled.
Equity Order BrokerDerivativeDerivative
Directionally, enormous amounts of information can be encoded into the graph.
FFIV Order
Broker
A123
FFIV Order
Broker
B234
BUY
SELL
exchange: BATS
counterparty linkage
exchange: BATS
volume: 10,000
volume: 10,000
14. Insider Trading Graph Analysis Example: page 2 of 5
By modeling the market as a network of connected trades previous analytic computations can be replaced with queries.
Now attributes of the network are considered degrees of connectedness whereas before they would have been stored in
separate tables and looked up / aggregated independently.
Call Order
FFIV
FFIV – A123 Connection: April 2, 2016
Here we can examine the degree of separation and (source / sink
paths) to learn features from the network.
Strike Price
Property
Expiration Property
Broker
A123
Put
$105
Order
Strike Price
Property
Jun 15 Expiration Property
Order
volume: 10,000
volume: 5,000
volume: 1,000
exchange:
NASDAQ
property value
Order Summary:
1) Enter large position in stock
2) Enter large position in Call
3) Enter large position in Call (second order)
4) Wrote large position of Put
Example Queries:
Parking: Do round-trip paths exist between two brokers and a single security?
Front-Running: Do paths exist with better price execution on house accounts
rather than client accounts?
Apr 15
$103
Order
time
BUY
SELL
1
2
3
4
15. Insider Trading Graph Analysis Example: page 3 of 5
Major Public Announcement: April 3, 2016
Broker A: Annotation
Max Gain: $5,000,000
Position Value: $20,000
Volume: 500
PutFFIV PutFFIV PutFFIV
Broker B: Annotation
Max Gain: $10,000,000
Position Value: $40,000
Size: 1000
Broker C: Annotation
Max Gain: $5,000,000
Position Value: $20,000
Size: 500
5M
Kick-off Historical Graph Query Batch Process
Candidate List: A123, B324, C543
Given the traversal object g of a graph that reflects a specific date, and the symbol of the
targeted instrument 'FFIV', return a list of all broker names who have positions with Max
Profit property of greater than $500,000
g.V().has('symbol','FFIV').repeat(out()).times(2).where(values('max_profit').max().is(gt(500,0
00))).out().values('name')
=> A123, B324, C543
Query 1
Broker
A123
Broker
B324
Broker
C543
Max Profit Property*
Order
10M
Order
5M
Order
* Max Profit is a static property given at trade execution. It is the maximum profit from Put trade. Other properties or analytics are eligible in this scenario as well.
Max Profit Property Max Profit Property
16. Insider Trading Graph Analysis Example: page 4 of 5
Using the candidate list from before a graph traversal across historical snapshots can be done to find similar instances.
Put Order
Broke
r
A123
Historic Snapshot: January 1, 2016
Put
Broke
r B324
Traversal Result: IBM Put Position
IBM
Start
Match
Given the name of a broker 'A123' in the result returned for
query 1, and given the traversal object g of a graph that for a
specific date, return a list of broker names who have positions
for the same instrument as the targeted broker, and the Max
Profit values for the orders of these brokers and the targeted
broker on the instrument are all over $500,000:
g.V().match(
__.as('broker1').has('name','ABC123’),
__.as(‘broker1’).in().as('order1’),
__.as('order1').values('max_profit').max().is(gt(500,000)),
__.as(‘order1').repeat(in()).times(2).as('instrument'),
__.as('instrument').repeat(out()).times(2).as(‘order2'),
__.as('order2').values('max_profit').max().is(gt(500,000)),
__.as(‘order2’).out().as(‘broker2’)).
select('broker2').dedup().by('name')
=> B324
Iteratively, queries such as this can return matches for a
compliance officer to review:
Result:
Query 2
1
2
3
4
5
1
4
4
Equity
Exchang
e
Expiratio
n
Strike Broker
Order
Size
Max Profit
IBM BOX 15-Jan $150 A123 300 $4.5M
IBM BATS 15-Jan $155 B324 400 $6.2M
5 75
Strike: $150
Strike: $155
4.5M ✓
✓
2
6
3
Order
6.2M
Put Order
Broke
r
A123
Historic Snapshot: February 14, 2016
Put
Broke
r B324
Traversal Result: FB Put Position
FB
Start
Match
1
4
4
5 75
Strike: $80
Strike: $70
1.6M ✓
✓
2
6
3
Order
2.1M
6
7
17. Insider Trading Graph Analysis Example: page 5 of 5
Looking at the results it appears Broker B324 is the common link across all matches.
Broker
A123
Broker
B324
Broker
C543
Sub-match Investigation:
Results of investigation discovered that Broker B324
was the source of non-public information in the ring
and received information from a colleague who covers
Technology for a major Investment Bank in New York
City.
Initiate
Investigation
Date Equity XCh Exp Ins Strike Broker Size
Max
Gain
A123, B324, C543
3-Apr-16 FFIV BOX 15-Apr P $100 A123 500 $5M
3-Apr-16 FFIV C2 6-May P $100 B324 1000 $10M
3-Apr-16 FFIV CBOE 6-May P $100 C543 500 $5M
A123, B324
1-Jan-16 IBM BOX 15-Jan P $150 A123 300 $4.5M
1-Jan-16 IBM BATS 15-Jan P $155 B324 400 $6.2M
14-Feb-15 FB BOX 15-Mar P $80 A123 200 $1.6M
14-Feb-15 FB BATS 7-Apr P $70 B324 300 $2.1M
B324, C543
5-Dec-15 GRPN BOX 30-Dec P $4 B324 1600 $640K
5-Dec-15 GRPN BATS 30-Dec P $5 C543 2000 $1M
14-Oct-15 TWTR ISE 15-Oct P $25 B324 400 $1M
14-Oct-15 TWTR AMEX 15-Oct P $30 C543 500 $1.5M
Sample Report
Call Order
Broker
B324
Historic Snapshot: December 5, 2015
Call
Broker
C543
Traversal Result: GRPN Put Position
GRPN
Start
Match
1
4
4
5 75
Strike: $4
Strike: $5
640K ✓
✓
2
6
3
Order
1M
Put Order
Broker
B324
Historic Snapshot: October 14, 2015
Put
Broker
C543
Traversal Result: TWTR Put Position
TWTR
Start
Match
1
4
4
5 75
Strike: $25
Strike: $30
1M ✓
✓
2
6
3
Order
1.5M