While the importance of sharing cyber threat intelligence (CTI) and considering countermeasures in advance as cyber attacks become more sophisticated is increasing, IP addresses and domains as detection indices included in CTI are attacked by attackers in short cycles Dispose (change or disappear). As a countermeasure on the defender side, we are moving towards increasing the cost of attackers by improving the sharing speed of CTI, and we receive large amounts of CTI every day. As a result, the situation is such that the CTI is also disposable in a short cycle. In this report, we built a detection index learning method based on CTI that is accumulated day by day and implemented a detection index learning engine learning how detection indices are used by attackers Report on the learning result. We also report on the possibility of reconstructing and combining the result of learning the detection index and applying it to mid- to long-term advanced protection in combination with another data source.
Generative AI on Enterprise Cloud with NiFi and Milvus
Detection index learning based on cyber threat intelligence and its application by Tsuyoshi Taniguchi
1. Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Indicator learning based on cyber threat
intelligence and its application Overview
〜 Searching treasures from a vast amount of threat
information 〜
0
CODE BLUE Day0 - Special Track
Counter Cyber Crime Track
(November 8, 2017)
FUJITSU SYSTEM INTEGRATION LABORATORIES LTD.
Tsuyoshi TANIGUCHI
2. Treasures buried in a vast amount of threat
information
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED1
3. Cyber Threat Intelligence
Cyber Threat Intelligence: CTI
A report that is created to share
knowledge on a particular thread
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED2
4. The traditional CTI: Shared by text
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
For a cyberattack called ○○, the involvement
of an attacker named △△ is strongly
suspected. As the method of attack, malware
called □□ connecting to C&C server with IP
xx.xx.xx.xx has been observed.
3
5. Next CTI: Readable by machines
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
<tag threat-name> ○○ </threat-name>
<tag attacker> △△ </attacker>
<tag attack-method> □□ </attack-method>
<tag ip> xx.xx.xx.xx </ip>
4
6. STIX (Structured Threat Information eXpression) Format
One of the CIT
standards
Consist of 8
information
groups
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
IPA's outline of STIX https://www.ipa.go.jp/security/vuln/STIX.html
5
Intent of Cyber attack
activities
Indicators to detect
attacks
Events observed by
attacks
Behaviors and methods
of cyber attackers Incidents
People/organizations
involved to cyber attacks
Vulnerabilities of targeted software,
systems, and configurations
Countermeasures against
threats
7. Issues to work on
Analysts have too much CTIs to
analysis
Encourage to share CIT using AIS
(Automated Indicator Sharing)
A vast amount of CTI could turn into
garbage
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED6
8. Motivation
To help analysts,
find special CTIs (treasures) that
describe attackers
from a vast amount of CTIs
(garbage)
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED7
9. Image of searching treasures from CTIs
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Real-time type
CTI sources Others
Analysis type
CTI sources
CTI platform
Treasures
(Special CTIs)
8
10. Indicators
Indicators to detect attacks with elements of CTIs
Type of indicators
IP address ←Target
Domain ←Target
Host
E-mail
URL
Hash: MD5, SHA1, SHA256, PEHASH, IMPHASH
…
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
• IP xxx.xxx.xxx.xxx
• IP yyy.yyy.yyy.yyy
• IP zzz.zzz.zzz.zzz
Unidentified (New)
Continued use
Reuse
9
11. Most of indicators (attack infrastructure) are
used just once
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
80% >
Used just once
My research
focuses on this part
10
12. Hypothesis of my research
Indicators on CTI show the attackers' footprints
Classify the indicators as the following 3 categories
Disposable (used just once)
Long life
Reuse
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED11
13. Image of how to use the result of indicator
learning
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Real-time type
CTI sources
Black list
(Detection list)
Analysis type
CTI sources
Most of them are vanished
soon, but need to deal them
CTI platform
Special IP and
domain
A vast amount of
(unidentified) real-
time indicator
Extra deal,
more analysis
Indicator DB
12
14. Prior notice for indicator learning based on CTI
It's not a talk about deep learning or
clustering
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED13
15. 1. Treasures buried in a vast amount of threat
information
CTI
STIX
Garbage
Treasures
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED14
16. Contents of the treasures
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED15
17. Real example 1 (1/2): Spam mails
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Hi xxxxxx,
Congratulations!
You have access to your free
trading cash!
The money is sitting and waiting
in your account now.
Access Here Now
Thanks again
Dennis Mcclain
http://sectorservices[.]com[.]br/
components/com_tz_portfolio/v
iews/gallery/tmpl/
187.17.111[.]105
DNS
16
18. Indicator DB
Real example 1 (2/2): Usage of indicator learning
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
187.17.111[.]105
17
19. Real example 2 (1/2): Kelihos botnet
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Life-span of Botnet indicator (IP address) of Kelihos botnet in 2015
11 (/ 39,937)
lived for more
then 46 weeks
97.5% vanished
within 4 weeks
xx.xx.xx.41: 4/13 - 4/14
xx.xx.xx.42: 3/16
xx.xx.xx.46: 3/28 - 6/19
xx.xx.xx.47: 3/8 - 3/13
xx.xx.xx.48: 5/21 - 5/22
xx.xx.xx.51: 5/1 - 6/14
18
20. Real example 2 (2/2): Kelihos botnet
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Treasures are buried
19
21. Real example 3: Estimation of attack trends
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Long life type → DownloaderDisposable type → Botnet, DGA, etc
20
22. Real example 4 (1/2): Monitoring IP addresses that
could be used potentially by malicious activities
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
2014 at
present
2015 2016
GameOverZeus
Sality
CryptoWall
Tinba
DGA
21
23. Real example 4 (2/2): Verifications using passive
DNS services
Passive Total by RiskIQ
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Learning period
based on CTI
LOCKY spam
June 2016
4 (3rd) →
19 (4th) →
209 (5th)
398 (20th) →
573 (21st) →
584 (22nd)
22
24. 2. Contents of the treasures
Long-life indicators
Attack trends
Proactive defenses
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED23
25. The way of searching treasures
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED24
26. CTI (indicators on CTIs) is a collection of biased
data
The trouble of learning CTI indicators: a mass of bias
In machine learning, statistical information of learning data is to be applied for
future...
Unbalanced number of CTIs depending on specific malware (campaign)
Ex. WannaCry, Petya, Bad Rabbit
Bias of the quality of indicators
Most of indicators are new (unidentified) or related to a part of a vast amount of CTIs
Bias (difference) of the quality of attacks
Botnet (distribution, non-discriminational type) or APT (Targeted)
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED25
27. Indicator learning
It's not enough just simply to apply standard algorithms
Majority: Use just once
Booms: Botnets etc use and then dispose a lot
Classification/Identification: Most of indicators can identify
malware
Searching treasures: Return to a problem to reveal
rare patterns (treasures)
Unable to find treasures by blindly searching all the
CTIs
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED26
28. Structure of indicator learning
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
CTI data source 1
Subgroup 1 Subgroup 2 Subgroup i⋯
Preprocessing
Indicator learning
Indicator DB
CTI data source 2 CTI data source 3
27
29. Preprocessing
Basically assume
the STIX format
and use a XML
parser
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
<stix:STIX_Package …>
<stix:STIX_Header>
…
</stix:STIX_Header>
<stix:Observables…>
…
<cybox:Title> IP addresses </cybox:Title>
…
<AddressObj:Address_Value> xxx.xxx.xxx.xxx </AddressObj:Address_Value>
…
<cybox:Title>Cerber IP addresses </cybox:Title>
…
<AddressObj:Address_Value> yyy.yyy.yyy.yyy </AddressObj:Address_Value>
…
</stix:Observables>
<stix:STIX_TTPs>
…
<ttp:Title> … </ttp:Title>
…
</stix:STIX_TTPs>
<stix:Campaigns>
…
<campaign:Title> Campaign1 </campaign:Title>
…
</stix:Campaigns>
…
28
30. Sub-grouping CTIs
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
• IP 1-1
• IP 1-2
• Domain 1-1
• ⋯
Subgroup1 - GOZ
CTI data source 1
Preprocessing
CTI data source 2 CTI data source 3
• IP 2-1
• IP 2-2
• Domain 2-1
• ⋯
⋯
• IP i-1
• IP i-2
• Domain i-1
• ⋯
Timeline
• IP 1-1
• IP 1-2
• Domain 1-1
• ⋯
Subgroup2 - Upatre
• IP 2-1
• IP 2-2
• Domain 2-1
• ⋯
⋯
• IP i-1
• IP i-2
• Domain i-1
• ⋯
Timeline
• IP 1-1
• IP 1-2
• Domain 1-1
• ⋯
Subgroup3 - Kelihos
• IP 2-1
• IP 2-2
• Domain 2-1
• ⋯
⋯
• IP i-1
• IP i-2
• Domain i-1
• ⋯
Timeline
• IP 1-1
• IP 1-2
• Domain 1-1
• ⋯
Subgroup4 - Pony
• IP 2-1
• IP 2-2
• Domain 2-1
• ⋯
⋯
• IP i-1
• IP i-2
• Domain i-1
• ⋯
Timeline
GameOverZeus, Upatre, Kelihos, Pony, Locky, Domain Generation Algorithm, Dridex, DyreTrojan,
Cryptowall, Sality, Tinba, Torrent, KOL, Madness, APT28, APT10, Fallout, Lazarus, WannaCry, Petya
29
31. Learning life-span of indicators
As an indicator for CTIs, how long should it be kept?
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
• IP 1
• IP 2
CTI at 2/1 CTI at 2/8 CTI at 2/15 CTI at 2/22
CTIs related to a specific malware
• IP 1
• IP 3
• IP 1
• IP 4
• IP 1
30
32. Real example 2 (1/2): Kelihos botnet
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
Jan Feb Mar Apr May Jun Jul Aug Sept Oct Nov Dec
Life-span of Botnet indicator (IP address) of Kelihos botnet in 2015
11 (/ 39,937)
lived for more
then 46 weeks
97.5% vanished
within 4 weeks
xx.xx.xx.41: 4/13 - 4/14
xx.xx.xx.42: 3/16
xx.xx.xx.46: 3/28 - 6/19
xx.xx.xx.47: 3/8 - 3/13
xx.xx.xx.48: 5/21 - 5/22
xx.xx.xx.51: 5/1 - 6/14
31
33. Weighting indicators
Compare IP addresses and domains between multiple subgroups
Contrast Set Mining [Bay et.al 2001]
Emerging Patterns [Dong and Li 1999]
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
itemset A
32
DB 1 DB 2
Possible to identify
itemset A
No appearance
IP, domain
Malware,
Campaign
34. IP addresses shared by multiple malwares
More than 99%: Single subgroup
Less than 1%: Multiple subgroups
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
456 / 58048:
0.79%
33
35. Real example 4 (1/2): Monitoring IP addresses that
could be used potentially by malicious activities
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED
2014 at
present
2015 2016
GameOverZeus
Sality
CryptoWall
Tinba
DGA
34
36. Conclusion
1. Treasure is buried in CTIs
2. Need to have talented
guides to search treasures
Copyright 2017 FUJITSU SYSTEM INTEGRATION LABORATORIES LIMITED35
攻撃者の傾向が表れている特別な検知指標による先回り防御について検討中の内容について紹介.
この例では,2014 年から 2015 年にかけて,複数のマルウェアの CTI に出現した IP アドレスについて紹介.
四角は具体的な IP アドレス (今回はデータを利用させてもらっているベンダーに配慮して値は出さない)
四角の中の色付きのエリアはその IP が CTI に出現したことを表す
この IP のように,過去複数のマルウェアの活動で観察されてきた IP は,
またほかの活動でも観察される可能性があるのでは,と監視する.
ちなみに,この資料を作成している段階で,
FireEye のあるアナリストの分析で明らかになったことによると,
この IP は2014年あたりには既にシンクホールになっていたと推定され,
直接攻撃インフラに利用されていたわけではなさそうだが,
様々な悪性活動に反応する IP となっていた模様.
前のスライドにおいて説明した IP アドレスに関連づいているドメインの数を
パッシブDNSサービスである PassiveTotal を利用して検証.
左は各四半期の最終日の登録ドメイン数をプロットしたもの.
右は2016年の6月の日別の登録ドメイン数をプロットしたもの.
2015年までの学習で明らかにできていた IP を
2016年に監視できていたとしたら,
2016年6月のロッキースパムによる初動をとらえることができていた