The document discusses the need for multi-disciplinary intelligence production teams to help address challenges posed by increasing data volumes and proposes integrating experts from different fields like IT, software, statistics and intelligence to work together on tackling complex problems. It provides examples of how such integrated teams could support mission requirements by developing new processes, data products, tools and visualizations to gain actionable insights from large and diverse datasets. The document also outlines some accomplishments of integrated data analysis teams in supporting organizations like DEA and DoD with detecting illicit activity and identifying unknown threats.
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATADataTactics
USMA Cadet leverages GDELT Global Knowledge Graph (GKG) to quantify global human society beyond cataloging physical occurrences and network structure of the global news.
- What is Clustering, Honeypots and Density Based Clustering?
- What is Optics Clustering and how is it different than DB Clustering? …and how
can it be used for outlier detection.
- What is so-called soft clustering and how is it different than clustering? …and how
can it be used for outlier detection.
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
Do you wonder how to process huge amounts of data in short amount of time? If yes, this session is for you! You will learn why Apache Hadoop and Streams is the core framework that enables storing, managing and analyzing of vast amounts of data. You will learn the idea behind Hadoop's famous map-reduce algorithm and why it is at the heart of solutions that process massive amounts of data with flexible workloads and software based scaling. We explore how to go beyond Hadoop with both real-time and batch analytics, usability, and manageability. For practical examples, we will use IBM InfoSphere BigInsights and Streams, which build on top of open source tooling when going beyond basics and scaling up and out is needed.
This presentation was made at the Nasscom Tech Series Big Data event held in Chennai on 06-Feb-2013 and was made by Somjit Amrit, the Chief Business Officer for Technosoft Corporation
In this presentation Somjit makes the case for why historically Business and IT haven't gone together hand in hand and goes on to state that Big Data could be what brings them together. He also emphasizes the importance of Big Data in addressing the fourth V namely Veracity (the others being Velocity, Volume and Variety).
Every day we roughly create 2.5 Quintillion bytes of data; 90% of the worlds collected data has been generated only in the last 2 years. In this slide, learn the all about big data
in a simple and easiest way.
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATADataTactics
USMA Cadet leverages GDELT Global Knowledge Graph (GKG) to quantify global human society beyond cataloging physical occurrences and network structure of the global news.
- What is Clustering, Honeypots and Density Based Clustering?
- What is Optics Clustering and how is it different than DB Clustering? …and how
can it be used for outlier detection.
- What is so-called soft clustering and how is it different than clustering? …and how
can it be used for outlier detection.
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
Do you wonder how to process huge amounts of data in short amount of time? If yes, this session is for you! You will learn why Apache Hadoop and Streams is the core framework that enables storing, managing and analyzing of vast amounts of data. You will learn the idea behind Hadoop's famous map-reduce algorithm and why it is at the heart of solutions that process massive amounts of data with flexible workloads and software based scaling. We explore how to go beyond Hadoop with both real-time and batch analytics, usability, and manageability. For practical examples, we will use IBM InfoSphere BigInsights and Streams, which build on top of open source tooling when going beyond basics and scaling up and out is needed.
This presentation was made at the Nasscom Tech Series Big Data event held in Chennai on 06-Feb-2013 and was made by Somjit Amrit, the Chief Business Officer for Technosoft Corporation
In this presentation Somjit makes the case for why historically Business and IT haven't gone together hand in hand and goes on to state that Big Data could be what brings them together. He also emphasizes the importance of Big Data in addressing the fourth V namely Veracity (the others being Velocity, Volume and Variety).
Every day we roughly create 2.5 Quintillion bytes of data; 90% of the worlds collected data has been generated only in the last 2 years. In this slide, learn the all about big data
in a simple and easiest way.
The talk will cover in broad strokes the building blocks, facilitators and challenges for big data based decision making.
Using examples from two projects from very dissimilar domains (High tech manufacturing and Public Health) Dr. Vinze will present possibilities for Data Science for both practitioners and academic researchers.
Big Data, Big Content, and Aligning Your Storage StrategyHitachi Vantara
Fred Oh's presentation for SNW Spring, Monday 4/2/12, 1:00–1:45PM
Unstructured data growth is in an explosive state, and has no signs of slowing down. Costs continue to rise along with new regulations mandating longer data retention. Moreover, disparate silos, multivendor storage assets and less than optimal use of existing assets have all contributed to ‘accidental architectures.’ And while they can be key drivers for organizations to explore incremental, innovative solutions to their data challenges, they may provide only short-term gain. Join us for this session as we outline the business benefits of a truly unified, integrated platform to manage all block, file and object data that allows enterprises can make the most out of their storage resources. We explore the benefits of an integrated approach to multiprotocol file sharing, intelligent file tiering, federated search and active archiving; how to simplify and reduce the need for backup without the risk of losing availability; and the economic benefits of an integrated architecture approach that leads to lowering TCSO by 35% or more.
Similar to Multi Discipline Intelligence Production Teams 1 (20)
1. Future Concepts in
Intelligence:
Multi-Discipline Intelligence
Production Teams
Bruce Goldfeder, CSSLP
September 26, 2012
http://www.data-tactics.com
Data Tactics Corporation Proprietary Material
2. Paradigm Shift in the Intel Ecosystem
• Data Deluge
– Sensors and Sensor Data are increasing at exponential rates
– Move beyond traditional sources of data
– “The storm of data is 1500% heavier than it was just five years ago while
our ability to process, exploit and disseminate has increased about 30%”
Gen Robert Kehler, USSTRATCOM, 2011
Data Tactics Corporation Proprietary Material
3. Big Data Opportunities
Bruce Weed, IBM Corporation
Data Tactics Corporation Proprietary Material
4. Big data—a growing torrent
• $600 to buy a disk drive that can store all of the world’s music
• 5 billion mobile phones in use in 2010
• 30 billion pieces of content shared on Facebook every month
• 40% projected growth in global data generated per year vs. 5% growth in global IT spending
• 235 terabytes data collected by the US Library of Congress by April 2011
• 15 out of 17 sectors in the United States have more data stored per company than the US
Library of Congress
McKinsey, Big Data Report, 2011
Data Tactics Corporation Proprietary Material
5. Work Smartly with Data
“There’s a method to solving data problems that avoids the big, heavyweight
solution, and instead, concentrates building something quickly and
iterating. Smart data scientists don’t just solve big, hard problems; they
also have an instinct for making big problems small.
We call this Data Jujitsu: the art of using multiple data elements in clever
ways to solve iterative problems that, when combined, solve a data
problem that might otherwise be intractable.”
DJ Patil, Data Jujitsu: The Art of Turning Data into Product, 2012
Data Tactics Corporation Proprietary Material
8. Integrated Data Team
• Intelligence teams tackling the hard problems
– Senior members
– Mixture of IT, Software, Statistics, and Intelligence
SMEs
– Serve as the Vanguard for creating new
• Processes
• Actionable Data Products
• IT Tools
• Visualizations
Data Tactics Corporation Proprietary Material
9. Left and Right Brain
• Disciplined methods of traditional data mining
accelerated with iterative and rapid “what ifs”
• Requirement for unreasonable input – challenge
existing truths to find new patterns
• Intimate knowledge of the mission problems that
analytics or predictive analysis are addressing
• Ability to communicate findings using the
customers language
• Original visualizations required to convey abstract
and complicated results
Data Tactics Corporation Proprietary Material
11. DARPA Example
• Integrated Team Supporting Theater Commander
– Retired Special Operator
– Social Scientist
– Quantitative Mathematicians
– Software Developer
– Data Scientist
– IT, Database, and UI personnel
Data Tactics Corporation Proprietary Material
12. Threat Finance Analytics
Who Is Interested?
State Sponsorship Drug Economy • CJ-2/CJIOC-A
– Direction from BG Fogarty to support ATFC
Foreign Aid Corruption and Shafafiyat (Jan 2011)
• Afghan Threat Finance Cell
– DEA-led fusion center, Treasury and DoD
are deputy leads
– Active feedback loop with DEA Office of
Financial Operations
• CJIATF-SHAFAFIYAT (BG McMaster)
• NSA FTM Analysts
Threat Finance
State of the Art
• Highly manual analysis
• No single agency has full picture
• Technologies are limited
Violence Capital Flight
Automated tools for rapid analysis
with massive multi-int data
2/12/2013 12
Data Tactics Corporation Proprietary Material
13. ATFC Data
80,000+ spreadsheets
Millions of records with variable structure
Conduit Description Accts Interval Records Refine Stage
Two stages of data
Shaheen Exchange (aka
Physical storefronts of the ~1.2 million cleaning complete; third
Central Accounts)
exchange. Branches in and out of 95 2001-2010 stage necessary
ShaheenExchage Daily Afghanistan. First stage of data clean in
~1.8 million
Balances progress
Hawala Accounts (aka Dubai-based hawala accounts, Two stages of data
390 1998-2010 434, 401
“B” Computer) centrally maintained. cleaning complete
Shaheen Exchange is a Western Two stages of data
Western Union UNK 2001-2010 106, 176
Union sub-agent. cleaning complete
T accounts are debits, loans and
Initial specs and setup –
“T” Accounts payables to the Shaheen Exchange 555 2000-2010 ~421,575
highpriority
in Dubai
L accounts are bank accounts
Initial specs and setup –
“L” Accounts associated with the Sherkhan group 86 2000-2010 ~337,260
highpriority
of companies
Records and stores transactions of
AFRAT international exchange branch UNK 2004-2010 Unknown Not started
locations
2/12/2013 13
Data Tactics Corporation Proprietary Material
14. Accomplishments
• Toolset for faster DEA Shaheen Exchange data analysis
• Geolocated 9 additional branches in AfPak region that DEA did
not know existed; and 45 overall worldwide
• DEA work results from using our data
– Identified transactions with several banks in violation of OFAC
sanction designation
– Known cash courier Mr. X (name classified) under
investigation as a result
2/12/2013 14
Data Tactics Corporation Proprietary Material
15. Accomplishments (cont.)
• Fast query of stacked large data sets with a Data Resolution
user-friendly search and visualization tool • Cleaned 14,538
spreadsheets
Country: 947 -> 490 – 20% of the data
Technique Rows Modified Western Union: Original – Sheets prioritized
Neighbor 55K – 52% number of unique entries in by user interest
the “country” field was 4.5
Ngram 10K – 9%
times the actual number of – Orders of
Metaphone 97K – 92% countries in the world! magnitude faster
processing for
threat finance
analysis
• Resolved ~88k
• Tools immediately put to use by DEA/ATFC in entities
support of active and historical criminal – 12% improvement
cases
• Saved one 24/7 man-year of work that Provingcapabilities to
would have been spent simply scanning partners in theater and in
records CONUS has enabled trust
and data acquisition
2/12/2013 15
Data Tactics Corporation Proprietary Material
Editor's Notes
The Kabul Bank /Shaheen Exchange information that DARPA assisted in compiling is being used in support of active and historic criminal cases to: - Identify the persons, means, extent and nature of the over 900 Million USD theft of Kabul Bank funds during 2004-2010. - Identify terrorism linked persons and entities that conducted financial transactions through Kabul Bank/Shaheen Exchange. - Identify the informal money transfer systems that financially support transnational criminal organizations and the crimes they commit.- Identify illegal money service businesses located within the United States and seek the prosecution of the persons operating them.- Identify the financial transactions and persons who conducted them linked to criminal activity for OFAC sanction designation.- And other terrorism and criminal activity, i.e. bribery, official corruption,
The Kabul Bank /Shaheen Exchange information that DARPA assisted in compiling is being used in support of active and historic criminal cases to: - Identify the persons, means, extent and nature of the over 900 Million USD theft of Kabul Bank funds during 2004-2010. - Identify terrorism linked persons and entities that conducted financial transactions through Kabul Bank/Shaheen Exchange. - Identify the informal money transfer systems that financially support transnational criminal organizations and the crimes they commit.- Identify illegal money service businesses located within the United States and seek the prosecution of the persons operating them.- Identify the financial transactions and persons who conducted them linked to criminal activity for OFAC sanction designation.- And other terrorism and criminal activity, i.e. bribery, official corruption,