May 9, 2016
1
Harnessing Big Data
to Simplify Debugging
May 9, 2016
Asi Lifshitz, CTO
www.thevtool.com
May 9, 2016
2
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 2016
3
RTL Debugging
• Verification is one of the major bottlenecks towards
tape-out
• Debugging failing
tests is complex
and time-consuming
Source: Wilson Research & Mentor Graphics, 2014
May 9, 2016
4
• Iterating between the waveforms and the simulation
log file
• Simulation log files can reach several GB
Debugging Today
May 9, 2016
5
• Big Data tools will quickly and efficiently extract data
from huge log files
• Extracting and manipulating data gets simpler
• Data can be presented in a graphical way
• Shortening the debug time will
shorten the project schedule
and increase the
engineer’s productivity
Debugging Tomorrow
May 9, 2016
6
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 2016
7
• Big data is a term for data sets that are so large or
complex that traditional data processing applications
are inadequate
• The term often refers simply to the usage of
advanced methods for extracting value from data,
and seldom to a particular size of data set
Big Data
May 9, 2016
8
• For some organizations, facing few gigabytes of data
for the first time may trigger a need to reconsider
data management options
• For others, it may take tens or hundreds of terabytes
before data size becomes a significant consideration
Big Data – Cont.
May 9, 2016
9
• A database is an organized collection of data
• The data is typically organized in a way that supports
processes that require information
• A database management system (DBMS) is a
computer software application that interacts with
the user, other applications, and the database itself
to capture and analyze data
Database
May 9, 2016
10
• Database can be used to query a specific record, i.e.,
a specific message
• However, if some computation is required a
database search engine is to be used
– A concrete example which goes beyond the capabilities of
a database, is when the DV engineer would like to see all
messages from time point tp1 to time point tp2
Database for Log Files
May 9, 2016
11
• A search engine allows the user to search for
information using simple keywords
Database Search Engine
May 9, 2016
12
• A free and open-source database search engine,
originally written in Java
• Has been ported to Delphi, Perl, C#, C++, Python,
Ruby, and PHP
• Suitable for any application that requires full text
indexing and searching capability
• The core of its logical architecture is the idea of a
document containing fields of text
May 9, 2016
13
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 2016
14
• A simulation log file is a structured textual file, and as
such it can be indexed
• Once indexed, Lucene API can be used to search for
all the ”interesting” events that are needed for
debugging a failing test
Lucene for Verification
May 9, 2016
15
• The Universal Verification Methodology (UVM) is a
standardized methodology for verifying integrated
circuit designs
• More than 70% of the
industry have adopted
UVM, and the numbers
will only grow with time
UVM
Source: Wilson Research & Mentor Graphics, 2014
May 9, 2016
16
• UVM-based simulation contains UVM messages that
usually have the following format:
Verbosity
Filename(line)
Timepoint
Emitter
Message
UVM Messages
May 9, 2016
17
• UVM_ERROR /project/sflash/verification/SFLASH_controller_ENV/src/sflash_controller_env_sb.sv(1863) @
4498000: uvm_test_top.env.sb [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains 0x532e4000, but
expected 0x532e4cb3
• UVM ERROR is the verbosity (or severity)
• /project/sflash/verification/SFLASH_controller_ENV/src/sflash
_controller_env_sb.sv(1863) is the filename(line)
• @ 4498000 is the time point
• uvm_test_top.env.sb is the emitter of the message
• [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains
0x532e4000, but expected 0x532e4cb3 is the message
UVM Message Example
May 9, 2016
18
• Parse the log file, so that every message will be
broken to the aforementioned 5 elements and stored
as records in Lucene database
• The user can now use the efficient API of Lucene to
extract information
Using Lucene for UVM Messages
May 9, 2016
19
• Being designed to handle huge records, Lucene
returns these records in a negligible time
– Receive all messages of a specific verbosity, or specific
verbosity within some time range
– Messages containing a specific string
– All messages emitted from the APB UVC writing 0X1 to
register sflash_reg.enable
Extracting UVM Records
May 9, 2016
20
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 2016
21
• It is extremely hard to navigate through the log file,
while seeking for the necessary information, without
being overwhelmed or miss important information
• Graphical representation of
data is more natural and is
much easier for analysis
Why Graphical Representation?
May 9, 2016
22
Graphical Representation
of a Log File
May 9, 2016
23
Graphical Debugging
• The transition from debugging a textual file to a
graphical representation is intuitive
• Problems are traced much faster.
The engineer can quickly see what is wrong, when
the pattern changes, or when some unexpected
event has occurred
May 9, 2016
24
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 2016
25
Summary
• The complexity and size of designs these days require
new techniques, as the traditional ones impose very
long debugging time
• Harnessing tools that are used for processing Big
Data can simplify and shorten the debug time of
failing tests
• We hope that this work will encourage more
researches on importing these strong capabilities to
the existing and new EDA tools
May 9, 2016
26
Thank You
26

Asi Lifshitz, VP R&D, Vtool

  • 1.
    May 9, 2016 1 HarnessingBig Data to Simplify Debugging May 9, 2016 Asi Lifshitz, CTO www.thevtool.com
  • 2.
    May 9, 2016 2 Agenda •Introduction • What is Big Data, Anyway? • Simulation Log Files • Graphical Representation of a Log File • Summary
  • 3.
    May 9, 2016 3 RTLDebugging • Verification is one of the major bottlenecks towards tape-out • Debugging failing tests is complex and time-consuming Source: Wilson Research & Mentor Graphics, 2014
  • 4.
    May 9, 2016 4 •Iterating between the waveforms and the simulation log file • Simulation log files can reach several GB Debugging Today
  • 5.
    May 9, 2016 5 •Big Data tools will quickly and efficiently extract data from huge log files • Extracting and manipulating data gets simpler • Data can be presented in a graphical way • Shortening the debug time will shorten the project schedule and increase the engineer’s productivity Debugging Tomorrow
  • 6.
    May 9, 2016 6 Agenda •Introduction • What is Big Data, Anyway? • Simulation Log Files • Graphical Representation of a Log File • Summary
  • 7.
    May 9, 2016 7 •Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate • The term often refers simply to the usage of advanced methods for extracting value from data, and seldom to a particular size of data set Big Data
  • 8.
    May 9, 2016 8 •For some organizations, facing few gigabytes of data for the first time may trigger a need to reconsider data management options • For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration Big Data – Cont.
  • 9.
    May 9, 2016 9 •A database is an organized collection of data • The data is typically organized in a way that supports processes that require information • A database management system (DBMS) is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data Database
  • 10.
    May 9, 2016 10 •Database can be used to query a specific record, i.e., a specific message • However, if some computation is required a database search engine is to be used – A concrete example which goes beyond the capabilities of a database, is when the DV engineer would like to see all messages from time point tp1 to time point tp2 Database for Log Files
  • 11.
    May 9, 2016 11 •A search engine allows the user to search for information using simple keywords Database Search Engine
  • 12.
    May 9, 2016 12 •A free and open-source database search engine, originally written in Java • Has been ported to Delphi, Perl, C#, C++, Python, Ruby, and PHP • Suitable for any application that requires full text indexing and searching capability • The core of its logical architecture is the idea of a document containing fields of text
  • 13.
    May 9, 2016 13 Agenda •Introduction • What is Big Data, Anyway? • Simulation Log Files • Graphical Representation of a Log File • Summary
  • 14.
    May 9, 2016 14 •A simulation log file is a structured textual file, and as such it can be indexed • Once indexed, Lucene API can be used to search for all the ”interesting” events that are needed for debugging a failing test Lucene for Verification
  • 15.
    May 9, 2016 15 •The Universal Verification Methodology (UVM) is a standardized methodology for verifying integrated circuit designs • More than 70% of the industry have adopted UVM, and the numbers will only grow with time UVM Source: Wilson Research & Mentor Graphics, 2014
  • 16.
    May 9, 2016 16 •UVM-based simulation contains UVM messages that usually have the following format: Verbosity Filename(line) Timepoint Emitter Message UVM Messages
  • 17.
    May 9, 2016 17 •UVM_ERROR /project/sflash/verification/SFLASH_controller_ENV/src/sflash_controller_env_sb.sv(1863) @ 4498000: uvm_test_top.env.sb [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains 0x532e4000, but expected 0x532e4cb3 • UVM ERROR is the verbosity (or severity) • /project/sflash/verification/SFLASH_controller_ENV/src/sflash _controller_env_sb.sv(1863) is the filename(line) • @ 4498000 is the time point • uvm_test_top.env.sb is the emitter of the message • [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains 0x532e4000, but expected 0x532e4cb3 is the message UVM Message Example
  • 18.
    May 9, 2016 18 •Parse the log file, so that every message will be broken to the aforementioned 5 elements and stored as records in Lucene database • The user can now use the efficient API of Lucene to extract information Using Lucene for UVM Messages
  • 19.
    May 9, 2016 19 •Being designed to handle huge records, Lucene returns these records in a negligible time – Receive all messages of a specific verbosity, or specific verbosity within some time range – Messages containing a specific string – All messages emitted from the APB UVC writing 0X1 to register sflash_reg.enable Extracting UVM Records
  • 20.
    May 9, 2016 20 Agenda •Introduction • What is Big Data, Anyway? • Simulation Log Files • Graphical Representation of a Log File • Summary
  • 21.
    May 9, 2016 21 •It is extremely hard to navigate through the log file, while seeking for the necessary information, without being overwhelmed or miss important information • Graphical representation of data is more natural and is much easier for analysis Why Graphical Representation?
  • 22.
    May 9, 2016 22 GraphicalRepresentation of a Log File
  • 23.
    May 9, 2016 23 GraphicalDebugging • The transition from debugging a textual file to a graphical representation is intuitive • Problems are traced much faster. The engineer can quickly see what is wrong, when the pattern changes, or when some unexpected event has occurred
  • 24.
    May 9, 2016 24 Agenda •Introduction • What is Big Data, Anyway? • Simulation Log Files • Graphical Representation of a Log File • Summary
  • 25.
    May 9, 2016 25 Summary •The complexity and size of designs these days require new techniques, as the traditional ones impose very long debugging time • Harnessing tools that are used for processing Big Data can simplify and shorten the debug time of failing tests • We hope that this work will encourage more researches on importing these strong capabilities to the existing and new EDA tools
  • 26.