SlideShare a Scribd company logo
FLOGS: Flexible logging for SSIS
Petr Podrouzek
Project goals
Design custom logging for ETLs (MS SSIS) which can be
used for the following:
● G1 Errors detection - detect warnings and errors in execution and
help with recovery when a failure occurred.
● G2 Performance tuning – statistically analyse data for a longer
period and determine which SSIS tasks utilise most of the
resources and once this is known, developer can try to redesign
the SSIS package to be more resource conscious.
● G3 Forecasting – applying trend analysis on the data one can
predict future performance and resource utilisation.
Target audience (1/2)
● U1 SSIS developers - instead of developing their own
logging, they can just use FLOGS. This basic concept of
code reuse is important for developing maintainable
applications.
● U2 Support teams – the main goal of every logging is to
allow the developers to easily find errors during
execution (G1). Support teams can then easily find out
which SSIS jobs failed, alert appropriate teams and
provide detailed error description.
● U3 Infrastructure teams – capacity planning (G3) is
important to make sure that the infrastructure is ready
for the future demand coming from the users. FLOGS
will produce enough data to perform trend analysis and
predict future demand for resources.
● U4 Business users – although most of the data coming from
FLOGS will be analysed by IT experts, some information
can be consumed directly by the business users. They
can easily find out what data has been loaded and
when.
Target audience (2/2)
● F1 Small footprint – it needs to be easy enough to use
FLOGs in new or existing SSIS packages.
● F2 Minimal impact on performance – any type of monitoring
has an impact on the performance of the software process.
The objective is to minimize this performance hit.
● F3 Flexibility – while the system will offer some set of
standardized measures to be logged, it also needs to be
flexible enough to allow custom measures.
● F4 Data ready for analysis – the data should not require
further reprocessing in order to by analysed.
Features of the system
● Hashtable will be used to store log data in memory.
● Data will then be flushed to database once in a while.
●
This decreases the performance hit.
Architecture (1/2)
C# script task:
Get the time of
OnPreExecute
Hashtable
C# script task:
Get the time of
OnPostExecute
and store the to
database
Database
Create new log row in db table;
no update required
Failover to text file was added later
on - if the database is not
accessible log data is stored in txt
file.
N steps kept in memory feature –
keep N steps in hashtable before
flushing to the database; this
allows to log even more events
without an impact on the
performance.
Architecture (2/2)
● There are two tables – Event and EventDetail
● These are in one-to-many relationship linked though
ExecutionInstanceGUID field.
Implementation: ER diagram
Implementation: sample code
Brief overview of how the log data is stored in a hashtable; please
download full code from GitHub.
Impact was measured on rapid
executing ETLs (many small data
feeds loaded in sequence).
100x loading DimProduct from
AdventureWorks database.
Minimal impact of 4% detected
(14.07 sec vs 13.57 sec).
Performance testing
Logging No logging
0
2
4
6
8
10
12
14
16
14.07
13.57
Final comments
● Please note that this presentation is just a very brief
overview of the FLOGS project.
● Articles about the project are accessible on my
LinkedIn profile.
● The solution can be downloaded from GitHub.

More Related Content

Viewers also liked

Resume
ResumeResume
Resume
jaesy lee
 
Title sequence 2
Title sequence 2Title sequence 2
Title sequence 2
Boluwatifé Famodun
 
Pantallas interactivas s
Pantallas interactivas sPantallas interactivas s
Pantallas interactivas s
MonicaEsteves
 
Marketing Plan for XYZ Home
Marketing Plan for XYZ HomeMarketing Plan for XYZ Home
Marketing Plan for XYZ Home
Mehmet Metin, EMBA
 
161106 창업 지원 사업 소개(전달)
161106 창업 지원 사업 소개(전달)161106 창업 지원 사업 소개(전달)
161106 창업 지원 사업 소개(전달)
SeungWon Lee
 
Most outstanding national artists erasmus+ Hungary
Most outstanding national artists erasmus+ HungaryMost outstanding national artists erasmus+ Hungary
Most outstanding national artists erasmus+ Hungary
projectportal
 
Polish artists en
Polish artists enPolish artists en
Polish artists en
projectportal
 

Viewers also liked (7)

Resume
ResumeResume
Resume
 
Title sequence 2
Title sequence 2Title sequence 2
Title sequence 2
 
Pantallas interactivas s
Pantallas interactivas sPantallas interactivas s
Pantallas interactivas s
 
Marketing Plan for XYZ Home
Marketing Plan for XYZ HomeMarketing Plan for XYZ Home
Marketing Plan for XYZ Home
 
161106 창업 지원 사업 소개(전달)
161106 창업 지원 사업 소개(전달)161106 창업 지원 사업 소개(전달)
161106 창업 지원 사업 소개(전달)
 
Most outstanding national artists erasmus+ Hungary
Most outstanding national artists erasmus+ HungaryMost outstanding national artists erasmus+ Hungary
Most outstanding national artists erasmus+ Hungary
 
Polish artists en
Polish artists enPolish artists en
Polish artists en
 

Similar to flogsPresentation

Pullareddy_tavva_resume.doc
Pullareddy_tavva_resume.docPullareddy_tavva_resume.doc
Pullareddy_tavva_resume.doc
T Pulla Reddy
 
Developing, testing and distributing elasticsearch beats in a complex, heter...
Developing, testing and distributing elasticsearch beats in  a complex, heter...Developing, testing and distributing elasticsearch beats in  a complex, heter...
Developing, testing and distributing elasticsearch beats in a complex, heter...
Jesper Agerled Wermuth
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
naveenreddytamma
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
naveenreddytamma
 
SE.pdf
SE.pdfSE.pdf
SE.pdf
BdBangladesh
 
Spring batch overivew
Spring batch overivewSpring batch overivew
Spring batch overivew
Chanyeong Choi
 
Answers
AnswersAnswers
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the field
JoAnna Cheshire
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
Splunk
 
Report final
Report finalReport final
Report final
Ashutosh Bhatt
 
CISSP Week 22
CISSP Week 22CISSP Week 22
CISSP Week 22
jemtallon
 
Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014
Dan Cundiff
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Piyush Kumar
 
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
CodeScience
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
Alexey Grishchenko
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
inoshg
 
JSAC2022_2_kobayashi_en.pdf
JSAC2022_2_kobayashi_en.pdfJSAC2022_2_kobayashi_en.pdf
JSAC2022_2_kobayashi_en.pdf
Algustionesa Yoshi
 
System design
System designSystem design
System design
lumantimanandhar2
 
Datasciencetools
DatasciencetoolsDatasciencetools
Datasciencetools
jyostnanareshit
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoring
Kranthi Paidi
 

Similar to flogsPresentation (20)

Pullareddy_tavva_resume.doc
Pullareddy_tavva_resume.docPullareddy_tavva_resume.doc
Pullareddy_tavva_resume.doc
 
Developing, testing and distributing elasticsearch beats in a complex, heter...
Developing, testing and distributing elasticsearch beats in  a complex, heter...Developing, testing and distributing elasticsearch beats in  a complex, heter...
Developing, testing and distributing elasticsearch beats in a complex, heter...
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
SE.pdf
SE.pdfSE.pdf
SE.pdf
 
Spring batch overivew
Spring batch overivewSpring batch overivew
Spring batch overivew
 
Answers
AnswersAnswers
Answers
 
Sql server tips from the field
Sql server tips from the fieldSql server tips from the field
Sql server tips from the field
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
 
Report final
Report finalReport final
Report final
 
CISSP Week 22
CISSP Week 22CISSP Week 22
CISSP Week 22
 
Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
 
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
Technical Webinar: Patterns for Integrating Your Salesforce App with Off-Plat...
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
 
JSAC2022_2_kobayashi_en.pdf
JSAC2022_2_kobayashi_en.pdfJSAC2022_2_kobayashi_en.pdf
JSAC2022_2_kobayashi_en.pdf
 
System design
System designSystem design
System design
 
Datasciencetools
DatasciencetoolsDatasciencetools
Datasciencetools
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoring
 

flogsPresentation

  • 1. FLOGS: Flexible logging for SSIS Petr Podrouzek
  • 2. Project goals Design custom logging for ETLs (MS SSIS) which can be used for the following: ● G1 Errors detection - detect warnings and errors in execution and help with recovery when a failure occurred. ● G2 Performance tuning – statistically analyse data for a longer period and determine which SSIS tasks utilise most of the resources and once this is known, developer can try to redesign the SSIS package to be more resource conscious. ● G3 Forecasting – applying trend analysis on the data one can predict future performance and resource utilisation.
  • 3. Target audience (1/2) ● U1 SSIS developers - instead of developing their own logging, they can just use FLOGS. This basic concept of code reuse is important for developing maintainable applications. ● U2 Support teams – the main goal of every logging is to allow the developers to easily find errors during execution (G1). Support teams can then easily find out which SSIS jobs failed, alert appropriate teams and provide detailed error description.
  • 4. ● U3 Infrastructure teams – capacity planning (G3) is important to make sure that the infrastructure is ready for the future demand coming from the users. FLOGS will produce enough data to perform trend analysis and predict future demand for resources. ● U4 Business users – although most of the data coming from FLOGS will be analysed by IT experts, some information can be consumed directly by the business users. They can easily find out what data has been loaded and when. Target audience (2/2)
  • 5. ● F1 Small footprint – it needs to be easy enough to use FLOGs in new or existing SSIS packages. ● F2 Minimal impact on performance – any type of monitoring has an impact on the performance of the software process. The objective is to minimize this performance hit. ● F3 Flexibility – while the system will offer some set of standardized measures to be logged, it also needs to be flexible enough to allow custom measures. ● F4 Data ready for analysis – the data should not require further reprocessing in order to by analysed. Features of the system
  • 6. ● Hashtable will be used to store log data in memory. ● Data will then be flushed to database once in a while. ● This decreases the performance hit. Architecture (1/2) C# script task: Get the time of OnPreExecute Hashtable C# script task: Get the time of OnPostExecute and store the to database Database Create new log row in db table; no update required
  • 7. Failover to text file was added later on - if the database is not accessible log data is stored in txt file. N steps kept in memory feature – keep N steps in hashtable before flushing to the database; this allows to log even more events without an impact on the performance. Architecture (2/2)
  • 8. ● There are two tables – Event and EventDetail ● These are in one-to-many relationship linked though ExecutionInstanceGUID field. Implementation: ER diagram
  • 9. Implementation: sample code Brief overview of how the log data is stored in a hashtable; please download full code from GitHub.
  • 10. Impact was measured on rapid executing ETLs (many small data feeds loaded in sequence). 100x loading DimProduct from AdventureWorks database. Minimal impact of 4% detected (14.07 sec vs 13.57 sec). Performance testing Logging No logging 0 2 4 6 8 10 12 14 16 14.07 13.57
  • 11. Final comments ● Please note that this presentation is just a very brief overview of the FLOGS project. ● Articles about the project are accessible on my LinkedIn profile. ● The solution can be downloaded from GitHub.