SlideShare a Scribd company logo
1 of 14
Key Achivements
During my work in Mimecast
19/09/2013 – 29/09/2014
DBA and performance tuning project
One of the biggest problem we have in Mimecast is system performance. As our Data
Warehouse is growing rapidly and becoming more and more complex it is difficult to
monitor and control the performance. Some procedures took long time to execute
which increased total time to have data refreshed. We had errors caused by lack of
memory, hard disk I/O (latches). Users reported it takes very long time to load a
dashboard. The problem was we had almost no insight into the issues we were facing
in production. As part of my duties was database administration I initiated a
performance tuning project to develop a set of procedure and extended events which
collect various system counters including CPU, memory, disk usage store this
information in database and later analysed with SSRS reports and dashboards in
Tableau to get a complete performance picture of the SQL server environment.
We found out that the performance issues can be divided into 3 main groups:
1. Tableau performance
2. SQL Server performance
3. Network performance
1. Tableau performance
Tableau Data Extract project
As most of our dashboards are very complex and highly interactive. All of them are live connected to our SQL server. As
Tableau is as fast as our data source we need to decided if it is the best access method for our dashboards and try to
replace live connections to use a Tableau Data Extract (TDE).
I developed a script in C# generating TDE files using Tableau Data Extract API. The files then are put into the Tableau
server and all datasources using Tableau extracts are refreshed. It resulted a significant improvement in dashboard load
time.
Tableau Availability project
Our Tableau data is replicated in different Tableau servers but due to high cost of licence for Tableau Failover solution we
need to have an in-house cost-effective custom failover solution without compromising quality which allows us to switch
between our Tableau servers automatically in case one of them fails. This solution gives us ability to save our Tableau
data is safe and have our BI dashboards available even in the event of a major failure. We also manage Tableau backups,
making it easy for us to restore when needed, including point-in-time recovery.
The solution I developed includes the following components:
 Backup and maintenance job on the primary server nightly
 Restore the backup primary on the secondary server
 A script starting the secondary Tableau server when the primary server goes offline/becomes unreachable.
 I suggested the following future developments for this subproject:
 Create a landing page on corporate portal/website from which all Tableau users redirected to the active server.
 Consider a replication solution for PostgreSQL database underlying Tableau
 Introduce VM server availability solution by issuing ping command on remote Tableau server and running Powershell
remote commands in case the VM is not available.
2. SQL Server performance
Suggestions I proposed for three main bottleneck areas:
Execution plans optimisation
Review existing indexes usage, introduce new indexes
Introduce native compile procedures (CLR)
Memory
Using SSIS as it uses its own memory more efficient
Improve tempdb performance by
storing it on a flash memory
using a new feature (2014) ‘sort in tempdb’
Introduce in-memory OLTP table
IO (latches)
Introduce partitioned views having physical fact tables stored in different file groups on different disks
As a part of this project, I completed a subproject called SSIS Lineage. The result was a new version of SSIS event handlers, which allows us to capture detailed
audit information about an execution of the package.
I also developed Business Intelligence documentation, which includes:
• DWH documentation (stage, DWH tables, views)
• Processing logic (stored procedures)
• SSAS documentation (cubes, measure groups, dimensions)
Integration solution
When I joined Mimecast, they needed to improve an integration stage of their BI system. Massive data requirements put
significant strain on our ability to load and process data quickly and created operational bottlenecks associated with data
loading and processing. As these operational challenges had direct impact on Data Warehouse downstream we decided
to replace existing integration approach with a new one based on vertical data partitioning schemes.
The solution I developed provides real-time integration with 305 tables of our source system Netsuite. Apart of
partitioning, it uses merge statement, Slowly Changing Dimension transformation and SQL Agent schedule executed a
job on indefinite loop.
By implementing an integration solution I developed, we were able to make significant productivity enhancements in
data loading and maintenance, which allows us to shorten the refresh time from 25 to 10 minutes. Scalability and
reliability of integration solution enabled us to meet data management challenges. The solution also provides availability
of the staging data as stage tables not locked for reading and now available all the time without locks because of loading
data with a switching partitions technique is a meta-data operation, which is competed practically instantly.
Measures I suggested for future development in order to improve staging integration:
 Set SQL server memory limits to unlimited (use all available to SQL Server memory)
 Replace lookup tasks with merge statements based on Kimball methodology
 Re-develop existing SSIS packages to use streamline approach instead of memory based by replacing dataflow tasks
with ones processing data flow line by line rather than loading the whole set of data into the buffer
 Replace insert SQL Destination step in Data Flow tasks with OLE DB Destination
 Rewrite a stored procedure loading fact tables into DWH by removing CurrentRecord column from a clustered index
and include this column in a non-clustered index to eliminate 'Halloween effect’ which significantly decreased
performance of Data Warehouse.
 Use alternative methods to download data into the staging phase (RESTlets)
DataWarehouseSolutions- ServiceDeliveryDashboard
I was involved in the several BI projects where I develop Data Warehouse solutions to expose data (Datamart) in Tableau
dashboards. Developing a solution for dashboard involves preparation of specific datasets that to demonstrate specific
concepts. Tableau developer then use this dataset (Datamart) and create an example proof of concept dashboards for
clients and management. General requirement for Datamart I had to develop is to ensure that data is easy to use yet
functionally rich. In order to meet this requirement Business Intelligence developer need to understanding complex
underlying data because although we have all data from our source system in our Data Warehouse, turning this data into
meaningful business information was presenting challenges.
One of the project I was involved was Service Delivery Dashboard. One of challenge I faced during this project was to
calculate First Time Response metric - number of business hours between ‘Case opened’ time and ‘Response sent from
Mimecast’ time. The metric should take into account customer’s local time including daylight saving, weekends and
public holidays. In fact, the timestamp captured in our source system Netsuite reflects GMT time zone. What I had to
develop is a solution that converts GMT time to customer local time and calculates the number of business hours
between requests and response.
The solution I developed involves extracting data via web scraping of various public websites providing time zones,
daylight saving and public holidays information. I utilized versatility of Ruby language using Nokogiri library and develop a
script that perform all data collection tasks. Then I import data with a SSIS C# script component into relational database,
perform various cleansing and conversion transformations.
Challenges I faced developing this solution:
• Some holidays are not nationwide, observed only in selected states/territories.
• Customer location data can be clean mostly manually
• Missing information on daylight saving time for some countries.
• I had to use advance business logic like using secondary address in case the registration address not actual office
address for companies registered on areas with special tax regime (BVI, Jersey, Cayman Islands etc.)
DataWarehouseSolutions- SalesEnablementKPIs&Lead
indicatorsDashboard
I selected and develop a unified view with all Lead Indicators used in Sales Dashboard.
The entities I suggested to use as Lead Indicators:
The solution I developed included the following components:
 stage and DWH tables
 switching partition infrastructure (function and scheme)
 SSIS packages
 stored procedures which implement business logic and load transformed data into data
mart.
Demos
Webexes
Pipeline movements
Targets
Meetings
Activities
CX Activities
Deal Registration
Incentives
Opportunities
Quota
RAMPACT
Sales Forecast
Sensitive Connects
Survey Pre Sales
DataWarehouseSolutions- GTMDashboard(Go-To-Market
MarketingDashboard)
The goal of this dashboard is to provide multi-level visibility into the efficacy of Mimecast’s
field and group marketing efforts, and to attribute pipeline creation and won business to the
correct source. GTM Dashboard enabled users to make informed decisions around marketing
campaigns and focus, as well as provide KPIs to track the efficiency and effectiveness of
Mimecast’s marketing against agreed targets.
My role in this project was to develop a Data Warehouse solution, which consists of 2
sections:
 Campaign Analysis
 Pipeline Creation Waterfall
One of the challenges I faced during this project was to implement AMT (Attributable
Marketing Touch) model, which is an attribution model that allows a 90-day window after a
marketing campaign during which any pipeline creation is attributed to that campaign. There
is no weighting of deals between multiple campaigns, so single opportunities can be
attributed to multiple campaign touches if they all fall within the 90-day window. In addition,
marketing touches 7 days after opportunity creation are given credit due to current
process/timing issues with marketing data imports.
In addition, I needed to make sure that marketing history data is de-duplicated by campaign
and date so that multiple contacts from the same lead attending the same event on the same
day were counted as a single marketing touch.
DataWarehouseSolutions- OpportunitiesDashboardproject
I developed Data Warehouse solution which meet business requirements on calculating
conversion rates (sales funnel) – new business pipeline created in a user-selected time period
from Lead to Opportunity Creation, from Opportunity Creation to Qualification, and from
Qualified Opportunity to Won Business are calculated for each strand, and overall.
One of the biggest challenges I encountered during this project was to develop custom
integration solution based on Netsuite saved search capabilities using RESTlets, web-service,
xml processing in script components in C#. In order to meet user requirements for this
dashboard I needed to bring data joined by our source system Netsuite. After profiling data I
realized that Netsuite tables in our Data Warehouse cannot be used as they lack of necessary
keys. I developed a custom integration solution using so called ‘saved searches’ which can be
called within C# script component in SSIS, then process the result as xml and load into our
Data Warehouse.
Another challenge associated with this project was to developing a stored procedure which
implementing data processing based on very complex business logic.
For example, in calculating Highest Active Status metric for Closed Lost Opportunities it needs
to be the stage it was at prior to being lost.
Within this project, I also develop Opportunities OLAP cube, which was used to create
dashboards in Tableau.
SAM integration and Data Warehouse solutions
The aim of the project was to make best use of the company's existing operational data currently held
in silos within the large internal operational PostgreSQL database called SAM containing the detailed
network and operations records on delivered services (pre-Big Data). The idea of using operational
data for sales intelligence was not new. Mimecast had wrestled with this problem for almost ten years
and recognised a need to develop a BI solution to bring this data into Data Warehouse.
I developed an integration and Data Warehouse solution bringing datasets from SAM to Data
Warehouse and process them to be used as data sources in various Tableau dashboards. One of the
requirements was to ensure that the solution was scalable to incorporate information from the source
system.
The challenges I faced during this project:
make conversion datetime parsed from varchar date type into datatime format depending on the
server as our developing and production servers span across different continents (US and UK data
format)
workaround buffer limits in SSIS trying to download large amount of data by running multiple queries
selected by id or timestamp
using open source PostgeSQL drivers for C# script components
using a listeners/switch in script component in C# in the process of SAM migration to download data
from the new scheme when it's ready.
IntegrationsolutionforCorporateGoals Dashboard
I developed an integration and Data Warehouse solution processing SAM data and
prepare it to be exposed to Tableau were developed.
The challenges I faced during this project:
complex integration solution to download nearly 'Big Data' (PostgreSQL) most
efficient way
complex processing logic applied - flattening, ranking, pivoting
Excelintegrationsolution
I developed an integration solution, which provides a seamless integration with Excel files
with various manual adjustments in Data Warehouse.
The solutions consists of the following steps:
Uploaded Excel files are checked for integrity.
Then a user is emailed the status of Excel file informing if the processing was successful. If
integrity test failed an email includes details about errors – column in case of data type error
or name of the worksheet in case it is missing etc.
The challenges I had during this project:
 Unable to query Excel by T-SQL via linked server with 64bit driver.
 If File Task is placed after a Data flow using Excel connection in a control flow we have an
error "The file is used by another process". To tackle this issue I separated SSIS packages in
SQL Agent job.
 Additional data validation layer in Excel template was introduced to eliminate possible
errors on data entry level. The validation is based on conditional formatting using advanced
Excel formulas.
Merging(deduplicating)project
I developed an integration and Data Warehouse solution as a part of merging project,
which aimed to eliminate NetSuite duplicate records.
Challenge: develop a complex reconciliation stored procedure to detect any deleted
or non-active or invalid customers and contacts records.
Pricereportin SSRS
Challenge: Data need flattening and a report was based on multiple nested groups.
Merging(deduplicating)project
I developed an integration and Data Warehouse solution as a part of merging project,
which aimed to eliminate NetSuite duplicate records.
Challenge: develop a complex reconciliation stored procedure to detect any deleted
or non-active or invalid customers and contacts records.

More Related Content

What's hot

Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsSeeling Cheung
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDatawarehouse Trainings
 
Informatica Power Center - Workflow Manager
Informatica Power Center - Workflow ManagerInformatica Power Center - Workflow Manager
Informatica Power Center - Workflow ManagerZaranTech LLC
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2Eduardo Castro
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)ModakAnalytics
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Reliability and performance with ibm db2 analytics accelerator
Reliability and performance with ibm db2 analytics acceleratorReliability and performance with ibm db2 analytics accelerator
Reliability and performance with ibm db2 analytics acceleratorbupbechanhgmail
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingKent Graziano
 
Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionAditya Trivedi
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsSeeling Cheung
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityCloudera, Inc.
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...Guy Korland
 
Db2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallsDb2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallssam2sung2
 
DIY: TPCDS HDInsight Benchmark
DIY: TPCDS HDInsight BenchmarkDIY: TPCDS HDInsight Benchmark
DIY: TPCDS HDInsight BenchmarkAshish Thapliyal
 
Informatica push down optimization implementation
Informatica push down optimization implementationInformatica push down optimization implementation
Informatica push down optimization implementationdivjeev
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Denodo
 
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...Martin Schmidt
 
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorEDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorDaniel Martin
 

What's hot (20)

Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and Analytics
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
 
Informatica Power Center - Workflow Manager
Informatica Power Center - Workflow ManagerInformatica Power Center - Workflow Manager
Informatica Power Center - Workflow Manager
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Reliability and performance with ibm db2 analytics accelerator
Reliability and performance with ibm db2 analytics acceleratorReliability and performance with ibm db2 analytics accelerator
Reliability and performance with ibm db2 analytics accelerator
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
 
Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdution
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricity
 
Plm Data Migration
Plm Data MigrationPlm Data Migration
Plm Data Migration
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...
 
Db2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallsDb2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfalls
 
DIY: TPCDS HDInsight Benchmark
DIY: TPCDS HDInsight BenchmarkDIY: TPCDS HDInsight Benchmark
DIY: TPCDS HDInsight Benchmark
 
Informatica push down optimization implementation
Informatica push down optimization implementationInformatica push down optimization implementation
Informatica push down optimization implementation
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)
 
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...
Connect 2014 - CUST109 - planning and upgrading to ibm connections 4.5 succes...
 
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics AcceleratorEDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
EDBT 2013 - Near Realtime Analytics with IBM DB2 Analytics Accelerator
 

Viewers also liked

HCM Access Insight Dashboard
HCM Access Insight DashboardHCM Access Insight Dashboard
HCM Access Insight DashboardVera Ekimenko
 
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige Datacenters
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige DatacentersArtikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige Datacenters
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige DatacentersCees Westzaan
 
KeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingKeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingVera Ekimenko
 
Artikel Pt Industrieel Management Maart 2011
Artikel Pt Industrieel Management Maart 2011Artikel Pt Industrieel Management Maart 2011
Artikel Pt Industrieel Management Maart 2011Cees Westzaan
 
NT Greek Writing System Lec 2
NT Greek Writing System Lec 2NT Greek Writing System Lec 2
NT Greek Writing System Lec 2Brian Tucker
 
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis Geld
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis GeldPetrochem Van Juni 2011 Subsidies Zijn Geen Gratis Geld
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis GeldCees Westzaan
 
Migration Dashboard Template ver. 2
Migration Dashboard Template ver. 2Migration Dashboard Template ver. 2
Migration Dashboard Template ver. 2arvinronald
 
Dashboard_template
Dashboard_templateDashboard_template
Dashboard_templatePatty Leo
 
Holy Spirit In John's Gospel
Holy Spirit In John's GospelHoly Spirit In John's Gospel
Holy Spirit In John's GospelBrian Tucker
 
Art of Presenting yourself
Art of Presenting yourselfArt of Presenting yourself
Art of Presenting yourselfdeepak1983
 
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...SlideTeam.net
 
Expectation of Corporates from Professionals
Expectation of Corporates from ProfessionalsExpectation of Corporates from Professionals
Expectation of Corporates from Professionalsdeepak1983
 

Viewers also liked (20)

HCM Access Insight Dashboard
HCM Access Insight DashboardHCM Access Insight Dashboard
HCM Access Insight Dashboard
 
CSharp
CSharpCSharp
CSharp
 
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige Datacenters
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige DatacentersArtikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige Datacenters
Artikel Dcw Maart 2011 Meer Subsidiekansen Voor Energiezuinige Datacenters
 
KeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingKeyAchivementsJustisPublishing
KeyAchivementsJustisPublishing
 
Presentations Tips
Presentations TipsPresentations Tips
Presentations Tips
 
Artikel Pt Industrieel Management Maart 2011
Artikel Pt Industrieel Management Maart 2011Artikel Pt Industrieel Management Maart 2011
Artikel Pt Industrieel Management Maart 2011
 
NT Greek Writing System Lec 2
NT Greek Writing System Lec 2NT Greek Writing System Lec 2
NT Greek Writing System Lec 2
 
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis Geld
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis GeldPetrochem Van Juni 2011 Subsidies Zijn Geen Gratis Geld
Petrochem Van Juni 2011 Subsidies Zijn Geen Gratis Geld
 
DWHRestructure
DWHRestructureDWHRestructure
DWHRestructure
 
Multicomb, Horno Multicombustible
Multicomb, Horno MulticombustibleMulticomb, Horno Multicombustible
Multicomb, Horno Multicombustible
 
buy_in
buy_inbuy_in
buy_in
 
Migration Dashboard Template ver. 2
Migration Dashboard Template ver. 2Migration Dashboard Template ver. 2
Migration Dashboard Template ver. 2
 
Dashboard_template
Dashboard_templateDashboard_template
Dashboard_template
 
Holy Spirit In John's Gospel
Holy Spirit In John's GospelHoly Spirit In John's Gospel
Holy Spirit In John's Gospel
 
Art of Presenting yourself
Art of Presenting yourselfArt of Presenting yourself
Art of Presenting yourself
 
Thinkpad l420
Thinkpad l420Thinkpad l420
Thinkpad l420
 
Who Is Lenovo 2010
Who Is Lenovo 2010Who Is Lenovo 2010
Who Is Lenovo 2010
 
Timeline
TimelineTimeline
Timeline
 
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...
Ppt 3d linear flow organizing process diagram layouts powerpoint 2003 busines...
 
Expectation of Corporates from Professionals
Expectation of Corporates from ProfessionalsExpectation of Corporates from Professionals
Expectation of Corporates from Professionals
 

Similar to KeyAchivementsMimecast

Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Harsha Gowda B R
 
Samuel Bayeta
Samuel BayetaSamuel Bayeta
Samuel BayetaSam B
 
Professional Portfolio
Professional PortfolioProfessional Portfolio
Professional PortfolioMoniqueO Opris
 
Dynamics 365 saturday 2018 - data migration story
Dynamics 365 saturday   2018 - data migration storyDynamics 365 saturday   2018 - data migration story
Dynamics 365 saturday 2018 - data migration storyAndre Margono
 
Embrace Tableau Innovations
Embrace Tableau InnovationsEmbrace Tableau Innovations
Embrace Tableau InnovationsWiiisdom
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lakeCapgemini
 
Traditional data word
Traditional data wordTraditional data word
Traditional data wordorcoxsm
 
CV_Vasili_Tegza 2G
CV_Vasili_Tegza 2GCV_Vasili_Tegza 2G
CV_Vasili_Tegza 2GVasyl Tegza
 
Eric Shields Portfolio
Eric Shields PortfolioEric Shields Portfolio
Eric Shields PortfolioEricShields
 
Ramesh BODS_IS
Ramesh BODS_ISRamesh BODS_IS
Ramesh BODS_ISRamesh Ch
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida CLARA CAMPROVIN
 
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...Xoomworks Business Intelligence
 
Maharshi_Amin_416
Maharshi_Amin_416Maharshi_Amin_416
Maharshi_Amin_416mamin1411
 
SQL Server 2017 - Mejoras Impulsadas por la Comunidad
SQL Server 2017 - Mejoras Impulsadas por la ComunidadSQL Server 2017 - Mejoras Impulsadas por la Comunidad
SQL Server 2017 - Mejoras Impulsadas por la ComunidadJavier Villegas
 

Similar to KeyAchivementsMimecast (20)

Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10
 
Samuel Bayeta
Samuel BayetaSamuel Bayeta
Samuel Bayeta
 
Professional Portfolio
Professional PortfolioProfessional Portfolio
Professional Portfolio
 
Dynamics 365 saturday 2018 - data migration story
Dynamics 365 saturday   2018 - data migration storyDynamics 365 saturday   2018 - data migration story
Dynamics 365 saturday 2018 - data migration story
 
Embrace Tableau Innovations
Embrace Tableau InnovationsEmbrace Tableau Innovations
Embrace Tableau Innovations
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lake
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
Resume
ResumeResume
Resume
 
Ganesh CV
Ganesh CVGanesh CV
Ganesh CV
 
CV_Vasili_Tegza 2G
CV_Vasili_Tegza 2GCV_Vasili_Tegza 2G
CV_Vasili_Tegza 2G
 
Resume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - HadoopResume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - Hadoop
 
Building a SaaS Style Application
Building a SaaS Style ApplicationBuilding a SaaS Style Application
Building a SaaS Style Application
 
Eric Shields Portfolio
Eric Shields PortfolioEric Shields Portfolio
Eric Shields Portfolio
 
Ramesh BODS_IS
Ramesh BODS_ISRamesh BODS_IS
Ramesh BODS_IS
 
VamsiKrishna Maddiboina
VamsiKrishna MaddiboinaVamsiKrishna Maddiboina
VamsiKrishna Maddiboina
 
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida  Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
Jet Reports es la herramienta para construir el mejor BI y de forma mas rapida
 
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...How to pinpoint and fix sources of performance problems in your SAP BusinessO...
How to pinpoint and fix sources of performance problems in your SAP BusinessO...
 
Maharshi_Amin_416
Maharshi_Amin_416Maharshi_Amin_416
Maharshi_Amin_416
 
SQL Server 2017 - Mejoras Impulsadas por la Comunidad
SQL Server 2017 - Mejoras Impulsadas por la ComunidadSQL Server 2017 - Mejoras Impulsadas por la Comunidad
SQL Server 2017 - Mejoras Impulsadas por la Comunidad
 
Amit_Kumar_CV
Amit_Kumar_CVAmit_Kumar_CV
Amit_Kumar_CV
 

More from Vera Ekimenko

Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AIVera Ekimenko
 
Deep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationDeep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationVera Ekimenko
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data QualityVera Ekimenko
 
Unsupervised AI for Data Quality
Unsupervised AI for Data QualityUnsupervised AI for Data Quality
Unsupervised AI for Data QualityVera Ekimenko
 
Deep Learning Hackathon
Deep Learning HackathonDeep Learning Hackathon
Deep Learning HackathonVera Ekimenko
 
Cloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineCloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineVera Ekimenko
 
Artificial Intelligence Hackathon
Artificial Intelligence HackathonArtificial Intelligence Hackathon
Artificial Intelligence HackathonVera Ekimenko
 

More from Vera Ekimenko (8)

Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AI
 
AML Knowledge Graph
AML Knowledge GraphAML Knowledge Graph
AML Knowledge Graph
 
Deep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationDeep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio Optimization
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data Quality
 
Unsupervised AI for Data Quality
Unsupervised AI for Data QualityUnsupervised AI for Data Quality
Unsupervised AI for Data Quality
 
Deep Learning Hackathon
Deep Learning HackathonDeep Learning Hackathon
Deep Learning Hackathon
 
Cloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineCloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipeline
 
Artificial Intelligence Hackathon
Artificial Intelligence HackathonArtificial Intelligence Hackathon
Artificial Intelligence Hackathon
 

KeyAchivementsMimecast

  • 1. Key Achivements During my work in Mimecast 19/09/2013 – 29/09/2014
  • 2. DBA and performance tuning project One of the biggest problem we have in Mimecast is system performance. As our Data Warehouse is growing rapidly and becoming more and more complex it is difficult to monitor and control the performance. Some procedures took long time to execute which increased total time to have data refreshed. We had errors caused by lack of memory, hard disk I/O (latches). Users reported it takes very long time to load a dashboard. The problem was we had almost no insight into the issues we were facing in production. As part of my duties was database administration I initiated a performance tuning project to develop a set of procedure and extended events which collect various system counters including CPU, memory, disk usage store this information in database and later analysed with SSRS reports and dashboards in Tableau to get a complete performance picture of the SQL server environment. We found out that the performance issues can be divided into 3 main groups: 1. Tableau performance 2. SQL Server performance 3. Network performance
  • 3. 1. Tableau performance Tableau Data Extract project As most of our dashboards are very complex and highly interactive. All of them are live connected to our SQL server. As Tableau is as fast as our data source we need to decided if it is the best access method for our dashboards and try to replace live connections to use a Tableau Data Extract (TDE). I developed a script in C# generating TDE files using Tableau Data Extract API. The files then are put into the Tableau server and all datasources using Tableau extracts are refreshed. It resulted a significant improvement in dashboard load time. Tableau Availability project Our Tableau data is replicated in different Tableau servers but due to high cost of licence for Tableau Failover solution we need to have an in-house cost-effective custom failover solution without compromising quality which allows us to switch between our Tableau servers automatically in case one of them fails. This solution gives us ability to save our Tableau data is safe and have our BI dashboards available even in the event of a major failure. We also manage Tableau backups, making it easy for us to restore when needed, including point-in-time recovery. The solution I developed includes the following components:  Backup and maintenance job on the primary server nightly  Restore the backup primary on the secondary server  A script starting the secondary Tableau server when the primary server goes offline/becomes unreachable.  I suggested the following future developments for this subproject:  Create a landing page on corporate portal/website from which all Tableau users redirected to the active server.  Consider a replication solution for PostgreSQL database underlying Tableau  Introduce VM server availability solution by issuing ping command on remote Tableau server and running Powershell remote commands in case the VM is not available.
  • 4. 2. SQL Server performance Suggestions I proposed for three main bottleneck areas: Execution plans optimisation Review existing indexes usage, introduce new indexes Introduce native compile procedures (CLR) Memory Using SSIS as it uses its own memory more efficient Improve tempdb performance by storing it on a flash memory using a new feature (2014) ‘sort in tempdb’ Introduce in-memory OLTP table IO (latches) Introduce partitioned views having physical fact tables stored in different file groups on different disks As a part of this project, I completed a subproject called SSIS Lineage. The result was a new version of SSIS event handlers, which allows us to capture detailed audit information about an execution of the package. I also developed Business Intelligence documentation, which includes: • DWH documentation (stage, DWH tables, views) • Processing logic (stored procedures) • SSAS documentation (cubes, measure groups, dimensions)
  • 5. Integration solution When I joined Mimecast, they needed to improve an integration stage of their BI system. Massive data requirements put significant strain on our ability to load and process data quickly and created operational bottlenecks associated with data loading and processing. As these operational challenges had direct impact on Data Warehouse downstream we decided to replace existing integration approach with a new one based on vertical data partitioning schemes. The solution I developed provides real-time integration with 305 tables of our source system Netsuite. Apart of partitioning, it uses merge statement, Slowly Changing Dimension transformation and SQL Agent schedule executed a job on indefinite loop. By implementing an integration solution I developed, we were able to make significant productivity enhancements in data loading and maintenance, which allows us to shorten the refresh time from 25 to 10 minutes. Scalability and reliability of integration solution enabled us to meet data management challenges. The solution also provides availability of the staging data as stage tables not locked for reading and now available all the time without locks because of loading data with a switching partitions technique is a meta-data operation, which is competed practically instantly. Measures I suggested for future development in order to improve staging integration:  Set SQL server memory limits to unlimited (use all available to SQL Server memory)  Replace lookup tasks with merge statements based on Kimball methodology  Re-develop existing SSIS packages to use streamline approach instead of memory based by replacing dataflow tasks with ones processing data flow line by line rather than loading the whole set of data into the buffer  Replace insert SQL Destination step in Data Flow tasks with OLE DB Destination  Rewrite a stored procedure loading fact tables into DWH by removing CurrentRecord column from a clustered index and include this column in a non-clustered index to eliminate 'Halloween effect’ which significantly decreased performance of Data Warehouse.  Use alternative methods to download data into the staging phase (RESTlets)
  • 6. DataWarehouseSolutions- ServiceDeliveryDashboard I was involved in the several BI projects where I develop Data Warehouse solutions to expose data (Datamart) in Tableau dashboards. Developing a solution for dashboard involves preparation of specific datasets that to demonstrate specific concepts. Tableau developer then use this dataset (Datamart) and create an example proof of concept dashboards for clients and management. General requirement for Datamart I had to develop is to ensure that data is easy to use yet functionally rich. In order to meet this requirement Business Intelligence developer need to understanding complex underlying data because although we have all data from our source system in our Data Warehouse, turning this data into meaningful business information was presenting challenges. One of the project I was involved was Service Delivery Dashboard. One of challenge I faced during this project was to calculate First Time Response metric - number of business hours between ‘Case opened’ time and ‘Response sent from Mimecast’ time. The metric should take into account customer’s local time including daylight saving, weekends and public holidays. In fact, the timestamp captured in our source system Netsuite reflects GMT time zone. What I had to develop is a solution that converts GMT time to customer local time and calculates the number of business hours between requests and response. The solution I developed involves extracting data via web scraping of various public websites providing time zones, daylight saving and public holidays information. I utilized versatility of Ruby language using Nokogiri library and develop a script that perform all data collection tasks. Then I import data with a SSIS C# script component into relational database, perform various cleansing and conversion transformations. Challenges I faced developing this solution: • Some holidays are not nationwide, observed only in selected states/territories. • Customer location data can be clean mostly manually • Missing information on daylight saving time for some countries. • I had to use advance business logic like using secondary address in case the registration address not actual office address for companies registered on areas with special tax regime (BVI, Jersey, Cayman Islands etc.)
  • 7. DataWarehouseSolutions- SalesEnablementKPIs&Lead indicatorsDashboard I selected and develop a unified view with all Lead Indicators used in Sales Dashboard. The entities I suggested to use as Lead Indicators: The solution I developed included the following components:  stage and DWH tables  switching partition infrastructure (function and scheme)  SSIS packages  stored procedures which implement business logic and load transformed data into data mart. Demos Webexes Pipeline movements Targets Meetings Activities CX Activities Deal Registration Incentives Opportunities Quota RAMPACT Sales Forecast Sensitive Connects Survey Pre Sales
  • 8. DataWarehouseSolutions- GTMDashboard(Go-To-Market MarketingDashboard) The goal of this dashboard is to provide multi-level visibility into the efficacy of Mimecast’s field and group marketing efforts, and to attribute pipeline creation and won business to the correct source. GTM Dashboard enabled users to make informed decisions around marketing campaigns and focus, as well as provide KPIs to track the efficiency and effectiveness of Mimecast’s marketing against agreed targets. My role in this project was to develop a Data Warehouse solution, which consists of 2 sections:  Campaign Analysis  Pipeline Creation Waterfall One of the challenges I faced during this project was to implement AMT (Attributable Marketing Touch) model, which is an attribution model that allows a 90-day window after a marketing campaign during which any pipeline creation is attributed to that campaign. There is no weighting of deals between multiple campaigns, so single opportunities can be attributed to multiple campaign touches if they all fall within the 90-day window. In addition, marketing touches 7 days after opportunity creation are given credit due to current process/timing issues with marketing data imports. In addition, I needed to make sure that marketing history data is de-duplicated by campaign and date so that multiple contacts from the same lead attending the same event on the same day were counted as a single marketing touch.
  • 9. DataWarehouseSolutions- OpportunitiesDashboardproject I developed Data Warehouse solution which meet business requirements on calculating conversion rates (sales funnel) – new business pipeline created in a user-selected time period from Lead to Opportunity Creation, from Opportunity Creation to Qualification, and from Qualified Opportunity to Won Business are calculated for each strand, and overall. One of the biggest challenges I encountered during this project was to develop custom integration solution based on Netsuite saved search capabilities using RESTlets, web-service, xml processing in script components in C#. In order to meet user requirements for this dashboard I needed to bring data joined by our source system Netsuite. After profiling data I realized that Netsuite tables in our Data Warehouse cannot be used as they lack of necessary keys. I developed a custom integration solution using so called ‘saved searches’ which can be called within C# script component in SSIS, then process the result as xml and load into our Data Warehouse. Another challenge associated with this project was to developing a stored procedure which implementing data processing based on very complex business logic. For example, in calculating Highest Active Status metric for Closed Lost Opportunities it needs to be the stage it was at prior to being lost. Within this project, I also develop Opportunities OLAP cube, which was used to create dashboards in Tableau.
  • 10. SAM integration and Data Warehouse solutions The aim of the project was to make best use of the company's existing operational data currently held in silos within the large internal operational PostgreSQL database called SAM containing the detailed network and operations records on delivered services (pre-Big Data). The idea of using operational data for sales intelligence was not new. Mimecast had wrestled with this problem for almost ten years and recognised a need to develop a BI solution to bring this data into Data Warehouse. I developed an integration and Data Warehouse solution bringing datasets from SAM to Data Warehouse and process them to be used as data sources in various Tableau dashboards. One of the requirements was to ensure that the solution was scalable to incorporate information from the source system. The challenges I faced during this project: make conversion datetime parsed from varchar date type into datatime format depending on the server as our developing and production servers span across different continents (US and UK data format) workaround buffer limits in SSIS trying to download large amount of data by running multiple queries selected by id or timestamp using open source PostgeSQL drivers for C# script components using a listeners/switch in script component in C# in the process of SAM migration to download data from the new scheme when it's ready.
  • 11. IntegrationsolutionforCorporateGoals Dashboard I developed an integration and Data Warehouse solution processing SAM data and prepare it to be exposed to Tableau were developed. The challenges I faced during this project: complex integration solution to download nearly 'Big Data' (PostgreSQL) most efficient way complex processing logic applied - flattening, ranking, pivoting
  • 12. Excelintegrationsolution I developed an integration solution, which provides a seamless integration with Excel files with various manual adjustments in Data Warehouse. The solutions consists of the following steps: Uploaded Excel files are checked for integrity. Then a user is emailed the status of Excel file informing if the processing was successful. If integrity test failed an email includes details about errors – column in case of data type error or name of the worksheet in case it is missing etc. The challenges I had during this project:  Unable to query Excel by T-SQL via linked server with 64bit driver.  If File Task is placed after a Data flow using Excel connection in a control flow we have an error "The file is used by another process". To tackle this issue I separated SSIS packages in SQL Agent job.  Additional data validation layer in Excel template was introduced to eliminate possible errors on data entry level. The validation is based on conditional formatting using advanced Excel formulas.
  • 13. Merging(deduplicating)project I developed an integration and Data Warehouse solution as a part of merging project, which aimed to eliminate NetSuite duplicate records. Challenge: develop a complex reconciliation stored procedure to detect any deleted or non-active or invalid customers and contacts records. Pricereportin SSRS Challenge: Data need flattening and a report was based on multiple nested groups.
  • 14. Merging(deduplicating)project I developed an integration and Data Warehouse solution as a part of merging project, which aimed to eliminate NetSuite duplicate records. Challenge: develop a complex reconciliation stored procedure to detect any deleted or non-active or invalid customers and contacts records.