SlideShare a Scribd company logo
A Smart GridcompanyI workedatwas havingdifficultywiththe reliabilityof the “DataCollectors”,
whichwouldcollectmeteringdatafromsmartMeters intheirarea andtransmitthat data up to the
utilityheadoffice onadailybasis.The promise made tothe utilitywasthattheirsystemwouldreadand
report98+% of all installedSmartMeterseveryday.Theywere notmeetingthatgoal.
Each Data Collectorhada debugport,accessible viatelnet,thatrevealedwhatthe unitwasdoing.
Standardpractice was to use ExtraPuTTY to establishthe telnetandcapture all collecteddatatoa log
file forlaterexamination.Itwasa small manual chore toset upand examine datafromone Data
Collector,itwasa bigmanual chore to dothat withmultiple DataCollectors.Additionally,if the Data
Collectorrebootedorlockedupitstelnet,nofurtherdatawouldbe collected.
My solutionwastocreate a configurable tool thatwouldlogintomultiple DataCollectors,logtheirdata
withtimestamps,andexamine the incomingdataon-the-fly.Here isthe mainpanel of thattool:
Each horizontal line above representsone DataCollector.Notice the greenbarstopand bottom:Clicking
these scrollsthe listof collectorsupordownby tenrows.This tool will monitorup to50 Data Collectors
at a time,eachbeingmonitoredinitsownthread.Clickingthe “MainControl”fromOFFto ON startsthe
processand the telnetstoeveryDataCollectorthathas its“Activate”box checked.
The “Status” indicatesida connectionis active,andthe “CollectorID”isjustthe serial numberas
indicatedinthe configuration.
Clickingthe “Configure”buttononanyrow bringsupa panel like this:
The Device Type indicatesthe connectiontype,the Device IDisthe serial numberof the device,andth
IP addressesare given.If the connectionistobe anRS-232 port (supported) thenthe COMportnumber
isused.
Since I washandlingalarge numberof data collectorsscatteredall overthe building,andnotall of them
belongedtome,Ihad to keeptrackof where theywere andwhatI waspermittedtouse themforby
theirowners.Ihad myprogram track that for me,unit-by-unit.
The “TelnetsSuccessful”isthe countof successful telnetsmade tothe device.More thanone means
there wasan interruptionandthe telnetwasre-established,indictingapossiblereboot.
Clickingthe “GraphSelectionPanel”buttonbringsupthispanel:
Duringthe monitoringthere are several itemstrackedovertime.Since the programisnotrunningat the
time of thiswritingthere all sample sizesare zero.Aninterestinggraphisthe “ActivityGraph”,which
graphs the linesof debuginformationminute-by-minute,showingwhenthe collectorwasmore
stressed.Itrackedline-countsof upto 40,000 linesof debuginformationperminute oncome collectors.
The “BufferGraph” showsthe numberof free buffersreportedovertime,makingiteasytodetecta
bufferleakthatmightnotcause a collectorresetforweeksormonths.
The “Show Status”buttonon any rowbringsup a panel like this:
Thisis a summaryof statisticsandnotedproblemssee todate.The contentsof thiswindow are also
preservedinasummaryfile forthisdevice once perminute,keepingitconstantlyupdated.This
information,of course,can be viewedwithoutinterruptingthe monitoring.
The “Show Trace” buttonbringsupthis panel:
Whenthispanel isup onany device newlinesarrivingfromthe device are shownhere.Itletsyousee
whatis happeninginthe monitoringwithout interruptingthe monitoring.
The “Notes”buttonsimplybringsupthe contentsof a textfile thatI can make notesto. It isstoredby
serial numberandwill surviveadevice beingremovedandrestored.
Everytime the DebugMonitoris startedit makes a new time-stampedlogfile folder.Whenthe
monitoringisstartedafolderforthe logsfilesismade foreach device:
Inside the folderforadevice are the logfiles:
Logfile.logisthe global logfile,all debuglinesgothere.Butsome linesof debughave relationto
individualfunctionswe wanttotrack, sothose are alsostoredina special logfile forthatfunction.That
wayyou can quicklyviewwhathappenedregardingthatfunctionwithouthavingitslines“polluted”by
linesforotherfunctions.
Additionally,there are otherthingsgoingonoutside of the debugporttelnet,like anothertelnetmade
once per minute toquerythe device forstatusonsome functions.Those queriesgototheirownlog
files,likeTstBenchRead.txt.
I nevergoto where I wasmonitoringmore thanabouttwentycollectorsatone time,Ijustdidnot have
access to enoughunits.HoweverIfoundalot of value inmonitoringthose twenty,since the failuresI
was huntingwere sointermittent.The ease of restartingthe monitoringwaswonderful:One clickand
thenwatch ithappenonthe LEDs, as opposedtothe longmanual processof settingupall those telnets
and startingdata logging.The reliabilityof the programallowedme tocomfortablygohome fora
weekendandknowthatIwouldhave soliddatawhenIarrivedback at work onMonday morning.
My extensivepre-sortedlogfilesalsoenabledme toeasilyanalyze problemsandfile verydetailedbug
reports,whichmade iteasierforthe developerstounderstandandfix the problems.Bugturn-around
was prettygood.
The resultof all thistoolswasgreatly the acceleratedtesting,diagnosis,andrepairof causesof spurious
collectorrebootsandotherissues.Withthe improvedreliabilityof the collectorsthe numberof smart
metersbeingreadona dailybasisshotup to 99.5% andthe utilitycouldnolongerwithholdpaymentfor
an installationforlackof performance andthiscompanyexperienceditsfirstprofitable quarter.
Whenthe company’smanagementsawthe tool I was workingonand how it helpedtheirproduct
qualitytheyagreedthatcontinuingthiseffortwasmymandate andtheymade allowancestoenable me
to pursue it.I wouldsaythose allowancespaidoff.

More Related Content

Similar to Debug Console Monitor

Cicero Discovery White Paper
Cicero Discovery White PaperCicero Discovery White Paper
Cicero Discovery White Paper
Cicero, Inc.
 
Checking Windows for signs of compromise
Checking Windows for signs of compromiseChecking Windows for signs of compromise
Checking Windows for signs of compromise
Cal Bryant
 
O P Manager
O P  ManagerO P  Manager
O P Manager
blakka
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
Sanjib Dhar
 
Monitor(karthika)
Monitor(karthika)Monitor(karthika)
Monitor(karthika)
Nagarajan
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
Brian Brazil
 
The correlation advantages of ANET SURELOG International Edition SIEM product
The correlation advantages of ANET SURELOG International Edition SIEM product The correlation advantages of ANET SURELOG International Edition SIEM product
The correlation advantages of ANET SURELOG International Edition SIEM product
Ertugrul Akbas
 
Manual BASE Insight Lite Edition (En)
Manual BASE Insight Lite Edition (En)Manual BASE Insight Lite Edition (En)
Manual BASE Insight Lite Edition (En)
BeAnywhere
 
Built in digaonst
Built in digaonstBuilt in digaonst
Built in digaonst
elboob2025
 
Scytec cloud machine utilization tracking makes pilot programs obsolete
Scytec cloud machine utilization tracking makes pilot programs obsoleteScytec cloud machine utilization tracking makes pilot programs obsolete
Scytec cloud machine utilization tracking makes pilot programs obsolete
Josh Davids
 
VMS Troubleshooting Guide
VMS Troubleshooting GuideVMS Troubleshooting Guide
VMS Troubleshooting Guide
Michael Dotson
 
The difference between in-depth analysis of virtual infrastructures & monitoring
The difference between in-depth analysis of virtual infrastructures & monitoringThe difference between in-depth analysis of virtual infrastructures & monitoring
The difference between in-depth analysis of virtual infrastructures & monitoring
BettyRManning
 
Point of-sale-malware-backoff
Point of-sale-malware-backoffPoint of-sale-malware-backoff
Point of-sale-malware-backoff
Andrey Apuhtin
 
Point of-sale-malware-backoff
Point of-sale-malware-backoffPoint of-sale-malware-backoff
Point of-sale-malware-backoff
EMC
 
Meterpreter awareness
Meterpreter awarenessMeterpreter awareness
Meterpreter awareness
Haydn Johnson
 
What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
Maarten Balliauw
 
Software_Documentation
Software_DocumentationSoftware_Documentation
Software_Documentation
Sven Roesner
 
Tiamo software from Metrohm
Tiamo software from MetrohmTiamo software from Metrohm
Tiamo software from Metrohm
Metrohm India Limited
 
Ch24 system administration
Ch24 system administration Ch24 system administration
Ch24 system administration
Raja Waseem Akhtar
 
Ch24
Ch24Ch24

Similar to Debug Console Monitor (20)

Cicero Discovery White Paper
Cicero Discovery White PaperCicero Discovery White Paper
Cicero Discovery White Paper
 
Checking Windows for signs of compromise
Checking Windows for signs of compromiseChecking Windows for signs of compromise
Checking Windows for signs of compromise
 
O P Manager
O P  ManagerO P  Manager
O P Manager
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
 
Monitor(karthika)
Monitor(karthika)Monitor(karthika)
Monitor(karthika)
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
The correlation advantages of ANET SURELOG International Edition SIEM product
The correlation advantages of ANET SURELOG International Edition SIEM product The correlation advantages of ANET SURELOG International Edition SIEM product
The correlation advantages of ANET SURELOG International Edition SIEM product
 
Manual BASE Insight Lite Edition (En)
Manual BASE Insight Lite Edition (En)Manual BASE Insight Lite Edition (En)
Manual BASE Insight Lite Edition (En)
 
Built in digaonst
Built in digaonstBuilt in digaonst
Built in digaonst
 
Scytec cloud machine utilization tracking makes pilot programs obsolete
Scytec cloud machine utilization tracking makes pilot programs obsoleteScytec cloud machine utilization tracking makes pilot programs obsolete
Scytec cloud machine utilization tracking makes pilot programs obsolete
 
VMS Troubleshooting Guide
VMS Troubleshooting GuideVMS Troubleshooting Guide
VMS Troubleshooting Guide
 
The difference between in-depth analysis of virtual infrastructures & monitoring
The difference between in-depth analysis of virtual infrastructures & monitoringThe difference between in-depth analysis of virtual infrastructures & monitoring
The difference between in-depth analysis of virtual infrastructures & monitoring
 
Point of-sale-malware-backoff
Point of-sale-malware-backoffPoint of-sale-malware-backoff
Point of-sale-malware-backoff
 
Point of-sale-malware-backoff
Point of-sale-malware-backoffPoint of-sale-malware-backoff
Point of-sale-malware-backoff
 
Meterpreter awareness
Meterpreter awarenessMeterpreter awareness
Meterpreter awareness
 
What is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
 
Software_Documentation
Software_DocumentationSoftware_Documentation
Software_Documentation
 
Tiamo software from Metrohm
Tiamo software from MetrohmTiamo software from Metrohm
Tiamo software from Metrohm
 
Ch24 system administration
Ch24 system administration Ch24 system administration
Ch24 system administration
 
Ch24
Ch24Ch24
Ch24
 

Debug Console Monitor

  • 1. A Smart GridcompanyI workedatwas havingdifficultywiththe reliabilityof the “DataCollectors”, whichwouldcollectmeteringdatafromsmartMeters intheirarea andtransmitthat data up to the utilityheadoffice onadailybasis.The promise made tothe utilitywasthattheirsystemwouldreadand report98+% of all installedSmartMeterseveryday.Theywere notmeetingthatgoal. Each Data Collectorhada debugport,accessible viatelnet,thatrevealedwhatthe unitwasdoing. Standardpractice was to use ExtraPuTTY to establishthe telnetandcapture all collecteddatatoa log file forlaterexamination.Itwasa small manual chore toset upand examine datafromone Data Collector,itwasa bigmanual chore to dothat withmultiple DataCollectors.Additionally,if the Data Collectorrebootedorlockedupitstelnet,nofurtherdatawouldbe collected. My solutionwastocreate a configurable tool thatwouldlogintomultiple DataCollectors,logtheirdata withtimestamps,andexamine the incomingdataon-the-fly.Here isthe mainpanel of thattool: Each horizontal line above representsone DataCollector.Notice the greenbarstopand bottom:Clicking these scrollsthe listof collectorsupordownby tenrows.This tool will monitorup to50 Data Collectors
  • 2. at a time,eachbeingmonitoredinitsownthread.Clickingthe “MainControl”fromOFFto ON startsthe processand the telnetstoeveryDataCollectorthathas its“Activate”box checked. The “Status” indicatesida connectionis active,andthe “CollectorID”isjustthe serial numberas indicatedinthe configuration. Clickingthe “Configure”buttononanyrow bringsupa panel like this: The Device Type indicatesthe connectiontype,the Device IDisthe serial numberof the device,andth IP addressesare given.If the connectionistobe anRS-232 port (supported) thenthe COMportnumber isused. Since I washandlingalarge numberof data collectorsscatteredall overthe building,andnotall of them belongedtome,Ihad to keeptrackof where theywere andwhatI waspermittedtouse themforby theirowners.Ihad myprogram track that for me,unit-by-unit. The “TelnetsSuccessful”isthe countof successful telnetsmade tothe device.More thanone means there wasan interruptionandthe telnetwasre-established,indictingapossiblereboot. Clickingthe “GraphSelectionPanel”buttonbringsupthispanel:
  • 3. Duringthe monitoringthere are several itemstrackedovertime.Since the programisnotrunningat the time of thiswritingthere all sample sizesare zero.Aninterestinggraphisthe “ActivityGraph”,which graphs the linesof debuginformationminute-by-minute,showingwhenthe collectorwasmore stressed.Itrackedline-countsof upto 40,000 linesof debuginformationperminute oncome collectors. The “BufferGraph” showsthe numberof free buffersreportedovertime,makingiteasytodetecta bufferleakthatmightnotcause a collectorresetforweeksormonths. The “Show Status”buttonon any rowbringsup a panel like this:
  • 4. Thisis a summaryof statisticsandnotedproblemssee todate.The contentsof thiswindow are also preservedinasummaryfile forthisdevice once perminute,keepingitconstantlyupdated.This information,of course,can be viewedwithoutinterruptingthe monitoring. The “Show Trace” buttonbringsupthis panel:
  • 5. Whenthispanel isup onany device newlinesarrivingfromthe device are shownhere.Itletsyousee whatis happeninginthe monitoringwithout interruptingthe monitoring. The “Notes”buttonsimplybringsupthe contentsof a textfile thatI can make notesto. It isstoredby serial numberandwill surviveadevice beingremovedandrestored.
  • 6. Everytime the DebugMonitoris startedit makes a new time-stampedlogfile folder.Whenthe monitoringisstartedafolderforthe logsfilesismade foreach device:
  • 7. Inside the folderforadevice are the logfiles:
  • 8. Logfile.logisthe global logfile,all debuglinesgothere.Butsome linesof debughave relationto individualfunctionswe wanttotrack, sothose are alsostoredina special logfile forthatfunction.That wayyou can quicklyviewwhathappenedregardingthatfunctionwithouthavingitslines“polluted”by linesforotherfunctions. Additionally,there are otherthingsgoingonoutside of the debugporttelnet,like anothertelnetmade once per minute toquerythe device forstatusonsome functions.Those queriesgototheirownlog files,likeTstBenchRead.txt.
  • 9. I nevergoto where I wasmonitoringmore thanabouttwentycollectorsatone time,Ijustdidnot have access to enoughunits.HoweverIfoundalot of value inmonitoringthose twenty,since the failuresI was huntingwere sointermittent.The ease of restartingthe monitoringwaswonderful:One clickand thenwatch ithappenonthe LEDs, as opposedtothe longmanual processof settingupall those telnets and startingdata logging.The reliabilityof the programallowedme tocomfortablygohome fora weekendandknowthatIwouldhave soliddatawhenIarrivedback at work onMonday morning. My extensivepre-sortedlogfilesalsoenabledme toeasilyanalyze problemsandfile verydetailedbug reports,whichmade iteasierforthe developerstounderstandandfix the problems.Bugturn-around was prettygood. The resultof all thistoolswasgreatly the acceleratedtesting,diagnosis,andrepairof causesof spurious collectorrebootsandotherissues.Withthe improvedreliabilityof the collectorsthe numberof smart metersbeingreadona dailybasisshotup to 99.5% andthe utilitycouldnolongerwithholdpaymentfor an installationforlackof performance andthiscompanyexperienceditsfirstprofitable quarter. Whenthe company’smanagementsawthe tool I was workingonand how it helpedtheirproduct qualitytheyagreedthatcontinuingthiseffortwasmymandate andtheymade allowancestoenable me to pursue it.I wouldsaythose allowancespaidoff.