Taming the shrew, Optimizing Power BI Options

Taming of the Shrew
Tricks to Optimizing Power BI
Kellyn Pot’Vin-Gorman
TSP, Power BI and AI in Education

Kellyn Pot’Vin-Gorman
Technical Solution Professional at Microsoft, Data Platform in Power BI
and AI
• Former Technical Intelligence Manager, Delphix
• Multi-platform DBA, (Oracle, MSSQL, MySQL, Sybase, PostgreSQL,
Informix…)
• Oracle ACE Director, (Alumni)
• OakTable Network Member
• Idera ACE Alumni 2018
• STEM education with Raspberry Pi and Python, including DevOxx4Kids,
Oracle Education Foundation andTechGirls
• Former President, Rocky Mtn Oracle User Group
• Current President, Denver SQL Server User Group
• Linux and DevOps author, instructor and presenter.
• Blogger, (http://dbakevlar.com)Twitter: @DBAKevlar

Gaining just 10% more access to data
can result in over $65 million in
revenue

User Chooses to Refresh
Report
User Gets in Car
To Get Cup of Coffee
In Next Town
While Waiting for
Refresh
User Needs Updated
Information
from Power BI Report
Our User Story

Relational Data
Oracle, SQL Server,
Teradata, Salesforce
Cloud Data
Azure, AWS, Google
Other Data
Excel, Access,
Sharepoint, etc.
MODEL & SERVE
Azure Analysis ServicesAzure SQL Data
Warehouse
Power BI
.
Power BISQL Server
Integration
Services
P O W E R B I L A N D S C A P E
Finding all the Fish in the Ocean
Data Factory
Big Data
DataLake,Hadoop,
Hortonworks

Power BI is Guilty Until
Proven Innocent

Relational Data
Oracle, SQL Server,
Teradata, Salesforce
Cloud Data
Azure, AWS, Google
Other Data
Excel, Access,
Sharepoint, etc.
MODEL & SERVE
Azure Analysis ServicesAzure SQL Data
Warehouse
Power BI
.
Power BISQL Server
Integration
Services
P O W E R B I L A N D S C A P E
Finding All The External Latency
Data Factory
Big Data
HD Insights,
DataLake,
Hortonworks

Coordinate pipeline acOPTIMIZATION EXERCISE PROCESS
Power BI
Layer
Bring
Data to
Network
Specialist
OnceVerified
Non-Issue
Network
Layer
OnceVerified
Non-issue
BringWait
Times to
Data
Specialist
Repeat and
verify
resolved
Inspect
Data Model
Data Sets
Power BI Review Steps:
Resources
Concurrency
Visuals and
Dashboards
Data Modeler
to Address
OnceVerified
Non-IssueData
Sources
Identify byType
and bring in
expertise for
each

“TUNE FOR TIME OR YOU’RE
WASTING TIME.”

• A scientific approach to optimization.
• Optimizing on cost, or assumptions does not guarantee results.
• Removes finger pointing and the “Blame Game”
• Simplifies the process of identifying real latency.
• When Time is Addressed, Long Term Resolution is Often
Experienced.
Why Time Should BeYour Main Focus for
Optimization

• Data sources can be relational, databases, big data, CSV/Excel,
structured/unstructured data files.
• If there are onsite or remote specialists available, partner to gather
distinct data to identify waits and patterns.
• Know, along with execution plans, tracing can assist in identifying
deeper and multi-tier issues that isn’t divulged in traditional
performance tools.
• Infrastructure tools, cloud monitoring tools and tracing can also
provide more information than traditional tools.
Steps for Optimizing Data Sources

RELATIONAL DATA
SOURCES
•Filter Early, Filter Often- before it
is pulled to Power BI
•Understand the optimizer and
plans for queries and performance
“gotchas” for different database
platforms
•Push calculated columns and
measures to the source where
possible – disperse resource age
for the object to the source.
•Add indices, partitioning, etc. to
support commonly queried tables

BIG DATA
•Use HD Insight and/or Azure Data
Factory to help manage sheer
quantity of data.
•Manage partitions and prune
unnecessary data regularly.
•Make a goal to migrate to
“pristine” data model from
unstructured data.
•Make yourself part of the
development process to be aware
of changes to what data is being
consumed.
•Have clear and concise list of what
data is important to the business
vs. what is collected.

ACCESS AND EXCEL/CSV
• Keep Excel sheets and Access tables that are
brought into Power BI narrow. Wider tables
perform poorer.
• Purge or archive off unused data from
Access, which can slow down refreshes.
• Convert derived values from formulas to
static values whenever possible. This
removes one conversion step when
importing/refreshing to Power BI
• Avoid multiple volatile functions and array
formulas in Excel. This is not the place for
these.
• Avoid linked tables with Access with split
database architecture.
• Consider the size of the data in regards to
refreshes and how it will impact Power BI
performance.

The Network – The Final Bottleneck
On-Premise data sources
SQL DB Managed Instance
SQL Server
VNET
Data User
Power BICloud data sources
Microsoft
SQL Server
Integration Services
Firewall is our best
friend and worst
enemy

NETWORK
• Networks are still limited by much of
“Shannon’s Law”
• Filter to deter from creating bottlenecks
on the network.
• Become friends with the network admin
to isolate issues with firewalls and
network bottlenecks.
• Consider how often refreshes are
performed and from where the data is
being sent from and to.

Columnar data store makes it forgiving of large data
sets.
But…Power BI is dependent upon the data that it
sources from, along with multiple other features.
Performance can be hindered by numerous items
Power BI is dependent upon:
• Data Model
• Data Size
• Resources Allocated for Processing
• DataTypes

POWER BI QUERY EDITOR
• Avoid complex queries in Query Editor,
combinations of filter with context
transition are some of the worst.
• Don’t use relative date filtering in the Query
Editor.
• Keep measures simple initially, adding
complexity incrementally.
• Avoid relationships on calculated columns
and unique identifier columns.
• Try setting “Assume Referential Integrity” on
relationships – this may improve query
performance.
• Ensure relationships are set up properly, use
new many to many sparingly.

As You Design Your Reports
Simplify Data
Demands
Whenever
Possible
Remove Unused
Columns
Avoid Distinct
counts on
fields with
High
Cardinality
Limit
Complexity on
High
Cardinality
Consider How
Often Data
Refresh is
Required

VISUALS
• Filter early and filter carefully.
• You may want to switch off interaction
between visuals – it reduces the query load
as users cross-highlight.
• Always test the impact of row-level security
roles that your users will use and
performance.
• To ensure long-running queries won’t
monopolize the system, there is a 225
second timeout on visuals. Design visuals
with as much simplicity as possible to avoid
this threshold.

• Eight MAX visuals in dashboard or
report
• Set filters in filter pane of reports.
• Understand where performance hits
are sourcing from
• Test and track refreshes over
time for reports and
dashboards – Don’t assume.
• Don’t build complicated
measures or aggregates at
the data model layer.
Tips for Dashboards

• NarrowTables are Faster
• Integers over strings, (text)
• Slicers use multiple steps, (queries) to process
• Use powerful DAX functions that can eliminate
complex or poor performing expressions.
• Certain filters can hinder performance if they examine
each row. Identify when this occurs.
• Simplify queries whenever possible
• Follow best practices for relationships for your data
model
• Add indexes and foreign keys whenever possible
Power BI Tips

Resource Constrictions Can Hinder
Performance:
• Consider increasing memory allocated for
data loads
• Up data cache for large processing.
• Monitor and alert on thresholds for
demands for enterprise reporting
Resource Constrictions Can Hinder
Performance, too!

Power BI uses premium memory when:
•Loading datasets*
•When refreshing a dataset, (scheduled and on-
demand)*
•Running report queries
•Poor performance can result if evicted due to LRU
runs into conflict.
*Remember that datasets in memory may be larger than when stored
on disk and not to confuse premium memory with Power BI Premium.
Gotchas With Published Reports

let
Source = Csv.Document(File.Contents(“<logfile>"),5,"",null,1252),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", Int64.Type},
{"Column4", type text}, {"Column5", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column4"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column3", "PID"}, {"Column1", "Process Type"}}),
#"Replaced Value" = Table.ReplaceValue(#"Renamed Columns","{Start:","",Replacer.ReplaceText,{"Column5"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value", "Column5", Splitter.SplitTextByEachDelimiter({",Action:"},
QuoteStyle.Csv, false), {"Column5.1", "Column5.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column5.1", type datetime}, {"Column5.2", type
text}}),
#"Renamed Columns1" = Table.RenameColumns(#"Changed Type1",{{"Column5.1", "Start"}}),
#"Replaced Value1" = Table.ReplaceValue(#"Renamed Columns1","}","",Replacer.ReplaceText,{"Column5.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Replaced Value1", "Column5.2", Splitter.SplitTextByEachDelimiter({",Duration:"},
QuoteStyle.Csv, true), {"Column5.2.1", "Column5.2.2"}),
#"Replaced Value2" = Table.ReplaceValue(#"Split Column by Delimiter1","00:00:","",Replacer.ReplaceText,{"Column5.2.2"}),
#"Renamed Columns2" = Table.RenameColumns(#"Replaced Value2",{{"Column5.2.2", "Duration"}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Renamed Columns2",{{"Duration", type number}}),
#"Renamed Columns3" = Table.RenameColumns(#"Changed Type2",{{"Column5.2.1", "Message"}}),
#"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns3",{"Process Type"})
in
#"Removed Columns1"

Term Function Log Source
SimpleDocument Local Object Multiple logs
RemoteDocument Remote Excel or CSV file Multiple logs
PackageStorage Disk waits- database,
often Access
Power BI logs
PBIDashboard Dashboard waits PBI logs, inspect message
PBIVisualConsent Row level permissions PBI Logs, inspect message
PBIData.get Get Data waits PBI Logs, inspect message
PBITrustedVisual Open visual view PBI Logs
PBIModuleLoad Load of dashboard PBI Logs
FirewallDocument Cloud or remote
document
MSMdsrv Logs

https://blogs.msdn.microsoft.com/samlester/2015/12/12/connecting-sql-server-profiler-to-power-bi-desktop/

SUMMARY
• Remember to stay with the
process.
• Use time as the reason to
optimize.
• Use data, not assumptions.
• Use Power BI to analyze logs and
traces, just as you would other
data.
• Collaborate with the user to
identify what’s important to them,
too.

Thanks to
• Chris Webb for sharing test data and ideas.
• Brent Ozar for creating the sp_blitz data model
that offered the opportunity to optimize.
• The EDU group at Microsoft for offering a full
environment for me to build for testing, including
the cloud to work with on this presentation.

Questions?
dbakevlar@gmail.com
https://dbakevlar.com
Twitter: @dbakevlar

Taming the shrew, Optimizing Power BI Options

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Taming the shrew, Optimizing Power BI Options

Similar to Taming the shrew, Optimizing Power BI Options (20)

More from Kellyn Pot'Vin-Gorman

More from Kellyn Pot'Vin-Gorman (20)

Recently uploaded

Recently uploaded (20)

Taming the shrew, Optimizing Power BI Options