Line charts are just about the most difficult chart to format for clarity. It doesn't take too many lines before the whole thing looks like a pile of undercooked spaghetti.
Presentation by Alycia Murugesson & Nokuthula Mabhena on how to make data attractive for the 5th Biennial SAMEA Conference. Covers data visualization and infographics.
Presentation by Alycia Murugesson & Nokuthula Mabhena on how to make data attractive for the 5th Biennial SAMEA Conference. Covers data visualization and infographics.
Sometimes a table is going to outpace any chart(s) you can make. Presenting a table doesn't need a lot and I use the same principles for tables that I apply to charts.
I'll admit. I have a fixation on 3D charts. Namely, destroying them any time I come across them. Here you'll see why I have such a problem with them.
Don't fall into the trap of "fun" effects that Excel can do.
You've been doing your NFIRS reports, making sure they're good, submitting them. You're doing everything right for the System and now it's time to get something back.
When it comes to reporting to the National Fire Incident Reporting System, there are several myths floating around. As a trainer I hear them in almost every class. Find the truth behind common myths in this short presentation. Spread the truth afterwards! Note: there are some links specific to Kansas Fire Incident Reporting System. They may still be applicable but please defer to your state's guidelines.
The backbone of any National Fire Incident Reporting System report is the Incident Type. With 176 choices it can be overwhelming to get started. Use this short presentation to help in your search for not only the correct code, but data quality.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Sometimes a table is going to outpace any chart(s) you can make. Presenting a table doesn't need a lot and I use the same principles for tables that I apply to charts.
I'll admit. I have a fixation on 3D charts. Namely, destroying them any time I come across them. Here you'll see why I have such a problem with them.
Don't fall into the trap of "fun" effects that Excel can do.
You've been doing your NFIRS reports, making sure they're good, submitting them. You're doing everything right for the System and now it's time to get something back.
When it comes to reporting to the National Fire Incident Reporting System, there are several myths floating around. As a trainer I hear them in almost every class. Find the truth behind common myths in this short presentation. Spread the truth afterwards! Note: there are some links specific to Kansas Fire Incident Reporting System. They may still be applicable but please defer to your state's guidelines.
The backbone of any National Fire Incident Reporting System report is the Incident Type. With 176 choices it can be overwhelming to get started. Use this short presentation to help in your search for not only the correct code, but data quality.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
1. Fire data isn’t ugly
Presenting fire data effectively series
Episode: When lines cross
July 2015
2. A makeover of fire department
data to transform it from
unclear and underperforming
to powerfully informative.
3. This redo is going to be a bit different. Here we’re going to
use some real data from the Kansas Fire Incident
Reporting System on school fires during the school year.
First, let’s create a line chart using all the defaults in
Microsoft Excel. Line charts are a bit big so we’ll be
working with one chart, instead of the usual side-by-side.
6. 0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
Line charts visualize continuous information. There is no
halfway point between “Vehicle fires” and “Yearly Total”
yet this is how Excel might default your chart.
Instead, make sure that time is on the x-axis by using the
Switch Row/Column option.
7. Line charts
Time series data is suited perfectly
to a line chart since it’s
continuous data. Few other data
series fit well on line charts.
8. We can label the chart better using a formatting option
for the Labels. Did you know that Excel can list the Value,
Series Name, and/or Category Name? Just right click the
labels and choose Format Data Labels.
9. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Directly labeling the lines takes away the need to look at
a line, look for the color, and then assign a name from the
legend.
Working memory can only keep about 3 items going at
once.
10. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
You’ll notice I also changed the colors of the labels to
match the lines themselves. This helps your eye out even
more.
We’ll come back to color here in a bit.
11. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Lighten the gridlines to help the lines move forward. This
ultra-light grey might not print well so check it with a test
run. Drop the bolding on the title, too.
I’ve shifted the title to the left, where our eyes naturally
look for the beginning of text.
12. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Think about what you are trying to say with the chart. If
you are trying to emphasize the fires inside schools, then
you can push the emphasis right there.
Once you know the point you’re trying to illustrate, a bit
of color does all the work for you.
13. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
You can even slightly increase and decrease the line
weight of each category. Don’t go overboard with this
though.
Here, the line for Building/Inside Fires is just a tad thicker
than the rest.
14. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Worried about printing in black and white?
Even without color, your line chart still passes all the
information on, between direct labeling and shades of
grey.
15. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
If you ask your data the right questions you can get the
right chart. If you’re interested in each year compared to
the others this chart is perfect.
Let’s ask our data another question. Are there more or
fewer fires today than 5 years ago?
16. Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
If we only need to compare two years, we’re not
interested in the rise and fall between each year.
Actually, the other years get in the way. Can you tell with
this view if fires are up or down?
19. Building/Inside Fires, 22
31
Outside Trash/Dumpster Fires, 19
15
Vegetation Fires, 8
10
Vehicle Fires, 6 5
Yearly Total, 68
72
2010 2014
Fires in schools during the school year
Slope graphs are just two data points: the start date and
the end date. You can quickly see what’s up and what’s
down.
You can even emphasize increasing/decreasing fires
using color: red for up, blue for down.
20. Building/Inside Fires, 22
31
Outside Trash/Dumpster Fires, 19
15
Vegetation Fires, 8
10
Vehicle Fires, 6 5
Yearly Total, 68
72
2010 2014
Fires in schools during the school year
Be careful in sizing your chart. Slope graphs are usually
tall and narrow, but if you make your chart longer the
increases and decreases will appear smaller. If you make
your chart taller and squish it any changes will show off
dramatically.
Building/Inside Fires, 22
31
Outside Trash/Dumpster Fires, 19
15
Vegetation Fires, 8
10
Vehicle Fires, 6 5
Yearly Total, 68
72
2010 2014
Fires in schools during the school year
21. 0
20
40
60
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Building/Inside Fire
Outside
Trash/Dumpster
Vegetation Fires
Vehicle Fires
Building/Inside Fire
Outside Trash/Dumpster
Vegetation Fires
Vehicle Fires
Yearly total
0
10
20
30
40
50
60
70
80
2010 2011 2012 2013 2014
Fires at schools during the school year
Default
After
22. Lines and Slope Graphs
Don’t be afraid of line charts.
Just clean them up a bit and they
can be quite presentable.
23. Hello! I’m Sara Wood and I love converting fire service members into
NFIRS operatives. I’m the State NFIRS program manager for Kansas and
enjoy providing classes to help bring fire departments into the era of data
driven decisions. If you need help creating a presentation or analyzing
your data, I’d love to hear from you!