SlideShare a Scribd company logo
Accessibility
Considerations forVery
Large Datasets
Puneet Kishor
University of Wisconsin-Madison and Creative Commons
Monday, October 25, 2010
Acknowledgments to
CODATA for inviting me, Creative
Commons for funding my trip,
University of Wisconsin-Madison for
paying my salary, and most
importantly, the US Federal
Goverment for making all the data
available to anyone, anywhere
without any pre-conditions
Monday, October 25, 2010
Research context:
ecosystem process
modeling of very large
terrestrial ecosystems
Monday, October 25, 2010
Information by numbers
Monday, October 25, 2010
7
daily variablestm
axtm
in
taveprcpsrad
vpd
dayl
Monday, October 25, 2010
1
km2 cell
1 km
1 km
tm
axtm
in
taveprcpsrad
vpd
dayl
Monday, October 25, 2010
13million cells
4587
2889
.25
Monday, October 25, 2010
8401days
8400
Monday, October 25, 2010
111billion septets
.32
tm
axtm
in
taveprcpsrad
vpd
dayl
111.32b
Monday, October 25, 2010
725raw gigabytes
.78
Monday, October 25, 2010
10times as much in
a database
Monday, October 25, 2010
84GB of NetCDF format
in tar gzipped archives
Monday, October 25, 2010
2 3 4 5
8 9 10 11
6
12
7
13 14
1
4˚square chunks
Monday, October 25, 2010
“½”incomplete
documentation
Monday, October 25, 2010
0ways to query
the data
?X
Monday, October 25, 2010
1. Acquire NetCDF file of lat/lon values for each
cell from the weather data 1 km2 estimates
2. Dump lat/lon values to CSV with Panoply
3. Import into ArcMap as XY data
4. Export as shapefile
5. Assign WGS84 datum to shapefile in ArcCatalog
6. Reproject to Lambert Spherical (“US National
Atlas Equal Area”)
7. Separate by 2x2 degree tile using "tile_num"
attribute (so grid will match the netCDF met
files) using defination query in ArcMap and
exporting to individual shapefiles (256 tiles) as
"mask".
8. Open lambert points in qGIS and make 1km grid
(shapefile) for each 2x2 tile
9. Assign projection to output (EPSG:2163)
10. Add each new grid shapefile (one at a time) to
ArcMap with 2x2 Grid as separate layer
11. Select by location (select from grid x that
intersect mask x)
12. Export selected features of grid x (now will be
numbered sequentially by record in a way that
matches the met NetCDF “ncells”)
13. Clean up: delete extra fields from qGIS
(ID,MAXX,MINX,MAXY,MINY) add ncell_id (FID
+1) block_id, block_name
“10”times the work to
unpack the data
Monday, October 25, 2010
Many kinds of queries
f<variable> <location> <point in time>
avg(srad) at x,y on Dec 2, 2001
tmin for area on May 19, 1992
tmax at x,y on May 19, 1992
f<variable> <point location> <duration of time>
tave at x,y during the first quarter of 1983
sum(vpd) at x,y during the last week of Mar, 2003
Monday, October 25, 2010
accessible¦aksesəbəl¦
adjective
1 (of a place) able to be reached or entered : the town is
accessible by bus | the building has been made accessible to
disabled people.
• (of an object, service, or facility) able to be easily obtained or
used : making learning opportunities more accessible to adults.
• easily understood : his Latin grammar is lucid and accessible.
• able to be reached or entered by people in wheelchairs : it
provides specialized features such as nonslip floors and accessible
entrances.
2 (of a person, typically one in a position of authority or
importance) friendly and easy to talk to; approachable : he is more
accessible than most tycoons.
Monday, October 25, 2010
Accessible information
is easy to: find,
determine what one
can do with it, acquire,
and use
Monday, October 25, 2010
Factors that affect
accessibility: law;
technology; culture;
semantics; and
economics
Monday, October 25, 2010
Law makes sharing
permissible; technology
makes it possible; culture
makes it acceptable;
semantics make it
understandable; and
economics affordable
Monday, October 25, 2010
It is permissible,
acceptable, and
affordable to access
public sector
information, but not
necessarily possible or
understandableMonday, October 25, 2010
Goals of the new
storage: make the
information
technologically and
semantically accessible
Monday, October 25, 2010
Allow access by
providing user-
interface, application
programming interface
and documentation
Monday, October 25, 2010

More Related Content

Viewers also liked

2 sharif
2 sharif2 sharif
G T C N Exec Summ J M 1
G T C N  Exec  Summ  J M 1G T C N  Exec  Summ  J M 1
G T C N Exec Summ J M 1
Jeffery Massey
 

Viewers also liked (7)

2 sharif
2 sharif2 sharif
2 sharif
 
10
1010
10
 
G T C N Exec Summ J M 1
G T C N  Exec  Summ  J M 1G T C N  Exec  Summ  J M 1
G T C N Exec Summ J M 1
 
How Many Errors Can Be In My Paper?
How Many Errors Can Be In My Paper?How Many Errors Can Be In My Paper?
How Many Errors Can Be In My Paper?
 
Your True Reality Short Version
Your True Reality  Short VersionYour True Reality  Short Version
Your True Reality Short Version
 
Funky Buddha Fashion Collection SS 14
Funky Buddha Fashion Collection SS 14Funky Buddha Fashion Collection SS 14
Funky Buddha Fashion Collection SS 14
 
Auto Enrolment: Are You Ready?
Auto Enrolment: Are You Ready?Auto Enrolment: Are You Ready?
Auto Enrolment: Are You Ready?
 

Recently uploaded

Recently uploaded (20)

In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 

3 kishor

  • 1. Accessibility Considerations forVery Large Datasets Puneet Kishor University of Wisconsin-Madison and Creative Commons Monday, October 25, 2010
  • 2. Acknowledgments to CODATA for inviting me, Creative Commons for funding my trip, University of Wisconsin-Madison for paying my salary, and most importantly, the US Federal Goverment for making all the data available to anyone, anywhere without any pre-conditions Monday, October 25, 2010
  • 3. Research context: ecosystem process modeling of very large terrestrial ecosystems Monday, October 25, 2010
  • 6. 1 km2 cell 1 km 1 km tm axtm in taveprcpsrad vpd dayl Monday, October 25, 2010
  • 11. 10times as much in a database Monday, October 25, 2010
  • 12. 84GB of NetCDF format in tar gzipped archives Monday, October 25, 2010
  • 13. 2 3 4 5 8 9 10 11 6 12 7 13 14 1 4˚square chunks Monday, October 25, 2010
  • 15. 0ways to query the data ?X Monday, October 25, 2010
  • 16. 1. Acquire NetCDF file of lat/lon values for each cell from the weather data 1 km2 estimates 2. Dump lat/lon values to CSV with Panoply 3. Import into ArcMap as XY data 4. Export as shapefile 5. Assign WGS84 datum to shapefile in ArcCatalog 6. Reproject to Lambert Spherical (“US National Atlas Equal Area”) 7. Separate by 2x2 degree tile using "tile_num" attribute (so grid will match the netCDF met files) using defination query in ArcMap and exporting to individual shapefiles (256 tiles) as "mask". 8. Open lambert points in qGIS and make 1km grid (shapefile) for each 2x2 tile 9. Assign projection to output (EPSG:2163) 10. Add each new grid shapefile (one at a time) to ArcMap with 2x2 Grid as separate layer 11. Select by location (select from grid x that intersect mask x) 12. Export selected features of grid x (now will be numbered sequentially by record in a way that matches the met NetCDF “ncells”) 13. Clean up: delete extra fields from qGIS (ID,MAXX,MINX,MAXY,MINY) add ncell_id (FID +1) block_id, block_name “10”times the work to unpack the data Monday, October 25, 2010
  • 17. Many kinds of queries f<variable> <location> <point in time> avg(srad) at x,y on Dec 2, 2001 tmin for area on May 19, 1992 tmax at x,y on May 19, 1992 f<variable> <point location> <duration of time> tave at x,y during the first quarter of 1983 sum(vpd) at x,y during the last week of Mar, 2003 Monday, October 25, 2010
  • 18. accessible¦aksesəbəl¦ adjective 1 (of a place) able to be reached or entered : the town is accessible by bus | the building has been made accessible to disabled people. • (of an object, service, or facility) able to be easily obtained or used : making learning opportunities more accessible to adults. • easily understood : his Latin grammar is lucid and accessible. • able to be reached or entered by people in wheelchairs : it provides specialized features such as nonslip floors and accessible entrances. 2 (of a person, typically one in a position of authority or importance) friendly and easy to talk to; approachable : he is more accessible than most tycoons. Monday, October 25, 2010
  • 19. Accessible information is easy to: find, determine what one can do with it, acquire, and use Monday, October 25, 2010
  • 20. Factors that affect accessibility: law; technology; culture; semantics; and economics Monday, October 25, 2010
  • 21. Law makes sharing permissible; technology makes it possible; culture makes it acceptable; semantics make it understandable; and economics affordable Monday, October 25, 2010
  • 22. It is permissible, acceptable, and affordable to access public sector information, but not necessarily possible or understandableMonday, October 25, 2010
  • 23. Goals of the new storage: make the information technologically and semantically accessible Monday, October 25, 2010
  • 24. Allow access by providing user- interface, application programming interface and documentation Monday, October 25, 2010