PyCon APAC 2016 Keynote

Wes McKinney
Wes McKinneyDirector of Ursa Labs, Open Source Developer at Ursa Labs
Saturday	Morning		
Keynote	
Wes	McKinney		
@wesmckinn	
PyCon	APAC	2016	(Seoul)
Me	
DataPad	
Apache	
Arrow	
Feather	
ibis
In	process:	
Python	for	Data	Analysis:	2nd	Edi:on	
Coming	2017	
	
(in	English	J)
Q:	What	brings	
you	here?
Our	shared	
values
Pride	in	soMware	
craMsmanship
My	story	
•  Accidental	soMware	developer	
•  2007:	My	first	job	(financial	research	analyst)	
•  I	started	wriPng	Python	libraries	to	do	my	own	
work	beQer	
•  Soon	I	was	helping	my	colleagues	work	beQer,	
too
Tools
Tools
Empathy	
the	feeling	that	you	understand	
and	share	another	person's	
experiences	and	emoPons	:	the	
ability	to	share	someone	else's	
feelings	
Source:	Merriam-Webster's	Learner's	DicPonary
Open	source	is	
wonderful…
Open	source	is	
wonderful…but	it	can	
also	be	frustraPng
Sustainable	open	source	
•  How	to	keep	contributors	from	drowning	/	
burning	out?	
•  How	to	fund	the	work?	
•  How	to	protect	and	serve	the	community?
The	Grind
“The	grind	is	an	endless	
stream	of	bug	reports,	
requests,	demands,	
quesPons,	and	
occasional	inquisiPons.”	
	 DHH,	Creator	of	Ruby	on	Rails
pandas,	the	open	source	project	
•  Parts	of	code	date	back	to	April	2008	
•  Over	600	unique	contributors	on	GitHub	
	
•  AcPve	project	maintainers	range	from	4-7	
people	
•  >	6900	Closed	Issues	
•  >	5100	Pull	Requests
pandas	at	end	of	2012
April	7,	2014
"Some	might	argue	that	
[Heartbleed]	is	the	worst	
vulnerability	found	(at	least	in	
terms	of	its	potenPal	impact)	
since	commercial	traffic	began	to	
flow	on	the	Internet."	
Joseph	Steinberg,	Forbes	cybersecurity	columnist
“	There	should	be	at	least…[6]	full	Pme	
OpenSSL	team	members,	not	just	one,	able	
to	concentrate	…	without	having	to	hustle	
commercial	work.	If	you’re	a	…	in	a	posiPon	
to	do	something	about	it,	give	it	some	
thought.	Please.	I’m	gemng	old	and	weary	
and	I’d	like	to	rePre	someday.”	
	
Steve	Marquess,	OpenSSL	team
By	Nadia	Eghbal,	supported	by	
the	Ford	FoundaPon	
For	more	on	this
“The	Cathedral		
and	the	Bazaar”
Python’s	normalizaPon	in	industry	
•  Python	has	become	a	leading	language	
instead	of	something	“experimental”	or	
“risky”	
•  Many	businesses	founded	on	the	growth	of	
the	Python	user	base	
•  See	Paul	Graham’s	2004	essay	“The	Python	
Paradox”	—	how	things	have	changed!
Governance	
“the	processes	of	interacPon	and	
decision-making	among	the	actors	
involved	in	a	collecPve	problem…”	
M.	HuMy	(via	Wikipedia)
Openness	and	
Transparency
Consensus
Some	example	governance	documents	
•  NumPy	(see	the	docs)	
•  IPython	/	Jupyter	governance	
– github.com/jupyter/governance	
•  pandas	
– github.com/pydata/pandas-governance	
– Modeled	aMer	Jupyter	governance
hQp://numfocus.org	
hQp://apache.org
PyCon APAC 2016 Keynote
conda-forge	
•  	Community-curated	conda	package	channel	
(hosted	on	anaconda.org)	
•  	Reproducible	build	infrastructure	(Docker	+	
Circle	CI	+	Travis	CI	+	Appveyor)	
•  	Automated	GitHub	helper	tools	
conda config --add channels conda-forge
What	is	next	for	pandas?	
•  pandas	1.0	
– A	stable,	maintenance-only	release	
•  Beginning	“pandas	2.0”	
– Planning	significant	refactoring	on	the	internals	of	
Series,	DataFrame
Why	pandas	2.0?	
•  Some	changes	difficult/impossible	to	do	in	an	
incremental	way	
•  pandas’s	relaPonship	with	the	ecosystem	has	
evolved	over	the	last	5	years	
	
•  Make	pandas	
– Faster	and	use	less	memory	
– Fix	long-standing	limitaPons	/	inconsistencies	
– Easier	interoperability	/	extensibility
Apache	
Arrow	
hQp://arrow.apache.org
High	Performance	Sharing	&	Interchange	
Today With Arrow
•  Each system has its own
internal memory format
•  70-80% CPU wasted on
serialization and
deserialization
•  Similar functionality
implemented in multiple
projects
•  All systems utilize the same
memory format
•  No overhead for cross-
system communication
•  Projects can share
functionality (eg, Parquet-
to-Arrow reader)
Feather	File	Format	for	Python	and	R	
• Problem:	fast,	language-
agnosPc	binary	data	
frame	file	format	
• By	Wes	McKinney	
(Python)	and	Hadley	
Wickham	(R)	
• Read	speeds	close	to	
disk	IO	performance	
• Leverages	Apache	Arrow
Thank	you	
	
@wesmckinn	
hQp://wesmckinney.com	
	
pandas	sprint	on	Monday!
1 of 36

Recommended

Numpy governance by
Numpy governanceNumpy governance
Numpy governanceJung Rae Cho
149 views7 slides
Numpy governance 요약 & 번역 by
Numpy governance 요약 & 번역Numpy governance 요약 & 번역
Numpy governance 요약 & 번역Denzel Hyeonsoo Shin
358 views10 slides
Governance of pandas by
Governance of pandasGovernance of pandas
Governance of pandashyeon woo han
142 views6 slides
Numpy governance by
Numpy governanceNumpy governance
Numpy governancehyeon woo han
193 views5 slides
SlideShare 101 by
SlideShare 101SlideShare 101
SlideShare 101Amit Ranjan
29.7M views27 slides
Open Source in Libraries: Freedom and Community by
Open Source in Libraries: Freedom and CommunityOpen Source in Libraries: Freedom and Community
Open Source in Libraries: Freedom and CommunityNicole C. Engard
2.7K views45 slides

More Related Content

Similar to PyCon APAC 2016 Keynote

How machines learn to talk. Machine Learning for Conversational AI by
How machines learn to talk. Machine Learning for Conversational AIHow machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AIVerena Rieser
622 views58 slides
Live Usability Lab: See One, Do One & Take One Home by
Live Usability Lab: See One, Do One & Take One HomeLive Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One HomeStephanie Brown
917 views57 slides
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019) by
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)ALATechSource
325 views59 slides
Open Source for Libraries by
Open Source for LibrariesOpen Source for Libraries
Open Source for LibrariesNicole C. Engard
1.1K views43 slides
Practical Open Source Software for Libraries by
Practical Open Source Software for LibrariesPractical Open Source Software for Libraries
Practical Open Source Software for LibrariesNicole C. Engard
7K views83 slides
Software art and design: computational thinking through programming practice ... by
Software art and design: computational thinking through programming practice ...Software art and design: computational thinking through programming practice ...
Software art and design: computational thinking through programming practice ...Aarhus University
218 views21 slides

Similar to PyCon APAC 2016 Keynote(20)

How machines learn to talk. Machine Learning for Conversational AI by Verena Rieser
How machines learn to talk. Machine Learning for Conversational AIHow machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AI
Verena Rieser622 views
Live Usability Lab: See One, Do One & Take One Home by Stephanie Brown
Live Usability Lab: See One, Do One & Take One HomeLive Usability Lab: See One, Do One & Take One Home
Live Usability Lab: See One, Do One & Take One Home
Stephanie Brown917 views
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019) by ALATechSource
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)
Creating a Digital Media Space for Today's Teens: Part 2 (Jan. 2019)
ALATechSource325 views
Practical Open Source Software for Libraries by Nicole C. Engard
Practical Open Source Software for LibrariesPractical Open Source Software for Libraries
Practical Open Source Software for Libraries
Nicole C. Engard7K views
Software art and design: computational thinking through programming practice ... by Aarhus University
Software art and design: computational thinking through programming practice ...Software art and design: computational thinking through programming practice ...
Software art and design: computational thinking through programming practice ...
Aarhus University218 views
Open Source: Freedom and Community by Nicole C. Engard
Open Source: Freedom and CommunityOpen Source: Freedom and Community
Open Source: Freedom and Community
Nicole C. Engard3.2K views
PennImmersive: Why We Need Librarians by Kimberly Eke
PennImmersive: Why We Need LibrariansPennImmersive: Why We Need Librarians
PennImmersive: Why We Need Librarians
Kimberly Eke167 views
OPAC 2.0 and Beyond by daveyp
OPAC 2.0 and BeyondOPAC 2.0 and Beyond
OPAC 2.0 and Beyond
daveyp3.2K views
Libraries and SXSW (for LACONi) by Carson Block
Libraries and SXSW (for LACONi)Libraries and SXSW (for LACONi)
Libraries and SXSW (for LACONi)
Carson Block672 views
Digital Fluencies: A Story of Trials & Triumph by Kimberly Eke
Digital Fluencies: A Story of Trials & TriumphDigital Fluencies: A Story of Trials & Triumph
Digital Fluencies: A Story of Trials & Triumph
Kimberly Eke818 views
Rethinking Scala Presented in San Francisco May 7, 2014 by Bruce Eckel
Rethinking Scala Presented in San Francisco May 7, 2014Rethinking Scala Presented in San Francisco May 7, 2014
Rethinking Scala Presented in San Francisco May 7, 2014
Bruce Eckel4.9K views
How to Contribute to Open Source by Aaron Careaga
How to Contribute to Open SourceHow to Contribute to Open Source
How to Contribute to Open Source
Aaron Careaga334 views
The Art and Science of Computer Conversation: Talkabot 2016 by PullString
The Art and Science of Computer Conversation: Talkabot 2016The Art and Science of Computer Conversation: Talkabot 2016
The Art and Science of Computer Conversation: Talkabot 2016
PullString675 views

More from Wes McKinney

Solving Enterprise Data Challenges with Apache Arrow by
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
1.1K views31 slides
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity by
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityWes McKinney
1.1K views26 slides
Apache Arrow: High Performance Columnar Data Framework by
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkWes McKinney
1.4K views53 slides
New Directions for Apache Arrow by
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache ArrowWes McKinney
1.9K views27 slides
Apache Arrow Flight: A New Gold Standard for Data Transport by
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportWes McKinney
2.2K views31 slides
ACM TechTalks : Apache Arrow and the Future of Data Frames by
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesWes McKinney
2K views47 slides

More from Wes McKinney(20)

Solving Enterprise Data Challenges with Apache Arrow by Wes McKinney
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
Wes McKinney1.1K views
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity by Wes McKinney
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Wes McKinney1.1K views
Apache Arrow: High Performance Columnar Data Framework by Wes McKinney
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
Wes McKinney1.4K views
New Directions for Apache Arrow by Wes McKinney
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache Arrow
Wes McKinney1.9K views
Apache Arrow Flight: A New Gold Standard for Data Transport by Wes McKinney
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
Wes McKinney2.2K views
ACM TechTalks : Apache Arrow and the Future of Data Frames by Wes McKinney
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
Wes McKinney2K views
Apache Arrow: Present and Future @ ScaledML 2020 by Wes McKinney
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020
Wes McKinney970 views
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future by Wes McKinney
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
Wes McKinney2.1K views
Apache Arrow: Leveling Up the Analytics Stack by Wes McKinney
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
Wes McKinney1.4K views
Apache Arrow Workshop at VLDB 2019 / BOSS Session by Wes McKinney
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Wes McKinney2.5K views
Apache Arrow: Leveling Up the Data Science Stack by Wes McKinney
Apache Arrow: Leveling Up the Data Science StackApache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science Stack
Wes McKinney3.5K views
Ursa Labs and Apache Arrow in 2019 by Wes McKinney
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
Wes McKinney4.2K views
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward" by Wes McKinney
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
Wes McKinney1.1K views
Apache Arrow at DataEngConf Barcelona 2018 by Wes McKinney
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
Wes McKinney2K views
Apache Arrow: Cross-language Development Platform for In-memory Data by Wes McKinney
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
Wes McKinney6.6K views
Apache Arrow -- Cross-language development platform for in-memory data by Wes McKinney
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Wes McKinney2.9K views
Shared Infrastructure for Data Science by Wes McKinney
Shared Infrastructure for Data ScienceShared Infrastructure for Data Science
Shared Infrastructure for Data Science
Wes McKinney8.5K views
Data Science Without Borders (JupyterCon 2017) by Wes McKinney
Data Science Without Borders (JupyterCon 2017)Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)
Wes McKinney6.2K views
Memory Interoperability in Analytics and Machine Learning by Wes McKinney
Memory Interoperability in Analytics and Machine LearningMemory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine Learning
Wes McKinney5.6K views
Raising the Tides: Open Source Analytics for Data Science by Wes McKinney
Raising the Tides: Open Source Analytics for Data ScienceRaising the Tides: Open Source Analytics for Data Science
Raising the Tides: Open Source Analytics for Data Science
Wes McKinney3.2K views

Recently uploaded

[2023] Putting the R! in R&D.pdf by
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdfEleanor McHugh
38 views127 slides
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeNUS-ISS
19 views47 slides
Tunable Laser (1).pptx by
Tunable Laser (1).pptxTunable Laser (1).pptx
Tunable Laser (1).pptxHajira Mahmood
23 views37 slides
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
17 views1 slide
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...NUS-ISS
28 views70 slides

Recently uploaded(20)

[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS28 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi120 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software225 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada121 views
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum... by NUS-ISS
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
NUS-ISS34 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS41 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10209 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman27 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst470 views
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin75 views
RADIUS-Omnichannel Interaction System by RADIUS
RADIUS-Omnichannel Interaction SystemRADIUS-Omnichannel Interaction System
RADIUS-Omnichannel Interaction System
RADIUS15 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views

PyCon APAC 2016 Keynote