How to create/improve OSS products and its communitySATOSHI TAGOMORI
This document discusses how to create and improve open source software (OSS) products and their communities. It recommends determining the purpose of the OSS product, choosing an appropriate programming language, using versioning to indicate stability, communicating in English, creating a pluggable architecture to encourage contributions, and continuously improving the software and engaging with users. The key is to be open, maintain the software over time, and grow the community through communication and contributions.
How to create/improve OSS products and its communitySATOSHI TAGOMORI
This document discusses how to create and improve open source software (OSS) products and their communities. It recommends determining the purpose of the OSS product, choosing an appropriate programming language, using versioning to indicate stability, communicating in English, creating a pluggable architecture to encourage contributions, and continuously improving the software and engaging with users. The key is to be open, maintain the software over time, and grow the community through communication and contributions.
RubyKaigi 2022 - Fast data processing with Ruby and Apache ArrowKouhei Sutou
I introduced Ruby and Apache Arrow integration including the "super fast large data interchange and processing" Apache Arrow feature at RubyKaigi Takeout 2021.
This talk introduces how we can use the "super fast large data interchange and processing" Apache Arrow feature in Ruby. Here are some use cases:
* Fast data retrieval (fast pluck) from DB such as MySQL and PostgreSQL for batch processes in a Ruby on Rails application
* Fast data interchange with JavaScript for dynamic visualization in a Ruby on Rails application
* Fast OLAP with in-process DB such as DuckDB and Apache Arrow DataFusion in a Ruby on Rails application or irb session
RubyKaigi Takeout 2021 - Red Arrow - Ruby and Apache ArrowKouhei Sutou
To use Ruby for data processing widely, Apache Arrow support is important. We can do the followings with Apache Arrow:
* Super fast large data interchange and processing
* Reading/writing data in several famous formats such as CSV and Apache Parquet
* Reading/writing partitioned large data on cloud storage such as Amazon S3
This talk describes the followings:
* What is Apache Arrow
* How to use Apache Arrow with Ruby
* How to integrate with Ruby 3.0 features such as MemoryView and Ractor
Apache Arrow 1.0 - A cross-language development platform for in-memory dataKouhei Sutou
Apache Arrow is a cross-language development platform for in-memory data. You can use Apache Arrow to process large data effectively in Python and other languages such as R. Apache Arrow is the future of data processing. Apache Arrow 1.0, the first major version, was released at 2020-07-24. It's a good time to know Apache Arrow and start using it.
Apache Arrow - A cross-language development platform for in-memory dataKouhei Sutou
Apache Arrow is the future for data processing systems. This talk describes how to solve data sharing overhead in data processing system such as Spark and PySpark. This talk also describes how to accelerate computation against your large data by Apache Arrow.
csv, one of the standard libraries, in Ruby 2.6 has many improvements:
* Default gemified
* Faster CSV parsing
* Faster CSV writing
* Clean new CSV parser implementation for further improvements
* Reconstructed test suites for further improvements
* Benchmark suites for further performance improvements
These improvements are done without breaking backward compatibility.
This talk describes details of these improvements by a new csv maintainer.
PGroonga 2 – Make PostgreSQL rich full text search system backend!Kouhei Sutou
PGroonga 2.0 has been released with 2 years development since PGroonga 1.0.0. PGroonga 1.0.0 just provides fast full text search with all languages support. It's important because it's a lacked feature in PostgreSQL. PGroonga 2.0 provides more useful features to implement rich full text search system with PostgreSQL. This session shows how to implement rich full text search system with PostgreSQL!
This talk describes about PGroonga that resolves these problems.