In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
This document compares query builders and Eloquent ORM in Laravel for database operations.
Query builders allow building queries with PHP methods chained together and protect against SQL injections. They require understanding SQL and can result in unintended database actions if relations are not correctly understood.
Eloquent ORM maps models to database tables, treating records as objects. It allows defining relations and working with databases and records intuitively. While it hides SQL, for beginners it reduces time learning SQL and risk of security issues, making it recommended for initial learning over query builders.
This document summarizes a presentation about Oracle's GraalVM and how it can be used to run PHP code. It discusses GraalVM's just-in-time compilation capabilities and how Truffle allows PHP to run on GraalVM. Benchmark results show that a Truffle-based PHP on GraalVM has significantly faster performance than PHP's default interpreter or just-in-time compiler.
The document discusses the results of a study on the effects of a new drug on memory and cognitive function in older adults. The double-blind study involved giving either the new drug or a placebo to 100 volunteers aged 65-80 over a 6 month period. Testing showed those receiving the drug experienced statistically significant improvements in short-term memory retention and processing speed compared to the placebo group.
This document discusses exactly once semantics in Apache Kafka 0.11. It provides an overview of how Kafka achieved exactly once delivery between producers and consumers. Key points include:
- Kafka 0.11 introduced exactly once semantics with changes to support transactions and deduplication.
- Producers can write in a transactional fashion and receive acknowledgments of committed writes from brokers.
- Brokers store commit markers to track the progress of transactions and ensure no data loss during failures.
- Consumers can read from brokers in a transactional mode and receive data only from committed transactions, guaranteeing no duplication of records.
- This allows reliable message delivery semantics between producers and consumers with Kafka acting as
This document compares query builders and Eloquent ORM in Laravel for database operations.
Query builders allow building queries with PHP methods chained together and protect against SQL injections. They require understanding SQL and can result in unintended database actions if relations are not correctly understood.
Eloquent ORM maps models to database tables, treating records as objects. It allows defining relations and working with databases and records intuitively. While it hides SQL, for beginners it reduces time learning SQL and risk of security issues, making it recommended for initial learning over query builders.
This document summarizes a presentation about Oracle's GraalVM and how it can be used to run PHP code. It discusses GraalVM's just-in-time compilation capabilities and how Truffle allows PHP to run on GraalVM. Benchmark results show that a Truffle-based PHP on GraalVM has significantly faster performance than PHP's default interpreter or just-in-time compiler.
The document discusses the results of a study on the effects of a new drug on memory and cognitive function in older adults. The double-blind study involved giving either the new drug or a placebo to 100 volunteers aged 65-80 over a 6 month period. Testing showed those receiving the drug experienced statistically significant improvements in short-term memory retention and processing speed compared to the placebo group.
This document discusses exactly once semantics in Apache Kafka 0.11. It provides an overview of how Kafka achieved exactly once delivery between producers and consumers. Key points include:
- Kafka 0.11 introduced exactly once semantics with changes to support transactions and deduplication.
- Producers can write in a transactional fashion and receive acknowledgments of committed writes from brokers.
- Brokers store commit markers to track the progress of transactions and ensure no data loss during failures.
- Consumers can read from brokers in a transactional mode and receive data only from committed transactions, guaranteeing no duplication of records.
- This allows reliable message delivery semantics between producers and consumers with Kafka acting as
Journey of Migrating Millions of Queries on The Cloudtakezoe
This document discusses challenges in upgrading a query engine and summarizing strategies for efficiently simulating queries to test compatibility and performance. It proposes grouping queries by signature and narrowing data scans to reduce the number of queries tested. It also recommends automating result verification by generating human-readable reports and excluding uncheckable queries. Assistance tools are proposed to aid investigation of differences, which helped discover real bugs in the target version.
GitBucket: Open source self-hosting Git server built by Scalatakezoe
This document provides information about GitBucket, an open source self-hosting Git server created by Naoki Takezoe using Scala. Some key points:
- GitBucket is an open source self-hosting Git server built using Scala and Java technologies. It allows hosting both public and private repositories.
- It has over 8,000 stars on GitHub and supports features like issues, pull requests, wiki pages, and plugins.
- The developer chose Scala for its interoperability with Java libraries and broad ecosystem. This helped minimize development costs, which is important for sustainability of personal open source projects.
Testing Distributed Query Engine as a Servicetakezoe
Naoki Takezoe from Treasure Data discussed testing their distributed query engine Presto as a service. They developed a tool called presto-query-simulator to test using production data and queries in a safe manner. The tool reduces testing time by grouping similar queries and narrowing data scans. It also helps analyze results and find problematic queries. Future work includes running tests more frequently and improving coverage.
This document discusses different approaches to dependency injection in Scala, including Google Guice, implicit parameters, the cake pattern, Reader monad, MacWire, and Airframe. It compares runtime DI approaches like Guice and Airframe to compile-time approaches. The best approach depends on whether auto-wiring is needed, whether compile-time checking or dynamic binding is preferred, and whether life-cycle management is required.
How to keep maintainability of long life Scala applicationstakezoe
Naoki Takezoe discusses maintaining long-term Scala applications. He outlines two main difficulties: programming style differences that impact understandability and upgrades that require coordinating framework, Scala, and Java version changes. Case studies show upgrades can be blocked until dependent libraries support new versions. Solutions include reducing dependencies, using popular libraries, custom libraries for core components, and considering Java alternatives. Regular maintenance and preparing for breaking changes are key to sustainable Scala applications.
GitBucket: Git Centric Software Development Platform by Scalatakezoe
GitBucket is an open source Git server platform written in Scala that provides easy installation and setup. It allows for public and private repositories along with features like issue tracking, pull requests, wikis, and notifications. While based on traditional Java technologies like Jetty and JGit, it uses the Scalatra framework to integrate these components and provide a unified web interface for managing Git repositories and collaboration.
Xtend is a Java-compatible language developed by Eclipse. It has a simpler syntax than Java and compiles to Java bytecode. Xtend supports features like lambda expressions, extension methods, and switch expressions. It has strong tooling support in Eclipse and IntelliJ IDEs. Xtend aims to be a pragmatic alternative to Java for development while maintaining full interoperability.
Zipkin is a distributed tracing system created by Twitter that allows services to record and query traces of requests across microservices. It uses HTTP headers to propagate trace data between services and stores trace data in storage backends like Cassandra, MySQL, or Elasticsearch. The Brave library can be used to instrument Java applications to send trace data to a Zipkin server.
This document discusses using Scala and Scala.js for type-safe front-end web development. It introduces Scala.js, which compiles Scala code to JavaScript, enabling the use of Scala on the front-end. It also discusses related libraries like ScalaTags for type-safe HTML generation and ScalaCSS for type-safe CSS. While Scala.js enables fully type-safe front-ends, the document acknowledges challenges like large JavaScript file sizes and lack of type mappings for existing JavaScript libraries. It proposes an approach where Scala programmers provide Scala.js interfaces for front-end code, while JavaScript programmers implement the user interface using frameworks like React.
This document compares various Scala frameworks for building web applications and interacting with databases. It discusses alternatives to the standard Play and Slick frameworks, including Finagle, Akka HTTP, Skinny Micro, Quill, doobie, and ScalikeJDBC. For web frameworks, all the alternatives look promising but Play is also still viable. For databases, there is no clear standard yet, but ScalikeJDBC appears suitable for most users.
This document discusses macros in Scala. Macros allow code to be generated at compile time by expanding directives or generating abstract syntax trees (ASTs). Examples show how macros can be used for validation, type generation, domain-specific languages (DSLs), and optimization without runtime overhead. The document explains macro types in Scala, how to construct ASTs using AST models, reify, and quasiquotes, and future plans to replace scala.reflect with the safer scala.meta metaprogramming toolkit.
The document discusses Reactive Slick, a new version of the Slick database access library for Scala that provides reactive capabilities. It allows parallel database execution and streaming of large query results using Reactive Streams. Reactive Slick is suitable for composite database tasks, combining async tasks, and processing large datasets through reactive streams.
markedj: The best of markdown processor on JVMtakezoe
This document discusses selecting a markdown parser for a Scala-based GitHub clone called GitBucket. It evaluates several Java-based markdown parsers but finds them lacking support for features like GitHub Flavored Markdown tables and fences. It describes initially trying to port the JavaScript markdown parser marked.js to Scala but facing issues with its use of regular expressions and mutability. The document then explains the decision to port marked.js to Java instead, resulting in the new markdown parser markedj, which supports GFM and has a simple API. GitBucket plans to switch to using markedj starting in its next version.
GitBucket: The perfect Github clone by Scalatakezoe
GitBucket is an open-source GitHub clone written in Scala that provides features like public and private repositories, wiki, issues, pull requests, and more. It uses Scala web frameworks like Scalatra and Twirl for the backend and JVM technologies like JGit, H2 database, and Apache MINA for key functions. GitBucket aims to be easy to install, run purely on the JVM, and provide an alternative for those unable to access GitHub due to political restrictions.