The document presents research on improving the performance of wavelet transforms through lifting scheme cores. It introduces a lifting core as a processing unit that can continuously consume input and produce output while visiting each sample once in a cache-friendly manner. It discusses how lifting cores can handle borders, be configured for different processing orders, and allow reorganization of the underlying scheme for better parallelization and vectorization. The thesis aims to address shortcomings of prior methods through experimental evaluation of lifting cores on CPUs, GPUs, and FPGAs for 2D and 3D transforms as well as JPEG 2000 compression.