Recruit Technologies uses Hadoop for a variety of purposes including optimizing matching logic for job recommendations, personalizing content like articles for customers, and optimizing search logic. Since first implementing Hadoop in 2011, Recruit has expanded its use of Hadoop across more business domains and data types. Hadoop allows Recruit to store and analyze large amounts of both structured and unstructured data to power applications and machine learning models.
Top 5 mistakes when writing Spark applicationshadooparchbook
This document discusses common mistakes people make when writing Spark applications and provides recommendations to address them. It covers issues related to executor configuration, application failures due to shuffle block sizes exceeding limits, slow jobs caused by data skew, and managing the DAG to avoid excessive shuffles and stages. Recommendations include using smaller executors, increasing the number of partitions, addressing skew through techniques like salting, and preferring ReduceByKey over GroupByKey and TreeReduce over Reduce to improve performance and resource usage.