Novel techniques are needed for high-performance applications to exploit massive local concurrency in many-core systems. Getting software applications to run faster on machines with more cores requires substantial restructuring of embedded software stacks, including applications, middleware, and the operating system (OS). Contemporary software stacks are not designed to exploit hundreds or thousands of cores. New OS and middleware mechanisms must be developed to handle scheduling, resource sharing, and communication in many-core systems. The solution must also provide high-level API to simplify development of concurrent software. In this session, we describe new mechanisms for scheduling and communication for many-core embedded platforms.