Apache Beam de A à Z

Apache Beam, IOs ?
JB Onofré
<jbonofre@apache.org> <jbonofre@talend.com>

Who am I ?
JB Onofré <jbonofre@apache.org> <jbonofre@talend.com>
@jbonofre
● Fellow at Talend
● Member of the Apache Software Foundation
● PMC/Committer on ~ 20 Apache projects from container and integration (Karaf, Camel,
ActiveMQ, Aries, …) to big data (Beam, CarbonData, Livy, Gearpump, …)
● Mentor during Apache Beam incubation
● PMC member for Apache Beam

Agenda
● What’s Beam ?
● Beam parts
● Beam Programming Model
● SDKs & DSLs
● IOs & Filesystems
● Runners

● Apache TLP since May 2017 (incubation since December 2016)
● Coming from Google Cloud Dataflow SDK
● Data processing APIs:
○ Unified (batch & streaming, same code)
○ Portable (several execution engines, same code)
○ Extensible (custom extensions)
What is Beam ?

Beam parts
● Apache Beam:
○ An unified programming model
○ SDKs & DSLs to implement the programming model
○ Convenient extensions (connectors, functions, …)
○ Runners to “translate” the user code to an execution engine (Beam doesn’t provide the
engine)
Programming
Model
SDKs
DSLs
User Pipeline
Extensions
Runner Execution Engine

PTransform
1.PTransform are operations that transform data
2.Receive one or multiple PCollections and produce one or multiple PCollections
3.They must be Serializable
4.Should be thread-compatible (if you create your threads you must sync them)

IO & Filesystem
● Connectors and extensions as Read & Write PTransforms
● Support bounded and/or unbounded PCollections
● Not using the execution engines connectors: features & portability !
● From simple to advanced features (watermark, timestamp, dedup, splitting, …)

IO Write: a DoFn !
1.Write PTransform<PCollection<?>, PDone> wrapping a DoFn (sink is deprecated)
2.Leverage the DoFn annotations
3.Supports both bounded and unbounded PCollections (process by element)
4.Can support batching related to the bundles (runner)
5.Available on multiple workers thanks to ParDo

IO Write: Elasticsearch simple example
public abstract static class Write extends PTransform<PCollection<String>, PDone> {
public PDone expand(PCollection<String> input) {
input.apply(ParDo.of(new WriteFn()));
return PDone.in(input.getPipeline());
}
static class WriteFn extends DoFn<String, PDone> {
private RestClient restClient;
@Setup
public void setup() throws Exception {
restClient = RestClient.builder(new HttpHost[]{ new HttpHost("localhost", 9200)}).build();
}
@ProcessElement
public void processElement(ProcessContext context) throws Exception {
String document = context.element();
HttpEntity request = new NStringEntity(document, ContentType.APPLICATION_JSON);
restClient.performRequest("POST", "/my_index/beam_type", Collections.singletonMap("refresh", "true"), request);
}
@Teardown
public void closeClient() throws Exception {
if (restClient != null) {
restClient.close();
}
}
}
}

IO Write: Elasticsearch adding batching
public abstract static class Write extends PTransform<PCollection<String>, PDone> {
public PDone expand(PCollection<String> input) {
input.apply(ParDo.of(new WriteFn()));
return PDone.in(input.getPipeline());
}
static class WriteFn extends DoFn<String, PDone> {
private final static long BATCH_SIZE = 1024;
private RestClient restClient;
private ArrayList<String> batch;
private long currentBatchSizeBytes;
@Setup
restClient = RestClient.builder(new HttpHost[]{ new HttpHost("localhost", 9200)}).build();
}
@StartBundle
public void startBundle(StartBundleContext context) throws Exception {
batch = new ArrayList<>();
currentBatchSizeBytes = 0;
}
@ProcessElement
public void processElement(ProcessContext context) throws Exception {
String document = context.element();
batch.add(String.format("{ "index" : {} }%n%s%n", document));
currentBatchSizeBytes += document.getBytes(StandardCharsets.UTF_8).length;
if (batch.size() >= BATCH_SIZE
|| currentBatchSizeBytes >= BATCH_SIZE) {
flushBatch();
}
}
@FinishBundle
public void finishBundle(FinishBundleContext context) throws Exception {
flushBatch();
}
private void flushBatch() throws IOException {
if (batch.isEmpty()) {
return;
}
StringBuilder bulkRequest = new StringBuilder();
for (String json : batch) {
bulkRequest.append(json);
}
batch.clear();
currentBatchSizeBytes = 0;
Response response;
HttpEntity requestBody = new NStringEntity(bulkRequest.toString(),
ContentType.APPLICATION_JSON);
restClient.performRequest("POST", "/my_index/beam_type", Collections.<String,
String>emptyMap(), requestBody);
}
@Teardown
public void closeClient() throws Exception {
if (restClient != null) {
restClient.close();
}
}
}

IO Simplest Read: a DoFn !
1.Write PTransform<PBegin, PCollection<?>> wrapping a DoFn
2.Leverage the DoFn annotations
3.Executed on a single worker
4.No splitting or estimated size
5.Only produce bounded PCollections

IO Read: JDBC simple example
public static class Read extends PTransform<PBegin, PCollection<String>> {
DataSource dataSource;
private Read(DataSource dataSource) {
this.dataSource = dataSource;
}
public static Read withDataSource(DataSource dataSource) {
return new Read(dataSource);
}
public PCollection<String> expand(PBegin begin) {
return begin.apply(Create.of((Void) null))
.apply(ParDo.of(new ReadFn(this)));
}
private static class ReadFn extends DoFn<Void, String> {
private Read spec;
private Connection connection;
public ReadFn(Read spec) {
this.spec = spec;
}
@Setup
this.connection = spec.dataSource.getConnection();
}
@ProcessElement
public void processElement(ProcessContext processContext) throws Exception
{
try (PreparedStatement statement = connection.prepareStatement("select foo
from bar")) {
try (ResultSet resultSet = statement.executeQuery()) {
while (resultSet.next()) {
processContext.output(resultSet.getString("foo"));
}
}
}
}
@Teardown
public void teardown() throws Exception {
connection.close();
}
}
}

IO Read: a Bounded Source
1.Write PTransform<PBegin, PCollection<?>> wrapping a bounded source
2.Support advanced features like:
a.Splitting (chunk the read with several sources)
b.Estimated size (used by the runner for the scaling)
3.Sources create readers. The reader is on a specific split, moving forward on the records

IO Read: bounded source skeletonpublic static class Read extends PTransform<PBegin, PCollection<?>> {
@Override
public PCollection<?> expand(PBegin input) {
return input.apply(org.apache.beam.sdk.io.Read.from(new CustomSource()));
}
}
public static class CustomSource extends BoundedSource<?> {
private String splitPredicate;
@Override public List<CustomSource> split(long desiredBundleSizeBytes, PipelineOptions
options) throws Exception {
// here we create a list of sources, each source will be on a worker reading chunk of data
// That's why we have a split predicate.
// NB: a runner can move a source from a worker to another, that's why a source has to
serializable.
return Collections.singletonList(this);
}
@Override public long getEstimatedSizeBytes(PipelineOptions options) throws Exception {
// here we compute the size of the read data. The runner can use this value for:
// - bootstrap the required resources & workers (no-op execution engines like DataFlow)
// - define the size of the data bundles
return 0;
}
@Override public CustomReader createReader(PipelineOptions options) throws IOException {
// create the reader for this source
return new CustomReader(this);
}
}
// A reader is created by a source on a worker. It's "linked" to the source to read only the expected
// chunk of data. A reader is local to a worker and never change, it doesn't have to be serializable.
public static class CustomReader extends BoundedSource.BoundedReader<?> {
private CustomSource source;
private ? current;
public CustomReader(CustomSource source) {
this.source = source;
}
@Override public boolean start() throws IOException {
// it's where the reader init the resources (client, ...) and call the advance()
// method to read the first record
return advance();
}
@Override public boolean advance() throws IOException {
// here we actually read the records and update the current record.
if (something to read){
this.current = ....
return true;
} else {
return false;
}
}
@Override public ? getCurrent() throws NoSuchElementException {
if (current == null) {
throw new NoSuchElementException();
}
return current;
}
@Override public void close() throws IOException {
// close the resources created by the reader.
}
@Override public BoundedSource<?> getCurrentSource() {
return this.source;
}
}

IO Read: an Unbounded Source
1.Write PTransform<PBegin, PCollection<?>> wrapping an unbounded source
2.Support advanced features like:
a.Splitting (multiple sources all active receiving messages for instance)
b.Timestamp & watermark (event processing time and event time)
c.Checkpoint mark (to deal with read failure and avoid a re-read)
d.Dedup (can deal dedup using a record ID)
3.Sources create readers. The reader is on a specific split, moving forward on the events,
always active.

IO Read: unbounded source skeleton
public static class Read extends PTransform<PBegin, PCollection<?>> {
public PCollection<?> expand(PBegin begin) {
return begin.apply(org.apache.beam.sdk.io.Read.from(new CustomSource()));
}
}
public static class CustomSource extends UnboundedSource<?, CustomCheckpointMark> {
@Override public List<? extends UnboundedSource<?, CustomCheckpointMark>> split(
int desiredNumSplits, PipelineOptions options) throws Exception {
// like for bounded, we can split read on multiple workers
return Collections.singletonList(this);
}
@Override public UnboundedReader<?> createReader(PipelineOptions options,
@Nullable CustomCheckpointMark checkpointMark) throws IOException {
return new CustomReader(this);
}
@Override public Coder<CustomCheckpointMark> getCheckpointMarkCoder() {
// as checkpoint are shared by all sources, it has to be serializable (machine to machine)
// using a coder
}
}
public static class CustomCheckpointMark implements UnboundedSource.CheckpointMark {
// here we maintain a map of "pending" events, not yet fully read
@Override public void finalizeCheckpoint() throws IOException {
// this callback method is called by the runner when the events have been fully read
// and can be ack
}
}
public static class CustomReader extends UnboundedSource.UnboundedReader<?> {
private CustomSource source;
public CustomReader(CustomSource source) {
this.source = source;
}
@Override public boolean start() throws IOException {
// like for bounded, init the resources (client, ...) and call advance()
return advance();
}
@Override public boolean advance() throws IOException {
// read or receive event to update current, timestamp and watermark
// return true if a event has been received, false else
}
@Override public ? getCurrent() throws NoSuchElementException {
if (current == null) {
throw new NoSuchElementException();
}
return current;
}
@Override public Instant getCurrentTimestamp() throws NoSuchElementException {
// return the current timestamp using the event content or backend system
}
@Override public void close() throws IOException {
// close the reader and release resources
}
@Override public Instant getWatermark() {
// return the watermark (the timestamp of the oldest records).
// the watermark is timestamp before or at the timestamps of all future elements read by this reader.
// it can be estimated or using the ACK & checkpoint
}
@Override public UnboundedSource.CheckpointMark getCheckpointMark() {
// the current checkpoint mark for this reader
}
@Override public UnboundedSource<?, ?> getCurrentSource() {
return source;
}
}

IO Read: the SplittableDoFn
1.Write PTransform<PBegin, PCollection<?>> wrapping a SplittableDoFn
2.Limit the boilerplate, limit the errors, easier to write.
3.Does not distinguish between bounded and unbounded like sources.
4.Basically: it’s a DoFn supporting splitting. It’s a regular DoFn, just processElement
method takes a tracker.
5.Tracker and restriction to split and read chunks

IO Read: SplittableDoFn example
class CountFn<T> extends DoFn<KV<T, Long>, KV<T, Long>> {
@ProcessElement
public void process(ProcessContext c, OffsetRangeTracker tracker) {
for (long i = tracker.currentRestriction().getFrom(); tracker.tryClaim(i); ++i) {
c.output(KV.of(c.element().getKey(), i));
}
}
@GetInitialRestriction
public OffsetRange getInitialRange(KV<T, Long> element) {
return new OffsetRange(0L, element.getValue());
}
}
PCollection<KV<String, Long>> input = …;
PCollection<KV<String, Long>> output = input.apply(
ParDo.of(new CountFn<String>());

IO Read: current status of SplittableDoFn
1.API is already part of the Beam core.
2.Supported by runners

IOs & Filesystems
Filesystems
HDFS
Google Storage
S3 (WIP)
ADLS (WIP)
IOs
AMQP
Cassandra
Elasticsearch
Google PubSub
Google BigTable
Google BigQuery
HBase
HCatalog
JDBC
JMS
Kafka
Kinesis
MongoDB
MQTT
RabbitMQ (WIP)
Redis
Solr
Tika
XML

Using IOs: IoT Use Case
● Abstract: cars send location via MQTT. We want to check the cars in a given location in
real time.
● Streaming pipeline using Unbounded IO
○ Read an unbounded collection of data, live forever
○ Splitting (several readers)
○ Watermark (distinguish event time and event processing time, allowing for downstream
parts of the pipeline to know up to what point in time the data is complete)
○ Checkpointing (to avoid to re-read the same data again in case of failure)
○ Deduplication

IoT Use Case
Let’s start with a simple Maven project:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>strata</groupId>
<artifactId>strata</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>

<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-core</artifactId>
</dependency>

<dependency>
<artifactId>beam-sdks-java-io-mqtt</artifactId>
</dependency>

<dependency>
<artifactId>beam-sdks-java-io-hadoop-file-system</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</dependency>
</dependencies>
</project>
Java SDK
MQTT IO - will be used for source
HDFS filesystem (and dependency) - will be used for sink

IoT Use Case
An “option” interface describing the pipeline options
private interface Options extends PipelineOptions {
@Description("Fixed window duration, in seconds")
@Default.Integer(WINDOW_SIZE)
Integer getWindowSize();
void setWindowSize(Integer value);
@Description("Maximum coordinate value (axis X)")
@Default.Integer(COORD_X)
Integer getCoordX();
void setCoordX(Integer value);
@Description("Maximum coordinate value (axis Y)")
@Default.Integer(COORD_Y)
Integer getCoordY();
void setCoordY(Integer value);
@Description("Output Path")
@Default.String(OUTPUT_PATH)
String getOutput();
void setOutput(String value);
}
Annotations describing the option and defining the default value
Simple getter/setter for the option

IoT Use Case
Implementing the filter as a SerializableFunction
private static class FilterObjectsByCoordinates implements SerializableFunction<String, Boolean> {
private Integer maxCoordX;
private Integer maxCoordY;
public FilterObjectsByCoordinates(Integer maxCoordX, Integer maxCoordY) {
this.maxCoordX = maxCoordX;
this.maxCoordY = maxCoordY;
}
@Override
public Boolean apply(String input) {
String[] split = input.split(",");
if (split.length < 3) {
return null;
}
Integer coordX = Integer.valueOf(split[1]);
Integer coordY = Integer.valueOf(split[2]);
return (coordX >= 0 && coordX < this.maxCoordX
&& coordY >= 0 && coordY < this.maxCoordY);
}
}
A function that computes an output value of type Boolean from a
input value of type String and is Serializable (in order to be
executed in parallel on different workers)
Returns the result of invoking this function on the given input

IoT Use Case
Create the pipeline
public final static void main(String[] args) throws Exception {
final Options options = PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class);
Pipeline pipeline = Pipeline.create(options);
Wrapped as a main to be directly executable
Load the options using the corresponding factory
Create the pipeline using the options

IoT Use Case
Reading messages from MQTT and converting to PCollection<String>
pipeline
.apply("MQTT Source", MqttIO.read()
.withConnectionConfiguration(MqttIO.ConnectionConfiguration.create("tcp://localhost:1883", "CAR")))
.apply("Byte To String Converter", ParDo.of(new DoFn<byte[], String>() {
@ProcessElement
public void processElement(ProcessContext processContext) {
byte[] element = processContext.element();
processContext.output(new String(element));
}
}))
Connect and receive message from the MQTT broker
As MQTT IO provides a PCollection<byte[]>, we use a
ParDo/DoFn to convert as a PCollection<String>

IoT Use Case
Windowing, Pane and trigger
.apply("Data Window", Window.<String>into(FixedWindows.of(Duration.standardSeconds(options.getWindowSize())))
.triggering(AfterWatermark.pastEndOfWindow())
.withAllowedLateness(Duration.ZERO)
.discardingFiredPanes()
)
WindowFn that window values into fixed-
size timestamp-based windows.
Trigger that fires when the watermark passes the end of the
window
Deal with late data arrival. Any elements that are later than this will be dropped. This value also determines how long
state will be kept around for old windows. Once no element will be added to a window (because this duration has
passed) any state associated with the window will be dropped.
Discards elements in a pane after they are triggered.

IoT Use Case
● In streaming mode, to get results, we have to window elements.
● Elements has a timestamp (event time) and a watermark, which is the timestamp of the
oldest work not yet completed (processing time). The source sets both timestamp and
watermark.
● The trigger decides at what time we fire the result on a window based on the watermark.
● Accumulation allows to refine the results.

IoT Use Case
First execution: local using the Direct Runner
Building
Executing

<dependency>
<artifactId>beam-runners-core-java</artifactId>
</dependency>
We add the direct runner in our Maven dependencies
$ mvn clean install
...
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
$ java -cp ….. strata.Main

IoT Use Case
We see the pipeline connected on ActiveMQ (MQTT)

The results are generated at the trigger.
IoT Use Case
Dec 02, 2017 7:24:51 AM org.apache.beam.sdk.io.FileBasedSink$Writer open
INFO: Opening temporary file /tmp/beam/.temp-beam-2017-12-02_06-23-46-0/a6b214e6-2931-42f4-a2df-91b2e736a9fc with MIME type text/plain to write destination null
shard 0 window [2017-12-02T06:23:50.000Z..2017-12-02T06:24:00.000Z) pane PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0}
Dec 02, 2017 7:24:51 AM org.apache.beam.sdk.io.FileBasedSink$Writer close
INFO: Successfully wrote temporary file /tmp/beam/.temp-beam-2017-12-02_06-23-46-0/a6b214e6-2931-42f4-a2df-91b2e736a9fc
Dec 02, 2017 7:24:51 AM org.apache.beam.sdk.io.WriteFiles$FinalizeWindowedFn finishBundle
INFO: Will finalize 1 files
Dec 02, 2017 7:24:51 AM org.apache.beam.sdk.io.FileBasedSink$WriteOperation copyToOutputFiles
INFO: Will copy temporary file /tmp/beam/.temp-beam-2017-12-02_06-23-46-0/a6b214e6-2931-42f4-a2df-91b2e736a9fc to final location /tmp/beam/cars_report2017-12-
02T06:23:50.000Z-2017-12-02T06:24:00.000Z-pane-0-last-00000-of-00001
Dec 02, 2017 7:24:51 AM org.apache.beam.sdk.io.FileBasedSink$WriteOperation removeTemporaryFiles
INFO: Will remove known temporary file /tmp/beam/.temp-beam-2017-12-02_06-23-46-0/a6b214e6-2931-42f4-a2df-91b2e736a9fc
$ cat cars_report2017-12-02T06:23:50.000Z-2017-12-02T06:24:00.000Z-pane-0-last-00000-of-00001
{“1”}

IoT Use Case
Now, let’s run on Spark. First, we add the Spark runner in the project:

<dependency>
<artifactId>beam-runners-spark</artifactId>
</dependency>
We add the Spark runner

IoT Use Case
For convenience, we do a shade jar (embedding the dependencies):
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>

IoT Use Case
We can see the application on master

IoT Use Case
We can see the job corresponding to the pipeline

IoT Use Case
We can see the DAG corresponding to the pipeline

IoT Use Case
Running on Flink. Like for Spark, let’s add the Flink runner:

<dependency>
<artifactId>beam-runners-flink_2.10</artifactId>
</dependency>
We add the Flink runner

IoT Use Case
Upload the pipeline jar to Flink

IoT Use Case
We can see the plan (DAG) in the dashboard

IoT Use Case
We can run the pipeline, either with the web UI

IoT Use Case
… or the command line
$ bin/flink run -c strata.Main -p 1 /home/jbonofre/strata-1.0-SNAPSHOT.jar --runner=FlinkRunner
Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
Submitting job with JobID: 518a04a576222e0fba7f317718d5d4e5. Waiting for job completion.
Connected to JobManager at Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-927633151] with leader session id 00000000-0000-0000-0000-000000000000.
12/03/2017 07:51:17 Job execution switched to status RUNNING.
12/03/2017 07:51:17 Source: Read(UnboundedMqttSource) -> Flat Map -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> Window/Window.Assign.out ->
ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ToKeyedWorkItem(1/1) switched to SCHEDULED
12/03/2017 07:51:17 Combine.perKey(Count) -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ParMultiDo(ApplyShardingKey) -> ToKeyedWorkItem(1/1)
switched to SCHEDULED
12/03/2017 07:51:17 GroupByKey -> ParMultiDo(WriteShardedBundles) -> ParMultiDo(Anonymous) -> Writing
Output/WriteFiles/Reshuffle/Window.Into()/Window.Assign.out -> ParMultiDo(Anonymous) -> ToKeyedWorkItem(1/1) switched to SCHEDULED
12/03/2017 07:51:17 GroupByKey -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) ->
ParMultiDo(FinalizeWindowed)(1/1) switched to SCHEDULED
12/03/2017 07:51:17 Source: Read(UnboundedMqttSource) -> Flat Map -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> Window/Window.Assign.out ->
ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ToKeyedWorkItem(1/1) switched to DEPLOYING
12/03/2017 07:51:17 Combine.perKey(Count) -> ParMultiDo(Anonymous) -> ParMultiDo(Anonymous) -> ParMultiDo(ApplyShardingKey) -> ToKeyedWorkItem(1/1)
switched to DEPLOYING
12/03/2017 07:51:17 GroupByKey -> ParMultiDo(WriteShardedBundles) -> ParMultiDo(Anonymous) -> Writing
Output/WriteFiles/Reshuffle/Window.Into()/Window.Assign.out -> ParMultiDo(Anonymous) -> ToKeyedWorkItem(1/1) switched to DEPLOYING
ParMultiDo(FinalizeWindowed)(1/1) switched to DEPLOYING
ParMultiDo(FinalizeWindowed)(1/1) switched to RUNNING
...

Summary
1.Unified model for both batch and streaming, supporting features like watermark,
triggering, accumulation, …
2.Large set of IOs, filesystems and extensions
3.Agnostic of the execution engine (you don’t change your code) or the platform (on
premise or cloud)
4.Extensible (IOs, runners, DSLs)
Apache Beam can be the glue in your ecosystem, very flexible to match most
of your use cases, and optimize enterprise workload.

http://beam.apache.org
@ApacheBeam
@jbonofre <jbonofre@apache.org>

https://www.eventbrite.com/e/beam-summit-europe-2019-tickets-57933472576

Apache Beam de A à Z

More Related Content

What's hot

Similar to Apache Beam de A à Z

More from Paris Data Engineers !

Recently uploaded

Apache Beam de A à Z