10. Load Productions
LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/JMHReif/graph-demo-datasets/main/kaggle-net
fl
ix/titles.csv" as row
CALL apoc.merge.node(["Production",apoc.text.capitalize(toLower(row.type))], {productionId: row.id}, {title: row.title, …}, {}) YIELD node as p
WITH row, p
CALL {
…
MERGE (g:Genre {name: apoc.text.capitalize(genre)})
MERGE (p)-[r:CATEGORIZED_BY]->(g)
}
WITH row, p
CALL {
…
MERGE (c:Country {iso2Code: country})
MERGE (p)-[r2:PRODUCED_IN]->(c)
}
RETURN count(row);
https://github.com/JMHReif/graph-demo-datasets/blob/main/kaggle-net
fl
ix/load-data.cypher
11. Load Production People
:auto LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/JMHReif/graph-demo-datasets/main/kaggle-net
fl
ix/credits.csv" as row
WITH row
CALL {
WITH row
MERGE (p:Person {personId: row.person_id})
SET p.name = row.name
WITH row, p
CALL apoc.create.addLabels(p,[apoc.text.capitalize(toLower(row.role))]) YIELD node as person
RETURN person
} IN TRANSACTIONS OF 20000 ROWS
RETURN count(row);
https://github.com/JMHReif/graph-demo-datasets/blob/main/kaggle-net
fl
ix/load-data.cypher
12. Top to Bottom
• 2 CSV
fi
les
• Add country name with Wikipedia
• Use Cypher + APOC magic
• Start in small pieces
• :auto IN TRANSACTIONS to batch
• Multiple statements to conserve memory
• Can be scripted
• Also can schedule (using APOC)