IBM's journey in open source graphs and how its Open by Design approach has benefited open communities and IBM offerings. Presented at JanusGraph NYC Meetup, March 1, 2017. https://www.meetup.com/graphs/events/237100744/
Enabling Multimodel Graphs with Apache TinkerPopJason Plurad
Graphs are everywhere, but in a modern data stack, they are not the only tool in the toolbox. With Apache TinkerPop, adding graph capability on top of your existing data platform is not as daunting as it sounds. We will do a deep dive on writing Traversal Strategies to optimize performance of the underlying graph database. We will investigate how various TinkerPop systems offer unique possibilities in a multimodel approach to graph processing. We will discuss how using Gremlin frees you from vendor lock-in and enables you to swap out your graph database as your requirements evolve. Presented at Graph Day Texas, January 14, 2017. http://graphday.com/graph-day-at-data-day-texas/#plurad
Community-Driven Graphs with JanusGraphJason Plurad
Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing. Presented with Jing Chen (Jerry) He at HBaseCon West 2017, June 12, 2017.
Graph Processing with Apache TinkerPop and GremlinJason Plurad
Presented at the NVIDIA GPU-Accelerated Graph Ecosystem Roundtable. "Come share and learn more about how NVIDIA is accelerating the graph ecosystem and collaborating with the community on joint development opportunities. Join us to get the latest update on nvGraph, cuSTINGER, Gunrock, and query languages. Don't miss out on a great opportunity to provide feedback and take an active part in shaping the future of GPU-accelerated graph analytics." GPU Technology Conference, May 8, 2017, San Jose, California.
Graph Processing with Titan and ScyllaJason Plurad
Graphs are growing in popularity, but the landscape is becoming a hairball. Learn how to unravel it with the Apache TinkerPop graph computing framework and Gremlin, a functional, data flow language for traversing graphs. This session helps you distinguish between OLTP and OLAP graph processing as well as how to bridge the gap between graph databases and graph engines. We will talk specifically about how Titan, an open source graph database, can combine with Scylla to handle both types of workloads. http://www.scylladb.com/summit/
Graph Computing with JanusGraph. Presented at Cleveland Big Data Mega Meetup on September 11, 2017. https://www.meetup.com/Cleveland-Hadoop/events/241553826/
JanusGraph: What's Next, Project Status Update. Presented at Open Source Graph Technologies NYC Meetup on August 24, 2017. https://www.meetup.com/graphs/events/241136321/
Community-Driven Graphs with JanusGraphJason Plurad
Presented at Open Camps (Database Camp, Search Camp) in New York City on November 19, 2017. http://www.searchcamp.io/2017/presentations/community-driven-graphs-with-janusgraph
Start Flying with Python & Apache TinkerPopJason Plurad
Gremlin, the graph traversal language from Apache TinkerPop, continues to evolve in support of the growing graph ecosystem. In this session, we'll take a deep dive into Gremlin Language Variants (GLV) to see how TinkerPop enables modern programming languages to leverage Gremlin natively. By converting Gremlin into bytecode, the same instructions can be transmitted and interpreted by graph systems from different vendors. We'll uncover the benefits of this approach by demonstrating a Python-based graph architecture built to empower your application developers and data scientists. By using popular packages from Python open source, like Flask microframework and Jupyter notebooks, we'll see how you can easily transition your app development from your machine to the IBM Cloud. Presented at Graph Day SF on June 17, 2017.
Enabling Multimodel Graphs with Apache TinkerPopJason Plurad
Graphs are everywhere, but in a modern data stack, they are not the only tool in the toolbox. With Apache TinkerPop, adding graph capability on top of your existing data platform is not as daunting as it sounds. We will do a deep dive on writing Traversal Strategies to optimize performance of the underlying graph database. We will investigate how various TinkerPop systems offer unique possibilities in a multimodel approach to graph processing. We will discuss how using Gremlin frees you from vendor lock-in and enables you to swap out your graph database as your requirements evolve. Presented at Graph Day Texas, January 14, 2017. http://graphday.com/graph-day-at-data-day-texas/#plurad
Community-Driven Graphs with JanusGraphJason Plurad
Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing. Presented with Jing Chen (Jerry) He at HBaseCon West 2017, June 12, 2017.
Graph Processing with Apache TinkerPop and GremlinJason Plurad
Presented at the NVIDIA GPU-Accelerated Graph Ecosystem Roundtable. "Come share and learn more about how NVIDIA is accelerating the graph ecosystem and collaborating with the community on joint development opportunities. Join us to get the latest update on nvGraph, cuSTINGER, Gunrock, and query languages. Don't miss out on a great opportunity to provide feedback and take an active part in shaping the future of GPU-accelerated graph analytics." GPU Technology Conference, May 8, 2017, San Jose, California.
Graph Processing with Titan and ScyllaJason Plurad
Graphs are growing in popularity, but the landscape is becoming a hairball. Learn how to unravel it with the Apache TinkerPop graph computing framework and Gremlin, a functional, data flow language for traversing graphs. This session helps you distinguish between OLTP and OLAP graph processing as well as how to bridge the gap between graph databases and graph engines. We will talk specifically about how Titan, an open source graph database, can combine with Scylla to handle both types of workloads. http://www.scylladb.com/summit/
Graph Computing with JanusGraph. Presented at Cleveland Big Data Mega Meetup on September 11, 2017. https://www.meetup.com/Cleveland-Hadoop/events/241553826/
JanusGraph: What's Next, Project Status Update. Presented at Open Source Graph Technologies NYC Meetup on August 24, 2017. https://www.meetup.com/graphs/events/241136321/
Community-Driven Graphs with JanusGraphJason Plurad
Presented at Open Camps (Database Camp, Search Camp) in New York City on November 19, 2017. http://www.searchcamp.io/2017/presentations/community-driven-graphs-with-janusgraph
Start Flying with Python & Apache TinkerPopJason Plurad
Gremlin, the graph traversal language from Apache TinkerPop, continues to evolve in support of the growing graph ecosystem. In this session, we'll take a deep dive into Gremlin Language Variants (GLV) to see how TinkerPop enables modern programming languages to leverage Gremlin natively. By converting Gremlin into bytecode, the same instructions can be transmitted and interpreted by graph systems from different vendors. We'll uncover the benefits of this approach by demonstrating a Python-based graph architecture built to empower your application developers and data scientists. By using popular packages from Python open source, like Flask microframework and Jupyter notebooks, we'll see how you can easily transition your app development from your machine to the IBM Cloud. Presented at Graph Day SF on June 17, 2017.
The JanusGraph project started at the Linux Foundation earlier this year, but it is not the new kid on the block. We'll start with a look at the origins and evolution of this open source graph database through the lens of a few IBM graph use cases. We'll discuss the new features in latest release of JanusGraph, and then take a look at future directions to explore together with the open community. Presented on October 18, 2017 at the Graph Technologies Meetup in Santa Clara, CA. https://www.meetup.com/_CAIDI/events/243122187/
Graph Processing with Apache TinkerPopJason Plurad
Graphs are growing in popularity, but the landscape is becoming a hairball. Learn how to unravel it with the Apache TinkerPop graph computing framework and Gremlin, a functional, data flow language for traversing graphs. This session helps you distinguish between OLTP and OLAP graph processing as well as how to bridge the gap between graph databases and graph engines. We will offer TinkerPop alternatives for effective graph processing that go beyond Spark GraphX. We will also cover how to spin up a graph development environment quickly with Apache Ambari. Presented May 11, 2016 at Apache: Big Data 2016 conference. http://sched.co/6M2y
One of the first problems a developer encounters when evaluating a graph database is how to construct a graph efficiently. Recognizing this need in 2014, TinkerPop's Stephen Mallette penned a series of blog posts titled "Powers of Ten" which addressed several bulkload techniques for Titan. Since then Titan has gone away, and the open source graph database landscape has evolved significantly. Do the same approaches stand the test of time? In this session, we will take a deep dive into strategies for loading data of various sizes into modern Apache TinkerPop graph systems. We will discuss bulkloading with JanusGraph, the scalable graph database forked from Titan, to better understand how its architecture can be optimized for ingestion. Presented at Data Day Texas on January 27, 2018.
Janus graph lookingbackwardreachingforwardDemai Ni
JanusGraph: Looking Backward and Reaching Forward - by Jason Plurad (@pluradj):
The JanusGraph project started at the Linux Foundation earlier this year, but it is not the new kid on the block. We'll start with a look at the origins and evolution of this open source graph database through the lens of a few IBM graph use cases. We'll discuss the new features in latest release of JanusGraph, and then take a look at future directions to explore together with the open community.
Presented at the Linked Data Benchmark Council (LDBC) Technical User Group (TUG) Meeting on June 8, 2018. http://www.ldbcouncil.org/blog/11th-tuc-meeting-university-texas-austin
Presented at Open Camps (Database Camp) in New York City on November 19, 2017. http://www.db.camp/2017/presentations/graph-computing-with-apache-tinkerpop
Exploring Graph Use Cases with JanusGraphJason Plurad
Graph databases are relative newcomers in the NoSQL database landscape. What are some graph model and design considerations when choosing a graph database in your architecture? Let's take a tour of a couple graph use cases that we've collaborated on recently with our clients to help you better understand how and why a graph database can be integrated to help solve problems found with connected data. Presented at DataWorks Summit San Jose - IBM Meetup on June 18, 2018.
https://www.meetup.com/BigDataDevelopers/events/251307524/
Airline Reservations and Routing: A Graph Use CaseJason Plurad
We've all been there before... you hear the announcement that your flight is canceled. Fellow passengers race to the gate agent to rebook on the next available flight. How do they quickly determine the best route from Berlin to San Francisco? Ultimately the flight route network is best solved as a graph problem. We will discuss our lessons learned from working with a major airline to solve this problem using JanusGraph database. JanusGraph is an open source graph database designed for massive scale. It is compatible with several pieces of the open source big data stack: Apache TinkerPop (graph computing framework), HBase, Cassandra, and Solr. We will go into depth about our approach to benchmarking graph performance and discuss the utilities we developed. We will share our comparison results for evaluating which storage backend use with JanusGraph. Whether you are productizing a new database or you are a frustrated traveler, a fast resolution is needed to satisfy everybody involved. Presented at DataWorks Summit Berlin on April 18, 2018
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing.
by Jason Plurad and Jing Chen He of IBM
In the slide deck, we describe how graph databases are used at Netflix. Graph databases can be faster than relational databases for deeply-connected data - a strength of the underlying model. We have used JanusGraph on top of Cassandra. Both technologies are Open Source.
Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...Amazon Web Services
Learn how to deploy a managed Presto environment to interactively query log data on AWS
Organizations often need to quickly analyze large amounts of data, such as logs, generated from a wide variety of sources and formats. However, traditional approaches require a lot of time and effort designing complex data transformation and loading processes; and configuring data warehouses. Using AWS, you can start querying your datasets within minutes
In this webinar you will learn how you can deploy a managed Presto environment in minutes to interactively query log data using plain ANSI SQL. Presto is a popular open source SQL engine for running interactive analytic queries against data sources of all sizes. We will talk about common use cases and best practices for running Presto on Amazon EMR.
Learning Objectives:
• Learn how to deploy a managed Presto environment running on Amazon EMR
• Understand best practices for running Presto on Amazon EMR, including use of Amazon EC2 Spot instances
• Learn how other customers are using Presto to analyze large data sets
The JanusGraph project started at the Linux Foundation earlier this year, but it is not the new kid on the block. We'll start with a look at the origins and evolution of this open source graph database through the lens of a few IBM graph use cases. We'll discuss the new features in latest release of JanusGraph, and then take a look at future directions to explore together with the open community. Presented on October 18, 2017 at the Graph Technologies Meetup in Santa Clara, CA. https://www.meetup.com/_CAIDI/events/243122187/
Graph Processing with Apache TinkerPopJason Plurad
Graphs are growing in popularity, but the landscape is becoming a hairball. Learn how to unravel it with the Apache TinkerPop graph computing framework and Gremlin, a functional, data flow language for traversing graphs. This session helps you distinguish between OLTP and OLAP graph processing as well as how to bridge the gap between graph databases and graph engines. We will offer TinkerPop alternatives for effective graph processing that go beyond Spark GraphX. We will also cover how to spin up a graph development environment quickly with Apache Ambari. Presented May 11, 2016 at Apache: Big Data 2016 conference. http://sched.co/6M2y
One of the first problems a developer encounters when evaluating a graph database is how to construct a graph efficiently. Recognizing this need in 2014, TinkerPop's Stephen Mallette penned a series of blog posts titled "Powers of Ten" which addressed several bulkload techniques for Titan. Since then Titan has gone away, and the open source graph database landscape has evolved significantly. Do the same approaches stand the test of time? In this session, we will take a deep dive into strategies for loading data of various sizes into modern Apache TinkerPop graph systems. We will discuss bulkloading with JanusGraph, the scalable graph database forked from Titan, to better understand how its architecture can be optimized for ingestion. Presented at Data Day Texas on January 27, 2018.
Janus graph lookingbackwardreachingforwardDemai Ni
JanusGraph: Looking Backward and Reaching Forward - by Jason Plurad (@pluradj):
The JanusGraph project started at the Linux Foundation earlier this year, but it is not the new kid on the block. We'll start with a look at the origins and evolution of this open source graph database through the lens of a few IBM graph use cases. We'll discuss the new features in latest release of JanusGraph, and then take a look at future directions to explore together with the open community.
Presented at the Linked Data Benchmark Council (LDBC) Technical User Group (TUG) Meeting on June 8, 2018. http://www.ldbcouncil.org/blog/11th-tuc-meeting-university-texas-austin
Presented at Open Camps (Database Camp) in New York City on November 19, 2017. http://www.db.camp/2017/presentations/graph-computing-with-apache-tinkerpop
Exploring Graph Use Cases with JanusGraphJason Plurad
Graph databases are relative newcomers in the NoSQL database landscape. What are some graph model and design considerations when choosing a graph database in your architecture? Let's take a tour of a couple graph use cases that we've collaborated on recently with our clients to help you better understand how and why a graph database can be integrated to help solve problems found with connected data. Presented at DataWorks Summit San Jose - IBM Meetup on June 18, 2018.
https://www.meetup.com/BigDataDevelopers/events/251307524/
Airline Reservations and Routing: A Graph Use CaseJason Plurad
We've all been there before... you hear the announcement that your flight is canceled. Fellow passengers race to the gate agent to rebook on the next available flight. How do they quickly determine the best route from Berlin to San Francisco? Ultimately the flight route network is best solved as a graph problem. We will discuss our lessons learned from working with a major airline to solve this problem using JanusGraph database. JanusGraph is an open source graph database designed for massive scale. It is compatible with several pieces of the open source big data stack: Apache TinkerPop (graph computing framework), HBase, Cassandra, and Solr. We will go into depth about our approach to benchmarking graph performance and discuss the utilities we developed. We will share our comparison results for evaluating which storage backend use with JanusGraph. Whether you are productizing a new database or you are a frustrated traveler, a fast resolution is needed to satisfy everybody involved. Presented at DataWorks Summit Berlin on April 18, 2018
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
Graphs are well-suited for many use cases to express and process complex relationships among entities in enterprise and social contexts. Fueled by the growing interest in graphs, there are various graph databases and processing systems that dot the graph landscape. JanusGraph is a community-driven project that continues the legacy of Titan, a pioneer of open source graph databases. JanusGraph is a scalable graph database optimized for large scale transactional and analytical graph processing. In the session, we will introduce JanusGraph, which features full integration with the Apache TinkerPop graph stack. We will discuss JanusGraph's optimized storage model that relies on HBase for fast graph transversal and processing.
by Jason Plurad and Jing Chen He of IBM
In the slide deck, we describe how graph databases are used at Netflix. Graph databases can be faster than relational databases for deeply-connected data - a strength of the underlying model. We have used JanusGraph on top of Cassandra. Both technologies are Open Source.
Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...Amazon Web Services
Learn how to deploy a managed Presto environment to interactively query log data on AWS
Organizations often need to quickly analyze large amounts of data, such as logs, generated from a wide variety of sources and formats. However, traditional approaches require a lot of time and effort designing complex data transformation and loading processes; and configuring data warehouses. Using AWS, you can start querying your datasets within minutes
In this webinar you will learn how you can deploy a managed Presto environment in minutes to interactively query log data using plain ANSI SQL. Presto is a popular open source SQL engine for running interactive analytic queries against data sources of all sizes. We will talk about common use cases and best practices for running Presto on Amazon EMR.
Learning Objectives:
• Learn how to deploy a managed Presto environment running on Amazon EMR
• Understand best practices for running Presto on Amazon EMR, including use of Amazon EC2 Spot instances
• Learn how other customers are using Presto to analyze large data sets
Teksten en liederen die geprojecteerd werden op Ten Bos op de derde vastenzondag 2016 (Sint Amanduskerk Erembodegem). De teksten van onze vieringen zijn te vinden op de website: http://www.kerkembodegem.be/tenbos/liturgie/vieringen.html
Dr Muhammad Athar Khan MBBS,DPH,DCPS-HCSM(MPH),MBA MCPS,PGD-Statistics,DCPS-...Dr Athar Khan
Dr Muhammad Athar Khan
MBBS,DPH,DCPS-HCSM(MPH),MBA MCPS,PGD-Statistics,DCPS-HPE
Associate Professor
Department of Community Medicine
Liaquat College of Medicine & Dentistry
Pneumoconiosis and prevention Dr Muhammad Athar Khan MBBS,DPH,DCPS-HCSM(MP...Dr Athar Khan
Dr Muhammad Athar Khan
MBBS,DPH,DCPS-HCSM(MPH),MBA MCPS,PGD-Statistics,DCPS-HPE
Associate Professor
Department of Community Medicine
Liaquat College of Medicine & Dentistry
Karachi, Pakistan
Article ti&m special: Metropolis digitalizes public transportPeter Affolter
Cities are using systematic digitalization to make public transport more reliable and more e cient. The aim isto reduce cancellations and tra c jams, and to ensure that passengers have more enjoyable journeys by keeping them better
informed and providing real-time journey planning.
El pasado viernes, 3 de marzo, el alumnado de Educación Infantil celebró el carnaval. El carnaval es una fiesta especial para los más pequeños. A todos los niños y niñas, les encantan disfrazarse. Llevar un disfraz les transforman en otros personajes, les acercan a su mundo interior de magia e imaginación.
La semana pasada, el alumnado de 5ºA realizó la segunda sesión de la actividad "Imaginemos nuestro barrio". En esta sesión, el alumnado ha conocido los distritos en los que se divide la ciudad de Sevilla con sus lugares más emblemáticos. También han conocido los verdaderos nombres de las distintas barriadas que componen el Polígono Sur, como la barriada de Martínez Montañés, conocida como "las vegas"o la barriada de Murillo, conocida como "los amarillos".
Se les han facilitado unos mapas del Polígono Sur. En estos mapas, el alumnado ha rodeado de verde las zonas que están bien y de rojo las que necesitan mejoras.
Het Levende Water (derde vastenzondag 2017 A)Ten Bos
Teksten en liederen die geprojecteerd werden op Ten Bos tijdens de derder vastenzondag (A) 2017 (Sint Amanduskerk Erembodegem). De teksten van onze vieringen zijn te vinden op de website: http://www.kerkembodegem.be/tenbos/liturgie/vieringen.html
This introductory level talk is about Apache Flink: a multi-purpose Big Data analytics framework leading a movement towards the unification of batch and stream processing in the open source.
With the many technical innovations it brings along with its unique vision and philosophy, it is considered the 4 G (4th Generation) of Big Data Analytics frameworks providing the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases: batch, streaming, relational queries, machine learning and graph processing.
In this talk, you will learn about:
1. What is Apache Flink stack and how it fits into the Big Data ecosystem?
2. How Apache Flink integrates with Hadoop and other open source tools for data input and output as well as deployment?
3. Why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark.
4. Who is using Apache Flink?
5. Where to learn more about Apache Flink?
ApacheCon 2021 Apache Deep Learning 302Timothy Spann
ApacheCon 2021 Apache Deep Learning 302
Tuesday 18:00 UTC
Apache Deep Learning 302
Timothy Spann
This talk will discuss and show examples of using Apache Hadoop, Apache Kudu, Apache Flink, Apache Hive, Apache MXNet, Apache OpenNLP, Apache NiFi and Apache Spark for deep learning applications. This is the follow up to previous talks on Apache Deep Learning 101 and 201 and 301 at ApacheCon, Dataworks Summit, Strata and other events. As part of this talk, the presenter will walk through using Apache MXNet Pre-Built Models, integrating new open source Deep Learning libraries with Python and Java, as well as running real-time AI streams from edge devices to servers utilizing Apache NiFi and Apache NiFi - MiNiFi. This talk is geared towards Data Engineers interested in the basics of architecting Deep Learning pipelines with open source Apache tools in a Big Data environment. The presenter will also walk through source code examples available in github and run the code live on Apache NiFi and Apache Flink clusters.
Tim Spann is a Developer Advocate @ StreamNative where he works with Apache NiFi, Apache Pulsar, Apache Flink, Apache MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming. Previously, he was a Principal Field Engineer at Cloudera, a senior solutions architect at AirisData and a senior field engineer at Pivotal. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton on big data, the IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as IoT Fusion, Strata, ApacheCon, Data Works Summit Berlin, DataWorks Summit Sydney, and Oracle Code NYC. He holds a BS and MS in computer science.
* https://github.com/tspannhw/ApacheDeepLearning302/
* https://github.com/tspannhw/nifi-djl-processor
* https://github.com/tspannhw/nifi-djlsentimentanalysis-processor
* https://github.com/tspannhw/nifi-djlqa-processor
* https://www.linkedin.com/pulse/2021-schedule-tim-spann/
.net developer for Jupyter Notebook and Apache Spark and viceversaMarco Parenzan
Jupyter Notebooks and Apache Spark are first class citizens of the Data Science space, a truly requirement for the "modern" data scientist. But there was a requirement: being a python developer. Now Microsoft is investing on C# as another first class citizen in this space. Let's look what .net can do for notebooks and spark and what are notebooks and spark.
GPU Computing with Python and Anaconda: The Next FrontierNVIDIA
Learn how Python is becoming the glue that binds data science, how rapid integration empowers data scientists to combine new technologies, and the two primary goals in store for Anaconda.
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI PipelinesTimothy Spann
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
https://www.aicamp.ai/event/eventdetails/W2024022214
apache nifi
llm
generative ai
gen ai
ml
dl
machine learning
apache kafka
apache flink
postgresql
python
AI Meetup (NYC): GenAI, LLMs, ML and Data
Feb 22, 05:30 PM EST
Welcome to the monthly in-person AI meetup in New York City, in collaboration with Microsoft. Join us for deep dive tech talks on AI, GenAI, LLMs and machine learning, food/drink, networking with speakers and fellow developers
Agenda:
* 5:30pm~6:00pm: Checkin, Food/drink and networking
* 6:00pm~6:10pm: Welcome/community update
* 6:10pm~8:30pm: Tech talks
* 8:30pm: Q&A, Open discussion
Tech Talk: Searching and Reasoning Over Multimedia Data with Vector Databases and LMMs
Speaker: Zain Hasan (Weaviate LinkedIn)
Abstract: In this talk, Zain Hasan will discuss how we can use open-source multimodal embedding models in conjunction with large generative multimodal models that can that can see, hear, read, and feel data(!), to perform cross-modal search(searching audio with images, videos with text etc.) and multimodal retrieval augmented generation (MM-RAG) at the billion-object scale with the help of open source vector databases. I will also demonstrate, with live code demos, how being able to perform this cross-modal retrieval in real-time can enables users to use LLMs that can reason over their enterprise multimodal data. This talk will revolve around how we can scale the usage of multimodal embedding and generative models in production.
Tech Talk: Codeless Generative AI Pipelines
Speaker: Timothy Spann (Cloudera LinkedIn)
Abstract: Join us for an insightful talk on leveraging the power of real-time streaming tools, specifically Apache NiFi, to revolutionize GenAI data engineering. In this session, we’ll explore how the integration of Apache NiFi can automate the entire process of prompt building, making it a seamless and efficient task.
Speakers/Topics:
Stay tuned as we are updating speakers and schedules. If you have a keen interest in speaking to our community, we invite you to submit topics for consideration: Submit Topics
Sponsors:
We are actively seeking sponsors to support our community. Whether it is by offering venue spaces, providing food/drink, or cash sponsorship. Sponsors will have the chance to speak at the meetups, receive prominent recognition, and gain exposure to our extensive membership base of 20,000+ local or 300K+ developers worldwide.
Venue:
Microsoft NYC - Times Square, 11 Times Square, New York, NY 10036
Room Name: Central Park West 6501
Community on Slack/Discord
- Event chat: chat and connect with speakers and attendees
- Sharing blogs, events, job openings, projects collaborations
Join Slack (search and join the #newyork channel) | Join Discord
AFCEA C4I Symposium: The 4th C in C4I Stands for Cloud:Factors Driving Adopti...Patrick Chanezon
Computer systems architecture evolve in cycles every 15-20 years, oscillating between centralization and decentralization: centralized mainframes of the 60s, decentralized PCs of the 80s, centralized web apps of the 90s. Since 2010, we see a new architecture shift back to the 80's client-server model, with 3 trends: powerful mobile device (android, iphone), the browser becoming a rich client platform with html5, and cloud platforms commoditizing distributed computing on the server. This talk is about the server side of the current architecture shift.
As most technology architecture changes, cloud computing adoption is driven by factors from multiple dimensions, not only technical ones:
- technology: Big Data & fast networks, shift from vertical to horizontal scalability, commoditization of distributed computing (Virtualization, Sharding, Storage, NoSQL databases, Paxos, Map/Reduce, Go language), centralization of security
- economy: broadband and wireless ubiquity, shift from product to services, economies of scale, Moore's law, cost of electricity becoming main driver for computing cost , pay as you go models
- culture: consumerization of enterprise technology, technology achieves ubiquity by disappearing
20 years ago when I was involved with Command and Control Systems for the french DoD, they were called C3I. Since then it seems they added a C for Computers, C4I. Maybe for the next 20 years the 4th C of C4I should stand for Cloud.
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022Timothy Spann
21-Jan-2022. Friday 9:45 AM — 10 min. DataMinutes. Apache Pulsar with MQTT for Edge Computing. https://datagrillen.com/dataminutes/
Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Tim Spann
Scylla Summit 2022: Learning Rust the Hard Way for a Production Kafka+ScyllaD...ScyllaDB
Numberly operates business-critical data pipelines and applications where failure and latency means "lost money" in the best-case scenario. Most of those data pipelines and applications are deployed on Kubernetes and rely on Kafka and ScyllaDB, where Kafka acts as the message bus and ScyllaDB as the source of some data enrichment. The availability and latency of both systems are thus very important because they mix and match data in the early stage of their pipelines to be consumed by their platforms.
Most of their applications are developed using Python. But they always felt that they could benefit from a lower-level programming language to squeeze the performance of their hardware even further for some of the most demanding applications. So, when an important part of their data pipeline was to be adjusted to reflect some important changes in their platforms, they thought it was a great opportunity to rewrite it in Rust!
Moving to Rust was hard, not only because of the language itself, but because being at a lower level allowed them to see, test, and demonstrate things that they could not pinpoint or explain that well using Python. They spent a lot of time analyzing the latency impacts of code patterns and client driver settings and ended up contributing to Apache Avro as they went down the rabbit hole.
This session will share their experience transitioning from Python to Rust while meeting the expectations of a business-critical application mixing data from Confluent Kafka and ScyllaDB. There will be code snippets, graphs, numbers, tears, pull requests, grins, latency results, smiles, rants of frustration, and a lot of fun!
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Simplilearn
Deep Learning covers all the essential Deep Learning frameworks that are necessary to build AI models. In this presentation, you will learn about the development of essential frameworks such as TensorFlow, Keras, PyTorch, Theano, etc. You will also understand the programming languages used to build the frameworks, the different companies that use these frameworks, the characteristics of these Deep Learning frameworks, and type of models that were built using these frameworks. Now, let us get started with understanding the different popular Deep Learning frameworks being used in industries.
Below are the different Deep Learning frameworks we'll be discussing in this presentation:
1. TensorFlow
2. Keras
3. PyTorch
4. Theano
5. Deep Learning 4 Java
6. Caffe
7. Chainer
8. Microsoft CNTK
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations, and the execution pipeline
2. Implement deep learning algorithms, understand neural networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional neural networks, recurrent neural networks, training deep networks and high-level interfaces
4. Build deep learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of artificial neural networks
6. Troubleshoot and improve deep learning models
7. Build your own deep learning project
8. Differentiate between machine learning, deep learning, and artificial intelligence
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
28March2024-Codeless-Generative-AI-Pipelines
https://www.meetup.com/futureofdata-princeton/events/299440871/
https://www.meetup.com/real-time-analytics-meetup-ny/events/299290822/
******Note*****
The event is seat-limited, therefore please complete your registration here. Only people completing the form will be able to attend.
-----------------------
We're excited to invite you to join us in-person, for a Real-Time Analytics exploration!
Join us for an evening of insights, networking as we delve into the OSS technologies shaping the field!
Agenda:
05:30-06:00: Pizza and friends
06:00- 06:40: Codeless GenAI Pipelines with Flink, Kafka, NiFi
06:40- 07:20 Real-Time Analytics in the Corporate World: How Apache Pinot® Powers Industry Leaders
07:20-07:30 QNA
Codeless GenAI Pipelines with Flink, Kafka, NiFi | Tim Spann, Cloudera
Explore the power of real-time streaming with GenAI using Apache NiFi. Learn how NiFi simplifies data engineering workflows, allowing you to focus on creativity over technical complexities. I'll guide you through practical examples, showcasing NiFi's automation impact from ingestion to delivery. Whether you're a seasoned data engineer or new to GenAI, this talk offers valuable insights into optimizing workflows. Join us to unlock the potential of real-time streaming and witness how NiFi makes data engineering a breeze for GenAI applications!
Real-Time Analytics in the Corporate World: How Apache Pinot® Powers Industry Leaders | Viktor Gamov, StarTree
Explore how industry leaders like LinkedIn, Uber Eats, and Stripe are mastering real-time data with Viktor as your guide. Discover how Apache Pinot transforms data into actionable insights instantly. Viktor will showcase Pinot's features, including the Star-Tree Index, and explain why it's a game-changer in data strategy. This session is for everyone, from data geeks to business gurus, eager to uncover the future of tech. Join us and be wowed by the power of real-time analytics with Apache Pinot!
-------
Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera.
He works with Apache NiFi, Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
1. Jason Plurad • pluradj@us.ibm.com • @pluradj
IBM Open Technology • Apache TinkerPop • JanusGraph
March 1, 2017 • JanusGraph NYC Meetup
IBM Open by Design:
Graph Technology
3. Apache TinkerPop
§ Open source, vendor-agnostic,
graph computing framework
§ Gremlin graph traversal language
3
Apache TinkerPop™
Maintainer Apache
Software
Foundation
License Apache
Latest
Release
3.2.4
February 2017
https://tinkerpop.apache.org
@pluradj #JanusGraph
4. JanusGraph
§ Fork of TitanDB code base
§ Scalable graph database distributed on
multi-machine clusters with pluggable storage
and indexing
§ Vendor-neutral, open community with
open governance
§ Contributors from Amazon, Expero, Google,
GRAKN.AI, Hortonworks, IBM, Netflix,
Orchestral Developments
4
JanusGraph™
Maintainer Linux
Foundation
License Apache
First Release Planned
1Q 2017
https://janusgraph.org
@pluradj #JanusGraph
6. IBM Graph
§ Fully-managed, Apache TinkerPop compatible
OLTP graph database
§ Focus on your data, not on install and operations
§ #sleepMore
6
IBM Graph
Maintainer IBM
License Commercial
Latest
Release
GA
July
2016
https://ibm.biz/IBMGraph
@pluradj #JanusGraph
7. JanusGraph + ScyllaDB
§ Scylla is a drop-in replacement for Apache Cassandra 2.1
– Higher throughput, lower latency
– C++ implementation, I/O scheduler
§ Scylla on IBM Compose (beta)
– https://www.compose.com/scylladb
§ Thrift compatibility starting with Scylla 1.3
7
ScyllaDB™
Maintainer ScyllaDB
License AGPL
Latest
Release
1.5
December
2016
https://scylladb.com
@pluradj #JanusGraph
8. Powered by Graph
§ IBM BigInsights / IBM Open Platform with Apache Hadoop
§ IBM IT Service Management (NetCool, Application Performance Management)
§ IBM Personal Social Dashboard
§ IBM Research: Cognitive Eldercare
§ IBM Watson Cognitive Security
8 @pluradj #JanusGraph