Are Open LLMs useful for production applications, or are they low quality toys useful only for experiments? We share our experiences using open LLMs vs proprietary LLMs.
Use Case Patterns for LLM Applications (1).pdfM Waleed Kadous
What are the "use case patterns" for deploying LLMs into production? Understanding these will allow you to spot "LLM-shaped" problems in your own industry.
The need for intelligent, personalized experiences powered by AI is ever-growing. Our devices are producing more and more data that could help improve our AI experiences. How do we learn and efficiently process all this data from edge devices while maintaining privacy? On-device learning rather than cloud training can address these challenges. In this presentation, we’ll discuss:
- Why on-device learning is crucial for providing intelligent, personalized experiences without sacrificing privacy
- Our latest research in on-device learning, including few-shot learning, continuous learning, and federated learning
- How we are solving system and feasibility challenges to move from research to commercialization
5G is going mainstream across the globe, and this is an exciting time to harness the low latency and high capacity of 5G to enable the metaverse. A distributed-compute architecture across device and cloud can enable rich extended reality (XR) user experiences. Virtual reality (VR) and mixed reality (MR) are ready for deployment in private networks, while augmented reality (AR) for wide area networks can be enabled in the near term with Wi-Fi powered AR glasses paired with a 5G-enabled phone. Device APIs enabling application adaptation is critical for good user experience. 5G standards are evolving to support the deployment of AR glasses at a large scale and setting the stage for 6G-era with the merging of the physical, digital, and virtual worlds. Techniques like perception-enhanced wireless offer significant potential to improve user experience. Qualcomm Technologies is enabling the XR industry with platforms, developer SDKs, and reference designs.
Check out this webinar to learn:
• How 5G and distributed-compute architectures enable the metaverse
• The latest results from our boundless XR 5G/6G testbed, including device APIs and perception-enhanced wireless
• 5G standards evolution for enhancing XR applications and the road to 6G
• How Qualcomm Technologies is enabling the industry with platforms, SDKs, and reference designs
LAMP is a web development platform consisting of Linux, Apache, MySQL, and PHP. Linux provides the operating system, Apache is the web server, MySQL stores data in a database, and PHP is the scripting language that brings it all together. Together, these open source components provide a low-cost, robust solution for building dynamic web applications and websites.
The document discusses different methods for customizing large language models (LLMs) with proprietary or private data, including training a custom model, fine-tuning a general model, and prompting with expanded inputs. Fine-tuning techniques like low-rank adaptation and supervised fine-tuning allow emphasizing custom knowledge without full retraining. Prompt expansion using techniques like retrieval augmented generation can provide additional context beyond the character limit.
- LLaMA 2 is a family of large language models developed by Meta in partnership with Microsoft and others. It has been pretrained on 2 trillion tokens and has three model sizes up to 70 billion parameters.
- LLaMA 2 was trained using an auto-regressive transformer and reinforcement learning from human feedback to improve safety and alignment. It can generate text, translate languages, and answer questions.
- The models were pretrained on Meta's research supercomputers then fine-tuned for dialog using supervised learning and reinforcement learning from human feedback to further optimize safety and usefulness.
Optii is an AI-powered supply chain optimization solution that automates supply chain planning. It uses advanced modeling and machine intelligence to identify the optimal configuration of a supply chain network to maximize service levels while minimizing costs. Optii targets businesses that currently use manual planning systems by automating the configuration without requiring replacements of existing infrastructure. The company has validated its solution through proofs-of-concept and pilots with industry-leading customers. It is seeking seed funding to further develop its capabilities and expand its sales and marketing efforts.
This presentation outlines the synergistic nature of 5G and AI -- two disruptive areas of innovations that can change the world. It illustrates the benefits of adopting AI for the advancements of 5G, as well as showcases the latest progress made by Qualcomm Technologies, Inc.
Use Case Patterns for LLM Applications (1).pdfM Waleed Kadous
What are the "use case patterns" for deploying LLMs into production? Understanding these will allow you to spot "LLM-shaped" problems in your own industry.
The need for intelligent, personalized experiences powered by AI is ever-growing. Our devices are producing more and more data that could help improve our AI experiences. How do we learn and efficiently process all this data from edge devices while maintaining privacy? On-device learning rather than cloud training can address these challenges. In this presentation, we’ll discuss:
- Why on-device learning is crucial for providing intelligent, personalized experiences without sacrificing privacy
- Our latest research in on-device learning, including few-shot learning, continuous learning, and federated learning
- How we are solving system and feasibility challenges to move from research to commercialization
5G is going mainstream across the globe, and this is an exciting time to harness the low latency and high capacity of 5G to enable the metaverse. A distributed-compute architecture across device and cloud can enable rich extended reality (XR) user experiences. Virtual reality (VR) and mixed reality (MR) are ready for deployment in private networks, while augmented reality (AR) for wide area networks can be enabled in the near term with Wi-Fi powered AR glasses paired with a 5G-enabled phone. Device APIs enabling application adaptation is critical for good user experience. 5G standards are evolving to support the deployment of AR glasses at a large scale and setting the stage for 6G-era with the merging of the physical, digital, and virtual worlds. Techniques like perception-enhanced wireless offer significant potential to improve user experience. Qualcomm Technologies is enabling the XR industry with platforms, developer SDKs, and reference designs.
Check out this webinar to learn:
• How 5G and distributed-compute architectures enable the metaverse
• The latest results from our boundless XR 5G/6G testbed, including device APIs and perception-enhanced wireless
• 5G standards evolution for enhancing XR applications and the road to 6G
• How Qualcomm Technologies is enabling the industry with platforms, SDKs, and reference designs
LAMP is a web development platform consisting of Linux, Apache, MySQL, and PHP. Linux provides the operating system, Apache is the web server, MySQL stores data in a database, and PHP is the scripting language that brings it all together. Together, these open source components provide a low-cost, robust solution for building dynamic web applications and websites.
The document discusses different methods for customizing large language models (LLMs) with proprietary or private data, including training a custom model, fine-tuning a general model, and prompting with expanded inputs. Fine-tuning techniques like low-rank adaptation and supervised fine-tuning allow emphasizing custom knowledge without full retraining. Prompt expansion using techniques like retrieval augmented generation can provide additional context beyond the character limit.
- LLaMA 2 is a family of large language models developed by Meta in partnership with Microsoft and others. It has been pretrained on 2 trillion tokens and has three model sizes up to 70 billion parameters.
- LLaMA 2 was trained using an auto-regressive transformer and reinforcement learning from human feedback to improve safety and alignment. It can generate text, translate languages, and answer questions.
- The models were pretrained on Meta's research supercomputers then fine-tuned for dialog using supervised learning and reinforcement learning from human feedback to further optimize safety and usefulness.
Optii is an AI-powered supply chain optimization solution that automates supply chain planning. It uses advanced modeling and machine intelligence to identify the optimal configuration of a supply chain network to maximize service levels while minimizing costs. Optii targets businesses that currently use manual planning systems by automating the configuration without requiring replacements of existing infrastructure. The company has validated its solution through proofs-of-concept and pilots with industry-leading customers. It is seeking seed funding to further develop its capabilities and expand its sales and marketing efforts.
This presentation outlines the synergistic nature of 5G and AI -- two disruptive areas of innovations that can change the world. It illustrates the benefits of adopting AI for the advancements of 5G, as well as showcases the latest progress made by Qualcomm Technologies, Inc.
Llama 2 Open Foundation and Fine-Tuned Chat Models.pdfDr. Yasir Butt
This document summarizes the development and release of Llama 2, a collection of pretrained and fine-tuned large language models ranging from 7 billion to 70 billion parameters. Key points:
1) Llama 2 models were pretrained on a larger corpus of publicly available data and with increased context length and attention mechanisms compared to prior models.
2) Llama 2-Chat models were fine-tuned via supervised learning and reinforcement learning with human feedback to optimize dialogue capabilities and outperform existing open-source models based on evaluations.
3) Safety techniques like data annotation, red-teaming, and iterative evaluations were used in pretraining and fine-tuning to improve the safety of Llama
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The content was modified from Google Content Group
Eric ShangKuan(ericsk@google.com)
---
TensorFlow Lite guide( for mobile & IoT )
TensorFlow Lite is a set of tools to help developers run TensorFlow models on mobile, embedded, and IoT devices. It enables on-device machine learning inference with low latency and small binary size.
TensorFlow Lite consists of two main components:
The TensorFlow Lite interpreter:
- optimize models on many different hardware types, like mobile phones, embedded Linux devices, and microcontrollers.
The TensorFlow Lite converter:
- which converts TensorFlow models into an efficient form for use by the interpreter, and can introduce optimizations to improve binary size and performance.
---
Event: PyLadies TensorFlow All-Around
Date: Sep 25, 2019
Event link: https://www.meetup.com/PyLadies-Berlin/events/264205538/
Linkedin: http://linkedin.com/in/mia-chang/
Class lecture by Prof. Raj Jain on Introduction to OpenFlow. The talk covers Planes of Networking, Data vs. Control Logic, OpenFlow: Key Ideas, History of OpenFlow, Separation of Control and Data Plane, OpenFlow V1.0, Matching, Counters, Actions, Hardware OpenFlow Switches, Software OpenFlow Switches, Open vSwitch, Open vSwitch Features, OVSDB, OpenFlow V1.1, OpenFlow Hardware Implementation, OpenFlow V1.2, OpenFlow 1.3, OpenFlow V1.4, Implementation Issues, Current Limitations of OpenFlow, OpenFlow Current Activities, Introduction to OpenFlow, Planes of Networking, Data vs. Control Logic, OpenFlow: Key Ideas, History of OpenFlow, Separation of Control and Data Plane, OpenFlow V1.0, Matching, Counters, Actions, Hardware OpenFlow Switches, Software OpenFlow Switches, Open vSwitch, Open vSwitch Features, OVSDB, OpenFlow V1.1, OpenFlow Hardware Implementation, OpenFlow V1.2, OpenFlow 1.3, OpenFlow V1.4, Implementation Issues, Current Limitations of OpenFlow, OpenFlow Current Activities. Video recording available in YouTube.
The document discusses various components of IoT including control units, communication modules, and wireless technologies. Control units include sensors and actuators that convert physical phenomena into electrical signals. Common sensors detect humidity, temperature, motion etc. Communication modules allow connection and data transfer between IoT devices using short-range wireless technologies like Bluetooth, Zigbee and WiFi. Bluetooth supports audio/video transfer while Bluetooth Low Energy focuses on low power. Zigbee is optimized for large sensor networks with low data rates and power consumption.
The document discusses the boot process of the Raspberry Pi. It begins with a first stage bootloader in the ROM that mounts the FAT32 partition on the SD card. It then loads a second stage bootloader (bootcode.bin) which initializes RAM and PLLs. This bootloader parses config.txt and loads a third stage (the RTOS binary start.elf). The RTOS is launched, it splits RAM between the GPU and ARM, loads config.txt and cmdline.txt, and hands over to the operating system.
Plantee Innovations is raising $1.4 million to launch their flagship smart indoor gardening product called Plantee and expand into new markets. Plantee is an all-in-one smart greenhouse device that monitors and controls lighting, watering, fertilization and other conditions to provide optimal care for houseplants. The company has already validated demand through a successful Kickstarter campaign and plans to achieve $1.7 million in revenue by 2025 by selling Plantee units and accessories to their target customer of serial plant killers in developed markets.
This document provides an overview of a hands-on workshop on the Constrained Application Protocol (CoAP). It outlines the agenda which includes introductions to CoAP, the Californium CoAP framework, and hands-on projects. Attendees will work through example CoAP client and server code using the Californium libraries and test their implementations. Advanced CoAP topics like security, proxies, and resource directories are also discussed.
- Zigbee is a wireless mesh networking standard used for low-power wireless personal area networks. It operates on the IEEE 802.15.4 standard and defines the higher layers for reliable transmission of data between devices.
- 6LoWPAN is an adaptation layer that allows IPv6 packets to be sent over IEEE 802.15.4 low-power wireless networks. It provides compression mechanisms to encapsulate IPv6 datagrams into frames compatible with the IEEE 802.15.4 standard.
- Both Zigbee and 6LoWPAN are commonly used in wireless sensor networks and Internet of Things applications where many devices need to communicate wirelessly over short distances with low power consumption. However, Z
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1
Retrieval Augmented Generation (RAG) combines the concepts of semantic search and LLM-based text generation. When a person makes a query in natural language, the query is compared to the entries in the knowledge base and most relevant results are returned to the LLM, which uses this extra information to generate more accurate and reliable response. RAG can therefore limit hallucination and provide accurate responses from reliable source. In this talk, we will present the concept of RAG and underlying concept of semantic search, and present available libraries and vector databases.
John Chiappetta prezo on 5G EdgeComputing and IoT and how interdependent they are.
Held Nov 19, 2019 at Milton Education Village Innovation Centre, Milton, Ontario, Canada.
Learn more here: https://siliconhalton.com/event/meetup-119-what-is-5g-and-edge-computing/
1. The document describes Netmaker, a next-generation networking platform that simplifies networking by securely connecting devices everywhere using the WireGuard protocol.
2. Netmaker has over 1400 production platforms, 50% monthly growth for the last 6 months, and over 1000 community members. It is actively used on over 10,000 devices.
3. The document outlines Netmaker's roadmap, including releasing an enterprise version in beta in Q1 2022, and reaching $150,000 monthly recurring revenue from SMBs and enterprises by the end of 2023.
This document provides an introduction to eBPF (Extended Berkeley Packet Filter), which allows running user-space code in the Linux kernel without needing to compile a kernel module. It describes how eBPF avoids unnecessary copying of packets between kernel and user-space for improved performance. Examples are given of using eBPF for networking tasks like SDN configuration, DDoS mitigation, intrusion detection, and load balancing. The document concludes by noting eBPF provides alternatives to iptables that are better suited for microservices architectures.
Pitch Deck Teardown - Doola's $1m Series A extension deckHajeJanKamps
Doola is a "Business-in-a-Box" platform that provides end-to-end solutions for entity formation, compliance, banking, payments processing, and ongoing support for non-US entrepreneurs to start businesses in America. It aims to remove the difficulties non-US founders face when cobbling together these solutions individually. Doola has raised over $8 million in funding to date and sees an opportunity to build out additional financial services and a full fintech platform to tap into the large market of businesses started by non-US residents in America each year.
How to build high performance 5G networks with vRAN and O-RANQualcomm Research
5G networks are poised to deliver an unprecedented amount of data from a richer set of use cases than we have ever seen. This makes efficient networking in terms of scalability, cost, and power critical for the sustainable growth of 5G. Cloud technologies such as virtualization, containerization and orchestration are now powering a surge of innovation in virtualized radio access network (vRAN) infrastructure with modular hardware and software components, and standardized interfaces. While commercial off-the-shelf (COTS) hardware platforms provide the compute capacity for running vRAN software, hardware accelerators will also play a major role in offloading real-time and complex signal processing functions. Together, COTS platforms and hardware accelerators provide the foundation for building the intelligent 5G network and facilitate innovative new use cases with the intelligent wireless edge.
Pitch Deck Teardown: Scalestack's $1M AI sales tech Seed deckHajeJanKamps
This document introduces Scalestack, an AI-powered sales operations platform that aims to improve sales productivity and pipeline generation. It summarizes Scalestack's solution as bringing together all GTM data, suggesting prioritized actions, and helping reps execute sales plays with context. It provides an example customer story of how Scalestack helped MongoDB improve data quality, efficiency, and close over $20M in additional revenue. Scalestack differentiates itself from CRMs, sales tools, and data providers by focusing on automating the early sales funnel through personalized insights and actionable next steps for reps.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness, happiness and focus.
Cloud Robotics: It’s time to offload their brain on Cloud, for better Robotic...Sai Natkar
Cloud robotics is an emerging field of robotics ingrained in cloud computing. It allows robots to benefit from the powerful computational, storage, and communications resources of modern data centers.When computational or storage demands exceed the on-board capacity of a robot, they are offloaded to the cloud, where the massive resources of a datacenter can supplement their limited local resources.
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfGDG Bujumbura
Struggling with limited GPU resources but want to leverage large language models (LLMs)? This session provides a deep dive into cutting-edge LLM compression methods like quantization, pruning, and knowledge distillation. Learn how to efficiently run LLMs without sacrificing performance. Ideal for data scientists, machine learning engineers, and AI enthusiasts keen on cost-effective solutions. Includes a 5-minute Q&A.
Tensors Are All You Need: Faster Inference with HummingbirdDatabricks
The ever-increasing interest around deep learning and neural networks has led to a vast increase in processing frameworks like TensorFlow and PyTorch. These libraries are built around the idea of a computational graph that models the dataflow of individual units. Because tensors are their basic computational unit, these frameworks can run efficiently on hardware accelerators (e.g. GPUs).Traditional machine learning (ML) such as linear regressions and decision trees in scikit-learn cannot currently be run on GPUs, missing out on the potential accelerations that deep learning and neural networks enjoy.
In this talk, we’ll show how you can use Hummingbird to achieve 1000x speedup in inferencing on GPUs by converting your traditional ML models to tensor-based models (PyTorch andTVM). https://github.com/microsoft/hummingbird
This talk is for intermediate audiences that use traditional machine learning and want to speedup the time it takes to perform inference with these models. After watching the talk, the audience should be able to use ~5 lines of code to convert their traditional models to tensor-based models to be able to try them out on GPUs.
Outline:
Introduction of what ML inference is (and why it’s different than training)
Motivation: Tensor-based DNN frameworks allow inference on GPU, but “traditional” ML frameworks do not
Why “traditional” ML methods are important
Introduction of what Hummingbirddoes and main benefits
Deep dive on how traditional ML models are built
Brief intro onhow Hummingbird converter works
Example of how Hummingbird can convert a tree model into a tensor-based model
Other models
Demo
Status
Q&A
Llama 2 Open Foundation and Fine-Tuned Chat Models.pdfDr. Yasir Butt
This document summarizes the development and release of Llama 2, a collection of pretrained and fine-tuned large language models ranging from 7 billion to 70 billion parameters. Key points:
1) Llama 2 models were pretrained on a larger corpus of publicly available data and with increased context length and attention mechanisms compared to prior models.
2) Llama 2-Chat models were fine-tuned via supervised learning and reinforcement learning with human feedback to optimize dialogue capabilities and outperform existing open-source models based on evaluations.
3) Safety techniques like data annotation, red-teaming, and iterative evaluations were used in pretraining and fine-tuning to improve the safety of Llama
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The content was modified from Google Content Group
Eric ShangKuan(ericsk@google.com)
---
TensorFlow Lite guide( for mobile & IoT )
TensorFlow Lite is a set of tools to help developers run TensorFlow models on mobile, embedded, and IoT devices. It enables on-device machine learning inference with low latency and small binary size.
TensorFlow Lite consists of two main components:
The TensorFlow Lite interpreter:
- optimize models on many different hardware types, like mobile phones, embedded Linux devices, and microcontrollers.
The TensorFlow Lite converter:
- which converts TensorFlow models into an efficient form for use by the interpreter, and can introduce optimizations to improve binary size and performance.
---
Event: PyLadies TensorFlow All-Around
Date: Sep 25, 2019
Event link: https://www.meetup.com/PyLadies-Berlin/events/264205538/
Linkedin: http://linkedin.com/in/mia-chang/
Class lecture by Prof. Raj Jain on Introduction to OpenFlow. The talk covers Planes of Networking, Data vs. Control Logic, OpenFlow: Key Ideas, History of OpenFlow, Separation of Control and Data Plane, OpenFlow V1.0, Matching, Counters, Actions, Hardware OpenFlow Switches, Software OpenFlow Switches, Open vSwitch, Open vSwitch Features, OVSDB, OpenFlow V1.1, OpenFlow Hardware Implementation, OpenFlow V1.2, OpenFlow 1.3, OpenFlow V1.4, Implementation Issues, Current Limitations of OpenFlow, OpenFlow Current Activities, Introduction to OpenFlow, Planes of Networking, Data vs. Control Logic, OpenFlow: Key Ideas, History of OpenFlow, Separation of Control and Data Plane, OpenFlow V1.0, Matching, Counters, Actions, Hardware OpenFlow Switches, Software OpenFlow Switches, Open vSwitch, Open vSwitch Features, OVSDB, OpenFlow V1.1, OpenFlow Hardware Implementation, OpenFlow V1.2, OpenFlow 1.3, OpenFlow V1.4, Implementation Issues, Current Limitations of OpenFlow, OpenFlow Current Activities. Video recording available in YouTube.
The document discusses various components of IoT including control units, communication modules, and wireless technologies. Control units include sensors and actuators that convert physical phenomena into electrical signals. Common sensors detect humidity, temperature, motion etc. Communication modules allow connection and data transfer between IoT devices using short-range wireless technologies like Bluetooth, Zigbee and WiFi. Bluetooth supports audio/video transfer while Bluetooth Low Energy focuses on low power. Zigbee is optimized for large sensor networks with low data rates and power consumption.
The document discusses the boot process of the Raspberry Pi. It begins with a first stage bootloader in the ROM that mounts the FAT32 partition on the SD card. It then loads a second stage bootloader (bootcode.bin) which initializes RAM and PLLs. This bootloader parses config.txt and loads a third stage (the RTOS binary start.elf). The RTOS is launched, it splits RAM between the GPU and ARM, loads config.txt and cmdline.txt, and hands over to the operating system.
Plantee Innovations is raising $1.4 million to launch their flagship smart indoor gardening product called Plantee and expand into new markets. Plantee is an all-in-one smart greenhouse device that monitors and controls lighting, watering, fertilization and other conditions to provide optimal care for houseplants. The company has already validated demand through a successful Kickstarter campaign and plans to achieve $1.7 million in revenue by 2025 by selling Plantee units and accessories to their target customer of serial plant killers in developed markets.
This document provides an overview of a hands-on workshop on the Constrained Application Protocol (CoAP). It outlines the agenda which includes introductions to CoAP, the Californium CoAP framework, and hands-on projects. Attendees will work through example CoAP client and server code using the Californium libraries and test their implementations. Advanced CoAP topics like security, proxies, and resource directories are also discussed.
- Zigbee is a wireless mesh networking standard used for low-power wireless personal area networks. It operates on the IEEE 802.15.4 standard and defines the higher layers for reliable transmission of data between devices.
- 6LoWPAN is an adaptation layer that allows IPv6 packets to be sent over IEEE 802.15.4 low-power wireless networks. It provides compression mechanisms to encapsulate IPv6 datagrams into frames compatible with the IEEE 802.15.4 standard.
- Both Zigbee and 6LoWPAN are commonly used in wireless sensor networks and Internet of Things applications where many devices need to communicate wirelessly over short distances with low power consumption. However, Z
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1
Retrieval Augmented Generation (RAG) combines the concepts of semantic search and LLM-based text generation. When a person makes a query in natural language, the query is compared to the entries in the knowledge base and most relevant results are returned to the LLM, which uses this extra information to generate more accurate and reliable response. RAG can therefore limit hallucination and provide accurate responses from reliable source. In this talk, we will present the concept of RAG and underlying concept of semantic search, and present available libraries and vector databases.
John Chiappetta prezo on 5G EdgeComputing and IoT and how interdependent they are.
Held Nov 19, 2019 at Milton Education Village Innovation Centre, Milton, Ontario, Canada.
Learn more here: https://siliconhalton.com/event/meetup-119-what-is-5g-and-edge-computing/
1. The document describes Netmaker, a next-generation networking platform that simplifies networking by securely connecting devices everywhere using the WireGuard protocol.
2. Netmaker has over 1400 production platforms, 50% monthly growth for the last 6 months, and over 1000 community members. It is actively used on over 10,000 devices.
3. The document outlines Netmaker's roadmap, including releasing an enterprise version in beta in Q1 2022, and reaching $150,000 monthly recurring revenue from SMBs and enterprises by the end of 2023.
This document provides an introduction to eBPF (Extended Berkeley Packet Filter), which allows running user-space code in the Linux kernel without needing to compile a kernel module. It describes how eBPF avoids unnecessary copying of packets between kernel and user-space for improved performance. Examples are given of using eBPF for networking tasks like SDN configuration, DDoS mitigation, intrusion detection, and load balancing. The document concludes by noting eBPF provides alternatives to iptables that are better suited for microservices architectures.
Pitch Deck Teardown - Doola's $1m Series A extension deckHajeJanKamps
Doola is a "Business-in-a-Box" platform that provides end-to-end solutions for entity formation, compliance, banking, payments processing, and ongoing support for non-US entrepreneurs to start businesses in America. It aims to remove the difficulties non-US founders face when cobbling together these solutions individually. Doola has raised over $8 million in funding to date and sees an opportunity to build out additional financial services and a full fintech platform to tap into the large market of businesses started by non-US residents in America each year.
How to build high performance 5G networks with vRAN and O-RANQualcomm Research
5G networks are poised to deliver an unprecedented amount of data from a richer set of use cases than we have ever seen. This makes efficient networking in terms of scalability, cost, and power critical for the sustainable growth of 5G. Cloud technologies such as virtualization, containerization and orchestration are now powering a surge of innovation in virtualized radio access network (vRAN) infrastructure with modular hardware and software components, and standardized interfaces. While commercial off-the-shelf (COTS) hardware platforms provide the compute capacity for running vRAN software, hardware accelerators will also play a major role in offloading real-time and complex signal processing functions. Together, COTS platforms and hardware accelerators provide the foundation for building the intelligent 5G network and facilitate innovative new use cases with the intelligent wireless edge.
Pitch Deck Teardown: Scalestack's $1M AI sales tech Seed deckHajeJanKamps
This document introduces Scalestack, an AI-powered sales operations platform that aims to improve sales productivity and pipeline generation. It summarizes Scalestack's solution as bringing together all GTM data, suggesting prioritized actions, and helping reps execute sales plays with context. It provides an example customer story of how Scalestack helped MongoDB improve data quality, efficiency, and close over $20M in additional revenue. Scalestack differentiates itself from CRMs, sales tools, and data providers by focusing on automating the early sales funnel through personalized insights and actionable next steps for reps.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness, happiness and focus.
Cloud Robotics: It’s time to offload their brain on Cloud, for better Robotic...Sai Natkar
Cloud robotics is an emerging field of robotics ingrained in cloud computing. It allows robots to benefit from the powerful computational, storage, and communications resources of modern data centers.When computational or storage demands exceed the on-board capacity of a robot, they are offloaded to the cloud, where the massive resources of a datacenter can supplement their limited local resources.
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfGDG Bujumbura
Struggling with limited GPU resources but want to leverage large language models (LLMs)? This session provides a deep dive into cutting-edge LLM compression methods like quantization, pruning, and knowledge distillation. Learn how to efficiently run LLMs without sacrificing performance. Ideal for data scientists, machine learning engineers, and AI enthusiasts keen on cost-effective solutions. Includes a 5-minute Q&A.
Tensors Are All You Need: Faster Inference with HummingbirdDatabricks
The ever-increasing interest around deep learning and neural networks has led to a vast increase in processing frameworks like TensorFlow and PyTorch. These libraries are built around the idea of a computational graph that models the dataflow of individual units. Because tensors are their basic computational unit, these frameworks can run efficiently on hardware accelerators (e.g. GPUs).Traditional machine learning (ML) such as linear regressions and decision trees in scikit-learn cannot currently be run on GPUs, missing out on the potential accelerations that deep learning and neural networks enjoy.
In this talk, we’ll show how you can use Hummingbird to achieve 1000x speedup in inferencing on GPUs by converting your traditional ML models to tensor-based models (PyTorch andTVM). https://github.com/microsoft/hummingbird
This talk is for intermediate audiences that use traditional machine learning and want to speedup the time it takes to perform inference with these models. After watching the talk, the audience should be able to use ~5 lines of code to convert their traditional models to tensor-based models to be able to try them out on GPUs.
Outline:
Introduction of what ML inference is (and why it’s different than training)
Motivation: Tensor-based DNN frameworks allow inference on GPU, but “traditional” ML frameworks do not
Why “traditional” ML methods are important
Introduction of what Hummingbirddoes and main benefits
Deep dive on how traditional ML models are built
Brief intro onhow Hummingbird converter works
Example of how Hummingbird can convert a tree model into a tensor-based model
Other models
Demo
Status
Q&A
Recommendations for Building Machine Learning SoftwareJustin Basilico
This document provides recommendations for building machine learning software from the perspective of Netflix's experience.
The first recommendation is to be flexible about where and when computation happens by distributing components across offline, nearline, and online systems. The second is to think about distribution starting from the outermost levels of the problem by parallelizing across subsets of data, hyperparameters, and machines. The third recommendation is to design application software for experimentation by sharing components between experiment and production code. The fourth recommendation is to make algorithms and models extensible and modular by providing reusable building blocks. The fifth recommendation is to describe input and output transformations with models. The sixth recommendation is to not rely solely on metrics for testing and instead implement unit testing of code.
What are the most promising projects and how good are they?
In recent years different groups have used the transformer architecture (a deep learning model) to train neural networks using large quantities of text. With the increase in compute power these models have grown to billions or even hundred of billions of parameters. As the model size grew, noteworthy abilities emerged. Such as the ability to generate text showing surprising reasoning skills to the point that the leading models can now successfully take college-level exams.
Currently some of the best and most famous models are proprietary and released to the public as a service. However a large Open Source community has emerged that tries to train and fine tune free models that can be used self-hosted. This is a challenging task due to problems with potential copyright issues with the training text, the large computational cost of the training itself and the supervised fine tuning step to adapt the model to its final use case.
In this talk I will give an overview on what the most promising projects in this space are and how they compare to the proprietary state-of-the-art models of the large players.
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
Recommendations for Building Machine Learning Software: Building a real system that uses machine learning can be a difficult both in terms of the algorithmic and engineering challenges involved. In this talk, I will focus on the engineering side and discuss some of the practical lessons we’ve learned from years of developing the machine learning systems that power Netflix. I will go over what it takes to get machine learning working in a real-life feedback loop with our users and how that imposes different requirements and a different focus than doing machine learning only within a lab environment. This involves lessons around challenges such as where to place algorithmic components, how to handle distribution and parallelism, what kinds of modularity are useful, how to support both production experimentation, and how to test machine learning systems.
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostAggregage
Join Shreya Rajpal, CEO of Guardrails AI, and Travis Addair, CTO of Predibase, in this exclusive webinar to learn all about leveraging the part of AI that constitutes your IP – your data – to build a defensible AI strategy for the future!
This document discusses object-oriented design principles including encapsulation, abstraction, inheritance, polymorphism, and decoupling. It then introduces the SOLID principles of object-oriented design: single responsibility principle, open/closed principle, Liskov substitution principle, interface segregation principle, and dependency inversion principle. Code examples are provided to demonstrate how to apply these principles and improve code maintainability, reusability, and testability.
What are some of the performance implications of using lambdas and what strategies can be used to address these. When might be want an alternative to using a lambda and how can we design our APIs to be flexible in this regard. What are the principles of writing low latency code in Java? How do we tune and optimize our code for low latency? When don’t we optimize our code? Where does the JVM help and where does it get in our way? How does this apply to lambdas? How can we design our APIs to use lambdas and minimize garbage?
Microservices for performance - GOTO Chicago 2016Peter Lawrey
How do Microservices and Trading Systems overlap?
How can one area learn from the other?
How can we test components of microservices?
Is there a library which helps us implement and test these services?
The document discusses OttoBot, an AI assistant created by Lukas Biewald to be helpful, harmless, and honest. It can perform tasks like getting the weather by calling functions like weather(location="Boston"). The document explores improving OttoBot's accuracy at this task through techniques like prompt engineering, adding relevant examples, using different LLMs, and fine-tuning. It finds the best results come from combining prompt improvements with fine-tuning an LLM.
The document discusses emerging trends in software and services including:
1) Software as a Service and cloud computing which allows software to be delivered and consumed "as a service" with service level agreements.
2) The growth of massive data centers which are becoming large physical assets requiring significant capital expenditures.
3) The rise of "Dev-signers" or designer-developers who are combining development and design skills.
4) The integration of software and services will be key as local software interacts with internet services to provide combined capabilities.
Automatic License Plate Recognition using OpenCVEditor IJCATR
Automatic License Plate Recognition system is a real time embedded system which automatically recognizes the license plate of vehicles. There are many applications ranging from complex security systems to common areas and from parking admission to urban traffic control. Automatic license plate recognition (ALPR) has complex characteristics due to diverse effects such as of light and speed. Most of the ALPR systems are built using proprietary tools like Matlab. This paper presents an alternative method of implementing ALPR systems using Free Software including Python and the Open Computer Vision Library.
Automatic License Plate Recognition using OpenCV Editor IJCATR
Automatic License Plate Recognition system is a real time embedded system which automatically recognizes the license plate of vehicles. There are many applications ranging from complex security systems to common areas and from parking admission to urban traffic control. Automatic license plate recognition (ALPR) has complex characteristics due to diverse effects such as of light and speed. Most of the ALPR systems are built using proprietary tools like Matlab. This paper presents an alternative method of implementing ALPR systems using Free Software including Python and the Open Computer Vision Library.
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
Talk from Software Engineering for Machine Learning Workshop (SW4ML) at the Neural Information Processing Systems (NIPS) 2014 conference in Montreal, Canada on 2014-12-13.
Abstract:
Building a real system that incorporates machine learning as a part can be a difficult effort, both in terms of the algorithmic and engineering challenges involved. In this talk I will focus on the engineering side and discuss some of the practical issues we’ve encountered in developing real machine learning systems at Netflix and some of the lessons we’ve learned over time. I will describe our approach for building machine learning systems and how it comes from a desire to balance many different, and sometimes conflicting, requirements such as handling large volumes of data, choosing and adapting good algorithms, keeping recommendations fresh and accurate, remaining responsive to user actions, and also being flexible to accommodate research and experimentation. I will focus on what it takes to put machine learning into a real system that works in a feedback loop with our users and how that imposes different requirements and a different focus than doing machine learning only within a lab environment. I will address the particular software engineering challenges that we’ve faced in running our algorithms at scale in the cloud. I will also mention some simple design patterns that we’ve fond to be useful across a wide variety of machine-learned systems.
How to estimate the cost of a Maximo migration project with a high level of c...Mariano Zelaya Feijoo
This document discusses how to estimate the cost of a high customization Maximo migration project from an older version to 7.5+. It involves:
1) Analyzing areas like architecture, new functionality, data model changes, reporting and integrations.
2) Assessing customization level using IBM's Customization Detection Tool or comparisons to out-of-box versions.
3) Categorizing customizations into components like UI, reports, APIs.
4) Using tools like SLOC counters to measure custom code lines to estimate effort.
5) Collecting metrics on customizations to input into estimation models like SLIM QSM for time/cost figures.
1) Learn about Myplanet's Headless CMS solution using Gatsby Preview and Contentful’s UI Extensions (https://www.contentful.com/resources/serverless/)
2) their Serverless project with IBM - using Apache OpenWhisk (https://www.ibm.com/cloud/functions)
3) how Myplanet got involved with AWS DeepRacer - a fun way to get started with Reinforcement Learning (RL), and their racing experience at re:Invent DeepRacer League (https://reinvent.awsevents.com/learn/deepracer/)
4) their Machine Learning (ML) research related to finding DeepRacer’s ideal line (https://medium.com/myplanet-musings/the-best-path-a-deepracer-can-learn-2a468a3f6d64).
BONUS: Two TED Talks referenced in the intro
5) When ideas have sex | Matt Ridley | Jul 14, 2010 https://www.ted.com/talks/matt_ridley_when_ideas_have_sex
6) Why The Best Leaders Make Love The Top Priority | Matt Tenney | Dec 5, 2019 https://www.youtube.com/watch?v=qCVoohdyI6I
VIDEO: https://youtu.be/ZH1xxmBNx5k
The document provides tips for improving the performance of MATLAB code. It discusses using the profiler to identify bottlenecks, preallocating arrays to avoid dynamic resizing overhead, and how the Just-In-Time accelerator can speed up loops and functions by avoiding interpretation. Preallocating arrays is shown to improve the speed of examples by over 3 times, and is beneficial for cases where the final array size may vary. The JIT accelerator most effectively accelerates code using supported data types, array shapes, and language elements within loops and conditionals.
Similar to Open LLMs: Viable for Production or Low-Quality Toy? (20)
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Open LLMs: Viable for Production or Low-Quality Toy?
1. Open Source LLMs:
Viable for Production or
a Low-Quality Toy?
M Waleed Kadous
Chief Scientist, Anyscale
2. What we’ll cover
- Propietary vs Open LLMs
- Examples of people using Open LLMs in production
- Why people use Open LLMs (with supporting experiments)
- Cost
- Deployment Flexibility
- Fine-tuning options
- Where Open LLMs are lagging
- Quality
- Instruction following
- Missing features
- Function Templates
- Big context windows
2
3. Summary
Open Models are viable in production – people are using them already
It is often possible to get close to commercial LLM quality
Small fine-tuned models outperform giant general models (sometimes)
It is often radically cheaper (e.g. 30x)
Usually takes a bit of extra work e.g. prompt tuning, post-processing
OS Models still missing key features (but being worked on)
3
4. Being used already!
endpoints.anyscale.com – right now, use an open LLM in 2 minutes
4 models:
- Llama 2 7B, 13B, 70B
- Code Llama 34B Instruct
$0.15 per million tokens to $1 per million tokens
Some quotes from our customers
4
5. 5
Merlin
“We use Anyscale Endpoints to power
consumer-facing services that have
reach to millions of users … Anyscale
Endpoints gives us 5x-8x cost
advantages over alternatives, making
it easy for us to make Merlin even more
powerful while staying affordable for
millions of users.”
Some quotes from our customers
Realchar.ai
“Realchar.ai is about delivering
immersive, realistic experiences for our
users, not fighting infrastructure or
upgrading open source models.
Endpoints made it possible for us to
introduce new services in hours, instead
of weeks, and for a fraction of the cost of
proprietary services. It also enables us
to seamlessly personalize user
experiences at scale.”
7. Endless possibilities for AI innovation.
AI app serving & routing
Model training & continuous tuning
Python-native Workspaces
GPU/CPU optimizations
Multi-Cloud, auto-scaling
Anyscale AI Platform
Anyscale Endpoints
LLMs served via API
LLMs fine-tuned via API
Ray AI Libraries Ray Core
Ray Open Source
Serve your LLMs from your Cloud
Fine-tune & customize in your Cloud
Anyscale Private
Endpoints
8. Your options for LLMs
Proprietary
OpenAI, Anthropic, Cohere
Managed Open Source
Anyscale Endpoints, Hugging Face, etc
Self Hosted
Run and maintain your own Open Source models
- Won’t dive into today, more details: walee.dk/selfhost
- TL;DR: Doable but harder than it looks (and maybe more expensive)
- Aviary: easy serving of LLMs using Ray Serve.
8
9. The Most Popular “Open” Models
Llama 2 (99% open)
Released in July
3 sizes: 7B, 13B, 70B
Permissive licence
- Can be used commercially
- Can’t be used to train other models
Code Llama (99% open)
Released in August
Specifically for generating code
3 sizes: 7B, 13B, 34B
3 “tunes”: Base, Python and Instruct
9
Falcon (90% open)
In June, released 7B, 40B
In September, released 180B model
Need a license for managed hosting
Very Dynamic Space
No LLM has been “most popular”
> 2 months
Keep an eye on this!
11. Summary Ranking established in literature.
“insiders say the row brought simmering tensions
between the starkly contrasting pair -- both
rivals for miliband's ear -- to a head.”
A: insiders say the row brought tensions between the
contrasting pair.
B: insiders say the row brought simmering tensions
between miliband's ear.
Comparing quality: Factuality eval
11
13. GPT-4 is Expensive – 30x Llama 2 70b for similar performance
Comparing Cost: Summarization
30x!
13
14. Can mean the difference between a product being viable or not
RayAssistant numbers (approx):
2000 tokens in, 500 tokens out, 1,000 questions/day
GPT-4: 10c per question ~$35,000 (VP approval?)
Llama 2 70b: 0.25c per question ~$900 (Credit card?)
30x is radically cheaper
15. A small fine-tuned open source model
can outperform the best available general model
in some cases
The Power of Fine-tuning
16. Natural Language to SQL
Llama-2-7B GPT-4 (~1.4T?)
general
fine-tuned
86%
3%
78%
19. Retrieval Augmented Generation
Vector DB does a lot of the heavy lifting
LLM mostly just has to synthesize the context
A much easier problem
OS LLMs like Llama 70b work well – don’t see as big a difference vs
GPT-4
19
20.
21. Open model challenges
- Quality
- Instruction following
- Function Templates
- Large Context Windows
No. The Right tool for the Right job
22. High End Proprietary APIs (esp GPT-4 and Claude 2)
are the best quality:
- Better logical & analogical reasoning
- Better “general knowledge”
- More refined answers
Open LLMs are “good enough” for (blog post forthcoming)
- Summarization
- Generation stage of RAG
Quality
23. Hybrids make a lot of sense
For evaluations, we still use GPT-4:
“Is answer A better or answer B better?”
We still send ~5% of queries to GPT-4 for Ray Assistant
(costs 150% more: $900 → $2250)
We still use proprietary LLMs
24. Proprietary LLMs are better at following instructions. OS LLMs
tend to not be as refined
Hypothesis: OS LLMs don’t have the same level of human
review and RLHF.
Instruction following
25. What we asked for: Please give an A or a B.
What we got from GPT-4:
A
What we got from Llama 2 70b:
‘The correct answer is A: those who receive centrelink payments made up half of radio rental's
income last year. Explanation: Summary A accurately summarizes the article sentence by
mentioning that those who receive centrelink payments made up half of radio rental's income
last year. It maintains the same meaning and information as the original sentence. On the other
hand, Summary B is inconsistent with the article sentence. It suggests that the ABC's report
only mentioned that those who receive centrelink payments made up radio rental's income last
year, which is not entirely accurate. The article sentence explicitly states that the ABC reported
that those who receive centrelink payments made up half of radio rental's income last year.
Therefore, Summary A is the better choice’.
Example of instruction following
26. Easily fixed by another LLM
System prompt
You are a helpful assistant that carefully follows instruction. You
provide only answers, no explanations.
User prompt
Determine if the following text says whether the answer is A, B or other.
Only output a single word, either: A B or other
Text: {query}
26
27. Function Templates
Convert the text below into one that calls a Python function.
The function is find_flights(departure_city, arrival_city,time, date,
class)
Convert to the appropriate city code using another function
city_code(str) that returns the city code for a given city.
“Hi. I'd like to book a flight to SF from Boston on Wednesday 20
September in the evening. Business class.”
27
31. Large context windows
Bigger context windows are useful for retrieval augmented generation
From Ray Assistant Blog:
Increasing our number of chunks improves our retrieval and quality scores. We
had to stop testing at 7 chunks since Llama-2-70b's maximum content length is
4096 tokens. This is a compelling reason to invest in extending context size
31
32. Current status
Anthropic: 100K context window
GPT-4: 32K context window (8K by default)
Llama 2: 4K context window
CodeLlama: 16K context window
OSS
- Actively being worked on (eg RoPE)
- Larger context windows also need more GPU resources
- GPT-4 charges 2x for 32K context (vs 8K)
32
33. Status of Open LLM Weaknesses
Quality
- Larger and larger open models (180B now largest)
- Will likely be a moving target (eg Google’s Gemini)
Instruction following
- RLHF is pretty expensive and hard to do – may have to live with this
Expanded context window is actively being developed
- RoPE, YaRN, Hyena
Function templates being actively worked on
- Guidance, JSONFormer, LMQL
33
34. Best place to run Open LLMs?
endpoints.anyscale.com – right now, use an Open LLM in 2 minutes
4 models:
- Llama 2 7B, Llama 2 13B, Llama 70B
- Code Llama 34B Instruct
$0.15 per million tokens to $1 per million tokens
Fine tuning in Preview – super easy
34
35. One more thing …
$50 credit for Anyscale Endpoints if you sign up today
35
36. Summary
Open Models are viable in production – people are using them already
It is often possible to get close to proprietary LLM quality
Small fine-tuned models outperform giant general models (sometimes)
Use RAG for factual information
Open models are often radically cheaper (e.g. 30x)
Usually takes a bit of extra work e.g. prompt tuning, post-processing
Open Models still missing key features (but being worked on)
36
42. Realchar.ai
“Realchar.ai is about delivering immersive, realistic experiences for our users, not fighting infrastructure or upgrading open source
models. Endpoints made it possible for us to introduce new services in hours, instead of weeks, and for a fraction of the cost of
proprietary services. It also enables us to seamlessly personalize user experiences at scale.”
42
43. 43
Here is an info card
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor
incid idunt ut labo re et dolore magna
aliqu Ut enim ad minim veniam, quis
nostrud exercitation
How about a slide with 2 options?
Here is an info card
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor
incid idunt ut labo re et dolore magna
aliqu Ut enim ad minim veniam, quis
nostrud exercitation
44. 44
Here is an info card
Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do
eiusmod tempor incid idunt ut
labo re et dolore magna aliqu Ut
enim ad minim veniam, quis
nostrud exercitation
How about a slide with 3?
Here is an info card
Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do
eiusmod tempor incid idunt ut
labo re et dolore magna aliqu Ut
enim ad minim veniam, quis
nostrud exercitation
Here is an info card
Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do
eiusmod tempor incid idunt ut
labo re et dolore magna aliqu Ut
enim ad minim veniam, quis
nostrud exercitation