local launch small language model of AI.

•システム構築のプロセス評価、改善、策定、開発フレームワークの設計、実装管理、
プリセールスやプロジェクトの立ち上げなど
•ブログ：http://blog.processtune.com
•プロフィール：Tetsuro Takao on
Facebook, Twitter or http://mvp.microsoft.com
•コミュニティ：.NETラボの運営スタッフ
https://dotnetlab.connpass.com/
•Microsoft MVP ：Developer Technologies
[July 2010 – June 2024]

Phi-2：2.7billion
Foundation model
Microsoft Prometheus（GPT-4）：1trillion
for Bing AI ?
Copilot：billions
Pretrained model
(Bing Chat Copilot：1.7billion)
Microsoft & NVIDIA Megatron-Turing NLG：530billion
Microsoft Turing NLG：17billion
Foundation model
Microsoft Research Data&AI
“Generate solutions from a wide range of options”
≠
“Fast inference calculations” and “Computations with low power consumption”
The power required for calculation is small and inference
calculations are fast. A task-specific AI that generates solutions
from a limited range of attributes.
AGI
Artificial
General
Intelligence
To get closer to human thinking using AI orchestration.
The purpose is to exceed the limits of AI (as of 2023, it is said to be “Baby AGI”)
MAI
multimodal
Artifical
Intelligence
Services in the area of advertising
creative production：13billion
Cyber Agent Japanese LLM：6.8billion
Llama 2：7/13/70 billion
Foundation model
GPT-4：1.76trillion
Foundation model
GPT-5：17trillion? (GPT3x100)
Foundation model
Google PaLM2：340 billion
Exact numbers are unknown due to internal document leak
Technical report says
PaLM 2-L（Unicorn）：340billion
PaLM 2-M（Bison）： 147billion
PaLM 2-S（Gecko）：30billion
Google FLAN-UL5：50billion FLAN-UL2：20billion
Google Pathways：540billion
Claude 2：52billion
基盤モデル
BloombergGPT：50billion
Finance
AWS Titan Foundation Model：100billion
Amazon Olympus：2trillion
for Alexa ?
Amazon Alexa model：20billion
AWS Titan Foundation Model
IBM Japanese LLM：8billion
IBM Granite：13billion
Foundation mode;
IBM watsonx Code Assistant for Z：20billion
115Languages support
Such as gradual converting Cobol to Java
Oracle Text Embeddings：355million
Oracle Text Generation：52billion
Oracle Text Summarization： 52billion
Large scale Specialized
Gemini Ultra：540billion Gemini Pro：60billion Gemini Nano-1：1.8billion
Gemini Nano-2：3.25billion
Orca-2：13billion 7billion
Small model trainer model

AppleはAIビデオ圧縮のスタートアップWaveOneを買収したり、元Googleの検索責任者ジョン・ジャナンドレアを雇用したりするなど、AIに投資している。
同様に、Googleは2023年のGoogle IOで、Googleフォトから始まったMagic EraserがMagic Editorにアップグレードされたことを発表しており、すでにPixel 8にはG3チップが搭載されています。
AI
AI AI
会話のリアルタイム翻訳
メールの概要生成
会議記録の生成
写真編集や撮影補助
本人確認が必要
インターネット接続が必要
クラウドコンピューティングの能力に依存
遅延が発生
https://blog.google/products/photos/google-photos-magic-editor-pixel-io-2023/
Magic Editor in Google Photos
Apple unveils M3, M3 Pro, and M3 Max
https://www.apple.com/newsroom/2023/10/apple-unveils-m3-m3-pro-and-m3-
max-the-most-advanced-chips-for-a-personal-computer/
Google Tensor G3
https://blog.google/products/pixel/google-tensor-g3-pixel-8/

出典：生成AIに“視覚”与える学習ライブラリ、自動運転EVベンチャー公開最大700億パラメータの学習済みモデルも
https://www.itmedia.co.jp/news/articles/2309/07/news175.html
Prompt
Llama 2-chat
（Meta）
ELYZA-Llama 2
（ELYZA）
Japanese StableLM
（Stability AI Japan）
OpenLenda
（Turing）
RAG?
映像
信号機認識モデル
タスク特化型モデル
道路状況を言語化
装置をオペレーション
言語による命令
自動運転
重要なポイント：企業における有用なMultimodal AIは、小型なタスク特化型
のモデルのコラボレーションによって組織のビジネス独自のAIを知財化していくこと。

本来AIで実装する意味が無い事柄
（インターネットで検索できる知識、
誰もができる作業の自動化）を迅
速に解決するので、人員削減に利
用される。つまり、最終的には淘汰
される人間が増えるだけで、その効
率化は果たして意味があるのかどう
か？を議論中
その企業の“イズム”に合致した資料、
コンプライアンスなどと現代のパラダイ
ムやバズワードから解を生成
（RAG：Retrieval-Augmented
Generation）
NASAや気象庁、アイビーリーグ、
TOYOTAなど時代をけん引する必
要があるような企業戦略を立案する
補助的な役割など
企業が長年かけて蓄積してきたナ
レッジを体系化して、データから情報
や解を導き出すスキームを創出する
など、データやそのスキームを利用する
人間が変わっても同じ（またはそれ
以上の）価値を出せるようにする
対話型ロボットや画像生成などは、一人では思いつかなかったものを提供し、視野の拡大による生産性の向上が期待でき
ます（≒警告：人知を超えた可能性）
AIプロバイダーが膨大な電力を使って事前学習させたモデルを使って（必要に応じて組み合わせて
）組織独自のナレッジを管理する（企業戦略の補助）
大規模化特化
社外秘
社内外
OpenAI CEO Sam Altman, US House Speaker Mike Johnson discuss AI's
risk
米下院議長とオープンＡＩのＣＥＯ、ＡＩリスク巡り議論
https://www.reuters.com/technology/openai-ceo-sam-altman-us-house-speaker-
mike-johnson-discuss-ais-risk-2024-01-11/
Hippocratic AI raises $50 million seed funding to build models for healthcare
医療業界向け特化した大規模言語モデル(LLM)を開発する"Hippocratic AI"が
Seedで$50Mを調達
https://www.reuters.com/business/healthcare-pharmaceuticals/hippocratic-health-
raises-50-mln-seed-funding-build-ai-model-2023-05-16/
Introducing BloombergGPT, Bloomberg’s 50-billion parameter large
language model, purpose-built from scratch for finance
独自の金融ビジネス特化型AI「BloombergGPT」をBloombergが発表、金融
アナリストの業務や金融ニュースの作成を手助け可能
https://www.bloomberg.com/company/press/bloomberggpt-50-billion-
parameter-llm-tuned-finance/
Small But Mighty: Small Language Models Breakthroughs in the Era
of Dominant Large Language Models
小さくても強力: 支配的な大規模言語モデルの時代における小規模言語モデル
のブレークスルー
https://www.unite.ai/small-but-mighty-small-language-models-breakthroughs-
in-the-era-of-dominant-large-language-models/
多くの企業のニーズはこちら側

出典：【Oracle Cloud ウェビナー】 LLM(大規模言語モデル)などの生成AIで圧倒的なコスト・パフォーマンスを提供するOracle AI インフラストラクチャ
https://speakerdeck.com/oracle4engineer/ocwc_20231004_generativeai?slide=7

図の出典：https://learn.microsoft.com/ja-jp/training/modules/introduction-end-analytics-use-microsoft-fabric/2-explore-analytics-fabric
Microsoft Fabric でのガバナンスとコンプライアンス
https://learn.microsoft.com/ja-jp/fabric/governance/governance-compliance-overview
Delta Parquet 形式
AIモデル作成・管理
カスタムCopilot
Purview Data Loss Protection
Purview Information Protection
秘密度ラベル
DLPポリシー
アタッチ
ポリシー統合
Copilot for
Microsoft 365
Copilot for
Fabric

Copilot for
Fabric
Copilot for
Microsoft 365
秘密度ラベル
DLPポリシー
図の出典：https://learn.microsoft.com/ja-jp/training/modules/introduction-end-analytics-use-microsoft-fabric/2-explore-analytics-fabric
Microsoft Fabric でのガバナンスとコンプライアンス
https://learn.microsoft.com/ja-jp/fabric/governance/governance-compliance-overview

https://learn.microsoft.com/ja-jp/microsoft-365/syntex/automate-document-generation

Information source
is Microsoft 365
data
YES Microsoft 365
account access
Customize for using
company’s data
YES No code
Low code
YES Azure AI Studio
NO Semantic Kernel
programming
NO Copilot Studio
Microsoft Syntex
NO Identity Federation
is complete
YES
with Entra ID
Entra ID controls
access
Azure storages
Azure AI Studio
Semantic Kernel
programming
with non-Entra ID
No code
Low code
YES
Any tools of identity
provider (if
possible)
NO
programming
（LangChain,
Semantic Kernel）
Connector is
existing
Azure AI Studio
NO
Several AI
schemas of data
source in individual
access permission
Multimodal AI
orchestration
programming
工数とトレードオフ推奨
凡例
ステップ1: Microsoft 365 データを活用しますか?
ステップ2: カスタマイズは必要ですか？
開発方法は?

https://learn.microsoft.com/ja-jp/microsoft-365/community/microsoft365-maturity-model--governance-and-compliance

AI service
Orchestration
Models
(Vector Embeddings,
NLP※1)
Vector Memory
Storage
Persistent Layer
Microsoft Copilot（AI orchestration）
（Microsoft 365 Copilot, …※2）
Copilot Studio
Copilot
（ex. GitHub X is Codex + GPT-
4）
Copilot
Microsoft Azure tenant storage
（SharePoint, GitHub, OneLake）
Azure OpenAI Service
Azure AI Studio
Open AI
（GPT3.5, 4）
Azure AI Search
JSONL file / Azure BLOB
Programming area
AI を導入時は、データレイヤー、AI サービス、ベクター埋め込み機能、およびこれらのリソースにアクセスできるアカウントを設計します。
※1：NLP (Natural Language Processing)
ベクトル埋め込みによって文字やテキストを定量
化し、感情分析、機械翻訳、テキスト分類などを
実行します。学習により、常識、言語理解、論
理的推論が可能になります。
※2：Windows Copilot, GitHub Copilot,
Security Copilot, Bing Chat Copilot,
Power Platform Copilot, Dynamics 365
Copilot, Microsoft Syntex, Copilot for
Azure, Fabric Copilot (Copilot for Data
Science and Data Engineering, Copilot for
Data Factory, Copilot for Power BI)
Custom Web UI
Semantic Kernel
Phi-2
（& SLM container）
Cosmos DB
Ollama（後述）
MongoDB

Model
Vector Search
Milvus
vector
DB
Docker
Windows
Gremlin
Vector
schema
Mongo
DB
Docker
Windows
Additional data Pipline
Ollama API call
RAG
Embeddings area

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Model
Phi-2
localllm
Cloud
Workstations
Google Cloud
Project
Google Cloud
Ollama
(docker)
Ubuntu
WSL2
Windows
Vector Search
Milvus
vector
DB
Docker
Windows
Gremlin
Vector
schema
Mongo
DB
Docker
Windows
Additional data
OllamaSharp
Web
App
Graph API
Semantic
Kernel
Kestrel
Windows
Pipline
Ollama API call
RAG
Embeddings area

docker exec -it ollama ollama run phi GPU搭載PCの場合 docker exec -it ollama ollama run --gpu phi

docker exec -it ollama ollama pull llama2 docker exec -it ollama ollama run llama2

https://github.com/open-webui/open-webui

Model
Phi-2
localllm
Cloud
Workstations
Google Cloud
Project
Google Cloud
Ollama
(docker)
Ubuntu
WSL2
Windows
Vector Search
Milvus
vector
DB
Docker
Windows
Gremlin
Vector
schema
Mongo DB
Docker
Windows
Additional data
OllamaSharp
Web
App
Graph API
Semantic
Kernel
Kestrel
Windows
Pipline
Ollama API call
Office apps
Microsoft 365
Entra ID
Copilot for 〇〇
other resources
RAG
Embeddings area

Model
Phi-2
localllm
Cloud
Workstations
Google Cloud
Project
Google Cloud
Ollama
(docker)
Ubuntu
WSL2
Windows
Vector Search
Milvus
vector
DB
Docker
Windows
Gremlin
Vector
schema
Mongo DB
Docker
Windows
Additional data
OllamaSharp
Web
App
Graph
API
Semantic
Kernel
Kestrel
Windows
Pipline
Ollama API call
Office apps
Microsoft 365
Entra ID
Copilot for 〇〇
other resources
RAG
Embeddings area

https://github.com/awaescher/OllamaSharp

Model
Phi-2
localllm
Cloud
Workstations
Google Cloud
Project
Google Cloud
Ollama
(docker)
Ubuntu
WSL2
Windows
Vector Search
Milvus
vector DB
Docker
Windows
Gremlin
Vector
schema
Mongo DB
Docker
Windows
Additional data
OllamaSharp
Web
App
Graph
API
Semantic
Kernel
Kestrel
Windows
Pipline
Office apps
Microsoft 365
Entra ID
Copilot for 〇〇
other resources
RAG
Ollama API call
Embeddings area

docker pull milvusdb/milvus
Ollamaの機能
GPU Acceleration
Effortless Model Management
Automatic Memory Management
Support for a Wide Range of Models
Effortless Setup and Seamless Switching
Accessible Web User Interface (WebUI) Options

Model
ベースとなるWeb
サイト
Skills.Web
BingConnector
->組織の技術ブロ
グなど
Copilot for
Microsoft 365
プラグイン
（旧スキル）
カスタムデータ
ベース

Model
Phi-2
localllm
Cloud
Workstations
Google Cloud
Project
Google Cloud
Ollama
(docker)
Ubuntu
WSL2
Windows
Vector Search
Milvus
vector DB
Docker
Windows
Gremlin
Vector
schema
Mongo DB
Docker
Windows
Additional data
OllamaSharp
Web
App
Graph
API
Semantic
Kernel
Kestrel
Windows
Pipline
Office apps
Microsoft 365
Entra ID
Copilot for 〇〇
other resources
RAG
Ollama API call
Embeddings area 次回は開発部分をご説明します

OpenAI Brand guidelines
https://openai.com/brand
Google Brand Resource Center: Logos list
https://about.google/brand-resource-center/logos-list/
PaLM 2 Technical Report
https://ai.google/static/documents/palm2techreport.pdf
MongoDB Brand Resources
https://www.mongodb.com/brand-resources
Gemini Cheat Sheet: Google’s State-of-the-Art Multimodal Assistant Explained
https://gradientflow.com/gemini-cheat-sheet-googles-state-of-the-art-multimodal-assistant-explained/
Microsoft Research Data&AI
https://www.microsoft.com/en-us/research/group/dataai/
Microsoft Copilot Studio
https://www.microsoft.com/en-us/microsoft-copilot/microsoft-copilot-studio
Azure AI Studio
https://azure.microsoft.com/ja-jp/products/ai-studio
クイックスタート: 独自のデータを使用して Azure OpenAI モデルとチャットする
https://learn.microsoft.com/ja-jp/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython&pivots=programming-language-studio
THE BALANCING ACT OF TRAINING GENERATIVE AI
https://www.nextplatform.com/2023/07/17/the-balancing-act-of-training-generative-ai/
【Oracle Cloud ウェビナー】 LLM(大規模言語モデル)などの生成AIで圧倒的なコスト・パフォーマンスを提供するOracle AI インフラストラクチャ
https://speakerdeck.com/oracle4engineer/ocwc_20231004_generativeai?slide=7
Pretrained Foundational Models in Generative AI
https://docs.oracle.com/en-us/iaas/Content/generative-ai/pretrained-models.htm
VizSeek: AI-based visual search platform deployment on Oracle Cloud
https://docs.oracle.com/en/solutions/vizseek-on-oci/index.html#GUID-8F7CCB28-AAC9-4317-AD90-39246E19D29A

Oracle’s generative AI strategy
https://blogs.oracle.com/ai-and-datascience/post/generative-ai-strategy
Azure AI Search
https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search
Chroma
https://docs.trychroma.com/
Pinecone (C#)
Postgres (C#)
Qdrant (C#)
Redis (C#)
SQLite (C#)
Weaviate (C#) and for Python
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
https://arxiv.org/abs/2201.11903
Orca-2: Teaching Small Language Models How to Reason
https://www.microsoft.com/en-us/research/publication/orca-2-teaching-small-language-models-how-to-reason/
TensorFlow Hub
https://www.tensorflow.org/hub?hl=en
MODEL ZOO
https://pytorch.org/serve/model_zoo.html

How to Deploy Computer Vision Models Offline
https://blog.roboflow.com/deploy-computer-vision-models-offline/
Use metadata to find content in document libraries in Microsoft Syntex
https://learn.microsoft.com/en-us/microsoft-365/syntex/metadata-search
Key concepts - Use Power Automate connectors in Microsoft Copilot Studio (Preview)
https://learn.microsoft.com/en-us/microsoft-copilot-studio/advanced-connectors
Manage your multi-cloud identity infrastructure with Microsoft Entra
https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/manage-your-multi-cloud-identity-infrastructure-with-microsoft/ba-p/3709677
Customize a model with fine-tuning
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython&pivots=programming-language-studio
Microsoft Copilot for Microsoft 365 overview
https://learn.microsoft.com/en-us/microsoft-365-copilot/microsoft-365-copilot-overview
Introducing BloombergGPT, Bloomberg’s 50-billion parameter large language model, purpose-built from scratch for finance
https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/
GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE PRE-TRAINED TRANSFORMERS
https://arxiv.org/pdf/2210.17323.pdf
Hugging Face
https://huggingface.co/
TensorFlow Hub
https://www.tensorflow.org/hub?hl=en
PyTorch Zoo
https://www.microsoft.com/en-us/research/publication/orca-2-teaching-small-language-models-how-to-reason/
Introducing Atlas Vector Search: Build Intelligent Applications with Semantic Search and AI Over Any Type of Data
https://www.mongodb.com/blog/post/introducing-atlas-vector-search-build-intelligent-applications-semantic-search-ai
Microsoft.SemanticKernel.Connectors.MongoDB
https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/Connectors/Connectors.Memory.MongoDB

GPU は不要。localllm を使用してローカル CPU で生成 AI アプリを開発
https://cloud.google.com/blog/ja/products/application-development/new-localllm-lets-you-develop-gen-ai-apps-locally-without-gpus?hl=ja
Docker hub: ollama/quantize
https://hub.docker.com/r/ollama/quantize
GitHub: Ollama WebUI
Open WebUI (Formerly Ollama WebUI)
Introducing Gemini: our largest and most capable AI model
https://blog.google/technology/ai/google-gemini-ai/#sundar-note

local launch small language model of AI.

Recommended

Recommended

More Related Content

Similar to local launch small language model of AI.

Similar to local launch small language model of AI. (20)

More from Takao Tetsuro

More from Takao Tetsuro (20)

local launch small language model of AI.

Editor's Notes