Deep learning text NLP and Spark Collaboration . 한글 딥러닝 Text NLP & Sparkhoondong kim
This slide explain the Deep Learning Text NLP for Korean Language. We will also discuss expansion using Spark in Deep Learning Approach to BigData Scale data.
이 슬라이드에서는 한글의 deep learning Text NLP에 대하여 설명한다. 또한, BigData Scale 데이타에 대한 Deep Learning Approach 에 있어, Spark 를 이용한 확장에 대하여도 다룬다.
Vectorized Processing in a Nutshell. (in Korean)
Presented by Hyoungjun Kim, Gruter CTO and Apache Tajo committer, at DeView 2014, Sep. 30 Seoul Korea.
[223] h base consistent secondary indexingNAVER D2
The document discusses HBase and consistent secondary indexing. It provides an example of an HBase table with movie star data indexed by rowkey. It notes that without secondary indexes, full scans would be needed to query the data by fields other than the rowkey. It also mentions that HIM can be used to implement a secondary indexing system for HBase.
Deep learning text NLP and Spark Collaboration . 한글 딥러닝 Text NLP & Sparkhoondong kim
This slide explain the Deep Learning Text NLP for Korean Language. We will also discuss expansion using Spark in Deep Learning Approach to BigData Scale data.
이 슬라이드에서는 한글의 deep learning Text NLP에 대하여 설명한다. 또한, BigData Scale 데이타에 대한 Deep Learning Approach 에 있어, Spark 를 이용한 확장에 대하여도 다룬다.
Vectorized Processing in a Nutshell. (in Korean)
Presented by Hyoungjun Kim, Gruter CTO and Apache Tajo committer, at DeView 2014, Sep. 30 Seoul Korea.
[223] h base consistent secondary indexingNAVER D2
The document discusses HBase and consistent secondary indexing. It provides an example of an HBase table with movie star data indexed by rowkey. It notes that without secondary indexes, full scans would be needed to query the data by fields other than the rowkey. It also mentions that HIM can be used to implement a secondary indexing system for HBase.
[231] the simplicity of cluster apps with circuitNAVER D2
This document discusses Circuit, a lightweight cluster operating system. It provides a real-time API to view and control hosts, processes, and containers. The API allows traversal and manipulation of the cluster as a unified namespace. The document outlines the API, including command line usage and a Go client package. It then describes how to build a job scheduler service using the Circuit API, including designing the state, handling events, and running jobs on hosts. The vision is for Circuit to enable easy sharing of systems and for any program to take on different roles by executing as a recursive process tree on the cluster.
S2Graph is a large-scale graph database built on Hbase that provides storage and graph APIs to enable real-time breadth-first search on large, constantly changing social graphs. It models social network data, such as users, messages, posts, and relationships between them as vertices and edges in a graph. This allows querying the graph through steps that traverse edges between vertices to retrieve related data in real-time with low latency. Some examples demonstrated include messaging, newsfeed, recommendation, and search applications.
This document provides information about Olivier Duchenne and his experience and qualifications. It summarizes his educational background which includes a Ph.D in Computer Science from ENS Paris/INRIA and a postdoctoral fellowship at Carnegie Mellon University. It also lists his professional experience which includes positions at NEC Labs, Intel, and as a co-founder of Solidware. The document then provides guidelines for machine learning and discusses challenges such as having enough and changing data. It explores the history and reasons for increased use of machine learning in computer vision.
The document discusses Apache NiFi, an open source software project that provides a dataflow solution for managing enterprise data movement and integration. It describes challenges with traditional messaging systems for enterprise dataflow and introduces Apache NiFi as an alternative. NiFi is based on Flow-Based Programming and allows users to visually create dataflows that can transform, route, and process data in real-time. The document includes a demonstration of NiFi and discusses its architecture, features, and future proposals.
WIFI를 이용한 실내 장소 인식 기술에 대해 설명합니다. WIFI 신호 강도를 이용한 지문 기법으로 실내 위치를 추정할 수 있습니다. Android에서 WIFI 스캔을 수행하여 AP 정보와 신호 강도를 수집한 후 유사도 측정 알고리즘을 이용하여 가장 유사한 지문과 매칭하여 장소를 인식합니다. 하지만 실제 배포 환경
[233] level 2 network programming using packet ngin rtosNAVER D2
The document discusses level 2 network programming using PacketNgin RTOS. It begins with introductions and definitions of basic network concepts like local area networks, switches, routers, Ethernet, and the Address Resolution Protocol. It then covers wide area networks, IP routing, and the Internet Control Message Protocol. Transmission Control Protocol and congestion control are also explained. Level 2 network applications that can be built with PacketNgin like load balancing, IPsec, protocol conversion, and IoT gateways are presented. The document concludes with a summary of host versus network node programming and examples of level 2 network applications.
This document summarizes a presentation about Netflix's big data platform and Spark. The key points are:
1. Netflix uses Apache Spark on YARN and Mesos clusters to process batch and streaming data from sources like Cassandra and Kafka.
2. Netflix has contributed improvements to Spark's dynamic resource allocation, predicate pushdown, and support for S3 filesystems.
3. A use case showed Spark outperforming Pig for an iterative job that duplicated and aggregated data in multiple steps.
[212] large scale backend service develpmentNAVER D2
The document discusses developing large scale backend services for a new game using Node.js, Docker, and AWS. It describes problems with scalability and performance, and solutions using Node.js clustering, reverse proxying, and CPU profiling to optimize services. The goals are to build services that are scalable, have high performance, and allow for fast iterations.
Presto generates Java bytecode at runtime to optimize query execution. Key query operations like filtering, projections, joins and aggregations are compiled into efficient Java methods using libraries like ASM and Fastutil. This bytecode generation improves performance by 30% through techniques like compiling row hashing for join lookups directly into machine instructions.
The document discusses the Ethereum platform and its capabilities. It describes Ethereum as a planetary-scale virtual machine that allows anyone to program and trust computations. Contracts written in Solidity can be deployed and executed on the Ethereum Virtual Machine in a trustless manner. It also discusses how the Ethereum network uses gas paid in ether to prevent denial of service attacks and power computations.
[251] implementing deep learning using cu dnnNAVER D2
This document provides an overview of deep learning and implementation on GPU using cuDNN. It begins with a brief history of neural networks and an introduction to common deep learning models like convolutional neural networks. It then discusses implementing deep learning models using cuDNN, including initialization, forward and backward passes for layers like convolution, pooling and fully connected. It covers optimization issues like initialization and speeding up training. Finally, it introduces VUNO-Net, the company's deep learning framework, and discusses its performance, applications and visualization.
This document summarizes lessons learned from developing the Realm Android library. It discusses challenges such as setting up an Android library project, API design, testing, distribution methods, and issues like annotation processing, bytecode weaving, and native code support. Key points covered are how to start a library project, the importance of testing libraries extensively, and distribution options like Bintray.
The document discusses various machine learning clustering algorithms like K-means clustering, DBSCAN, and EM clustering. It also discusses neural network architectures like LSTM, bi-LSTM, and convolutional neural networks. Finally, it presents results from evaluating different chatbot models on various metrics like validation score.
The document discusses challenges with using reinforcement learning for robotics. While simulations allow fast training of agents, there is often a "reality gap" when transferring learning to real robots. Other approaches like imitation learning and self-supervised learning can be safer alternatives that don't require trial-and-error. To better apply reinforcement learning, robots may need model-based approaches that learn forward models of the world, as well as techniques like active localization that allow robots to gather targeted information through interactive perception. Closing the reality gap will require finding ways to better match simulations to reality or allow robots to learn from real-world experiences.
[243] Deep Learning to help student’s Deep LearningNAVER D2
This document describes research on using deep learning to predict student performance in massive open online courses (MOOCs). It introduces GritNet, a model that takes raw student activity data as input and predicts outcomes like course graduation without feature engineering. GritNet outperforms baselines by more than 5% in predicting graduation. The document also describes how GritNet can be adapted in an unsupervised way to new courses using pseudo-labels, improving predictions in the first few weeks. Overall, GritNet is presented as the state-of-the-art for student prediction and can be transferred across courses without labels.
[234]Fast & Accurate Data Annotation Pipeline for AI applicationsNAVER D2
This document provides a summary of new datasets and papers related to computer vision tasks including object detection, image matting, person pose estimation, pedestrian detection, and person instance segmentation. A total of 8 papers and their associated datasets are listed with brief descriptions of the core contributions or techniques developed in each.
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지NAVER D2
This document presents a formula for calculating the loss function J(θ) in machine learning models. The formula averages the negative log likelihood of the predicted probabilities being correct over all samples S, and includes a regularization term λ that penalizes predicted embeddings being dissimilar from actual embeddings. It also defines the cosine similarity term used in the regularization.
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기NAVER D2
The document discusses running a TensorFlow Serving (TFS) container using Docker. It shows commands to:
1. Pull the TFS Docker image from a repository
2. Define a script to configure and run the TFS container, specifying the model path, name, and port mapping
3. Run the script to start the TFS container exposing port 13377
The document discusses linear algebra concepts including:
- Representing a system of linear equations as a matrix equation Ax = b where A is a coefficient matrix, x is a vector of unknowns, and b is a vector of constants.
- Solving for the vector x that satisfies the matrix equation using linear algebra techniques such as row reduction.
- Examples of matrix equations and their component vectors are shown.
This document describes the steps to convert a TensorFlow model to a TensorRT engine for inference. It includes steps to parse the model, optimize it, generate a runtime engine, serialize and deserialize the engine, as well as perform inference using the engine. It also provides code snippets for a PReLU plugin implementation in C++.
The document discusses machine reading comprehension (MRC) techniques for question answering (QA) systems, comparing search-based and natural language processing (NLP)-based approaches. It covers key milestones in the development of extractive QA models using NLP, from early sentence-level models to current state-of-the-art techniques like cross-attention, self-attention, and transfer learning. It notes the speed and scalability benefits of combining search and reading methods for QA.
6. Supercomputing Today
- Clusters of commodity (Intel/AMD x86) HW & IBM bluegene
- High-speed, low-latency interconnect (Infiniband)
- Coprocessors (GPU, Xeon Phi)
- Storage area networks (SANs) & local disks for temporary files
Hardware ecosystem
7. High-end Computing & Industry
HPC
ENIAC (1946) 최초 Business Computer (1951)
CERN 통신 시스템(1989)
FTP, Gopher, NNTP (1969)
NCSA Mosaic (1993)
World Wide Web
dot-com bubble (1997~)
8. High-end Computing & Industry
HPC
Active Data Repository (1998)
BigData/Hadoop (2005)
Grid Computing (1998)
Amazon EC2 (2006)
14. Hadoop’s Uncomfortable Fit in HPC
Why?
1. Hadoop is an invader!
2. Running Java applications (Hadoop) on supercomputer looks funny
3. Hadoop reinvents HPC technologies poorly
4. HDFS is very slow and very obtuse
Source: G. K. Lockwood, HPCwire Image Source: solar-citrus.tumblr.com
15. HPC Ecosystem
Application level
Middleware
System software
Cluster Hardware
Applications and Community Codes
FORTRAN, C, C++, and IDEs
Domain Specific
Libraries
programming model
Lustre (Parallel
File System)
MPI/OpenMP/
OpenCL/CUDA
Batch Scheduler
(SLURM,SGE)
Linux OS
Infiniband
/Ethernet
SAN/Local
Storage
X86 Servers+
Coprocessors
16. BigData Ecosystem
Application level
Middleware
System software
Cluster Hardware
Mahout, R, etc
MapReduce
programming model
Hadoop File System
HBase
Big Table
Linux OS
Ethernet
Local
Storage
X86 Servers
zookeeper Spark
Hive Pig, etc ...
Virtual Machines and Cloud
17. ADR vs MapReduce
- Data Intensive Computing Framework
- Scientific Datasets 처리
1998 Active Data Repository (ADR)
Processing Remotely Sensed Data
NOAA Tiros-N
w/ AVHRR sensor
AVHRR Level 1 DataAVHRR Level 1 Data
• As the TIROS-N satellite orbits, the
Advanced Very High Resolution Radiometer (AVHRR)
sensor scans perpendicular to the satellite’s track.
• At regular intervals along a scan line measurements
are gathered to form an instantaneous field of view
(IFOV).
• Scan lines are aggregated into Level 1 data sets.
A single file of Global Area
Coverage (GAC) data
represents:
• ~one full earth orbit.
• ~110 minutes.
• ~40 megabytes.
• ~15,000 scan lines.
One scan line is 409 IFOV’s
Source: Chaos project: U.Maryland, College Park
18. DataCutter vs Dryad, Spark
2000 DataCutter
- Workflow 지원 Component Framework
- Pipelined components (filter)
- Stream based communication
class MyFilter : public Filter_Base {
public:
int init( int argc, char * argv[] )
{ ... };
int process( stream_t st[] ) { ... };
int finalize( void ) { ... };
}
Source: Chaos project: U.Maryland, College Park
19. HPC & BigData
HPC 와 BigData 의 공통 해결 과제
1. High-performance interconnect technologies
2. Energy efficient circuit, power, and cooling technologies
3. Power and Failure-aware resilient scalable system software
4. Advanced memory technologies
5. Data management- volume, velocity, and diversity of data
6. Programming models
7. Scale-up? Scale-out?
20. HPC for BigData
Infiniband/RDMA for Hadoop
- Wide adoption of RDMA
- Message Passing Interface (MPI) for HPC
- Parallel File Systems
- Lustre
- GPFS
- Delivering excellent performance
- (latency, bandwidth and CPU Utilization)
- Delivering excellent scalability
21. HPC for BigData
Infiniband/RDMA for Hadoop
- Socket 기반의 Java RPC
- HPC는 고속 lossless DMA 통신 사용
Application
Socket
Transport
NIC Driver
NIC
Application
Socket
Transport
NIC Driver
NIC
Buffer
Buffer
Buffer
Buffer
Buffer
Buffer
Buffer
Buffer
Buffer
Buffer
Application
Socket
Transport
NIC Driver
NIC
Application
Socket
Transport
NIC Driver
NIC
Buffer
Buffer
Buffer
Buffer
Source: D.K.Panda, ISCA15 tutorial
22. HPC for BigData
Lustre File System for Hadoop
- Intel HPC distribution for Hadoop
- Map task의 중간 결과물을 Lustre FS 에 저장
Source: Intel
23. HPC for BigData
RDMA 기반 In-Memory Merge
Source: D.K. Panda, ISCA 15 Tutorial
Map 1 Output (sorted)
Map 2 Output (sorted)
Reducer
Map 1 Output (sorted)
Map 2 Output (sorted)
Reducer
1. Sorted map output is divided into small pieces based on shuffle packet size
and size of the key-value pairs
2. As small portion of data is shuffled, merge can take place in memory
24. HPC for BigData
Scale out vs Scale up
1. Scale-out
2. Scale-up
Performance
Performance
Scale-up is more cost-effective
(Microsoft – SOCC 2013)
48. - 최근 데이터 액세스 분포를 사용하여 Hash Key 경계를 분할
Locality-aware Fair Scheduler
In-Memory Hash Key Boundary Management
14 150 1 2 3 4 5 6 7 8 9 10 11 12 13
49. Locality-aware Fair Scheduler
In-Memory Hash Key Boundary Management
0
0.2
0.4
0.6
0.8
1
0
6
12
18
24
30
36
42
48
54
60
66
72
78
84
90
96
102
108
114
120
126
132
138
CDF
server 1: [0,35)
server 2: [35,47)
server 3: [47,91)
server 4: [91,102)
server 5: [102,140)
new task T1 (HK=43)
new task T2 (HK=69)
50. EclipseMR 에서의 MapReduce
Job Scheduler
A
A
B
C
D
E
F
B
C
D
E
F
5
5
11
15
18
26
38
39
47
57
55
EclipseMR
MR application 1 1
2
3
4
4
5
5
3
6
reduce tasks
6
Serv
er
Hash Key Range
A [57~5)
B [5,11)
C [11,18)
D [18,39)
E [39,48)
F [48,57)
- accesses a file (hash key=38)
: data block 5 & 56
53. Ongoing Works
MachineLearning on EclipseMR
- GPU ML 패키지와 EclipseMR의 통합
Hadoop에서 GPU를
쓰는건 너무 어려워요.
WORLD'S LARGEST ARTIFICIAL
NEURAL NETWORKS WITH GPU
Source: Stanford Al Lab
54. Ongoing Works
Cassandra/NoSQL
- Cassandra 와 EclipseMR의 통합
- NewSQL on EclipseMR
DHT 인메모리 캐싱/RDMA 를 활용한
고속 NewSQL 엔진 개발
14
15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
{13,14,15} {0,1}
{2,3,4}
{5,6,7}
{8,9,10,11,12}