의료의 미래, 디지털 헬스케어

Professor, SAHIST, Sungkyunkwan University

Director, Digital Healthcare Institute 

Yoon Sup Choi, Ph.D.
“It's in Apple's DNA that technology alone is not enough. 

It's technology married with liberal arts.”
The Convergence of IT, BT and Medicine
최윤섭 지음
의료인공지능
표지디자인•최승협
컴퓨터
털 헬
치를 만드는 것을 화두로
기업가, 엔젤투자가, 에반
의 대표적인 전문가로, 활
이 분야를 처음 소개한 장
포항공과대학교에서 컴
동 대학원 시스템생명공
취득하였다. 스탠퍼드대
조교수, KT 종합기술원 컨
구원 연구조교수 등을 거
저널에 10여 편의 논문을
국내 최초로 디지털 헬스
윤섭 디지털 헬스케어 연
국내 유일의 헬스케어 스
어 파트너스’의 공동 창업
스타트업을 의료 전문가
관대학교 디지털헬스학과
뷰노, 직토, 3billion, 서지
소울링, 메디히어, 모바일
자문을 맡아 한국에서도
고 있다. 국내 최초의 디
케어 이노베이션』에 활발
을 연재하고 있다. 저서로
와 『그렇게 나는 스스로
•블로그_ http://www
•페이스북_ https://w
•이메일_ yoonsup.c
최윤섭
의료 인공지능은 보수적인 의료 시스템을 재편할 혁신을 일으키고 있다. 의료 인공지능의 빠른 발전과
광범위한 영향은 전문화, 세분화되며 발전해 온 현대 의료 전문가들이 이해하기가 어려우며, 어디서부
터 공부해야 할지도 막연하다. 이런 상황에서 의료 인공지능의 개념과 적용, 그리고 의사와의 관계를 쉽
게 풀어내는 이 책은 좋은 길라잡이가 될 것이다. 특히 미래의 주역이 될 의학도와 젊은 의료인에게 유용
한 소개서이다.
━ 서준범, 서울아산병원 영상의학과 교수, 의료영상인공지능사업단장
인공지능이 의료의 패러다임을 크게 바꿀 것이라는 것에 동의하지 않는 사람은 거의 없다. 하지만 인공
지능이 처리해야 할 의료의 난제는 많으며 그 해결 방안도 천차만별이다. 흔히 생각하는 만병통치약 같
은 의료 인공지능은 존재하지 않는다. 이 책은 다양한 의료 인공지능의 개발, 활용 및 가능성을 균형 있
게 분석하고 있다. 인공지능을 도입하려는 의료인, 생소한 의료 영역에 도전할 인공지능 연구자 모두에
게 일독을 권한다.
━ 정지훈, 경희사이버대 미디어커뮤니케이션학과 선임강의교수, 의사
서울의대 기초의학교육을 책임지고 있는 교수의 입장에서, 산업화 이후 변하지 않은 현재의 의학 교육
으로는 격변하는 인공지능 시대에 의대생을 대비시키지 못한다는 한계를 절실히 느낀다. 저와 함께 의
대 인공지능 교육을 개척하고 있는 최윤섭 소장의 전문적 분석과 미래 지향적 안목이 담긴 책이다. 인공
지능이라는 미래를 대비할 의대생과 교수, 그리고 의대 진학을 고민하는 학생과 학부모에게 추천한다.
━ 최형진, 서울대학교 의과대학 해부학교실 교수, 내과 전문의
최근 의료 인공지능의 도입에 대해서 극단적인 시각과 태도가 공존하고 있다. 이 책은 다양한 사례와 깊
은 통찰을 통해 의료 인공지능의 현황과 미래에 대해 균형적인 시각을 제공하여, 인공지능이 의료에 본
격적으로 도입되기 위한 토론의 장을 마련한다. 의료 인공지능이 일상화된 10년 후 돌아보았을 때, 이 책
이 그런 시대를 이끄는 길라잡이 역할을 하였음을 확인할 수 있기를 기대한다.
━ 정규환, 뷰노 CTO
의료 인공지능은 다른 분야 인공지능보다 더 본질적인 이해가 필요하다. 단순히 인간의 일을 대신하는
수준을 넘어 의학의 패러다임을 데이터 기반으로 변화시키기 때문이다. 따라서 인공지능을 균형있게 이
해하고, 어떻게 의사와 환자에게 도움을 줄 수 있을지 깊은 고민이 필요하다. 세계적으로 일어나고 있는
이러한 노력의 결과물을 집대성한 이 책이 반가운 이유다.
━ 백승욱, 루닛 대표
의료 인공지능의 최신 동향뿐만 아니라, 의의와 한계, 전망, 그리고 다양한 생각거리까지 주는 책이다.
논쟁이 되는 여러 이슈에 대해서도 저자는 자신의 시각을 명확한 근거에 기반하여 설득력 있게 제시하
고 있다. 개인적으로는 이 책을 대학원 수업 교재로 활용하려 한다.
━ 신수용, 성균관대학교 디지털헬스학과 교수
최윤섭지음
의료인공지능
값 20,000원
ISBN 979-11-86269-99-2
최초의 책!
계 안팎에서 제기
고 있다. 현재 의
분 커버했다고 자
것인가, 어느 진료
제하고 효용과 안
누가 지는가, 의학
쉬운 언어로 깊이
들이 의료 인공지
적인 용어를 최대
서 다른 곳에서 접
를 접하게 될 것
너무나 빨리 발전
책에서 제시하는
술을 공부하며, 앞
란다.
의사 면허를 취득
저가 도움되면 좋
를 불러일으킬 것
화를 일으킬 수도
슈에 제대로 대응
분은 의학 교육의
예비 의사들은 샌
지능과 함께하는
레이닝 방식도 이
전에 진료실과 수
겠지만, 여러분들
도생하는 수밖에
미래의료학자 최윤섭 박사가 제시하는
의료 인공지능의 현재와 미래
의료 딥러닝과 IBM 왓슨의 현주소
인공지능은 의사를 대체하는가
값 20,000원
ISBN 979-11-86269-99-2
레이닝 방식도 이
전에 진료실과 수
겠지만, 여러분들
도생하는 수밖에
소울링, 메디히어, 모바일
자문을 맡아 한국에서도
고 있다. 국내 최초의 디
케어 이노베이션』에 활발
을 연재하고 있다. 저서로
와 『그렇게 나는 스스로
•블로그_ http://www
•페이스북_ https://w
•이메일_ yoonsup.c
Inevitable Tsunami of Change
대한영상의학회 춘계학술대회 2017.6
“Technology will replace 80% of doctors”
https://www.youtube.com/watch?time_continue=70&v=2HMPRXstSvQ
“영상의학과 전문의를 양성하는 것을 당장 그만둬야 한다.
5년 안에 딥러닝이 영상의학과 전문의를 능가할 것은 자명하다.”
Hinton on Radiology
https://rockhealth.com/reports/2018-year-end-funding-report-is-digital-health-in-a-bubble/
•2018년에는 $8.1B 가 투자되며 역대 최대 규모를 또 한 번 갱신 (전년 대비 42.% 증가)

•총 368개의 딜 (전년 359 대비 소폭 증가): 개별 딜의 규모가 커졌음

•전체 딜의 절반이 seed 혹은 series A 투자였음

•‘초기 기업들이 역대 최고로 큰 규모의 투자를’, ‘역대 가장 자주’ 받고 있음
2010 2011 2012 2013 2014 2015 2016 2017 2018
Q1 Q2 Q3 Q4
153
283
476
647
608
568
684
851
765
FUNDING SNAPSHOT: YEAR OVER YEAR
5
Deal Count
$1.4B
$1.7B
$1.7B
$627M
$603M$459M
$8.2B
$6.2B
$7.1B
$2.9B
$2.3B$2.0B
$1.2B
$11.7B
$2.3B
Funding surpassed 2017 numbers by almost $3B, making 2018 the fourth consecutive increase in capital investment and
largest since we began tracking digital health funding in 2010. Deal volume decreased from Q3 to Q4, but deal sizes spiked,
with $3B invested in Q4 alone. Average deal size in 2018 was $21M, a $6M increase from 2017.
$3.0B
$14.6B
DEALS & FUNDING INVESTORS SEGMENT DETAIL
Source: StartUp Health Insights | startuphealth.com/insights Note: Report based on public data through 12/31/18 on seed (incl. accelerator), venture, corporate venture, and private equity funding only. © 2019 StartUp Health LLC
•글로벌 투자 추이를 보더라도, 2018년 역대 최대 규모: $14.6B

•2015년 이후 4년 연속 증가 중
https://hq.startuphealth.com/posts/startup-healths-2018-insights-funding-report-a-record-year-for-digital-health
5%
8%
24%
27%
36%
Life Science & Health
Mobile
Enterprise & Data
Consumer
Commerce
9%
13%
23%
24%
31%
Life Science & Health
Consumer
Enterprise
Data & AI
Others
2014 2015
Investment of GoogleVentures in 2014-2015
startuphealth.com/reports
Firm 2017 YTD Deals Stage
Early Mid Late
1 7
1 7
2 6
2 6
3 5
3 5
3 5
3 5
THE TOP INVESTORS OF 2017 YTD
We are seeing huge strides in new investors pouring money into the digital health market, however all the top 10 investors of
2017 year to date are either maintaining or increasing their investment activity.
Source: StartUp Health Insights | startuphealth.com/insights Note: Report based on public data on seed, venture, corporate venture and private equity funding only. © 2017 StartUp Health LLC
DEALS & FUNDING GEOGRAPHY INVESTORSMOONSHOTS
20
•Google Ventures와 Khosla Ventures가 각각 7개로 공동 1위, 

•GE Ventures와 Accel Partners가 6건으로 공동 2위를 기록

•GV 가 투자한 기업

•virtual fitness membership network를 만드는 뉴욕의 ClassPass

•Remote clinical trial 회사인 Science 37

•Digital specialty prescribing platform ZappRx 등에 투자.

•Khosla Ventures 가 투자한 기업

•single-molecule 검사 장비를 만드는 TwoPoreGuys

•Mabu라는 AI-powered patient engagement robot 을 만드는 Catalia Health에 투자.
•최근 3년 동안 Merck, J&J, GSK 등의 제약사들의 디지털 헬스케어 분야 투자 급증

•2015-2016년 총 22건의 deal (=2010-2014년의 5년간 투자 건수와 동일)

•Merck 가 가장 활발: 2009년부터 Global Health Innovation Fund 를 통해 24건 투자 ($5-7M)

•GSK 의 경우 2014년부터 6건 (via VC arm, SR One): including Propeller Health
헬스케어
넓은 의미의 건강 관리에는 해당되지만, 

디지털 기술이 적용되지 않고, 전문 의료 영역도 아닌 것

예) 운동, 영양, 수면
디지털 헬스케어
건강 관리 중에 디지털 기술이 사용되는 것

예) 사물인터넷, 인공지능, 3D 프린터, VR/AR
모바일 헬스케어
디지털 헬스케어 중 

모바일 기술이 사용되는 것

예) 스마트폰, 사물인터넷, SNS
개인 유전정보분석
암유전체, 질병위험도, 

보인자, 약물 민감도
예) 웰니스, 조상 분석
헬스케어 관련 분야 구성도(ver 0.6)
의료
질병 예방, 치료, 처방, 관리 

등 전문 의료 영역
원격의료
원격 환자 모니터링
원격진료
전화, 화상, 판독
디지털 치료제
당뇨 예방 앱
중독 치료 앱

ADHD 치료게임
EDITORIAL OPEN
Digital medicine, on its way to being just plain medicine
npj Digital Medicine (2018)1:20175 ; doi:10.1038/
s41746-017-0005-1
There are already nearly 30,000 peer-reviewed English-language
scientific journals, producing an estimated 2.5 million articles a year.1
So why another, and why one focused specifically on digital
medicine?
To answer that question, we need to begin by defining what
“digital medicine” means: using digital tools to upgrade the
practice of medicine to one that is high-definition and far more
individualized. It encompasses our ability to digitize human beings
using biosensors that track our complex physiologic systems, but
also the means to process the vast data generated via algorithms,
cloud computing, and artificial intelligence. It has the potential to
democratize medicine, with smartphones as the hub, enabling
each individual to generate their own real world data and being
far more engaged with their health. Add to this new imaging
tools, mobile device laboratory capabilities, end-to-end digital
clinical trials, telemedicine, and one can see there is a remarkable
array of transformative technology which lays the groundwork for
a new form of healthcare.
As is obvious by its definition, the far-reaching scope of digital
medicine straddles many and widely varied expertise. Computer
scientists, healthcare providers, engineers, behavioral scientists,
ethicists, clinical researchers, and epidemiologists are just some of
the backgrounds necessary to move the field forward. But to truly
accelerate the development of digital medicine solutions in health
requires the collaborative and thoughtful interaction between
individuals from several, if not most of these specialties. That is the
primary goal of npj Digital Medicine: to serve as a cross-cutting
resource for everyone interested in this area, fostering collabora-
tions and accelerating its advancement.
Current systems of healthcare face multiple insurmountable
challenges. Patients are not receiving the kind of care they want
and need, caregivers are dissatisfied with their role, and in most
countries, especially the United States, the cost of care is
unsustainable. We are confident that the development of new
systems of care that take full advantage of the many capabilities
that digital innovations bring can address all of these major issues.
Researchers too, can take advantage of these leading-edge
technologies as they enable clinical research to break free of the
confines of the academic medical center and be brought into the
real world of participants’ lives. The continuous capture of multiple
interconnected streams of data will allow for a much deeper
refinement of our understanding and definition of most pheno-
types, with the discovery of novel signals in these enormous data
sets made possible only through the use of machine learning.
Our enthusiasm for the future of digital medicine is tempered by
the recognition that presently too much of the publicized work in
this field is characterized by irrational exuberance and excessive
hype. Many technologies have yet to be formally studied in a
clinical setting, and for those that have, too many began and
ended with an under-powered pilot program. In addition, there are
more than a few examples of digital “snake oil” with substantial
uptake prior to their eventual discrediting.2
Both of these practices
are barriers to advancing the field of digital medicine.
Our vision for npj Digital Medicine is to provide a reliable,
evidence-based forum for all clinicians, researchers, and even
patients, curious about how digital technologies can transform
every aspect of health management and care. Being open source,
as all medical research should be, allows for the broadest possible
dissemination, which we will strongly encourage, including
through advocating for the publication of preprints
And finally, quite paradoxically, we hope that npj Digital
Medicine is so successful that in the coming years there will no
longer be a need for this journal, or any journal specifically
focused on digital medicine. Because if we are able to meet our
primary goal of accelerating the advancement of digital medicine,
then soon, we will just be calling it medicine. And there are
already several excellent journals for that.
ACKNOWLEDGEMENTS
Supported by the National Institutes of Health (NIH)/National Center for Advancing
Translational Sciences grant UL1TR001114 and a grant from the Qualcomm Foundation.
ADDITIONAL INFORMATION
Competing interests:The authors declare no competing financial interests.
Publisher's note:Springer Nature remains neutral with regard to jurisdictional claims
in published maps and institutional affiliations.
Change history:The original version of this Article had an incorrect Article number
of 5 and an incorrect Publication year of 2017. These errors have now been corrected
in the PDF and HTML versions of the Article.
Steven R. Steinhubl1
and Eric J. Topol1
1
Scripps Translational Science Institute, 3344 North Torrey Pines
Court, Suite 300, La Jolla, CA 92037, USA
Correspondence: Steven R. Steinhubl (steinhub@scripps.edu) or
Eric J. Topol (etopol@scripps.edu)
REFERENCES
1. Ware, M. & Mabe, M. The STM report: an overview of scientific and scholarly journal
publishing 2015 [updated March]. http://digitalcommons.unl.edu/scholcom/92017
(2015).
2. Plante, T. B., Urrea, B. & MacFarlane, Z. T. et al. Validation of the instant blood
pressure smartphone App. JAMA Intern. Med. 176, 700–702 (2016).
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this license, visit http://creativecommons.
org/licenses/by/4.0/.
© The Author(s) 2018
Received: 19 October 2017 Accepted: 25 October 2017
www.nature.com/npjdigitalmed
Published in partnership with the Scripps Translational Science Institute
디지털 의료의 미래는?

일상적인 의료가 되는 것
What is most important factor in digital medicine?
“Data! Data! Data!” he cried.“I can’t
make bricks without clay!”
- Sherlock Holmes,“The Adventure of the Copper Beeches”
새로운 데이터가

새로운 방식으로

새로운 주체에 의해

측정, 저장, 통합, 분석된다.
데이터의 종류

데이터의 질적/양적 측면
웨어러블 기기

스마트폰

유전 정보 분석

인공지능

SNS
사용자/환자

대중
디지털 헬스케어의 3단계
•Step 1. 데이터의 측정

•Step 2. 데이터의 통합

•Step 3. 데이터의 분석
Digital Healthcare Industry Landscape
Data Measurement Data Integration Data Interpretation Treatment
Smartphone Gadget/Apps
DNA
Artificial Intelligence
2nd Opinion
Wearables / IoT
(ver. 3)
EMR/EHR 3D Printer
Counseling
Data Platform
Accelerator/early-VC
Telemedicine
Device
On Demand (O2O)
VR
Digital Healthcare Institute
Diretor, Yoon Sup Choi, Ph.D.
yoonsup.choi@gmail.com
Data Measurement Data Integration Data Interpretation Treatment
Smartphone Gadget/Apps
DNA
Artificial Intelligence
2nd Opinion
Device
On Demand (O2O)
Wearables / IoT
Digital Healthcare Institute
Diretor, Yoon Sup Choi, Ph.D.
yoonsup.choi@gmail.com
EMR/EHR 3D Printer
Counseling
Data Platform
Accelerator/early-VC
VR
Telemedicine
Digital Healthcare Industry Landscape (ver. 3)
Step 1. 데이터의 측정
Smartphone: the origin of healthcare innovation
Smartphone: the origin of healthcare innovation
2013?
The election of Pope Benedict
The Election of Pope Francis
The Election of Pope Francis
The Election of Pope Benedict
Sci Transl Med 2015
검이경 더마토스코프 안과질환 피부암
기생충 호흡기 심전도 수면
식단 활동량 발열 생리/임신
CellScope’s iPhone-enabled otoscope
CellScope’s iPhone-enabled otoscope
한국에서는 불법한국에서는 불법
“왼쪽 귀에 대한 비디오를 보면 고막 뒤
에 액체가 보인다. 고막은 특별히 부어 있
거나 모양이 이상하지는 않다. 그러므로 심
한 염증이 있어보이지는 않는다.
네가 스쿠버 다이빙 하면서 압력평형에 어
려움을 느꼈다는 것을 감안한다면, 고막의
움직임을 테스트 할 수 있는 의사에게 직
접 진찰 받는 것도 좋겠다. ...”
한국에서는 불법한국에서는 불법
First Derm
한국에서는 불법한국에서는 불법
SpiroSmart: spirometer using iPhone
AliveCor Heart Monitor (Kardia)
AliveCor Heart Monitor (Kardia)
“심장박동은 안정적이기 때문에, 

당장 병원에 갈 필요는 없겠습니다. 

그래도 이상이 있으면 전문의에게 

진료를 받아보세요. “
한국에서는 불법한국에서는 불법
2015년 2017년
30분-1시간 정도 일상적인 코골이가 있음

이걸 어떻게 믿나?
녹음을 해줌. 

PGS와의 analytical validity의 증명?
녹음을 해줌. 

PGS와의 analytical validity의 증명?
PGHD
Patients Generated Health Data
Wearable Devices
http://www.rolls-royce.com/about/our-technology/enabling-technologies/engine-health-management.aspx#sense
250 sensors to monitor the “health” of the GE turbines
Fig 1. What can consumer wearables do? Heart rate can be measured with an oximeter built into a ring [3], muscle activity with an electromyographi
sensor embedded into clothing [4], stress with an electodermal sensor incorporated into a wristband [5], and physical activity or sleep patterns via an
accelerometer in a watch [6,7]. In addition, a female’s most fertile period can be identified with detailed body temperature tracking [8], while levels of me
attention can be monitored with a small number of non-gelled electroencephalogram (EEG) electrodes [9]. Levels of social interaction (also known to a
PLOS Medicine 2016
Hype or Hope?
Source: Gartner
Fitbit
Apple Watch
애플워치4: 심전도, 부정맥, 낙상 측정
FDA 의료기기 인허가
•De Novo 의료기기로 인허가 받음 (새로운 종류의 의료기기)

•9월에 발표하였으나, 부정맥 관련 기능은 12월에 활성화

•미국 애플워치에서만 가능하고, 한국은안 됨 (미국에서 구매한 경우, 한국 앱스토어 ID로 가능)
•American College of Cardiology’s 68th Annual Scientific Session

•전체 임상 참여자 중에서 irregular pusle notification 받은 사람은 불과 0.5%

•애플워치와 ECG patch를 동시에 사용한 결과 71%의 positive predictive value. 

•irregular pusle notification 받은 사람 중 84%가 그 시점에 심방세동을 가짐

•f/u으로 그 다음 일주일 동안 ECG patch를 착용한 사람 중 34%가 심방세동을 발견

•Irregular pusle notification 받은 사람 중에 실제로 병원에 간 사람은 57% (전체 환자군의 0.3%)
It’s already here.
Google’s Smart Contact Lens
Ingestible Sensor, Proteus Digital Health
Ingestible Sensor, Proteus Digital Health
헬스케어 웨어러블 딜레마
지속 사용성 사용자 효용
당뇨병 

패러독스
NO
행동을 변화시켜야 

하는가?
재정적

효용
의료적 

효용
오락적

효용
“돈을 준다”“병이 낫는다” “재미있다”
정확성
정확성만으로

계속 사용하지는 

않는다
의료적 사용을 위해서는

정확해야 한다
“계속 사용하는가” “쓰면 뭐가 좋은가”
YES
효용이 번거로움을

크게 능가하는가?
NO
YES
일단 사용을 해야만 

효용을 기대할 수 있다
“쓰면 좋은 걸 알지만,

그래도 안 쓴다”
최윤섭디지털헬스케어연구소

소장 최윤섭, PhD

yoonsup.choi@gmail.com

www.yoonsupchoi.com
심미적

효용
“예쁘다”
사회적

효용
“친구를 사귄다”
편의적

효용
“결제가 쉽다”
보험사가 참고하려면

정확해야 한다
n
n-
ng
n
es
h-
n
ne
ne
ct
d
n-
at
s-
or
e,
ts
n
a-
gs
d
ch
Nat Biotech 2015
Personal Genome Analysis
가타카 (1997)
가타카 (1997)
2003 Human Genome Project 13 years (676 weeks) $2,700,000,000
2007 Dr. CraigVenter’s genome 4 years (208 weeks) $100,000,000
2008 Dr. James Watson’s genome 4 months (16 weeks) $1,000,000
2009 (Nature Biotechnology) 4 weeks $48,000
2013 1-2 weeks ~$5,000
The $1000 Genome is Already Here!
• 2017년 1월 NovaSeq 5000, 6000 발표

• 몇년 내로 $100로 WES 를 실현하겠다고 공언

• 2일에 60명의 WES 가능 (한 명당 한 시간 이하)
Results within 6-8 weeksA little spit is all it takes!
DTC Genetic TestingDirect-To-Consumer
Health Risks
Health Risks
Health Risks
Drug Response
Traits
음주 후 얼굴이 붉어지는가
쓴 맛을 감지할 수 있나
귀지 유형
눈 색깔
곱슬머리 여부
유당 분해 능력
말라리아 저항성
대머리가 될 가능성
근육 퍼포먼스
혈액형
노로바이러스 저항성
HIV 저항성
흡연 중독 가능성
Ancestry Composition
1,000,000
2,000,000
2007-11
2011-06
2011-10
2012-04
2012-10
2013-04
2013-06
2013-09
2013-12
2014-10
2015-02
2015-06
2016-02
2017-04
2017-11
2018-04
3,000,000
5,000,000
2019-03
10,000,000
Customer growth of 23andMe
23andMe Chronicle
$115m 펀딩

(유니콘 등극)
100만 명 돌파
2006
23andMe 창업
20162007 2012 2013 2014 2015
구글 벤처스

360만 달러 투자
2008
$99 로 

가격 인하
FDA 판매 중지 명령
영국에서

DTC 서비스 시작
FDA 블룸증후군

DTC 서비스 허가
FDA에 블룸증후군

테스트 승인 요청
FDA에 510(k) 제출
FDA 510(k) 철회
보인자 등 DTC

서비스 재개 ($199)
캐나다에서

DTC 서비스 시작
Genetech, pFizer가

23andMe 데이터 구입
자체 신약 개
발 

계획 발표
120만 명 돌파
$399 로 

가격 인하Business
Regulation
애플 리서치키트와

데이터 수집 협력
50만 명 돌파
30만 명 돌파
TV 광고 시작
2017
FDA의

질병위험도 검사

DTC 서비스 허가

+

관련 규제 면제 

프로세스 확립
Digital Healthcare Institute
Director,Yoon Sup Choi, PhD
yoonsup.choi@gmail.com
FDA 

Pre-Cert
FDA Gottlieb 국장,

질병 위험도 유전자 

DTC 서비스의 

Pre-Cert 발의
BRCA 1/2

DTC 검사 허용
2018
FDA, 질병 위험도

유전자 DTC서비스의

Pre-Cert 발효
200만 명 돌파 500만 명 돌파
GSK에서 $300M 

투자 유치
2019
1000만 명 

돌파
•질병 위험도 유전자 분석 DTC 서비스에 대해서 Pre-Cert 를 적용 시작 (18. 6. 5)

•최초 한 번"만 99% 이상의 analytical validity 를 증명하면, 

•이 회사는 정확한 유전 정보 분석 서비스를 만들 수 있는 것으로 인정하여,

•이후의 서비스는 출시 전 인허가가 면제

•다만 민감할 수 있는 4가지 종류의 분석에 대해서는 이 규제 완화에서 제외

•산전 진단 

•(예방적 스크리닝이나 치료법 결정으로 이어지는) 암 발병 가능성 검사

•약물 유전체 검사

•우성유전질환 유전인자 검사
한국 DTC 유전정보 분석 제한적 허용

(2016.6.30)
• 「비의료기관 직접 유전자검사 실시 허용 관련 고시 제정, 6.30일시행」

• 2015년 12월「생명윤리 및 안전에 관한 법률」개정(‘15.12.29개정, ’16.6.30시행)
과 제9차 무역투자진흥회의(’16.2월) 시 발표한 규제 개선의 후속조치 일환으로 추진

• 민간 유전자검사 업체에서는 혈당, 혈압, 피부노화, 체질량지수 등 12개 검사항목과
관련된 46개 유전자를 직접 검사 가능
http://www.mohw.go.kr/m/noticeView.jsp?MENU_ID=0403&cont_seq=333112&page=1
검사항목 (유전자수) 유전자명
1 체질량지수(3) FTO, MC4R, BDNF
2 중성지방농도(8) GCKR, DOCK7, ANGPTL3, BAZ1B, TBL2, MLXIPL, LOC105375745, TRIB1
3 콜레스테롤(8) CELSR2, SORT1, HMGCR, ABO, ABCA1, MYL2, LIPG, CETP
4 혈 당(8) CDKN2A/B, G6PC2, GCK, GCKR, GLIS3, MTNR1B, DGKB-TMEM195, SLC30A8
5 혈 압(8) NPR3, ATP2B1, NT5C2, CSK, HECTD4, GUCY1A3, CYP17A1, FGF5
6 색소 침착(2) OCA2, MC1R
7 탈 모(3) chr20p11(rs1160312, rs2180439), IL2RA, HLA-DQB1
8 모발 굵기(1) EDAR
9 피부 노화(1) AGER
10 피부 탄력(1) MMP1
11 비타민C농도(1) SLC23A1(SVCT1)
12 카페인대사(2) AHR, CYP1A1-CYP1A2
https://www.23andme.com/slideshow/research/
고객의 자발적인 참여에 의한 유전학 연구
깍지를 끼면 어느 쪽 엄지가 위로 오는가?
아침형 인간? 저녁형 인간?
빛에 노출되었을 때 재채기를 하는가?
근육의 퍼포먼스
쓴 맛 인식 능력
음주 후 얼굴이 붉어지나?
유당 분해 효소 결핍?
고객의 81%가 10개 이상의 질문에 자발적 답변

매주 1 million 개의 data point 축적

The More Data, The Higher Accuracy!
January 13, 2015January 6, 2015
Data Business
Step1. 데이터의 측정
•스마트폰

•웨어러블 디바이스

•개인 유전 정보 분석
환자 유래의 의료 데이터 (PGHD)
Step 2. 데이터의 통합
Sci Transl Med 2015
Google Fit
Samsung SAMI
Epic MyChart Epic EHR
Dexcom CGM
Patients/User
Devices
EH Hospit
Whitings
+
Apple Watch
Apps
HealthKit
Hospital B
Hospital C
Hospital A
Hospital A Hospital B
Hospital C
interoperability
Hospital B
Hospital C
Hospital A
•2018년 1월에 출시 당시, 존스홉킨스, UC샌디에고 등 12개의 병원에 연동

•(2019년 2월 현재) 1년 만에 200개 이상의 병원에 연동

•VA와도 연동된다고 밝힘 (with 9 million veterans)

•2008년 구글 헬스는 3년 동안 12개 병원에 연동에 그쳤음
Step 3. 데이터의 분석
Data Overload
How to Analyze and Interpret the Big Data?
and/or
Two ways to get insights from the big data
원격의료
• ‘명시적’으로, ‘전면적’으로 ‘금지’된 곳은 한국 밖에 없는 듯

• 해외에서는 새로운 서비스의 상당수가 원격의료 기능 포함 

• 글로벌 100대 헬스케어 서비스 중 39개가 원격의료 포함

• 다른 모델과 결합하여 갈수록 새로운 모델이 만들어지는 중

• 스마트폰, 웨어러블, IoT, 인공지능, 챗봇 등과 결합

• 10년 뒤 한국 의료에서는?
원격 의료
원격 진료
원격 환자 모니터링
화상 진료
전화 진료
2차 소견
용어 정리
데이터 판독
원격 수술
•원격 진료: 화상 진료

•원격 진료: 2차 소견

•원격 진료: 애플리케이션

•원격 환자 모니터링
원격 의료에도 종류가 많다.
•원격 진료: 화상 진료

•원격 진료: 2차 소견

•원격 진료: 애플리케이션

•원격 환자 모니터링
원격 의료에도 종류가 많다.
Telemedicine
Average Time to Appointment (Familiy Medicine)
Boston
LA
Portland
Miami
Atlanta
Denver
Detroit
New York
Seattle
Houston
Philadelphia
Washington DC
San Diego
Dallas
Minneapolis
Total
0 30 60 90 120
20.3
10
8
24
30
9
17
8
24
14
14
9
7
8
59
63
19.5
10
5
7
14
21
19
23
26
16
16
24
12
13
20
66
29.3 days
8 days
12 days
13 days
17 days
17 days
21 days
26 days
26 days
27 days
27 days
27 days
28 days
39 days
42 days
109 days
2017
2014
2009
0
125
250
375
500
2013 2014 2015 2016 2017 2018
417.9
233.3
123
77.4
44
20
0
550
1100
1650
2200
2013 2014 2015 2016 2017 2018
2,036
1,461
952
575
299
127
0
6
12
18
24
2013 2014 2015 2016 2017 2018
22.8
19.6
17.5
11.5
8.1
6.2
Revenue ($m) Visits (k) Members (m)
Growth of Teladoc
•원격 진료: 화상 진료

•원격 진료: 2차 소견

•원격 진료: 애플리케이션

•원격 환자 모니터링
원격 의료에도 종류가 많다.
Epic MyChart Epic EHR
Dexcom CGM
Patients/User
Devices
EHR Hospital
Whitings
+
Apple Watch
Apps
HealthKit
transfer from Share2 to HealthKit as mandated by Dexcom receiver
Food and Drug Administration device classification. Once the glucose
values reach HealthKit, they are passively shared with the Epic
MyChart app (https://www.epic.com/software-phr.php). The MyChart
patient portal is a component of the Epic EHR and uses the same data-
base, and the CGM values populate a standard glucose flowsheet in
the patient’s chart. This connection is initially established when a pro-
vider places an order in a patient’s electronic chart, resulting in a re-
quest to the patient within the MyChart app. Once the patient or
patient proxy (parent) accepts this connection request on the mobile
device, a communication bridge is established between HealthKit and
MyChart enabling population of CGM data as frequently as every 5
Participation required confirmation of Bluetooth pairing of the CGM re-
ceiver to a mobile device, updating the mobile device with the most recent
version of the operating system, Dexcom Share2 app, Epic MyChart app,
and confirming or establishing a username and password for all accounts,
including a parent’s/adolescent’s Epic MyChart account. Setup time aver-
aged 45–60 minutes in addition to the scheduled clinic visit. During this
time, there was specific verbal and written notification to the patients/par-
ents that the diabetes healthcare team would not be actively monitoring
or have real-time access to CGM data, which was out of scope for this pi-
lot. The patients/parents were advised that they should continue to contact
the diabetes care team by established means for any urgent questions/
concerns. Additionally, patients/parents were advised to maintain updates
Figure 1: Overview of the CGM data communication bridge architecture.
BRIEFCOMMUNICATION
Kumar R B, et al. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv206, Brief Communication
byguestonApril7,2016http://jamia.oxfordjournals.org/Downloadedfrom
•Apple HealthKit, Dexcom CGM기기를 통해 지속적으로 혈당을 모니터링한 데이터를 EHR과 통합

•당뇨환자의 혈당관리를 향상시켰다는 연구결과

•Stanford Children’s Health와 Stanford 의대에서 10명 type 1 당뇨 소아환자 대상으로 수행 (288 readings /day)

•EHR 기반 데이터분석과 시각화는 데이터 리뷰 및 환자커뮤니케이션을 향상

•환자가 내원하여 진료하는 기존 방식에 비해 실시간 혈당변화에 환자가 대응
JAMIA 2016
Remote Patients Monitoring
via Dexcom-HealthKit-Epic-Stanford
의료계 일각에서는 원격 환자 모니터링의 합법화를 요구하기도
No choice but to bring AI into the medicine
Martin Duggan,“IBM Watson Health - Integrated Care & the Evolution to Cognitive Computing”
Copyright 2016 American Medical Association. All rights reserved.
Development and Validation of a Deep Learning Algorithm
for Detection of Diabetic Retinopathy
in Retinal Fundus Photographs
Varun Gulshan, PhD; Lily Peng, MD, PhD; Marc Coram, PhD; Martin C. Stumpe, PhD; Derek Wu, BS; Arunachalam Narayanaswamy, PhD;
Subhashini Venugopalan, MS; Kasumi Widner, MS; Tom Madams, MEng; Jorge Cuadros, OD, PhD; Ramasamy Kim, OD, DNB;
Rajiv Raman, MS, DNB; Philip C. Nelson, BS; Jessica L. Mega, MD, MPH; Dale R. Webster, PhD
IMPORTANCE Deep learning is a family of computational methods that allow an algorithm to
program itself by learning from a large set of examples that demonstrate the desired
behavior, removing the need to specify rules explicitly. Application of these methods to
medical imaging requires further assessment and validation.
OBJECTIVE To apply deep learning to create an algorithm for automated detection of diabetic
retinopathy and diabetic macular edema in retinal fundus photographs.
DESIGN AND SETTING A specific type of neural network optimized for image classification
called a deep convolutional neural network was trained using a retrospective development
data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy,
diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists
and ophthalmology senior residents between May and December 2015. The resultant
algorithm was validated in January and February 2016 using 2 separate data sets, both
graded by at least 7 US board-certified ophthalmologists with high intragrader consistency.
EXPOSURE Deep learning–trained algorithm.
MAIN OUTCOMES AND MEASURES The sensitivity and specificity of the algorithm for detecting
referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy,
referable diabetic macular edema, or both, were generated based on the reference standard
of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2
operating points selected from the development set, one selected for high specificity and
another for high sensitivity.
RESULTS TheEyePACS-1datasetconsistedof9963imagesfrom4997patients(meanage,54.4
years;62.2%women;prevalenceofRDR,683/8878fullygradableimages[7.8%]);the
Messidor-2datasethad1748imagesfrom874patients(meanage,57.6years;42.6%women;
prevalenceofRDR,254/1745fullygradableimages[14.6%]).FordetectingRDR,thealgorithm
hadanareaunderthereceiveroperatingcurveof0.991(95%CI,0.988-0.993)forEyePACS-1and
0.990(95%CI,0.986-0.995)forMessidor-2.Usingthefirstoperatingcutpointwithhigh
specificity,forEyePACS-1,thesensitivitywas90.3%(95%CI,87.5%-92.7%)andthespecificity
was98.1%(95%CI,97.8%-98.5%).ForMessidor-2,thesensitivitywas87.0%(95%CI,81.1%-
91.0%)andthespecificitywas98.5%(95%CI,97.7%-99.1%).Usingasecondoperatingpoint
withhighsensitivityinthedevelopmentset,forEyePACS-1thesensitivitywas97.5%and
specificitywas93.4%andforMessidor-2thesensitivitywas96.1%andspecificitywas93.9%.
CONCLUSIONS AND RELEVANCE In this evaluation of retinal fundus photographs from adults
with diabetes, an algorithm based on deep machine learning had high sensitivity and
specificity for detecting referable diabetic retinopathy. Further research is necessary to
determine the feasibility of applying this algorithm in the clinical setting and to determine
whether use of the algorithm could lead to improved care and outcomes compared with
current ophthalmologic assessment.
JAMA. doi:10.1001/jama.2016.17216
Published online November 29, 2016.
Editorial
Supplemental content
Author Affiliations: Google Inc,
Mountain View, California (Gulshan,
Peng, Coram, Stumpe, Wu,
Narayanaswamy, Venugopalan,
Widner, Madams, Nelson, Webster);
Department of Computer Science,
University of Texas, Austin
(Venugopalan); EyePACS LLC,
San Jose, California (Cuadros); School
of Optometry, Vision Science
Graduate Group, University of
California, Berkeley (Cuadros);
Aravind Medical Research
Foundation, Aravind Eye Care
System, Madurai, India (Kim); Shri
Bhagwan Mahavir Vitreoretinal
Services, Sankara Nethralaya,
Chennai, Tamil Nadu, India (Raman);
Verily Life Sciences, Mountain View,
California (Mega); Cardiovascular
Division, Department of Medicine,
Brigham and Women’s Hospital and
Harvard Medical School, Boston,
Massachusetts (Mega).
Corresponding Author: Lily Peng,
MD, PhD, Google Research, 1600
Amphitheatre Way, Mountain View,
CA 94043 (lhpeng@google.com).
Research
JAMA | Original Investigation | INNOVATIONS IN HEALTH CARE DELIVERY
(Reprinted) E1
Copyright 2016 American Medical Association. All rights reserved.
Downloaded From: http://jamanetwork.com/ on 12/02/2016
안과
LETTERS
https://doi.org/10.1038/s41591-018-0335-9
1
Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China. 2
Institute for Genomic Medicine, Institute of
Engineering in Medicine, and Shiley Eye Institute, University of California, San Diego, La Jolla, CA, USA. 3
Hangzhou YITU Healthcare Technology Co. Ltd,
Hangzhou, China. 4
Department of Thoracic Surgery/Oncology, First Affiliated Hospital of Guangzhou Medical University, China State Key Laboratory and
National Clinical Research Center for Respiratory Disease, Guangzhou, China. 5
Guangzhou Kangrui Co. Ltd, Guangzhou, China. 6
Guangzhou Regenerative
Medicine and Health Guangdong Laboratory, Guangzhou, China. 7
Veterans Administration Healthcare System, San Diego, CA, USA. 8
These authors contributed
equally: Huiying Liang, Brian Tsui, Hao Ni, Carolina C. S. Valentim, Sally L. Baxter, Guangjian Liu. *e-mail: kang.zhang@gmail.com; xiahumin@hotmail.com
Artificial intelligence (AI)-based methods have emerged as
powerful tools to transform medical care. Although machine
learning classifiers (MLCs) have already demonstrated strong
performance in image-based diagnoses, analysis of diverse
and massive electronic health record (EHR) data remains chal-
lenging. Here, we show that MLCs can query EHRs in a manner
similar to the hypothetico-deductive reasoning used by physi-
cians and unearth associations that previous statistical meth-
ods have not found. Our model applies an automated natural
language processing system using deep learning techniques
to extract clinically relevant information from EHRs. In total,
101.6 million data points from 1,362,559 pediatric patient
visits presenting to a major referral center were analyzed to
train and validate the framework. Our model demonstrates
high diagnostic accuracy across multiple organ systems and is
comparable to experienced pediatricians in diagnosing com-
mon childhood diseases. Our study provides a proof of con-
cept for implementing an AI-based system as a means to aid
physicians in tackling large amounts of data, augmenting diag-
nostic evaluations, and to provide clinical decision support in
cases of diagnostic uncertainty or complexity. Although this
impact may be most evident in areas where healthcare provid-
ers are in relative shortage, the benefits of such an AI system
are likely to be universal.
Medical information has become increasingly complex over
time. The range of disease entities, diagnostic testing and biomark-
ers, and treatment modalities has increased exponentially in recent
years. Subsequently, clinical decision-making has also become more
complex and demands the synthesis of decisions from assessment
of large volumes of data representing clinical information. In the
current digital age, the electronic health record (EHR) represents a
massive repository of electronic data points representing a diverse
array of clinical information1–3
. Artificial intelligence (AI) methods
have emerged as potentially powerful tools to mine EHR data to aid
in disease diagnosis and management, mimicking and perhaps even
augmenting the clinical decision-making of human physicians1
.
To formulate a diagnosis for any given patient, physicians fre-
quently use hypotheticodeductive reasoning. Starting with the chief
complaint, the physician then asks appropriately targeted questions
relating to that complaint. From this initial small feature set, the
physician forms a differential diagnosis and decides what features
(historical questions, physical exam findings, laboratory testing,
and/or imaging studies) to obtain next in order to rule in or rule
out the diagnoses in the differential diagnosis set. The most use-
ful features are identified, such that when the probability of one of
the diagnoses reaches a predetermined level of acceptability, the
process is stopped, and the diagnosis is accepted. It may be pos-
sible to achieve an acceptable level of certainty of the diagnosis with
only a few features without having to process the entire feature set.
Therefore, the physician can be considered a classifier of sorts.
In this study, we designed an AI-based system using machine
learning to extract clinically relevant features from EHR notes to
mimic the clinical reasoning of human physicians. In medicine,
machine learning methods have already demonstrated strong per-
formance in image-based diagnoses, notably in radiology2
, derma-
tology4
, and ophthalmology5–8
, but analysis of EHR data presents
a number of difficult challenges. These challenges include the vast
quantity of data, high dimensionality, data sparsity, and deviations
Evaluation and accurate diagnoses of pediatric
diseases using artificial intelligence
Huiying Liang1,8
, Brian Y. Tsui 2,8
, Hao Ni3,8
, Carolina C. S. Valentim4,8
, Sally L. Baxter 2,8
,
Guangjian Liu1,8
, Wenjia Cai 2
, Daniel S. Kermany1,2
, Xin Sun1
, Jiancong Chen2
, Liya He1
, Jie Zhu1
,
Pin Tian2
, Hua Shao2
, Lianghong Zheng5,6
, Rui Hou5,6
, Sierra Hewett1,2
, Gen Li1,2
, Ping Liang3
,
Xuan Zang3
, Zhiqi Zhang3
, Liyan Pan1
, Huimin Cai5,6
, Rujuan Ling1
, Shuhua Li1
, Yongwang Cui1
,
Shusheng Tang1
, Hong Ye1
, Xiaoyan Huang1
, Waner He1
, Wenqing Liang1
, Qing Zhang1
, Jianmin Jiang1
,
Wei Yu1
, Jianqun Gao1
, Wanxing Ou1
, Yingmin Deng1
, Qiaozhen Hou1
, Bei Wang1
, Cuichan Yao1
,
Yan Liang1
, Shu Zhang1
, Yaou Duan2
, Runze Zhang2
, Sarah Gibson2
, Charlotte L. Zhang2
, Oulan Li2
,
Edward D. Zhang2
, Gabriel Karin2
, Nathan Nguyen2
, Xiaokang Wu1,2
, Cindy Wen2
, Jie Xu2
, Wenqin Xu2
,
Bochu Wang2
, Winston Wang2
, Jing Li1,2
, Bianca Pizzato2
, Caroline Bao2
, Daoman Xiang1
, Wanting He1,2
,
Suiqin He2
, Yugui Zhou1,2
, Weldon Haw2,7
, Michael Goldbaum2
, Adriana Tremoulet2
, Chun-Nan Hsu 2
,
Hannah Carter2
, Long Zhu3
, Kang Zhang 1,2,7
* and Huimin Xia 1
*
NATURE MEDICINE | www.nature.com/naturemedicine
소아청소년과
ARTICLES
https://doi.org/10.1038/s41591-018-0177-5
1
Applied Bioinformatics Laboratories, New York University School of Medicine, New York, NY, USA. 2
Skirball Institute, Department of Cell Biology,
New York University School of Medicine, New York, NY, USA. 3
Department of Pathology, New York University School of Medicine, New York, NY, USA.
4
School of Mechanical Engineering, National Technical University of Athens, Zografou, Greece. 5
Institute for Systems Genetics, New York University School
of Medicine, New York, NY, USA. 6
Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY,
USA. 7
Center for Biospecimen Research and Development, New York University, New York, NY, USA. 8
Department of Population Health and the Center for
Healthcare Innovation and Delivery Science, New York University School of Medicine, New York, NY, USA. 9
These authors contributed equally to this work:
Nicolas Coudray, Paolo Santiago Ocampo. *e-mail: narges.razavian@nyumc.org; aristotelis.tsirigos@nyumc.org
A
ccording to the American Cancer Society and the Cancer
Statistics Center (see URLs), over 150,000 patients with lung
cancer succumb to the disease each year (154,050 expected
for 2018), while another 200,000 new cases are diagnosed on a
yearly basis (234,030 expected for 2018). It is one of the most widely
spread cancers in the world because of not only smoking, but also
exposure to toxic chemicals like radon, asbestos and arsenic. LUAD
and LUSC are the two most prevalent types of non–small cell lung
cancer1
, and each is associated with discrete treatment guidelines. In
the absence of definitive histologic features, this important distinc-
tion can be challenging and time-consuming, and requires confir-
matory immunohistochemical stains.
Classification of lung cancer type is a key diagnostic process
because the available treatment options, including conventional
chemotherapy and, more recently, targeted therapies, differ for
LUAD and LUSC2
. Also, a LUAD diagnosis will prompt the search
for molecular biomarkers and sensitizing mutations and thus has
a great impact on treatment options3,4
. For example, epidermal
growth factor receptor (EGFR) mutations, present in about 20% of
LUAD, and anaplastic lymphoma receptor tyrosine kinase (ALK)
rearrangements, present in<5% of LUAD5
, currently have tar-
geted therapies approved by the Food and Drug Administration
(FDA)6,7
. Mutations in other genes, such as KRAS and tumor pro-
tein P53 (TP53) are very common (about 25% and 50%, respec-
tively) but have proven to be particularly challenging drug targets
so far5,8
. Lung biopsies are typically used to diagnose lung cancer
type and stage. Virtual microscopy of stained images of tissues is
typically acquired at magnifications of 20×to 40×, generating very
large two-dimensional images (10,000 to>100,000 pixels in each
dimension) that are oftentimes challenging to visually inspect in
an exhaustive manner. Furthermore, accurate interpretation can be
difficult, and the distinction between LUAD and LUSC is not always
clear, particularly in poorly differentiated tumors; in this case, ancil-
lary studies are recommended for accurate classification9,10
. To assist
experts, automatic analysis of lung cancer whole-slide images has
been recently studied to predict survival outcomes11
and classifica-
tion12
. For the latter, Yu et al.12
combined conventional thresholding
and image processing techniques with machine-learning methods,
such as random forest classifiers, support vector machines (SVM) or
Naive Bayes classifiers, achieving an AUC of ~0.85 in distinguishing
normal from tumor slides, and ~0.75 in distinguishing LUAD from
LUSC slides. More recently, deep learning was used for the classi-
fication of breast, bladder and lung tumors, achieving an AUC of
0.83 in classification of lung tumor types on tumor slides from The
Cancer Genome Atlas (TCGA)13
. Analysis of plasma DNA values
was also shown to be a good predictor of the presence of non–small
cell cancer, with an AUC of ~0.94 (ref. 14
) in distinguishing LUAD
from LUSC, whereas the use of immunochemical markers yields an
AUC of ~0.94115
.
Here, we demonstrate how the field can further benefit from deep
learning by presenting a strategy based on convolutional neural
networks (CNNs) that not only outperforms methods in previously
Classification and mutation prediction from
non–small cell lung cancer histopathology
images using deep learning
Nicolas Coudray 1,2,9
, Paolo Santiago Ocampo3,9
, Theodore Sakellaropoulos4
, Navneet Narula3
,
Matija Snuderl3
, David Fenyö5,6
, Andre L. Moreira3,7
, Narges Razavian 8
* and Aristotelis Tsirigos 1,3
*
Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and sub-
type of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung
cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep con-
volutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and
automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of
pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen
tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most
commonly mutated genes in LUAD. We found that six of them—STK11, EGFR, FAT1, SETBP1, KRAS and TP53—can be pre-
dicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest
that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be
applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH.
NATURE MEDICINE | www.nature.com/naturemedicine
병리과병리과병리과병리과병리과병리과병리과
ARTICLES
https://doi.org/10.1038/s41551-018-0301-3
1
Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital, Chengdu, China. 2
Shanghai Wision AI Co., Ltd, Shanghai, China. 3
Beth Israel
Deaconess Medical Center and Harvard Medical School, Center for Advanced Endoscopy, Boston , MA, USA. *e-mail: gary.samsph@gmail.com
C
olonoscopy is the gold-standard screening test for colorectal
cancer1–3
, one of the leading causes of cancer death in both the
United States4,5
and China6
. Colonoscopy can reduce the risk
of death from colorectal cancer through the detection of tumours
at an earlier, more treatable stage as well as through the removal of
precancerous adenomas3,7
. Conversely, failure to detect adenomas
may lead to the development of interval cancer. Evidence has shown
that each 1.0% increase in adenoma detection rate (ADR) leads to a
3.0% decrease in the risk of interval colorectal cancer8
.
Although more than 14million colonoscopies are performed
in the United States annually2
, the adenoma miss rate (AMR) is
estimated to be 6–27%9
. Certain polyps may be missed more fre-
quently, including smaller polyps10,11
, flat polyps12
and polyps in the
left colon13
. There are two independent reasons why a polyp may
be missed during colonoscopy: (i) it was never in the visual field or
(ii) it was in the visual field but not recognized. Several hardware
innovations have sought to address the first problem by improv-
ing visualization of the colonic lumen, for instance by providing a
larger, panoramic camera view, or by flattening colonic folds using a
distal-cap attachment. The problem of unrecognized polyps within
the visual field has been more difficult to address14
. Several studies
have shown that observation of the video monitor by either nurses
or gastroenterology trainees may increase polyp detection by up
to 30%15–17
. Ideally, a real-time automatic polyp-detection system
could serve as a similarly effective second observer that could draw
the endoscopist’s eye, in real time, to concerning lesions, effec-
tively creating an ‘extra set of eyes’ on all aspects of the video data
with fidelity. Although automatic polyp detection in colonoscopy
videos has been an active research topic for the past 20 years, per-
formance levels close to that of the expert endoscopist18–20
have not
been achieved. Early work in automatic polyp detection has focused
on applying deep-learning techniques to polyp detection, but most
published works are small in scale, with small development and/or
training validation sets19,20
.
Here, we report the development and validation of a deep-learn-
ing algorithm, integrated with a multi-threaded processing system,
for the automatic detection of polyps during colonoscopy. We vali-
dated the system in two image studies and two video studies. Each
study contained two independent validation datasets.
Results
We developed a deep-learning algorithm using 5,545colonoscopy
images from colonoscopy reports of 1,290patients that underwent
a colonoscopy examination in the Endoscopy Center of Sichuan
Provincial People’s Hospital between January 2007 and December
2015. Out of the 5,545images used, 3,634images contained polyps
(65.54%) and 1,911 images did not contain polyps (34.46%). For
algorithm training, experienced endoscopists annotated the pres-
ence of each polyp in all of the images in the development data-
set. We validated the algorithm on four independent datasets.
DatasetsA and B were used for image analysis, and datasetsC and D
were used for video analysis.
DatasetA contained 27,113colonoscopy images from colo-
noscopy reports of 1,138consecutive patients who underwent a
colonoscopy examination in the Endoscopy Center of Sichuan
Provincial People’s Hospital between January and December 2016
and who were found to have at least one polyp. Out of the 27,113
images, 5,541images contained polyps (20.44%) and 21,572images
did not contain polyps (79.56%). All polyps were confirmed histo-
logically after biopsy. DatasetB is a public database (CVC-ClinicDB;
Development and validation of a deep-learning
algorithm for the detection of polyps during
colonoscopy
Pu Wang1
, Xiao Xiao2
, Jeremy R. Glissen Brown3
, Tyler M. Berzin 3
, Mengtian Tu1
, Fei Xiong1
,
Xiao Hu1
, Peixi Liu1
, Yan Song1
, Di Zhang1
, Xue Yang1
, Liangping Li1
, Jiong He2
, Xin Yi2
, Jingjia Liu2
and
Xiaogang Liu 1
*
The detection and removal of precancerous polyps via colonoscopy is the gold standard for the prevention of colon cancer.
However, the detection rate of adenomatous polyps can vary significantly among endoscopists. Here, we show that a machine-
learningalgorithmcandetectpolypsinclinicalcolonoscopies,inrealtimeandwithhighsensitivityandspecificity.Wedeveloped
the deep-learning algorithm by using data from 1,290 patients, and validated it on newly collected 27,113 colonoscopy images
from 1,138 patients with at least one detected polyp (per-image-sensitivity, 94.38%; per-image-specificity, 95.92%; area under
the receiver operating characteristic curve, 0.984), on a public database of 612 polyp-containing images (per-image-sensitiv-
ity, 88.24%), on 138 colonoscopy videos with histologically confirmed polyps (per-image-sensitivity of 91.64%; per-polyp-sen-
sitivity, 100%), and on 54 unaltered full-range colonoscopy videos without polyps (per-image-specificity, 95.40%). By using a
multi-threaded processing system, the algorithm can process at least 25 frames per second with a latency of 76.80±5.60ms
in real-time video analysis. The software may aid endoscopists while performing colonoscopies, and help assess differences in
polyp and adenoma detection performance among endoscopists.
NATURE BIOMEDICA L ENGINEERING | VOL 2 | OCTOBER 2018 | 741–748 | www.nature.com/natbiomedeng 741
소화기내과
1Wang P, et al. Gut 2019;0:1–7. doi:10.1136/gutjnl-2018-317500
Endoscopy
ORIGINAL ARTICLE
Real-time automatic detection system increases
colonoscopic polyp and adenoma detection rates: a
prospective randomised controlled study
Pu Wang,  1
Tyler M Berzin,  2
Jeremy Romek Glissen Brown,  2
Shishira Bharadwaj,2
Aymeric Becq,2
Xun Xiao,1
Peixi Liu,1
Liangping Li,1
Yan Song,1
Di Zhang,1
Yi Li,1
Guangre Xu,1
Mengtian Tu,1
Xiaogang Liu  1
To cite: Wang P, Berzin TM,
Glissen Brown JR, et al. Gut
Epub ahead of print: [please
include Day Month Year].
doi:10.1136/
gutjnl-2018-317500
► Additional material is
published online only.To view
please visit the journal online
(http://dx.doi.org/10.1136/
gutjnl-2018-317500).
1
Department of
Gastroenterology, Sichuan
Academy of Medical Sciences
& Sichuan Provincial People’s
Hospital, Chengdu, China
2
Center for Advanced
Endoscopy, Beth Israel
Deaconess Medical Center and
Harvard Medical School, Boston,
Massachusetts, USA
Correspondence to
Xiaogang Liu, Department
of Gastroenterology Sichuan
Academy of Medical Sciences
and Sichuan Provincial People’s
Hospital, Chengdu, China;
Gary.samsph@gmail.com
Received 30 August 2018
Revised 4 February 2019
Accepted 13 February 2019
© Author(s) (or their
employer(s)) 2019. Re-use
permitted under CC BY-NC. No
commercial re-use. See rights
and permissions. Published
by BMJ.
ABSTRACT
Objective The effect of colonoscopy on colorectal
cancer mortality is limited by several factors, among them
a certain miss rate, leading to limited adenoma detection
rates (ADRs).We investigated the effect of an automatic
polyp detection system based on deep learning on polyp
detection rate and ADR.
Design In an open, non-blinded trial, consecutive
patients were prospectively randomised to undergo
diagnostic colonoscopy with or without assistance of a
real-time automatic polyp detection system providing
a simultaneous visual notice and sound alarm on polyp
detection.The primary outcome was ADR.
Results Of 1058 patients included, 536 were
randomised to standard colonoscopy, and 522 were
randomised to colonoscopy with computer-aided
diagnosis.The artificial intelligence (AI) system
significantly increased ADR (29.1%vs20.3%, p<0.001)
and the mean number of adenomas per patient
(0.53vs0.31, p<0.001).This was due to a higher number
of diminutive adenomas found (185vs102; p<0.001),
while there was no statistical difference in larger
adenomas (77vs58, p=0.075). In addition, the number
of hyperplastic polyps was also significantly increased
(114vs52, p<0.001).
Conclusions In a low prevalent ADR population, an
automatic polyp detection system during colonoscopy
resulted in a significant increase in the number of
diminutive adenomas detected, as well as an increase in
the rate of hyperplastic polyps.The cost–benefit ratio of
such effects has to be determined further.
Trial registration number ChiCTR-DDD-17012221;
Results.
INTRODUCTION
Colorectal cancer (CRC) is the second and third-
leading causes of cancer-related deaths in men and
women respectively.1
Colonoscopy is the gold stan-
dard for screening CRC.2 3
Screening colonoscopy
has allowed for a reduction in the incidence and
mortality of CRC via the detection and removal
of adenomatous polyps.4–8
Additionally, there is
evidence that with each 1.0% increase in adenoma
detection rate (ADR), there is an associated 3.0%
decrease in the risk of interval CRC.9 10
However,
polyps can be missed, with reported miss rates of
up to 27% due to both polyp and operator charac-
teristics.11 12
Unrecognised polyps within the visual field is
an important problem to address.11
Several studies
have shown that assistance by a second observer
increases the polyp detection rate (PDR), but such a
strategy remains controversial in terms of increasing
the ADR.13–15
Ideally, a real-time automatic polyp detec-
tion system, with performance close to that of
expert endoscopists, could assist the endosco-
pist in detecting lesions that might correspond to
adenomas in a more consistent and reliable way
Significance of this study
What is already known on this subject?
► Colorectal adenoma detection rate (ADR)
is regarded as a main quality indicator of
(screening) colonoscopy and has been shown
to correlate with interval cancers. Reducing
adenoma miss rates by increasing ADR has
been a goal of many studies focused on
imaging techniques and mechanical methods.
► Artificial intelligence has been recently
introduced for polyp and adenoma detection
as well as differentiation and has shown
promising results in preliminary studies.
What are the new findings?
► This represents the first prospective randomised
controlled trial examining an automatic polyp
detection during colonoscopy and shows an
increase of ADR by 50%, from 20% to 30%.
► This effect was mainly due to a higher rate of
small adenomas found.
► The detection rate of hyperplastic polyps was
also significantly increased.
How might it impact on clinical practice in the
foreseeable future?
► Automatic polyp and adenoma detection could
be the future of diagnostic colonoscopy in order
to achieve stable high adenoma detection rates.
► However, the effect on ultimate outcome is
still unclear, and further improvements such as
polyp differentiation have to be implemented.
on17March2019byguest.Protectedbycopyright.http://gut.bmj.com/Gut:firstpublishedas10.1136/gutjnl-2018-317500on27February2019.Downloadedfrom
소화기내과
Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018
Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018
Impact of Deep Learning Assistance on the
Histopathologic Review of Lymph Nodes for Metastatic
Breast Cancer
David F. Steiner, MD, PhD,* Robert MacDonald, PhD,* Yun Liu, PhD,* Peter Truszkowski, MD,*
Jason D. Hipp, MD, PhD, FCAP,* Christopher Gammage, MS,* Florence Thng, MS,†
Lily Peng, MD, PhD,* and Martin C. Stumpe, PhD*
Abstract: Advances in the quality of whole-slide images have set the
stage for the clinical use of digital images in anatomic pathology.
Along with advances in computer image analysis, this raises the
possibility for computer-assisted diagnostics in pathology to improve
histopathologic interpretation and clinical care. To evaluate the
potential impact of digital assistance on interpretation of digitized
slides, we conducted a multireader multicase study utilizing our deep
learning algorithm for the detection of breast cancer metastasis in
lymph nodes. Six pathologists reviewed 70 digitized slides from lymph
node sections in 2 reader modes, unassisted and assisted, with a wash-
out period between sessions. In the assisted mode, the deep learning
algorithm was used to identify and outline regions with high like-
lihood of containing tumor. Algorithm-assisted pathologists demon-
strated higher accuracy than either the algorithm or the pathologist
alone. In particular, algorithm assistance significantly increased the
sensitivity of detection for micrometastases (91% vs. 83%, P=0.02).
In addition, average review time per image was significantly shorter
with assistance than without assistance for both micrometastases (61
vs. 116 s, P=0.002) and negative images (111 vs. 137 s, P=0.018).
Lastly, pathologists were asked to provide a numeric score regarding
the difficulty of each image classification. On the basis of this score,
pathologists considered the image review of micrometastases to be
significantly easier when interpreted with assistance (P=0.0005).
Utilizing a proof of concept assistant tool, this study demonstrates the
potential of a deep learning algorithm to improve pathologist accu-
racy and efficiency in a digital pathology workflow.
Key Words: artificial intelligence, machine learning, digital pathology,
breast cancer, computer aided detection
(Am J Surg Pathol 2018;00:000–000)
The regulatory approval and gradual implementation of
whole-slide scanners has enabled the digitization of glass
slides for remote consults and archival purposes.1 Digitiza-
tion alone, however, does not necessarily improve the con-
sistency or efficiency of a pathologist’s primary workflow. In
fact, image review on a digital medium can be slightly
slower than on glass, especially for pathologists with limited
digital pathology experience.2 However, digital pathology
and image analysis tools have already demonstrated po-
tential benefits, including the potential to reduce inter-reader
variability in the evaluation of breast cancer HER2 status.3,4
Digitization also opens the door for assistive tools based on
Artificial Intelligence (AI) to improve efficiency and con-
sistency, decrease fatigue, and increase accuracy.5
Among AI technologies, deep learning has demon-
strated strong performance in many automated image-rec-
ognition applications.6–8 Recently, several deep learning–
based algorithms have been developed for the detection of
breast cancer metastases in lymph nodes as well as for other
applications in pathology.9,10 Initial findings suggest that
some algorithms can even exceed a pathologist’s sensitivity
for detecting individual cancer foci in digital images. How-
ever, this sensitivity gain comes at the cost of increased false
positives, potentially limiting the utility of such algorithms for
automated clinical use.11 In addition, deep learning algo-
rithms are inherently limited to the task for which they have
been specifically trained. While we have begun to understand
the strengths of these algorithms (such as exhaustive search)
and their weaknesses (sensitivity to poor optical focus, tumor
mimics; manuscript under review), the potential clinical util-
ity of such algorithms has not been thoroughly examined.
While an accurate algorithm alone will not necessarily aid
pathologists or improve clinical interpretation, these benefits
may be achieved through thoughtful and appropriate in-
tegration of algorithm predictions into the clinical workflow.8
From the *Google AI Healthcare; and †Verily Life Sciences, Mountain
View, CA.
D.F.S., R.M., and Y.L. are co-first authors (equal contribution).
Work done as part of the Google Brain Healthcare Technology Fellowship
(D.F.S. and P.T.).
Conflicts of Interest and Source of Funding: D.F.S., R.M., Y.L., P.T.,
J.D.H., C.G., F.T., L.P., M.C.S. are employees of Alphabet and have
Alphabet stock.
Correspondence: David F. Steiner, MD, PhD, Google AI Healthcare,
1600 Amphitheatre Way, Mountain View, CA 94043
(e-mail: davesteiner@google.com).
Supplemental Digital Content is available for this article. Direct URL citations
appear in the printed text and are provided in the HTML and PDF
versions of this article on the journal’s website, www.ajsp.com.
Copyright © 2018 The Author(s). Published by Wolters Kluwer Health,
Inc. This is an open-access article distributed under the terms of the
Creative Commons Attribution-Non Commercial-No Derivatives
License 4.0 (CCBY-NC-ND), where it is permissible to download and
share the work provided it is properly cited. The work cannot be
changed in any way or used commercially without permission from
the journal.
ORIGINAL ARTICLE
Am J Surg Pathol  Volume 00, Number 00, ’’ 2018 www.ajsp.com | 1
병리과
S E P S I S
A targeted real-time early warning score (TREWScore)
for septic shock
Katharine E. Henry,1
David N. Hager,2
Peter J. Pronovost,3,4,5
Suchi Saria1,3,5,6
*
Sepsis is a leading cause of death in the United States, with mortality highest among patients who develop septic
shock. Early aggressive treatment decreases morbidity and mortality. Although automated screening tools can detect
patients currently experiencing severe sepsis and septic shock, none predict those at greatest risk of developing
shock. We analyzed routinely available physiological and laboratory data from intensive care unit patients and devel-
oped “TREWScore,” a targeted real-time early warning score that predicts which patients will develop septic shock.
TREWScore identified patients before the onset of septic shock with an area under the ROC (receiver operating
characteristic) curve (AUC) of 0.83 [95% confidence interval (CI), 0.81 to 0.85]. At a specificity of 0.67, TREWScore
achieved a sensitivity of 0.85 and identified patients a median of 28.2 [interquartile range (IQR), 10.6 to 94.2] hours
before onset. Of those identified, two-thirds were identified before any sepsis-related organ dysfunction. In compar-
ison, the Modified Early Warning Score, which has been used clinically for septic shock prediction, achieved a lower
AUC of 0.73 (95% CI, 0.71 to 0.76). A routine screening protocol based on the presence of two of the systemic inflam-
matory response syndrome criteria, suspicion of infection, and either hypotension or hyperlactatemia achieved a low-
er sensitivity of 0.74 at a comparable specificity of 0.64. Continuous sampling of data from the electronic health
records and calculation of TREWScore may allow clinicians to identify patients at risk for septic shock and provide
earlier interventions that would prevent or mitigate the associated morbidity and mortality.
INTRODUCTION
Seven hundred fifty thousand patients develop severe sepsis and septic
shock in the United States each year. More than half of them are
admitted to an intensive care unit (ICU), accounting for 10% of all
ICU admissions, 20 to 30% of hospital deaths, and $15.4 billion in an-
nual health care costs (1–3). Several studies have demonstrated that
morbidity, mortality, and length of stay are decreased when severe sep-
sis and septic shock are identified and treated early (4–8). In particular,
one study showed that mortality from septic shock increased by 7.6%
with every hour that treatment was delayed after the onset of hypo-
tension (9).
More recent studies comparing protocolized care, usual care, and
early goal-directed therapy (EGDT) for patients with septic shock sug-
gest that usual care is as effective as EGDT (10–12). Some have inter-
preted this to mean that usual care has improved over time and reflects
important aspects of EGDT, such as early antibiotics and early ag-
gressive fluid resuscitation (13). It is likely that continued early identi-
fication and treatment will further improve outcomes. However, the
best approach to managing patients at high risk of developing septic
shock before the onset of severe sepsis or shock has not been studied.
Methods that can identify ahead of time which patients will later expe-
rience septic shock are needed to further understand, study, and im-
prove outcomes in this population.
General-purpose illness severity scoring systems such as the Acute
Physiology and Chronic Health Evaluation (APACHE II), Simplified
Acute Physiology Score (SAPS II), SequentialOrgan Failure Assessment
(SOFA) scores, Modified Early Warning Score (MEWS), and Simple
Clinical Score (SCS) have been validated to assess illness severity and
risk of death among septic patients (14–17). Although these scores
are useful for predicting general deterioration or mortality, they typical-
ly cannot distinguish with high sensitivity and specificity which patients
are at highest risk of developing a specific acute condition.
The increased use of electronic health records (EHRs), which can be
queried in real time, has generated interest in automating tools that
identify patients at risk for septic shock (18–20). A number of “early
warning systems,” “track and trigger” initiatives, “listening applica-
tions,” and “sniffers” have been implemented to improve detection
andtimelinessof therapy forpatients with severe sepsis andseptic shock
(18, 20–23). Although these tools have been successful at detecting pa-
tients currently experiencing severe sepsis or septic shock, none predict
which patients are at highest risk of developing septic shock.
The adoption of the Affordable Care Act has added to the growing
excitement around predictive models derived from electronic health
data in a variety of applications (24), including discharge planning
(25), risk stratification (26, 27), and identification of acute adverse
events (28, 29). For septic shock in particular, promising work includes
that of predicting septic shock using high-fidelity physiological signals
collected directly from bedside monitors (30, 31), inferring relationships
between predictors of septic shock using Bayesian networks (32), and
using routine measurements for septic shock prediction (33–35). No
current prediction models that use only data routinely stored in the
EHR predict septic shock with high sensitivity and specificity many
hours before onset. Moreover, when learning predictive risk scores, cur-
rent methods (34, 36, 37) often have not accounted for the censoring
effects of clinical interventions on patient outcomes (38). For instance,
a patient with severe sepsis who received fluids and never developed
septic shock would be treated as a negative case, despite the possibility
that he or she might have developed septic shock in the absence of such
treatment and therefore could be considered a positive case up until the
1
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
2
Division of Pulmonary and Critical Care Medicine, Department of Medicine, School of
Medicine, Johns Hopkins University, Baltimore, MD 21205, USA. 3
Armstrong Institute for
Patient Safety and Quality, Johns Hopkins University, Baltimore, MD 21202, USA. 4
Department
of Anesthesiology and Critical Care Medicine, School of Medicine, Johns Hopkins University,
Baltimore, MD 21202, USA. 5
Department of Health Policy and Management, Bloomberg
School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA. 6
Department
of Applied Math and Statistics, Johns Hopkins University, Baltimore, MD 21218, USA.
*Corresponding author. E-mail: ssaria@cs.jhu.edu
R E S E A R C H A R T I C L E
www.ScienceTranslationalMedicine.org 5 August 2015 Vol 7 Issue 299 299ra122 1
onNovember3,2016http://stm.sciencemag.org/Downloadedfrom
An Algorithm Based on Deep Learning for Predicting In-Hospital
Cardiac Arrest
Joon-myoung Kwon, MD;* Youngnam Lee, MS;* Yeha Lee, PhD; Seungwoo Lee, BS; Jinsik Park, MD, PhD
Background-—In-hospital cardiac arrest is a major burden to public health, which affects patient safety. Although traditional track-
and-trigger systems are used to predict cardiac arrest early, they have limitations, with low sensitivity and high false-alarm rates.
We propose a deep learning–based early warning system that shows higher performance than the existing track-and-trigger
systems.
Methods and Results-—This retrospective cohort study reviewed patients who were admitted to 2 hospitals from June 2010 to July
2017. A total of 52 131 patients were included. Specifically, a recurrent neural network was trained using data from June 2010 to
January 2017. The result was tested using the data from February to July 2017. The primary outcome was cardiac arrest, and the
secondary outcome was death without attempted resuscitation. As comparative measures, we used the area under the receiver
operating characteristic curve (AUROC), the area under the precision–recall curve (AUPRC), and the net reclassification index.
Furthermore, we evaluated sensitivity while varying the number of alarms. The deep learning–based early warning system (AUROC:
0.850; AUPRC: 0.044) significantly outperformed a modified early warning score (AUROC: 0.603; AUPRC: 0.003), a random forest
algorithm (AUROC: 0.780; AUPRC: 0.014), and logistic regression (AUROC: 0.613; AUPRC: 0.007). Furthermore, the deep learning–
based early warning system reduced the number of alarms by 82.2%, 13.5%, and 42.1% compared with the modified early warning
system, random forest, and logistic regression, respectively, at the same sensitivity.
Conclusions-—An algorithm based on deep learning had high sensitivity and a low false-alarm rate for detection of patients with
cardiac arrest in the multicenter study. (J Am Heart Assoc. 2018;7:e008678. DOI: 10.1161/JAHA.118.008678.)
Key Words: artificial intelligence • cardiac arrest • deep learning • machine learning • rapid response system • resuscitation
In-hospital cardiac arrest is a major burden to public health,
which affects patient safety.1–3
More than a half of cardiac
arrests result from respiratory failure or hypovolemic shock,
and 80% of patients with cardiac arrest show signs of
deterioration in the 8 hours before cardiac arrest.4–9
However,
209 000 in-hospital cardiac arrests occur in the United States
each year, and the survival discharge rate for patients with
cardiac arrest is 20% worldwide.10,11
Rapid response systems
(RRSs) have been introduced in many hospitals to detect
cardiac arrest using the track-and-trigger system (TTS).12,13
Two types of TTS are used in RRSs. For the single-parameter
TTS (SPTTS), cardiac arrest is predicted if any single vital sign
(eg, heart rate [HR], blood pressure) is out of the normal
range.14
The aggregated weighted TTS calculates a weighted
score for each vital sign and then finds patients with cardiac
arrest based on the sum of these scores.15
The modified early
warning score (MEWS) is one of the most widely used
approaches among all aggregated weighted TTSs (Table 1)16
;
however, traditional TTSs including MEWS have limitations, with
low sensitivity or high false-alarm rates.14,15,17
Sensitivity and
false-alarm rate interact: Increased sensitivity creates higher
false-alarm rates and vice versa.
Current RRSs suffer from low sensitivity or a high false-
alarm rate. An RRS was used for only 30% of patients before
unplanned intensive care unit admission and was not used for
22.8% of patients, even if they met the criteria.18,19
From the Departments of Emergency Medicine (J.-m.K.) and Cardiology (J.P.), Mediplex Sejong Hospital, Incheon, Korea; VUNO, Seoul, Korea (Youngnam L., Yeha L.,
S.L.).
*Dr Kwon and Mr Youngnam Lee contributed equally to this study.
Correspondence to: Joon-myoung Kwon, MD, Department of Emergency medicine, Mediplex Sejong Hospital, 20, Gyeyangmunhwa-ro, Gyeyang-gu, Incheon 21080,
Korea. E-mail: kwonjm@sejongh.co.kr
Received January 18, 2018; accepted May 31, 2018.
ª 2018 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley. This is an open access article under the terms of the Creative Commons
Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for
commercial purposes.
DOI: 10.1161/JAHA.118.008678 Journal of the American Heart Association 1
ORIGINAL RESEARCH
byguestonJune28,2018http://jaha.ahajournals.org/Downloadedfrom
감염내과 심장내과
BRIEF COMMUNICATION OPEN
Digital biomarkers of cognitive function
Paul Dagum1
To identify digital biomarkers associated with cognitive function, we analyzed human–computer interaction from 7 days of
smartphone use in 27 subjects (ages 18–34) who received a gold standard neuropsychological assessment. For several
neuropsychological constructs (working memory, memory, executive function, language, and intelligence), we found a family of
digital biomarkers that predicted test scores with high correlations (p  10−4
). These preliminary results suggest that passive
measures from smartphone use could be a continuous ecological surrogate for laboratory-based neuropsychological assessment.
npj Digital Medicine (2018)1:10 ; doi:10.1038/s41746-018-0018-4
INTRODUCTION
By comparison to the functional metrics available in other
disciplines, conventional measures of neuropsychiatric disorders
have several challenges. First, they are obtrusive, requiring a
subject to break from their normal routine, dedicating time and
often travel. Second, they are not ecological and require subjects
to perform a task outside of the context of everyday behavior.
Third, they are episodic and provide sparse snapshots of a patient
only at the time of the assessment. Lastly, they are poorly scalable,
taxing limited resources including space and trained staff.
In seeking objective and ecological measures of cognition, we
attempted to develop a method to measure memory and
executive function not in the laboratory but in the moment,
day-to-day. We used human–computer interaction on smart-
phones to identify digital biomarkers that were correlated with
neuropsychological performance.
RESULTS
In 2014, 27 participants (ages 27.1 ± 4.4 years, education
14.1 ± 2.3 years, M:F 8:19) volunteered for neuropsychological
assessment and a test of the smartphone app. Smartphone
human–computer interaction data from the 7 days following
the neuropsychological assessment showed a range of correla-
tions with the cognitive scores. Table 1 shows the correlation
between each neurocognitive test and the cross-validated
predictions of the supervised kernel PCA constructed from
the biomarkers for that test. Figure 1 shows each participant
test score and the digital biomarker prediction for (a) digits
backward, (b) symbol digit modality, (c) animal fluency,
(d) Wechsler Memory Scale-3rd Edition (WMS-III) logical
memory (delayed free recall), (e) brief visuospatial memory test
(delayed free recall), and (f) Wechsler Adult Intelligence Scale-
4th Edition (WAIS-IV) block design. Construct validity of the
predictions was determined using pattern matching that
computed a correlation of 0.87 with p  10−59
between the
covariance matrix of the predictions and the covariance matrix
of the tests.
Table 1. Fourteen neurocognitive assessments covering five cognitive
domains and dexterity were performed by a neuropsychologist.
Shown are the group mean and standard deviation, range of score,
and the correlation between each test and the cross-validated
prediction constructed from the digital biomarkers for that test
Cognitive predictions
Mean (SD) Range R (predicted),
p-value
Working memory
Digits forward 10.9 (2.7) 7–15 0.71 ± 0.10, 10−4
Digits backward 8.3 (2.7) 4–14 0.75 ± 0.08, 10−5
Executive function
Trail A 23.0 (7.6) 12–39 0.70 ± 0.10, 10−4
Trail B 53.3 (13.1) 37–88 0.82 ± 0.06, 10−6
Symbol digit modality 55.8 (7.7) 43–67 0.70 ± 0.10, 10−4
Language
Animal fluency 22.5 (3.8) 15–30 0.67 ± 0.11, 10−4
FAS phonemic fluency 42 (7.1) 27–52 0.63 ± 0.12, 10−3
Dexterity
Grooved pegboard test
(dominant hand)
62.7 (6.7) 51–75 0.73 ± 0.09, 10−4
Memory
California verbal learning test
(delayed free recall)
14.1 (1.9) 9–16 0.62 ± 0.12, 10−3
WMS-III logical memory
(delayed free recall)
29.4 (6.2) 18–42 0.81 ± 0.07, 10−6
Brief visuospatial memory test
(delayed free recall)
10.2 (1.8) 5–12 0.77 ± 0.08, 10−5
Intelligence scale
WAIS-IV block design 46.1(12.8) 12–61 0.83 ± 0.06, 10−6
WAIS-IV matrix reasoning 22.1(3.3) 12–26 0.80 ± 0.07, 10−6
WAIS-IV vocabulary 40.6(4.0) 31–50 0.67 ± 0.11, 10−4
Received: 5 October 2017 Revised: 3 February 2018 Accepted: 7 February 2018
1
Mindstrong Health, 248 Homer Street, Palo Alto, CA 94301, USA
Correspondence: Paul Dagum (paul@mindstronghealth.com)
www.nature.com/npjdigitalmed
정신의학과
P R E C I S I O N M E D I C I N E
Identification of type 2 diabetes subgroups through
topological analysis of patient similarity
Li Li,1
Wei-Yi Cheng,1
Benjamin S. Glicksberg,1
Omri Gottesman,2
Ronald Tamler,3
Rong Chen,1
Erwin P. Bottinger,2
Joel T. Dudley1,4
*
Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a
rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to
improve early prevention and clinical management of T2D and its complications. Clinicians have understood that
patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related compli-
cations. We used a precision medicine approach to characterize the complexity of T2D patient populations based
on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully
identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was character-
ized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer ma-
lignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases,
neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent
T2D subtypes to identify subtype-specific genetic markers and identified 1279, 1227, and 1338 single-nucleotide
polymorphisms (SNPs) that mapped to 425, 322, and 437 unique genes specific to subtypes 1, 2, and 3, respec-
tively. By assessing the human disease–SNP association for each subtype, the enriched phenotypes and
biological functions at the gene level for each subtype matched with the disease comorbidities and clinical dif-
ferences that we identified through EMRs. Our approach demonstrates the utility of applying the precision
medicine paradigm in T2D and the promise of extending the approach to the study of other complex, multi-
factorial diseases.
INTRODUCTION
Type 2 diabetes (T2D) is a complex, multifactorial disease that has
emerged as an increasing prevalent worldwide health concern asso-
ciated with high economic and physiological burdens. An estimated
29.1 million Americans (9.3% of the population) were estimated to
have some form of diabetes in 2012—up 13% from 2010—with T2D
representing up to 95% of all diagnosed cases (1, 2). Risk factors for
T2D include obesity, family history of diabetes, physical inactivity, eth-
nicity, and advanced age (1, 2). Diabetes and its complications now
rank among the leading causes of death in the United States (2). In fact,
diabetes is the leading cause of nontraumatic foot amputation, adult
blindness, and need for kidney dialysis, and multiplies risk for myo-
cardial infarction, peripheral artery disease, and cerebrovascular disease
(3–6). The total estimated direct medical cost attributable to diabetes in
the United States in 2012 was $176 billion, with an estimated $76 billion
attributable to hospital inpatient care alone. There is a great need to im-
prove understanding of T2D and its complex factors to facilitate pre-
vention, early detection, and improvements in clinical management.
A more precise characterization of T2D patient populations can en-
hance our understanding of T2D pathophysiology (7, 8). Current
clinical definitions classify diabetes into three major subtypes: type 1 dia-
betes (T1D), T2D, and maturity-onset diabetes of the young. Other sub-
types based on phenotype bridge the gap between T1D and T2D, for
example, latent autoimmune diabetes in adults (LADA) (7) and ketosis-
prone T2D. The current categories indicate that the traditional definition of
diabetes, especially T2D, might comprise additional subtypes with dis-
tinct clinical characteristics. A recent analysis of the longitudinal Whitehall
II cohort study demonstrated improved assessment of cardiovascular
risks when subgrouping T2D patients according to glucose concentration
criteria (9). Genetic association studies reveal that the genetic architec-
ture of T2D is profoundly complex (10–12). Identified T2D-associated
risk variants exhibit allelic heterogeneity and directional differentiation
among populations (13, 14). The apparent clinical and genetic com-
plexity and heterogeneity of T2D patient populations suggest that there
are opportunities to refine the current, predominantly symptom-based,
definition of T2D into additional subtypes (7).
Because etiological and pathophysiological differences exist among
T2D patients, we hypothesize that a data-driven analysis of a clinical
population could identify new T2D subtypes and factors. Here, we de-
velop a data-driven, topology-based approach to (i) map the complexity
of patient populations using clinical data from electronic medical re-
cords (EMRs) and (ii) identify new, emergent T2D patient subgroups
with subtype-specific clinical and genetic characteristics. We apply this
approachtoadatasetcomprisingmatchedEMRsandgenotypedatafrom
more than 11,000 individuals. Topological analysis of these data revealed
three distinct T2D subtypes that exhibited distinct patterns of clinical
characteristics and disease comorbidities. Further, we identified genetic
markers associated with each T2D subtype and performed gene- and
pathway-level analysis of subtype genetic associations. Biological and
phenotypic features enriched in the genetic analysis corroborated clinical
disparities observed among subgroups. Our findings suggest that data-
driven,topologicalanalysisofpatientco
내분비내과
LETTER
Derma o og - eve c a ca on o k n cancer
w h deep neura ne work
피부과
FOCUS LETTERS
W
W
W
W
W
Ca d o og s eve a hy hm a de ec on and
c ass ca on n ambu a o y e ec oca d og ams
us ng a deep neu a ne wo k
M m
M
FOCUS LETTERS
심장내과
D p a n ng nab obu a m n and on o
human b a o y a n v o a on
산부인과
O G NA A
W on o On o og nd b e n e e men
e ommend on g eemen w h n e pe
mu d p n umo bo d
종양내과
D m
m
B D m OHCA
m Kw MD K H MD M H M K m MD
M M K m MD M M L m MD M K H K m
MD D MD D MD D R K C
MD D B H O MD D
D m Em M M H
K
D C C C M H
K
T w
A D C D m
M C C M H
G m w G R K
Tw w
C A K H MD D C
D m M C C M
H K G m w G
R K T E m
m @ m m
A
A m O OHCA m
m m w w
T m
m DCA
M T w m K OHCA w
A
CCEPTED
M
A
N
U
SCRIPT
응급의학과
•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형
•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형
Jeopardy!
2011년 인간 챔피언 두 명 과 퀴즈 대결을 벌여서 압도적인 우승을 차지
메이요 클리닉 협력

(임상 시험 매칭)
전남대병원 

도입
인도 마니팔 병원

WFO 도입
식약처 인공지능

가이드라인 초안
메드트로닉과

혈당관리 앱 시연
2011 2012 2013 2014 2015
뉴욕 MSK암센터 협력

(폐암)
MD앤더슨 협력

(백혈병)
MD앤더슨

파일럿 결과 발표

@ASCO
왓슨 펀드,

웰톡에 투자
뉴욕게놈센터 협력

(교모세포종 분석)
GeneMD,

왓슨 모바일 디벨로퍼 

챌린지 우승
클리블랜드 클리닉 협력

(암 유전체 분석)
한국 IBM

왓슨 사업부 신설
Watson Health 출범
피텔, 익스플로리스 인수

JJ, 애플, 메드트로닉 협력
에픽 시스템즈, 메이요클리닉

제휴 (EHR 분석)
동경대 도입

( WFO)
왓슨 펀드,

모더나이징 메디슨

투자
학계/의료계
산업계
패쓰웨이 지노믹스 OME

클로즈드 알파 서비스 시작
트루븐 헬스 

인수
애플 리서치 키트

통한 수면 연구 시작
2017
가천대 

길병원 

도입
메드트로닉

Sugar.IQ 출시
제약사 

테바와 제휴
태국 범룽랏 국제 병원, 

WFO 도입
머지

헬스케어

인수
2016
언더 아머 제휴
브로드 연구소 협력 발표
(유전체 분석-항암제 내
성)
마니팔 병원의 

WFO 정확성 발표
대구가톨릭병원

대구동산병원 

도입
부산대병원

도입
왓슨 펀드,

패쓰웨이 지노믹스

투자
제퍼디! 우승
조선대병원 

도입
한국 왓슨 

컨소시움 출범
쥬피터 

메디컬 

센터 

도입
식약처 인공지능

가이드라인
메이요 클리닉

임상시험매칭

결과발표
2018
건양대병원

도입
IBM Watson Health Chronicle
WFO 

최초 논문
메이요 클리닉 협력

(임상 시험 매칭)
전남대병원 

도입
인도 마니팔 병원

WFO 도입
식약처 인공지능

가이드라인 초안
메드트로닉과

혈당관리 앱 시연
2011 2012 2013 2014 2015
뉴욕 MSK암센터 협력

(폐암)
MD앤더슨 협력

(백혈병)
MD앤더슨

파일럿 결과 발표

@ASCO
왓슨 펀드,

웰톡에 투자
뉴욕게놈센터 협력

(교모세포종 분석)
GeneMD,

왓슨 모바일 디벨로퍼 

챌린지 우승
클리블랜드 클리닉 협력

(암 유전체 분석)
한국 IBM

왓슨 사업부 신설
Watson Health 출범
피텔, 익스플로리스 인수

JJ, 애플, 메드트로닉 협력
에픽 시스템즈, 메이요클리닉

제휴 (EHR 분석)
동경대 도입

( WFO)
왓슨 펀드,

모더나이징 메디슨

투자
학계/의료계
산업계
패쓰웨이 지노믹스 OME

클로즈드 알파 서비스 시작
트루븐 헬스 

인수
애플 리서치 키트

통한 수면 연구 시작
2017
가천대 

길병원 

도입
메드트로닉

Sugar.IQ 출시
제약사 

테바와 제휴
태국 범룽랏 국제 병원, 

WFO 도입
머지

헬스케어

인수
2016
언더 아머 제휴
브로드 연구소 협력 발표
(유전체 분석-항암제 내
성)
마니팔 병원의 

WFO 정확성 발표
부산대병원

도입
왓슨 펀드,

패쓰웨이 지노믹스

투자
제퍼디! 우승
조선대병원 

도입
한국 왓슨 

컨소시움 출범
쥬피터 

메디컬 

센터 

도입
식약처 인공지능

가이드라인
메이요 클리닉

임상시험매칭

결과발표
2018
건양대병원

도입
IBM Watson Health Chronicle
WFO 

최초 논문
대구가톨릭병원

대구동산병원 

도입
Annals of Oncology (2016) 27 (suppl_9): ix179-ix180. 10.1093/annonc/mdw601
Validation study to assess performance of IBM cognitive
computing system Watson for oncology with Manipal
multidisciplinary tumour board for 1000 consecutive cases: 

An Indian experience
•인도 마니팔 병원의 1,000명의 암환자 에 대해 의사와 WFO의 권고안의 ‘일치율’을 비교

•유방암 638명, 대장암 126명, 직장암 124명, 폐암 112명

•의사-왓슨 일치율

•추천(50%), 고려(28%), 비추천(17%)

•의사의 진료안 중 5%는 왓슨의 권고안으로 제시되지 않음

•일치율이 암의 종류마다 달랐음

•직장암(85%), 폐암(17.8%)

•삼중음성 유방암(67.9%), HER2 음성 유방암 (35%)
WFO in ASCO 2017
•가천대 길병원의 대장암과 위암 환자에 왓슨 적용 결과

• 대장암 환자(stage II-IV) 340명

• 진행성 위암 환자 185명 (Retrospective)

• 의사와의 일치율

• 대장암 환자: 73%

• 보조 (adjuvant) 항암치료를 받은 250명: 85%

• 전이성 환자 90명: 40%

• 위암 환자: 49%

• Trastzumab/FOLFOX 가 국민 건강 보험 수가를 받지 못함

• S-1(tegafur, gimeracil and oteracil)+cisplatin):

• 국내는 매우 루틴; 미국에서는 X
원칙이 필요하다
•어떤 환자의 경우, 왓슨에게 의견을 물을 것인가?

•왓슨을 (암종별로) 얼마나 신뢰할 것인가?

•왓슨의 의견을 환자에게 공개할 것인가?

•왓슨과 의료진의 판단이 다른 경우 어떻게 할 것인가?

•왓슨에게 보험 급여를 매길 수 있는가?
이러한 기준에 따라 의료의 질/치료효과가 달라질 수 있으나,

현재 개별 병원이 개별적인 기준으로 활용하게 됨
•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형
Deep Learning
http://theanalyticsstore.ie/deep-learning/
Radiologist
•손 엑스레이 영상을 판독하여 환자의 골연령 (뼈 나이)를 계산해주는 인공지능

• 기존에 의사는 그룰리히-파일(Greulich-Pyle)법 등으로 표준 사진과 엑스레이를 비교하여 판독

• 인공지능은 참조표준영상에서 성별/나이별 패턴을 찾아서 유사성을 확률로 표시 + 표준 영상 검색

•의사가 성조숙증이나 저성장을 진단하는데 도움을 줄 수 있음
- 1 -
보 도 자 료
국내에서 개발한 인공지능(AI) 기반 의료기기 첫 허가
- 인공지능 기술 활용하여 뼈 나이 판독한다 -
식품의약품안전처 처장 류영진 는 국내 의료기기업체 주 뷰노가
개발한 인공지능 기술이 적용된 의료영상분석장치소프트웨어
뷰노메드 본에이지 를 월 일 허가했다고
밝혔습니다
이번에 허가된 뷰노메드 본에이지 는 인공지능 이 엑스레이 영상을
분석하여 환자의 뼈 나이를 제시하고 의사가 제시된 정보 등으로
성조숙증이나 저성장을 진단하는데 도움을 주는 소프트웨어입니다
그동안 의사가 환자의 왼쪽 손 엑스레이 영상을 참조표준영상
과 비교하면서 수동으로 뼈 나이를 판독하던 것을 자동화하여
판독시간을 단축하였습니다
이번 허가 제품은 년 월부터 빅데이터 및 인공지능 기술이
적용된 의료기기의 허가 심사 가이드라인 적용 대상으로 선정되어
임상시험 설계에서 허가까지 맞춤 지원하였습니다
뷰노메드 본에이지 는 환자 왼쪽 손 엑스레이 영상을 분석하여 의
료인이 환자 뼈 나이를 판단하는데 도움을 주기 위한 목적으로
허가되었습니다
- 2 -
분석은 인공지능이 촬영된 엑스레이 영상의 패턴을 인식하여 성별
남자 개 여자 개 로 분류된 뼈 나이 모델 참조표준영상에서
성별 나이별 패턴을 찾아 유사성을 확률로 표시하면 의사가 확률값
호르몬 수치 등의 정보를 종합하여 성조숙증이나 저성장을 진단합
니다
임상시험을 통해 제품 정확도 성능 를 평가한 결과 의사가 판단한
뼈 나이와 비교했을 때 평균 개월 차이가 있었으며 제조업체가
해당 제품 인공지능이 스스로 인지 학습할 수 있도록 영상자료를
주기적으로 업데이트하여 의사와의 오차를 좁혀나갈 수 있도록
설계되었습니다
인공지능 기반 의료기기 임상시험계획 승인건수는 이번에 허가받은
뷰노메드 본에이지 를 포함하여 현재까지 건입니다
임상시험이 승인된 인공지능 기반 의료기기는 자기공명영상으로
뇌경색 유형을 분류하는 소프트웨어 건 엑스레이 영상을 통해
폐결절 진단을 도와주는 소프트웨어 건 입니다
참고로 식약처는 인공지능 가상현실 프린팅 등 차 산업과
관련된 의료기기 신속한 개발을 지원하기 위하여 제품 연구 개발부터
임상시험 허가에 이르기까지 전 과정을 맞춤 지원하는 차세대
프로젝트 신개발 의료기기 허가도우미 등을 운영하고 있
습니다
식약처는 이번 제품 허가를 통해 개개인의 뼈 나이를 신속하게
분석 판정하는데 도움을 줄 수 있을 것이라며 앞으로도 첨단 의료기기
개발이 활성화될 수 있도록 적극적으로 지원해 나갈 것이라고
밝혔습니다
저는 뷰노의 자문을 맡고 있으며, 지분 관계가 있음을 밝힙니다
This copy is for personal use only.
To order printed copies, contact reprints@rsna.org
This copy is for personal use only.
To order printed copies, contact reprints@rsna.org
ORIGINAL RESEARCH • THORACIC IMAGING
hest radiography, one of the most common diagnos- intraobserver agreements because of its limited spatial reso-
Development and Validation of Deep
Learning–based Automatic Detection
Algorithm for Malignant Pulmonary Nodules
on Chest Radiographs
Ju Gang Nam, MD* • Sunggyun Park, PhD* • Eui Jin Hwang, MD • Jong Hyuk Lee, MD • Kwang-Nam Jin, MD,
PhD • KunYoung Lim, MD, PhD • Thienkai HuyVu, MD, PhD • Jae Ho Sohn, MD • Sangheum Hwang, PhD • Jin
Mo Goo, MD, PhD • Chang Min Park, MD, PhD
From the Department of Radiology and Institute of Radiation Medicine, Seoul National University Hospital and College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul
03080, Republic of Korea (J.G.N., E.J.H., J.M.G., C.M.P.); Lunit Incorporated, Seoul, Republic of Korea (S.P.); Department of Radiology, Armed Forces Seoul Hospital,
Seoul, Republic of Korea (J.H.L.); Department of Radiology, Seoul National University Boramae Medical Center, Seoul, Republic of Korea (K.N.J.); Department of
Radiology, National Cancer Center, Goyang, Republic of Korea (K.Y.L.); Department of Radiology and Biomedical Imaging, University of California, San Francisco,
San Francisco, Calif (T.H.V., J.H.S.); and Department of Industrial  Information Systems Engineering, Seoul National University of Science and Technology, Seoul,
Republic of Korea (S.H.). Received January 30, 2018; revision requested March 20; revision received July 29; accepted August 6. Address correspondence to C.M.P.
(e-mail: cmpark.morphius@gmail.com).
Study supported by SNUH Research Fund and Lunit (06–2016–3000) and by Seoul Research and Business Development Program (FI170002).
*J.G.N. and S.P. contributed equally to this work.
Conflicts of interest are listed at the end of this article.
Radiology 2018; 00:1–11 • https://doi.org/10.1148/radiol.2018180237 • Content codes:
Purpose: To develop and validate a deep learning–based automatic detection algorithm (DLAD) for malignant pulmonary nodules
on chest radiographs and to compare its performance with physicians including thoracic radiologists.
Materials and Methods: For this retrospective study, DLAD was developed by using 43292 chest radiographs (normal radiograph–
to–nodule radiograph ratio, 34067:9225) in 34676 patients (healthy-to-nodule ratio, 30784:3892; 19230 men [mean age, 52.8
years; age range, 18–99 years]; 15446 women [mean age, 52.3 years; age range, 18–98 years]) obtained between 2010 and 2015,
which were labeled and partially annotated by 13 board-certified radiologists, in a convolutional neural network. Radiograph clas-
sification and nodule detection performances of DLAD were validated by using one internal and four external data sets from three
South Korean hospitals and one U.S. hospital. For internal and external validation, radiograph classification and nodule detection
performances of DLAD were evaluated by using the area under the receiver operating characteristic curve (AUROC) and jackknife
alternative free-response receiver-operating characteristic (JAFROC) figure of merit (FOM), respectively. An observer performance
test involving 18 physicians, including nine board-certified radiologists, was conducted by using one of the four external validation
data sets. Performances of DLAD, physicians, and physicians assisted with DLAD were evaluated and compared.
Results: According to one internal and four external validation data sets, radiograph classification and nodule detection perfor-
mances of DLAD were a range of 0.92–0.99 (AUROC) and 0.831–0.924 (JAFROC FOM), respectively. DLAD showed a higher
AUROC and JAFROC FOM at the observer performance test than 17 of 18 and 15 of 18 physicians, respectively (P , .05), and
all physicians showed improved nodule detection performances with DLAD (mean JAFROC FOM improvement, 0.043; range,
0.006–0.190; P , .05).
Conclusion: This deep learning–based automatic detection algorithm outperformed physicians in radiograph classification and nod-
ule detection performance for malignant pulmonary nodules on chest radiographs, and it enhanced physicians’ performances when
used as a second reader.
©RSNA, 2018
Online supplemental material is available for this article.
• 43,292 chest PA (normal:nodule=34,067:9225)

• labeled/annotated by 13 board-certified radiologists.

• DLAD were validated 1 internal + 4 external datasets 

• 서울대병원 / 보라매병원 / 국립암센터 / UCSF 

• Classification / Lesion localization 

• 인공지능 vs. 의사 vs. 인공지능+의사

• 다양한 수준의 의사와 비교

• Non-radiology / radiology residents 

• Board-certified radiologist / Thoracic radiologists
Nam et al
Figure 1: Images in a 78-year-old female patient with a 1.9-cm part-solid nodule at the left upper lobe. (a) The nodule was faintly visible on the
chest radiograph (arrowheads) and was detected by 11 of 18 observers. (b) At contrast-enhanced CT examination, biopsy confirmed lung adeno-
carcinoma (arrow). (c) DLAD reported the nodule with a confidence level of 2, resulting in its detection by an additional five radiologists and an
elevation in its confidence by eight radiologists.
Figure 2: Images in a 64-year-old male patient with a 2.2-cm lung adenocarcinoma at the left upper lobe. (a) The nodule was faintly visible on
the chest radiograph (arrowheads) and was detected by seven of 18 observers. (b) Biopsy confirmed lung adenocarcinoma in the left upper lobe
on contrast-enhanced CT image (arrow). (c) DLAD reported the nodule with a confidence level of 2, resulting in its detection by an additional two
radiologists and an elevated confidence level of the nodule by two radiologists.
Deep Learning Automatic Detection Algorithm for Malignant Pulmonary Nodules
Table 3: Patient Classification and Nodule Detection at the Observer Performance Test
Observer
Test 1
DLAD versus Test 1
(P Value) Test 2
Test 1 versus Test 2 (P
Value)
Radiograph
Classification
(AUROC)
Nodule
Detection
(JAFROC FOM)
Radiograph
Classification
Nodule
Detection
Radiograph
Classification
(AUROC)
Nodule
Detection
(JAFROC
FOM)
Radiograph
Classification
Nodule
Detection
Nonradiology
physicians
Observer 1 0.77 0.716 ,.001 ,.001 0.91 0.853 ,.001 ,.001
Observer 2 0.78 0.657 ,.001 ,.001 0.90 0.846 ,.001 ,.001
Observer 3 0.80 0.700 ,.001 ,.001 0.88 0.783 ,.001 ,.001
Group 0.691 ,.001* 0.828 ,.001*
Radiology residents
Observer 4 0.78 0.767 ,.001 ,.001 0.80 0.785 .02 .03
Observer 5 0.86 0.772 .001 ,.001 0.91 0.837 .02 ,.001
Observer 6 0.86 0.789 .05 .002 0.86 0.799 .08 .54
Observer 7 0.84 0.807 .01 .003 0.91 0.843 .003 .02
Observer 8 0.87 0.797 .10 .003 0.90 0.845 .03 .001
Observer 9 0.90 0.847 .52 .12 0.92 0.867 .04 .03
Group 0.790 ,.001* 0.867 ,.001*
Board-certified
radiologists
Observer 10 0.87 0.836 .05 .01 0.90 0.865 .004 .002
Observer 11 0.83 0.804 ,.001 ,.001 0.84 0.817 .03 .04
Observer 12 0.88 0.817 .18 .005 0.91 0.841 .01 .01
Observer 13 0.91 0.824 ..99 .02 0.92 0.836 .51 .24
Observer 14 0.88 0.834 .14 .03 0.88 0.840 .87 .23
Group 0.821 .02* 0.840 .01*
Thoracic radiologists
Observer 15 0.94 0.856 .15 .21 0.96 0.878 .08 .03
Observer 16 0.92 0.854 .60 .17 0.93 0.872 .34 .02
Observer 17 0.86 0.820 .02 .01 0.88 0.838 .14 .12
Observer 18 0.84 0.800 ,.001 ,.001 0.87 0.827 .02 .02
Group 0.833 .08* 0.854 ,.001*
Note.—Observer 4 had 1 year of experience; observers 5 and 6 had 2 years of experience; observers 7–9 had 3 years of experience; observers
10–12 had 7 years of experience; observers 13 and 14 had 8 years of experience; observer 15 had 26 years of experience; observer 16 had 13
years of experience; and observers 17 and 18 had 9 years of experience. Observers 1–3 were 4th-year residents from obstetrics and gynecolo-
의사
인공지능 vs. 의사만

(p value) 의사+인공지능
의사 vs. 의사+인공지능

(p value)
영상의학과 1년차 전공의
영상의학과 2년차 전공의
영상의학과 3년차 전공의
산부인과 4년차 전공의
정형외과 4년차 전공의
내과 4년차 전공의
영상의학과 전문의
7년 경력
8년 경력
영상의학과 전문의 (흉부)
26년 경력
13년 경력
9년 경력
영상의학과 전공의
비영상의학과 의사
•인공지능을 second reader로 활용하면 정확도가 개선

•classification: 17 of 18 명이 개선 (15 of 18, P0.05)

•nodule detection: 18 of 18 명이 개선 (14 of 18, P0.05)
Deep Learning Automatic Detection Algorithm for Malignant Pulmonary Nodules
Table 3: Patient Classification and Nodule Detection at the Observer Performance Test
Observer
Test 1
DLAD versus Test 1
(P Value) Test 2
Test 1 versus Test 2 (P
Value)
Radiograph
Classification
(AUROC)
Nodule
Detection
(JAFROC FOM)
Radiograph
Classification
Nodule
Detection
Radiograph
Classification
(AUROC)
Nodule
Detection
(JAFROC
FOM)
Radiograph
Classification
Nodule
Detection
Nonradiology
physicians
Observer 1 0.77 0.716 ,.001 ,.001 0.91 0.853 ,.001 ,.001
Observer 2 0.78 0.657 ,.001 ,.001 0.90 0.846 ,.001 ,.001
Observer 3 0.80 0.700 ,.001 ,.001 0.88 0.783 ,.001 ,.001
Group 0.691 ,.001* 0.828 ,.001*
Radiology residents
Observer 4 0.78 0.767 ,.001 ,.001 0.80 0.785 .02 .03
Observer 5 0.86 0.772 .001 ,.001 0.91 0.837 .02 ,.001
Observer 6 0.86 0.789 .05 .002 0.86 0.799 .08 .54
Observer 7 0.84 0.807 .01 .003 0.91 0.843 .003 .02
Observer 8 0.87 0.797 .10 .003 0.90 0.845 .03 .001
Observer 9 0.90 0.847 .52 .12 0.92 0.867 .04 .03
Group 0.790 ,.001* 0.867 ,.001*
Board-certified
radiologists
Observer 10 0.87 0.836 .05 .01 0.90 0.865 .004 .002
Observer 11 0.83 0.804 ,.001 ,.001 0.84 0.817 .03 .04
Observer 12 0.88 0.817 .18 .005 0.91 0.841 .01 .01
Observer 13 0.91 0.824 ..99 .02 0.92 0.836 .51 .24
Observer 14 0.88 0.834 .14 .03 0.88 0.840 .87 .23
Group 0.821 .02* 0.840 .01*
Thoracic radiologists
Observer 15 0.94 0.856 .15 .21 0.96 0.878 .08 .03
Observer 16 0.92 0.854 .60 .17 0.93 0.872 .34 .02
Observer 17 0.86 0.820 .02 .01 0.88 0.838 .14 .12
Observer 18 0.84 0.800 ,.001 ,.001 0.87 0.827 .02 .02
Group 0.833 .08* 0.854 ,.001*
Note.—Observer 4 had 1 year of experience; observers 5 and 6 had 2 years of experience; observers 7–9 had 3 years of experience; observers
10–12 had 7 years of experience; observers 13 and 14 had 8 years of experience; observer 15 had 26 years of experience; observer 16 had 13
years of experience; and observers 17 and 18 had 9 years of experience. Observers 1–3 were 4th-year residents from obstetrics and gynecolo-
의사
인공지능 vs. 의사만

(p value) 의사+인공지능
의사 vs. 의사+인공지능

(p value)
영상의학과 1년차 전공의
영상의학과 2년차 전공의
영상의학과 3년차 전공의
산부인과 4년차 전공의
정형외과 4년차 전공의
내과 4년차 전공의
영상의학과 전문의
7년 경력
8년 경력
영상의학과 전문의 (흉부)
26년 경력
13년 경력
9년 경력
영상의학과 전공의
비영상의학과 의사
인공지능 0.91 0.885
•“인공지능 혼자” 한 것이 “영상의학과 전문의+인공지능”보다 대부분 더 정확

•classification: 9명 중 6명보다 나음

•nodule detection: 9명 전원보다 나음
병리과
조직검사; 확진을 내리는 대법관
A B DC
Benign without atypia / Atypic / DCIS (ductal carcinoma in situ) / Invasive Carcinoma
Interpretation?
Elmore etl al. JAMA 2015
Diagnostic Concordance Among Pathologists 

유방암 병리 데이터 판독하기
Figure 4. Participating Pathologists’ Interpretations of Each of the 240 Breast Biopsy Test Cases
0 25 50 75 100
Interpretations, %
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
72
Case
Benign without atypia
72 Cases
2070 Total interpretations
A
0 25 50 75 100
Interpretations, %
218
220
222
224
226
228
230
232
234
236
238
240
Case
Invasive carcinoma
23 Cases
663 Total interpretations
D
0 25 50 75 100
Interpretations, %
147
145
149
151
153
155
157
159
161
163
165
167
169
171
173
175
177
179
181
183
185
187
189
191
193
195
197
199
201
203
205
207
209
211
213
215
217
Case
DCIS
73 Cases
2097 Total interpretations
C
0 25 50 75 100
Interpretations, %
74
76
78
80
82
84
86
88
90
92
94
96
98
100
102
104
106
108
110
112
114
116
118
120
122
124
126
128
130
132
134
136
138
140
142
144
Case
Atypia
72 Cases
2070 Total interpretations
B
Benign without atypia
Atypia
DCIS
Invasive carcinoma
Pathologist interpretation
DCIS indicates ductal carcinoma in situ.
Diagnostic Concordance in Interpreting Breast Biopsies Original Investigation Research
Elmore etl al. JAMA 2015
유방암 판독에 대한 병리학과 전문의들의 불일치도
ISBI Grand Challenge on
Cancer Metastases Detection in Lymph Node
Camelyon16 (200 registrants)
International Symposium on Biomedical Imaging 2016
HE Image Processing Framework
Train
whole slide image
sample
sample
training data
normaltumor
Test
whole slide image
overlapping image
patches tumor prob. map
1.0
0.0
0.5
Convolutional Neural
Network
P(tumor)
https://blogs.nvidia.com/blog/2016/09/19/deep-learning-breast-cancer-diagnosis/
Clinical study on ISBI dataset
Error Rate
Pathologist in competition setting 3.5%
Pathologists in clinical practice (n = 12) 13% - 26%
Pathologists on micro-metastasis(small tumors) 23% - 42%
Beck Lab Deep Learning Model 0.65%
Beck Lab’s deep learning model now outperforms pathologist
Andrew Beck, Machine Learning for Healthcare, MIT 2017
구글의 유방 병리 판독 인공지능
• The localization score(FROC) for the algorithm reached 89%, which significantly
exceeded the score of 73% for a pathologist with no time constraint.
인공지능의 민감도 + 인간의 특이도
Yun Liu et al. Detecting Cancer Metastases on Gigapixel Pathology Images (2017)
• 구글의 인공지능은 민감도에서 큰 개선 (92.9%, 88.5%)

•@8FP: FP를 8개까지 봐주면서, 달성할 수 있는 민감도

•FROC: FP를 슬라이드당 1/4, 1/2, 1, 2, 4, 8개를 허용한 민감도의 평균

•즉, FP를 조금 봐준다면, 인공지능은 매우 높은 민감도를 달성 가능

• 인간 병리학자는 민감도 73%에 반해, 특이도는 거의 100% 달성
•인간 병리학자와 인공지능 병리학자는 서로 잘하는 것이 다름 

•양쪽이 협력하면 판독 효율성, 일관성, 민감도 등에서 개선 기대 가능
피부암 판독 인공지능
당뇨성 망막병증 판독 인공지능
•Some polyps were detected with only partial appearance.
•detected in both normal and insufficient light condition.
•detected under both qualified and suboptimal bowel preparations.
ARTICLESNATURE BIOMEDICAL ENGINEERING
from patients who underwent colonoscopy examinations up to 2
years later.
Also, we demonstrated high per-image-sensitivity (94.38%
and 91.64%) in both the image (datasetA) and video (datasetC)
analyses. DatasetsA and C included large variations of polyp mor-
phology and image quality (Fig. 3, Supplementary Figs. 2–5 and
Supplementary Videos 3 and 4). For images with only flat and iso-
datasets are often small and do not represent the full range of colon
conditions encountered in the clinical setting, and there are often
discrepancies in the reporting of clinical metrics of success such as
sensitivity and specificity19,20,26
. Compared with other metrics such
as precision, we believe that sensitivity and specificity are the most
appropriate metrics for the evaluation of algorithm performance
because of their independence on the ratio of positive to negative
Fig. 3 | Examples of polyp detection for datasetsA and C. Polyps of different morphology, including flat isochromatic polyps (left), dome-shaped polyps
(second from left, middle), pedunculated polyps (second from right) and sessile serrated adenomatous polyps (right), were detected by the algorithm
(as indicated by the green tags in the bottom set of images) in both normal and insufficient light conditions, under both qualified and suboptimal bowel
preparations. Some polyps were detected with only partial appearance (middle, second from right). See Supplementary Figs 2–6 for additional examples.
flat isochromatic polyps dome-shaped polyps sessile serrated adenomatous polypspedunculated polyps
대장내시경에서의 용종 발견 보조 인공지능
•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형
http://www.rolls-royce.com/about/our-technology/enabling-technologies/engine-health-management.aspx#sense
250 sensors to monitor the “health” of the GE turbines
Fig 1. What can consumer wearables do? Heart rate can be measured with an oximeter built into a ring [3], muscle activity with an electromyographi
sensor embedded into clothing [4], stress with an electodermal sensor incorporated into a wristband [5], and physical activity or sleep patterns via an
accelerometer in a watch [6,7]. In addition, a female’s most fertile period can be identified with detailed body temperature tracking [8], while levels of me
attention can be monitored with a small number of non-gelled electroencephalogram (EEG) electrodes [9]. Levels of social interaction (also known to a
PLOS Medicine 2016
Project Artemis at UIOT
S E P S I S
A targeted real-time early warning score (TREWScore)
for septic shock
Katharine E. Henry,1
David N. Hager,2
Peter J. Pronovost,3,4,5
Suchi Saria1,3,5,6
*
Sepsis is a leading cause of death in the United States, with mortality highest among patients who develop septic
shock. Early aggressive treatment decreases morbidity and mortality. Although automated screening tools can detect
patients currently experiencing severe sepsis and septic shock, none predict those at greatest risk of developing
shock. We analyzed routinely available physiological and laboratory data from intensive care unit patients and devel-
oped “TREWScore,” a targeted real-time early warning score that predicts which patients will develop septic shock.
TREWScore identified patients before the onset of septic shock with an area under the ROC (receiver operating
characteristic) curve (AUC) of 0.83 [95% confidence interval (CI), 0.81 to 0.85]. At a specificity of 0.67, TREWScore
achieved a sensitivity of 0.85 and identified patients a median of 28.2 [interquartile range (IQR), 10.6 to 94.2] hours
before onset. Of those identified, two-thirds were identified before any sepsis-related organ dysfunction. In compar-
ison, the Modified Early Warning Score, which has been used clinically for septic shock prediction, achieved a lower
AUC of 0.73 (95% CI, 0.71 to 0.76). A routine screening protocol based on the presence of two of the systemic inflam-
matory response syndrome criteria, suspicion of infection, and either hypotension or hyperlactatemia achieved a low-
er sensitivity of 0.74 at a comparable specificity of 0.64. Continuous sampling of data from the electronic health
records and calculation of TREWScore may allow clinicians to identify patients at risk for septic shock and provide
earlier interventions that would prevent or mitigate the associated morbidity and mortality.
INTRODUCTION
Seven hundred fifty thousand patients develop severe sepsis and septic
shock in the United States each year. More than half of them are
admitted to an intensive care unit (ICU), accounting for 10% of all
ICU admissions, 20 to 30% of hospital deaths, and $15.4 billion in an-
nual health care costs (1–3). Several studies have demonstrated that
morbidity, mortality, and length of stay are decreased when severe sep-
sis and septic shock are identified and treated early (4–8). In particular,
one study showed that mortality from septic shock increased by 7.6%
with every hour that treatment was delayed after the onset of hypo-
tension (9).
More recent studies comparing protocolized care, usual care, and
early goal-directed therapy (EGDT) for patients with septic shock sug-
gest that usual care is as effective as EGDT (10–12). Some have inter-
preted this to mean that usual care has improved over time and reflects
important aspects of EGDT, such as early antibiotics and early ag-
gressive fluid resuscitation (13). It is likely that continued early identi-
fication and treatment will further improve outcomes. However, the
Acute Physiology Score (SAPS II), SequentialOrgan Failure Assessment
(SOFA) scores, Modified Early Warning Score (MEWS), and Simple
Clinical Score (SCS) have been validated to assess illness severity and
risk of death among septic patients (14–17). Although these scores
are useful for predicting general deterioration or mortality, they typical-
ly cannot distinguish with high sensitivity and specificity which patients
are at highest risk of developing a specific acute condition.
The increased use of electronic health records (EHRs), which can be
queried in real time, has generated interest in automating tools that
identify patients at risk for septic shock (18–20). A number of “early
warning systems,” “track and trigger” initiatives, “listening applica-
tions,” and “sniffers” have been implemented to improve detection
andtimelinessof therapy forpatients with severe sepsis andseptic shock
(18, 20–23). Although these tools have been successful at detecting pa-
tients currently experiencing severe sepsis or septic shock, none predict
which patients are at highest risk of developing septic shock.
The adoption of the Affordable Care Act has added to the growing
excitement around predictive models derived from electronic health
R E S E A R C H A R T I C L E
onNovember3,2016http://stm.sciencemag.org/Downloadedfrom
puted as new data became avail
when his or her score crossed t
dation set, the AUC obtained f
0.81 to 0.85) (Fig. 2). At a spec
of 0.33], TREWScore achieved a s
a median of 28.2 hours (IQR, 10
Identification of patients b
A critical event in the developme
related organ dysfunction (seve
been shown to increase after th
more than two-thirds (68.8%) o
were identified before any sepsi
tients were identified a median
(Fig. 3B).
Comparison of TREWScore
Weevaluatedtheperformanceof
methods for the purpose of provid
use of TREWScore. We first com
to MEWS, a general metric used
of catastrophic deterioration (17
oped for tracking sepsis, MEWS
tion of patients at risk for severe
Fig. 2. ROC for detection of septic shock before onset in the validation
set. The ROC curve for TREWScore is shown in blue, with the ROC curve for
MEWS in red. The sensitivity and specificity performance of the routine
screening criteria is indicated by the purple dot. Normal 95% CIs are shown
for TREWScore and MEWS. TPR, true-positive rate; FPR, false-positive rate.
R E S E A R C H A R T I C L E
A targeted real-time early warning score (TREWScore)
for septic shock
AUC=0.83
At a specificity of 0.67, TREWScore achieved a sensitivity of 0.85 

and identified patients a median of 28.2 hours before onset.
March 2019, the Future of Individual Medicine @San Diego
ADA 2018
•미국에서 아이폰 앱으로 출시

•사용이 얼마나 번거로울지가 관건

•어느 정도의 기간을 활용해야 효과가 있는가: 2주? 평생?

•Food logging 등을 어떻게 할 것인가?

•과금 방식도 아직 공개되지 않은듯
ADA 2018
ADA 2017, San Diego, Courtesy of Taeho Kim (Seoul Medical Center)
•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형
디지털 헬스케어의 3단계
•Step 1. 데이터의 측정

•Step 2. 데이터의 통합

•Step 3. 데이터의 분석
의료 시장/산업의 특성
왜 그렇게 어려운가?
글로벌 헬스케어 시장은 큰 시장인가?
YES and NO.
글로벌 헬스케어 시장은 큰 시장인가?
YES and NO.
글로벌 헬스케어 시장의 

총 합은 크다. 12조 달러.
(1경 3046조원)
하지만, 헬스케어 시장은

극도로 세분화된 작은 시장의 

총합으로 구성되어 있다.
모든 세부 시장의 니즈를 충족시키는 것은 불가능하다.
헬스케어 시장의 니즈는 

고객마다 매우 세분화되어 있다.
• 건강인 / 환자 

• 20대 / 30대 / 40대 / 50대 / 60대 / 70대 / 80대

• 남성 / 여성

• 저체중 / 정상/ 과체중

• 가족력

• 건강에 대한 관심

• 지불 능력

• 디지털 리터러시

• B2C / B2B
• 모든 고객의 니즈를 모두 충족시키는 것은 불가능하다.

• 결국 한 번에 하나씩 공략 하는 수밖에 없다.

• 그렇다면 어떤 고객을 골라야하나?

• 가장 절박한 니즈를 가진 고객 세그먼트는?

• 우리가 실제로 해결책을 제시할 수 있는 고객은?

• 돈을 낼 수 있는 고객은?
그러면 어떻게 해야 하는가?
헬스케어 마켓 패러독스
건강인 중증질환

급성질환
지불의사

높음
지불의사

낮음
대상 고객

많음
대상 고객

적음
중증도
누가 돈을 내는가?
사용자, 결정자, 지불자
기업 고객
구매
지불
고객은 누구인가?
환자 의사
보험료
청구
보험
지불
진료
지불
보험금
고객은 누구인가?
환자 의사
기업 보험
고객은 누구인가?
돈은 누가 내지?

누가 결정하지?

누가 사용하지?
환자 의사
기업 보험
Payer는 누구인가
돈은 누가 내지?

누가 결정하지?

누가 사용하지?
(수가, 구매팀, 부모)
돈을 낼 것인가?
니즈가 

얼마나 큰가
지불구조+
이해관계자
기업
고객
타기업 정부
일반적인 산업 생태계와는 달리,

헬스케어 산업의 생태계는 수많은 이해관계자들이 존재합니다
헬스케어

기업
시민단체
보험사
타기업 정부
규제기관
환자
병원
심평원
일반 스타트업 생태계와는 달리,

헬스케어 스타트업의 생태계는 수많은 이해관계자들이 존재합니다
신생 헬스케어 기업은 

이러한 이해관계자들과의 괴리가 존재할 수밖에 없습니다
신생기업
정부
시민단체
규제기관
보험사
병원
근거가 필요하다.
데이터, 데이터, 데이터!
https://www.theranos.com/content/images/news/we-publish-prices.jpg
240 lab tests, less than $15 each
http://graphics.wsj.com/billion-dollar-club/
•기업 가치 $9b (June 2014)

•총 투자유치 규모: $400m

•엘리자베스 홈즈 본인이 과반 이상 지분 보유
The Journal of Clinical Investigation C L I N I C A L M E D I C I N E
Introduction
Clinical laboratory testing plays a critical role in health care and
evidence-based medicine (1). Lab tests provide essential data
that support clinical decisions to screen, diagnose, and treat
health conditions (2). Most individuals encounter clinical testing
through their health care provider during a routine health assess-
ment or as a patient in a health care facility. However, individu-
als are increasingly playing more active roles in managing their
health, and some now seek direct access to laboratory testing for
self-guided assessment or monitoring (3–5).
IntheUSA,allclinicallaboratorytestingconductedonhumans
is regulated by Centers for Medicare  Medicaid Services (CMS)
based on guidelines outlined in Clinical Laboratory Improvement
Amendments (CLIA) (6). To ensure analytical quality of labora-
tory methods, certified laboratories are required to participate in
periodic proficiency testing using a homogeneous batch of sam-
ples that are distributed to each laboratory from a CMS-approved
proficiency testing program. These programs assess the total
allowable error (TEa) that combines method bias and total impre-
cision for each analyte. Acceptability criteria are determined by
CLIA and/or the appropriate accrediting agency (7).
Direct-to-consumer service models now provide means for
individuals to obtain laboratory testing outside traditional health
care settings (4, 5). One company implementing this new model is
Theranos, which offers a blood testing service that uses capillary
tube collection and promises several advantages over traditional
venipuncture: lower collection volumes (typically ≤150 μl versus
≥1.5 ml), convenience, and reduced cost — on average about 5-fold
less than the 2 largest testing laboratories in the USA (Quest and
LabCorp) (8). However, availability of these services varies by
state, where access to offerings may be more or less restrictive
BACKGROUND. Clinical laboratory tests are now being prescribed and made directly available to consumers through retail
outlets in the USA. Concerns with these test have been raised regarding the uncertainty of testing methods used in these
venues and a lack of open, scientific validation of the technical accuracy and clinical equivalency of results obtained through
these services.
METHODS. We conducted a cohort study of 60 healthy adults to compare the uncertainty and accuracy in 22 common clinical
lab tests between one company offering blood tests obtained from finger prick (Theranos) and 2 major clinical testing services
that require standard venipuncture draws (Quest and LabCorp). Samples were collected in Phoenix, Arizona, at an ambulatory
clinic and at retail outlets with point-of-care services.
RESULTS. Theranos flagged tests outside their normal range 1.6× more often than other testing services (P  0.0001). Of the
22 lab measurements evaluated, 15 (68%) showed significant interservice variability (P  0.002). We found nonequivalent
lipid panel test results between Theranos and other clinical services. Variability in testing services, sample collection times,
and subjects markedly influenced lab results.
CONCLUSION. While laboratory practice standards exist to control this variability, the disparities between testing services
we observed could potentially alter clinical interpretation and health care utilization. Greater transparency and evaluation of
testing technologies would increase their utility in personalized health management.
FUNDING. This work was supported by the Icahn Institute for Genomics and Multiscale Biology, a gift from the Harris Family
Charitable Foundation (to J.T. Dudley), and grants from the NIH (R01 DK098242 and U54 CA189201, to J.T. Dudley, and R01
AG046170 and U01 AI111598, to E.E. Schadt).
Evaluation of direct-to-consumer low-volume lab tests
in healthy adults
Brian A. Kidd,1,2,3
Gabriel Hoffman,1,2
Noah Zimmerman,3
Li Li,1,2,3
Joseph W. Morgan,3
Patricia K. Glowe,1,2,3
Gregory J. Botwin,3
Samir Parekh,4
Nikolina Babic,5
Matthew W. Doust,6
Gregory B. Stock,1,2,3
Eric E. Schadt,1,2
and Joel T. Dudley1,2,3
1
Department of Genetics and Genomic Sciences, 2
Icahn Institute for Genomics and Multiscale Biology, 3
Harris Center for Precision Wellness, 4
Department of Hematology and Medical Oncology, and
5
Department of Pathology, Icahn School of Medicine at Mount Sinai, NewYork, NewYork, USA. 6
Hope Research Institute (HRI), Phoenix, Arizona, USA.
Conflict of interest: J.T. Dudley owns equity in NuMedii Inc. and has received consulting
fees or honoraria from Janssen Pharmaceuticals, GlaxoSmithKline, AstraZeneca, and
LAM Therapeutics.
Role of funding source: Study funding provided by the Icahn Institute for Genomics
and Multiscale Biology and the Harris Center for Precision Wellness at the Icahn
School of Medicine at Mount Sinai. Salaries of B.A. Kidd, J.T. Dudley, and E.E. Schadt
Downloaded from http://www.jci.org on March 28, 2016. http://dx.doi.org/10.1172/JCI86318
•Mt Sinai 에서 내어놓은 Theranos 의 정확도에 대한 논문

•2015년 7월 경에 60명의 건강한 환자들을 대상으로 5일 간에 걸쳐서 

•22가지의 검사 항목을 테라노스와 또 다른 두 군데의 검사 기관에 맡겨서 결과를 비교

•결론적으로 Theranos의 결과가 많이 부정확

•콜레스테롤 등의 경우는 의사의 진단이 바뀔 정도로 크게 부정확

•전반적인 테스트들 결과 정상 범위가 아니라고 판단하는 경우가 테라노스가 1.6배 많음

•22개의 검사 항목 중에서 15개에서 유의미하게 결과의 차이가 있었습니다.

•논문에서는 알 수 없는 또 다른 문제 

•Theranos가 자체적으로 개발했다고 '주장' 했던 에디슨 기기를 정말로 썼느냐...하는 것

•WSJ 에 나온 과거 직원의 증언에 따르면, 이미 2015년 7월경이라면, 

•에디슨 기기를 쓰지 않고 지멘스 등 기존 다른 기기에 혈액을 희석해서 쓰고 있을 때

•역시나(?) 이번에도 테라노스는 conflict-of-interest 가 있는 잘못된 논문이라는 반응
$4.5b 에서 $0 으로.
새로운 것을 주장하려면 근거가 있어야 한다.
논문, 임상연구…
IMM인베스트먼트 문여정 이사 (산부인과 전문의)
“어떤 헬스케어 스타트업에 투자해야 하는가?”
“어떤 헬스케어 스타트업에 투자해야 하는가?”
IMM인베스트먼트 문여정 이사 (산부인과 전문의)
“한국의 헬스케어에는 답이 없다는 것을 

알고 있는 스타트업에 투자해야 한다.”
한국 의료 시장의 특성
한국에서 의료 산업이 가능한가?
시장의 문제
시장의 문제
•“한국이 왜 매력적인 시장인가?”에 대한 설득력 있는 답 없음.
•너무 작은 국내 시장 (+ 헬스케어는 파편화된 시장)
• 혁신의 시도가 어렵고; 지속 가능한 사업모델도 적다
• 결국 해외 진출을 고민해야 함
• 하지만, 실제로는 이도저도 못하는 상황이 많음
시장의 문제
•정부와 환자는 의료를 산업보다는 ‘복지’로 인식
• 소비자가 의료에 돈을 쓴다는 의식이 적음
• ‘저렴, 혹은 공짜이면서도’, ‘완벽한 의료’를 원함
• 의료로는 돈을 벌어서는 안 된다는 인식
•한국은 기본적으로 저신뢰 사회
• 정부-환자-의료계-산업계: 서로를 믿지 못함
• 미리 촘촘한 규제를 만들고, 전문성을 서로 인정하지 않는 구조
시장의 문제
•의료의 특수성을 이해하는 창업자 및 투자자가 적음
• 규제, 인허가, 보험수가 + 복잡한 이해관계 구도
• 시장의 특수성을 이해하고, market-product fit 찾아낸 창업자가 적음
• 이러한 창업자를 알아볼 수 있는 투자자도 적음
Quiz: 다음 사례들의 공통점은?
Results within 6-8 weeksA little spit is all it takes!
DTC Genetic TestingDirect-To-Consumer
CellScope’s iPhone-enabled otoscope
AliveCor Heart Monitor (Kardia)
transfer from Share2 to HealthKit as mandated by Dexcom receiver
Food and Drug Administration device classification. Once the glucose
values reach HealthKit, they are passively shared with the Epic
MyChart app (https://www.epic.com/software-phr.php). The MyChart
patient portal is a component of the Epic EHR and uses the same data-
base, and the CGM values populate a standard glucose flowsheet in
the patient’s chart. This connection is initially established when a pro-
Participation required confirmation of Bluetooth pairing of the CGM re-
ceiver to a mobile device, updating the mobile device with the most recent
version of the operating system, Dexcom Share2 app, Epic MyChart app,
and confirming or establishing a username and password for all accounts,
including a parent’s/adolescent’s Epic MyChart account. Setup time aver-
aged 45–60 minutes in addition to the scheduled clinic visit. During this
time, there was specific verbal and written notification to the patients/par-
Figure 1: Overview of the CGM data communication bridge architecture.
BRIEFCOMMUNICATION
Kumar R B, et al. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv206, Brief Communication
byguestonApril7,2016http://jamia.oxfordjournals.org/Downloadedfrom
JAMIA 2016
Remote Patients Monitoring
via Dexcom-HealthKit-Epic-Stanford
• NewYork
• First-time home visit $50; regular visits $200; physical $100
•'모든' 보험 상품에 핏빗 등의 웨어러블과 스마트폰을 이용한 interactive policy를 추가
•웨어러블의 데이터를 제공해주면 '공짜'로 $1,000짜리 보험에 가입시켜주는 프로그램

•Amica Life, Greenhouse Life Insurance Company과 협업하여 돌연사에 대한 보험

•데이터의 내용이 보험의 커버리지나 요율 등에 변화를 주지는 않을 것
Beam Technologies
•스마트 칫솔을 기반으로 새로운 치과 보험을 판매하고 있는 Beam Dental

•치과 보험에 가입하면 스마트 칫솔과 치약, 치실을 정기 무료 배송

•(사용자 동의하에) 양치질 데이터를 바탕으로 dynamic pricing 

•KPCB 로부터 $22.5m 규모의 투자를 유치(serise C)

•현재 미국의 16개 주에서 서비스, 이번 투자를 바탕으로 연말까지 35개 주로 확대 계획

•“치과 보험 시장은 일반 건강 보험보다 규제와 걸림돌이 적다
공통점?
한국에서는 불법.
(+ 혹은 수가를 받지 못함)
규제의 문제
규제의 문제 (1/3)
•포지티브 규제
• 법으로 명시적으로 허용한 것 외에는 모두 불법
• 규제 샌드박스, 포괄적 네거티브 규제의 도입 논의 중
•식약처: 의료기기 인허가
• 최근들어 상당히 나아지고 있음
• 인공지능 의료기기 가이드라인 등의 선제적 발표
•심평원: 신의료기술평가
• 2중 규제: 식약처의 의료기기 인허가를 받아도 판매할 수 없음
• 국내 시장에 출시하는데 시간이 더 오래 걸리는 이유
• 한국에서는 기존에 없던 새로운 것을 하기가 어려움
규제의 문제 (2/3)
•의료 데이터 관련 규제의 불확실성
• ‘의료 데이터’의 명확한 법적 정의 없음
• 개인식별정보 / 비식별화-재식별화의 법적 정의 없음
•데이터 비즈니스에 대한 국내 인식
• “대기업이 환자의 데이터를 이용해서 돈을 번다” 는 프레임
• 완벽한 보호 + 데이터의 가치 극대화: 모두 요구
규제의 문제 (3/3)
•기타 주요 규제
• 영리 법인 병원 금지 (vs. 애플의 병원 설립)
• 원격 의료 금지 (vs. 텔라닥)
• 의약품 배송 금지 (vs. 아마존, 바이두)
• 유전정보 DTC 검사 금지 (vs. 23andMe)
• 보험사 건강관리서비스 회색지대 (vs. Oscar, Omada/Noom)
• 차량 공유 서비스 금지, 환자 유인행위 금지 (vs. Uber Health)
한국 의료의 특수성
한국 의료의 특수성 (1/2)
•단일 국민 건강보험  당연지정제
• 모든 의료 행위의 가격을 정부가 컨트롤


• 해당 의료 행위의 의학적 필요성을 정부(심평원)가 판단 (not by 의료인)
• 의사가 필요해도, 수가 적용이 되지 않아서 사용하지 못하는 경우 발생
• 정부는 기본적으로 의료계를 신뢰하지 않음 (‘저신뢰 사회’)


• 건강보험 재정을 아끼는 것이 가장 큰 목표 중의 하나
• 새로운 혁신 기술을 과감히 받아들이는 것에 매우 보수적임
한국 의료의 특수성 (2/3)
•문재인케어: 보장성 강화
• ‘의학적으로 필수적인’ 의료 행위는 모두 건강보험 적용
• 보장성 강화 = 국가 컨트롤 강화 = 기업 자율성 약화 = 혁신의 저해 (글로벌과 반대)
•저수가 (가장 근본적 문제 중 하나)
• 국민건강보험 수가 지정이 매우 보수적임 (ex. 인공지능 의료기기)
• 수가를 받아도, 결국 저수가 (원가의 일부 밖에 보전 안됨)
한국 의료의 특수성 (3/3)
•높은 접근성
• 한국은 당일 진료 vs. 미국은 예약 후 2-3주
• 미국의 많은 사업모델이 한국에서는 유효하지 않은 이유.
•붕괴된 의료 전달 체계
• 일반 감기 환자도 서울대병원 응급실 내원 가능
• 1, 2, 3차 병원의 역할 분담 무너짐 (위기이자 기회일 수도)
몇가지 해결책
몇가지 해결책
•1. 국민 건강 보험을 이원화/다원화
• 결국 한국 의료 시장의 모든 문제는 국민건강보험으로 귀결
• 혁신 기술을 보다 적극적으로 받아들이는 국민건강보험을 추가로 개설
• 근본적인 해결책이 될 것으로 생각되지만, 현실적으로는 불가능할 것
• 소위 ‘국민 정서법’에 저촉
• ‘기계적 형평성’, ‘정치적 올바름’ 이슈
• 돈 있는 사람이 더 좋은 서비스를 받는 것 인정하지 못함
• 보장성 확대와 반대 방향, ‘의료 영리화’에도 저촉
몇가지 해결책
•2. 사보험사 대상의 B2B2C
• 최근 사보험사의 디지털 헬스케어 서비스에 대한 관심 높아지고 있음
• 하지만, 국내 보험사의 관련 경험, 전문성, 데이터가 미비
• 건강관리서비스에 대한 가이드라인이 최근에 나옴
• 하지만, 회색 지대는 여전히 있으며, 해석의 여지가 존재
• 2018년 초 복지부의 건강관리서비스 TF 발족하였으나, 아직 활동 전무
• 시민단체 반발도 부담
• ‘보험사가 가입자의 데이터로 돈을 번다’는 프레임


몇가지 해결책
•3. 의료기기 외의 웰니스, 의료 관련 서비스에 집중
• 진단, 치료 등의 의학적 행위 외에, 규제를 받지 않는 영역에서 사업
• 의학의 근본적인 문제를 해결하기는 어려우나, 일단 사업은 가능
• 병원 대상의 B2B
• 일반 환자 대상의 B2C

• 한국에서 성공한 대부분의 디지털 헬스케어 기업은 이 영역에 해당
• 케어랩스(2018년 IPO)의 강남언니: 성형외과 O2O 플랫폼
• DHP가 투자한 영역도 대부분 여기에 해당
• 병원용 챗봇, 의대생/의사 대상의 VR 솔루션, (+ 마음챙김 명상)
• O2O 플랫폼: 당뇨병, 탈모, 의사 찾기, (+ 간병인)
몇가지 해결책
•4. 해외 진출
• 시장 크기 작고, 지불 의사도 없고, 규제도 미비된 한국 시장을 떠나서,
• 미국, 유럽, 중국, 일본, 동남아 등의 시장에 먼저 진출하려는 움직임
• 최근 일부 스타트업은 아예 해외에서 시작하거나,
• 기술 개발 후 출시를 해외에서 먼저 하는 경우 증가
Feedback/Questions
• Email: yoonsup.choi@gmail.com
• Blog: http://www.yoonsupchoi.com
• Facebook: Yoon Sup Choi

의료의 미래, 디지털 헬스케어 + 의료 시장의 특성

  • 1.
    의료의 미래, 디지털헬스케어 Professor, SAHIST, Sungkyunkwan University Director, Digital Healthcare Institute Yoon Sup Choi, Ph.D.
  • 2.
    “It's in Apple'sDNA that technology alone is not enough. 
 It's technology married with liberal arts.”
  • 3.
    The Convergence ofIT, BT and Medicine
  • 5.
    최윤섭 지음 의료인공지능 표지디자인•최승협 컴퓨터 털 헬 치를만드는 것을 화두로 기업가, 엔젤투자가, 에반 의 대표적인 전문가로, 활 이 분야를 처음 소개한 장 포항공과대학교에서 컴 동 대학원 시스템생명공 취득하였다. 스탠퍼드대 조교수, KT 종합기술원 컨 구원 연구조교수 등을 거 저널에 10여 편의 논문을 국내 최초로 디지털 헬스 윤섭 디지털 헬스케어 연 국내 유일의 헬스케어 스 어 파트너스’의 공동 창업 스타트업을 의료 전문가 관대학교 디지털헬스학과 뷰노, 직토, 3billion, 서지 소울링, 메디히어, 모바일 자문을 맡아 한국에서도 고 있다. 국내 최초의 디 케어 이노베이션』에 활발 을 연재하고 있다. 저서로 와 『그렇게 나는 스스로 •블로그_ http://www •페이스북_ https://w •이메일_ yoonsup.c 최윤섭 의료 인공지능은 보수적인 의료 시스템을 재편할 혁신을 일으키고 있다. 의료 인공지능의 빠른 발전과 광범위한 영향은 전문화, 세분화되며 발전해 온 현대 의료 전문가들이 이해하기가 어려우며, 어디서부 터 공부해야 할지도 막연하다. 이런 상황에서 의료 인공지능의 개념과 적용, 그리고 의사와의 관계를 쉽 게 풀어내는 이 책은 좋은 길라잡이가 될 것이다. 특히 미래의 주역이 될 의학도와 젊은 의료인에게 유용 한 소개서이다. ━ 서준범, 서울아산병원 영상의학과 교수, 의료영상인공지능사업단장 인공지능이 의료의 패러다임을 크게 바꿀 것이라는 것에 동의하지 않는 사람은 거의 없다. 하지만 인공 지능이 처리해야 할 의료의 난제는 많으며 그 해결 방안도 천차만별이다. 흔히 생각하는 만병통치약 같 은 의료 인공지능은 존재하지 않는다. 이 책은 다양한 의료 인공지능의 개발, 활용 및 가능성을 균형 있 게 분석하고 있다. 인공지능을 도입하려는 의료인, 생소한 의료 영역에 도전할 인공지능 연구자 모두에 게 일독을 권한다. ━ 정지훈, 경희사이버대 미디어커뮤니케이션학과 선임강의교수, 의사 서울의대 기초의학교육을 책임지고 있는 교수의 입장에서, 산업화 이후 변하지 않은 현재의 의학 교육 으로는 격변하는 인공지능 시대에 의대생을 대비시키지 못한다는 한계를 절실히 느낀다. 저와 함께 의 대 인공지능 교육을 개척하고 있는 최윤섭 소장의 전문적 분석과 미래 지향적 안목이 담긴 책이다. 인공 지능이라는 미래를 대비할 의대생과 교수, 그리고 의대 진학을 고민하는 학생과 학부모에게 추천한다. ━ 최형진, 서울대학교 의과대학 해부학교실 교수, 내과 전문의 최근 의료 인공지능의 도입에 대해서 극단적인 시각과 태도가 공존하고 있다. 이 책은 다양한 사례와 깊 은 통찰을 통해 의료 인공지능의 현황과 미래에 대해 균형적인 시각을 제공하여, 인공지능이 의료에 본 격적으로 도입되기 위한 토론의 장을 마련한다. 의료 인공지능이 일상화된 10년 후 돌아보았을 때, 이 책 이 그런 시대를 이끄는 길라잡이 역할을 하였음을 확인할 수 있기를 기대한다. ━ 정규환, 뷰노 CTO 의료 인공지능은 다른 분야 인공지능보다 더 본질적인 이해가 필요하다. 단순히 인간의 일을 대신하는 수준을 넘어 의학의 패러다임을 데이터 기반으로 변화시키기 때문이다. 따라서 인공지능을 균형있게 이 해하고, 어떻게 의사와 환자에게 도움을 줄 수 있을지 깊은 고민이 필요하다. 세계적으로 일어나고 있는 이러한 노력의 결과물을 집대성한 이 책이 반가운 이유다. ━ 백승욱, 루닛 대표 의료 인공지능의 최신 동향뿐만 아니라, 의의와 한계, 전망, 그리고 다양한 생각거리까지 주는 책이다. 논쟁이 되는 여러 이슈에 대해서도 저자는 자신의 시각을 명확한 근거에 기반하여 설득력 있게 제시하 고 있다. 개인적으로는 이 책을 대학원 수업 교재로 활용하려 한다. ━ 신수용, 성균관대학교 디지털헬스학과 교수 최윤섭지음 의료인공지능 값 20,000원 ISBN 979-11-86269-99-2 최초의 책! 계 안팎에서 제기 고 있다. 현재 의 분 커버했다고 자 것인가, 어느 진료 제하고 효용과 안 누가 지는가, 의학 쉬운 언어로 깊이 들이 의료 인공지 적인 용어를 최대 서 다른 곳에서 접 를 접하게 될 것 너무나 빨리 발전 책에서 제시하는 술을 공부하며, 앞 란다. 의사 면허를 취득 저가 도움되면 좋 를 불러일으킬 것 화를 일으킬 수도 슈에 제대로 대응 분은 의학 교육의 예비 의사들은 샌 지능과 함께하는 레이닝 방식도 이 전에 진료실과 수 겠지만, 여러분들 도생하는 수밖에 미래의료학자 최윤섭 박사가 제시하는 의료 인공지능의 현재와 미래 의료 딥러닝과 IBM 왓슨의 현주소 인공지능은 의사를 대체하는가 값 20,000원 ISBN 979-11-86269-99-2 레이닝 방식도 이 전에 진료실과 수 겠지만, 여러분들 도생하는 수밖에 소울링, 메디히어, 모바일 자문을 맡아 한국에서도 고 있다. 국내 최초의 디 케어 이노베이션』에 활발 을 연재하고 있다. 저서로 와 『그렇게 나는 스스로 •블로그_ http://www •페이스북_ https://w •이메일_ yoonsup.c
  • 7.
  • 8.
  • 9.
    “Technology will replace80% of doctors”
  • 10.
    https://www.youtube.com/watch?time_continue=70&v=2HMPRXstSvQ “영상의학과 전문의를 양성하는것을 당장 그만둬야 한다. 5년 안에 딥러닝이 영상의학과 전문의를 능가할 것은 자명하다.” Hinton on Radiology
  • 11.
    https://rockhealth.com/reports/2018-year-end-funding-report-is-digital-health-in-a-bubble/ •2018년에는 $8.1B 가투자되며 역대 최대 규모를 또 한 번 갱신 (전년 대비 42.% 증가) •총 368개의 딜 (전년 359 대비 소폭 증가): 개별 딜의 규모가 커졌음 •전체 딜의 절반이 seed 혹은 series A 투자였음 •‘초기 기업들이 역대 최고로 큰 규모의 투자를’, ‘역대 가장 자주’ 받고 있음
  • 12.
    2010 2011 20122013 2014 2015 2016 2017 2018 Q1 Q2 Q3 Q4 153 283 476 647 608 568 684 851 765 FUNDING SNAPSHOT: YEAR OVER YEAR 5 Deal Count $1.4B $1.7B $1.7B $627M $603M$459M $8.2B $6.2B $7.1B $2.9B $2.3B$2.0B $1.2B $11.7B $2.3B Funding surpassed 2017 numbers by almost $3B, making 2018 the fourth consecutive increase in capital investment and largest since we began tracking digital health funding in 2010. Deal volume decreased from Q3 to Q4, but deal sizes spiked, with $3B invested in Q4 alone. Average deal size in 2018 was $21M, a $6M increase from 2017. $3.0B $14.6B DEALS & FUNDING INVESTORS SEGMENT DETAIL Source: StartUp Health Insights | startuphealth.com/insights Note: Report based on public data through 12/31/18 on seed (incl. accelerator), venture, corporate venture, and private equity funding only. © 2019 StartUp Health LLC •글로벌 투자 추이를 보더라도, 2018년 역대 최대 규모: $14.6B •2015년 이후 4년 연속 증가 중 https://hq.startuphealth.com/posts/startup-healths-2018-insights-funding-report-a-record-year-for-digital-health
  • 13.
    5% 8% 24% 27% 36% Life Science &Health Mobile Enterprise & Data Consumer Commerce 9% 13% 23% 24% 31% Life Science & Health Consumer Enterprise Data & AI Others 2014 2015 Investment of GoogleVentures in 2014-2015
  • 14.
    startuphealth.com/reports Firm 2017 YTDDeals Stage Early Mid Late 1 7 1 7 2 6 2 6 3 5 3 5 3 5 3 5 THE TOP INVESTORS OF 2017 YTD We are seeing huge strides in new investors pouring money into the digital health market, however all the top 10 investors of 2017 year to date are either maintaining or increasing their investment activity. Source: StartUp Health Insights | startuphealth.com/insights Note: Report based on public data on seed, venture, corporate venture and private equity funding only. © 2017 StartUp Health LLC DEALS & FUNDING GEOGRAPHY INVESTORSMOONSHOTS 20 •Google Ventures와 Khosla Ventures가 각각 7개로 공동 1위, •GE Ventures와 Accel Partners가 6건으로 공동 2위를 기록
 •GV 가 투자한 기업 •virtual fitness membership network를 만드는 뉴욕의 ClassPass •Remote clinical trial 회사인 Science 37 •Digital specialty prescribing platform ZappRx 등에 투자.
 •Khosla Ventures 가 투자한 기업 •single-molecule 검사 장비를 만드는 TwoPoreGuys •Mabu라는 AI-powered patient engagement robot 을 만드는 Catalia Health에 투자.
  • 15.
    •최근 3년 동안Merck, J&J, GSK 등의 제약사들의 디지털 헬스케어 분야 투자 급증 •2015-2016년 총 22건의 deal (=2010-2014년의 5년간 투자 건수와 동일) •Merck 가 가장 활발: 2009년부터 Global Health Innovation Fund 를 통해 24건 투자 ($5-7M) •GSK 의 경우 2014년부터 6건 (via VC arm, SR One): including Propeller Health
  • 16.
    헬스케어 넓은 의미의 건강관리에는 해당되지만, 디지털 기술이 적용되지 않고, 전문 의료 영역도 아닌 것 예) 운동, 영양, 수면 디지털 헬스케어 건강 관리 중에 디지털 기술이 사용되는 것 예) 사물인터넷, 인공지능, 3D 프린터, VR/AR 모바일 헬스케어 디지털 헬스케어 중 모바일 기술이 사용되는 것 예) 스마트폰, 사물인터넷, SNS 개인 유전정보분석 암유전체, 질병위험도, 보인자, 약물 민감도 예) 웰니스, 조상 분석 헬스케어 관련 분야 구성도(ver 0.6) 의료 질병 예방, 치료, 처방, 관리 등 전문 의료 영역 원격의료 원격 환자 모니터링 원격진료 전화, 화상, 판독 디지털 치료제 당뇨 예방 앱 중독 치료 앱 ADHD 치료게임
  • 17.
    EDITORIAL OPEN Digital medicine,on its way to being just plain medicine npj Digital Medicine (2018)1:20175 ; doi:10.1038/ s41746-017-0005-1 There are already nearly 30,000 peer-reviewed English-language scientific journals, producing an estimated 2.5 million articles a year.1 So why another, and why one focused specifically on digital medicine? To answer that question, we need to begin by defining what “digital medicine” means: using digital tools to upgrade the practice of medicine to one that is high-definition and far more individualized. It encompasses our ability to digitize human beings using biosensors that track our complex physiologic systems, but also the means to process the vast data generated via algorithms, cloud computing, and artificial intelligence. It has the potential to democratize medicine, with smartphones as the hub, enabling each individual to generate their own real world data and being far more engaged with their health. Add to this new imaging tools, mobile device laboratory capabilities, end-to-end digital clinical trials, telemedicine, and one can see there is a remarkable array of transformative technology which lays the groundwork for a new form of healthcare. As is obvious by its definition, the far-reaching scope of digital medicine straddles many and widely varied expertise. Computer scientists, healthcare providers, engineers, behavioral scientists, ethicists, clinical researchers, and epidemiologists are just some of the backgrounds necessary to move the field forward. But to truly accelerate the development of digital medicine solutions in health requires the collaborative and thoughtful interaction between individuals from several, if not most of these specialties. That is the primary goal of npj Digital Medicine: to serve as a cross-cutting resource for everyone interested in this area, fostering collabora- tions and accelerating its advancement. Current systems of healthcare face multiple insurmountable challenges. Patients are not receiving the kind of care they want and need, caregivers are dissatisfied with their role, and in most countries, especially the United States, the cost of care is unsustainable. We are confident that the development of new systems of care that take full advantage of the many capabilities that digital innovations bring can address all of these major issues. Researchers too, can take advantage of these leading-edge technologies as they enable clinical research to break free of the confines of the academic medical center and be brought into the real world of participants’ lives. The continuous capture of multiple interconnected streams of data will allow for a much deeper refinement of our understanding and definition of most pheno- types, with the discovery of novel signals in these enormous data sets made possible only through the use of machine learning. Our enthusiasm for the future of digital medicine is tempered by the recognition that presently too much of the publicized work in this field is characterized by irrational exuberance and excessive hype. Many technologies have yet to be formally studied in a clinical setting, and for those that have, too many began and ended with an under-powered pilot program. In addition, there are more than a few examples of digital “snake oil” with substantial uptake prior to their eventual discrediting.2 Both of these practices are barriers to advancing the field of digital medicine. Our vision for npj Digital Medicine is to provide a reliable, evidence-based forum for all clinicians, researchers, and even patients, curious about how digital technologies can transform every aspect of health management and care. Being open source, as all medical research should be, allows for the broadest possible dissemination, which we will strongly encourage, including through advocating for the publication of preprints And finally, quite paradoxically, we hope that npj Digital Medicine is so successful that in the coming years there will no longer be a need for this journal, or any journal specifically focused on digital medicine. Because if we are able to meet our primary goal of accelerating the advancement of digital medicine, then soon, we will just be calling it medicine. And there are already several excellent journals for that. ACKNOWLEDGEMENTS Supported by the National Institutes of Health (NIH)/National Center for Advancing Translational Sciences grant UL1TR001114 and a grant from the Qualcomm Foundation. ADDITIONAL INFORMATION Competing interests:The authors declare no competing financial interests. Publisher's note:Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Change history:The original version of this Article had an incorrect Article number of 5 and an incorrect Publication year of 2017. These errors have now been corrected in the PDF and HTML versions of the Article. Steven R. Steinhubl1 and Eric J. Topol1 1 Scripps Translational Science Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA 92037, USA Correspondence: Steven R. Steinhubl (steinhub@scripps.edu) or Eric J. Topol (etopol@scripps.edu) REFERENCES 1. Ware, M. & Mabe, M. The STM report: an overview of scientific and scholarly journal publishing 2015 [updated March]. http://digitalcommons.unl.edu/scholcom/92017 (2015). 2. Plante, T. B., Urrea, B. & MacFarlane, Z. T. et al. Validation of the instant blood pressure smartphone App. JAMA Intern. Med. 176, 700–702 (2016). Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/. © The Author(s) 2018 Received: 19 October 2017 Accepted: 25 October 2017 www.nature.com/npjdigitalmed Published in partnership with the Scripps Translational Science Institute 디지털 의료의 미래는? 일상적인 의료가 되는 것
  • 18.
    What is mostimportant factor in digital medicine?
  • 19.
    “Data! Data! Data!”he cried.“I can’t make bricks without clay!” - Sherlock Holmes,“The Adventure of the Copper Beeches”
  • 21.
    새로운 데이터가 새로운 방식으로 새로운주체에 의해 측정, 저장, 통합, 분석된다. 데이터의 종류 데이터의 질적/양적 측면 웨어러블 기기 스마트폰 유전 정보 분석 인공지능 SNS 사용자/환자 대중
  • 22.
    디지털 헬스케어의 3단계 •Step1. 데이터의 측정 •Step 2. 데이터의 통합 •Step 3. 데이터의 분석
  • 23.
    Digital Healthcare IndustryLandscape Data Measurement Data Integration Data Interpretation Treatment Smartphone Gadget/Apps DNA Artificial Intelligence 2nd Opinion Wearables / IoT (ver. 3) EMR/EHR 3D Printer Counseling Data Platform Accelerator/early-VC Telemedicine Device On Demand (O2O) VR Digital Healthcare Institute Diretor, Yoon Sup Choi, Ph.D. yoonsup.choi@gmail.com
  • 24.
    Data Measurement DataIntegration Data Interpretation Treatment Smartphone Gadget/Apps DNA Artificial Intelligence 2nd Opinion Device On Demand (O2O) Wearables / IoT Digital Healthcare Institute Diretor, Yoon Sup Choi, Ph.D. yoonsup.choi@gmail.com EMR/EHR 3D Printer Counseling Data Platform Accelerator/early-VC VR Telemedicine Digital Healthcare Industry Landscape (ver. 3)
  • 25.
  • 26.
    Smartphone: the originof healthcare innovation
  • 27.
    Smartphone: the originof healthcare innovation
  • 28.
    2013? The election ofPope Benedict The Election of Pope Francis
  • 29.
    The Election ofPope Francis The Election of Pope Benedict
  • 32.
  • 33.
    검이경 더마토스코프 안과질환피부암 기생충 호흡기 심전도 수면 식단 활동량 발열 생리/임신
  • 34.
  • 35.
  • 36.
  • 37.
    “왼쪽 귀에 대한비디오를 보면 고막 뒤 에 액체가 보인다. 고막은 특별히 부어 있 거나 모양이 이상하지는 않다. 그러므로 심 한 염증이 있어보이지는 않는다. 네가 스쿠버 다이빙 하면서 압력평형에 어 려움을 느꼈다는 것을 감안한다면, 고막의 움직임을 테스트 할 수 있는 의사에게 직 접 진찰 받는 것도 좋겠다. ...” 한국에서는 불법한국에서는 불법
  • 38.
  • 40.
  • 41.
  • 42.
  • 45.
    “심장박동은 안정적이기 때문에,
 당장 병원에 갈 필요는 없겠습니다. 
 그래도 이상이 있으면 전문의에게 
 진료를 받아보세요. “ 한국에서는 불법한국에서는 불법
  • 48.
  • 50.
    30분-1시간 정도 일상적인코골이가 있음 이걸 어떻게 믿나?
  • 51.
    녹음을 해줌. PGS와의analytical validity의 증명?
  • 52.
    녹음을 해줌. PGS와의analytical validity의 증명?
  • 53.
  • 54.
  • 57.
  • 58.
    Fig 1. Whatcan consumer wearables do? Heart rate can be measured with an oximeter built into a ring [3], muscle activity with an electromyographi sensor embedded into clothing [4], stress with an electodermal sensor incorporated into a wristband [5], and physical activity or sleep patterns via an accelerometer in a watch [6,7]. In addition, a female’s most fertile period can be identified with detailed body temperature tracking [8], while levels of me attention can be monitored with a small number of non-gelled electroencephalogram (EEG) electrodes [9]. Levels of social interaction (also known to a PLOS Medicine 2016
  • 59.
  • 60.
  • 62.
  • 63.
    애플워치4: 심전도, 부정맥,낙상 측정 FDA 의료기기 인허가 •De Novo 의료기기로 인허가 받음 (새로운 종류의 의료기기) •9월에 발표하였으나, 부정맥 관련 기능은 12월에 활성화 •미국 애플워치에서만 가능하고, 한국은안 됨 (미국에서 구매한 경우, 한국 앱스토어 ID로 가능)
  • 65.
    •American College ofCardiology’s 68th Annual Scientific Session •전체 임상 참여자 중에서 irregular pusle notification 받은 사람은 불과 0.5% •애플워치와 ECG patch를 동시에 사용한 결과 71%의 positive predictive value.  •irregular pusle notification 받은 사람 중 84%가 그 시점에 심방세동을 가짐 •f/u으로 그 다음 일주일 동안 ECG patch를 착용한 사람 중 34%가 심방세동을 발견 •Irregular pusle notification 받은 사람 중에 실제로 병원에 간 사람은 57% (전체 환자군의 0.3%)
  • 66.
  • 67.
  • 70.
  • 71.
  • 73.
    헬스케어 웨어러블 딜레마 지속사용성 사용자 효용 당뇨병 패러독스 NO 행동을 변화시켜야 하는가? 재정적 효용 의료적 효용 오락적 효용 “돈을 준다”“병이 낫는다” “재미있다” 정확성 정확성만으로 계속 사용하지는 않는다 의료적 사용을 위해서는 정확해야 한다 “계속 사용하는가” “쓰면 뭐가 좋은가” YES 효용이 번거로움을 크게 능가하는가? NO YES 일단 사용을 해야만 효용을 기대할 수 있다 “쓰면 좋은 걸 알지만, 그래도 안 쓴다” 최윤섭디지털헬스케어연구소 소장 최윤섭, PhD yoonsup.choi@gmail.com www.yoonsupchoi.com 심미적 효용 “예쁘다” 사회적 효용 “친구를 사귄다” 편의적 효용 “결제가 쉽다” 보험사가 참고하려면 정확해야 한다
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
    2003 Human GenomeProject 13 years (676 weeks) $2,700,000,000 2007 Dr. CraigVenter’s genome 4 years (208 weeks) $100,000,000 2008 Dr. James Watson’s genome 4 months (16 weeks) $1,000,000 2009 (Nature Biotechnology) 4 weeks $48,000 2013 1-2 weeks ~$5,000
  • 79.
    The $1000 Genomeis Already Here!
  • 80.
    • 2017년 1월NovaSeq 5000, 6000 발표 • 몇년 내로 $100로 WES 를 실현하겠다고 공언 • 2일에 60명의 WES 가능 (한 명당 한 시간 이하)
  • 82.
    Results within 6-8weeksA little spit is all it takes! DTC Genetic TestingDirect-To-Consumer
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
    Traits 음주 후 얼굴이붉어지는가 쓴 맛을 감지할 수 있나 귀지 유형 눈 색깔 곱슬머리 여부 유당 분해 능력 말라리아 저항성 대머리가 될 가능성 근육 퍼포먼스 혈액형 노로바이러스 저항성 HIV 저항성 흡연 중독 가능성
  • 88.
  • 89.
  • 90.
    23andMe Chronicle $115m 펀딩 (유니콘등극) 100만 명 돌파 2006 23andMe 창업 20162007 2012 2013 2014 2015 구글 벤처스 360만 달러 투자 2008 $99 로 가격 인하 FDA 판매 중지 명령 영국에서 DTC 서비스 시작 FDA 블룸증후군 DTC 서비스 허가 FDA에 블룸증후군 테스트 승인 요청 FDA에 510(k) 제출 FDA 510(k) 철회 보인자 등 DTC 서비스 재개 ($199) 캐나다에서 DTC 서비스 시작 Genetech, pFizer가 23andMe 데이터 구입 자체 신약 개 발 계획 발표 120만 명 돌파 $399 로 가격 인하Business Regulation 애플 리서치키트와 데이터 수집 협력 50만 명 돌파 30만 명 돌파 TV 광고 시작 2017 FDA의 질병위험도 검사 DTC 서비스 허가 + 관련 규제 면제 프로세스 확립 Digital Healthcare Institute Director,Yoon Sup Choi, PhD yoonsup.choi@gmail.com FDA Pre-Cert FDA Gottlieb 국장, 질병 위험도 유전자 DTC 서비스의 Pre-Cert 발의 BRCA 1/2 DTC 검사 허용 2018 FDA, 질병 위험도 유전자 DTC서비스의 Pre-Cert 발효 200만 명 돌파 500만 명 돌파 GSK에서 $300M 투자 유치 2019 1000만 명 돌파
  • 91.
    •질병 위험도 유전자분석 DTC 서비스에 대해서 Pre-Cert 를 적용 시작 (18. 6. 5) •최초 한 번"만 99% 이상의 analytical validity 를 증명하면, •이 회사는 정확한 유전 정보 분석 서비스를 만들 수 있는 것으로 인정하여, •이후의 서비스는 출시 전 인허가가 면제
 •다만 민감할 수 있는 4가지 종류의 분석에 대해서는 이 규제 완화에서 제외 •산전 진단 •(예방적 스크리닝이나 치료법 결정으로 이어지는) 암 발병 가능성 검사 •약물 유전체 검사 •우성유전질환 유전인자 검사
  • 92.
    한국 DTC 유전정보분석 제한적 허용 (2016.6.30) • 「비의료기관 직접 유전자검사 실시 허용 관련 고시 제정, 6.30일시행」 • 2015년 12월「생명윤리 및 안전에 관한 법률」개정(‘15.12.29개정, ’16.6.30시행) 과 제9차 무역투자진흥회의(’16.2월) 시 발표한 규제 개선의 후속조치 일환으로 추진 • 민간 유전자검사 업체에서는 혈당, 혈압, 피부노화, 체질량지수 등 12개 검사항목과 관련된 46개 유전자를 직접 검사 가능 http://www.mohw.go.kr/m/noticeView.jsp?MENU_ID=0403&cont_seq=333112&page=1 검사항목 (유전자수) 유전자명 1 체질량지수(3) FTO, MC4R, BDNF 2 중성지방농도(8) GCKR, DOCK7, ANGPTL3, BAZ1B, TBL2, MLXIPL, LOC105375745, TRIB1 3 콜레스테롤(8) CELSR2, SORT1, HMGCR, ABO, ABCA1, MYL2, LIPG, CETP 4 혈 당(8) CDKN2A/B, G6PC2, GCK, GCKR, GLIS3, MTNR1B, DGKB-TMEM195, SLC30A8 5 혈 압(8) NPR3, ATP2B1, NT5C2, CSK, HECTD4, GUCY1A3, CYP17A1, FGF5 6 색소 침착(2) OCA2, MC1R 7 탈 모(3) chr20p11(rs1160312, rs2180439), IL2RA, HLA-DQB1 8 모발 굵기(1) EDAR 9 피부 노화(1) AGER 10 피부 탄력(1) MMP1 11 비타민C농도(1) SLC23A1(SVCT1) 12 카페인대사(2) AHR, CYP1A1-CYP1A2
  • 93.
    https://www.23andme.com/slideshow/research/ 고객의 자발적인 참여에의한 유전학 연구 깍지를 끼면 어느 쪽 엄지가 위로 오는가? 아침형 인간? 저녁형 인간? 빛에 노출되었을 때 재채기를 하는가? 근육의 퍼포먼스 쓴 맛 인식 능력 음주 후 얼굴이 붉어지나? 유당 분해 효소 결핍? 고객의 81%가 10개 이상의 질문에 자발적 답변 매주 1 million 개의 data point 축적 The More Data, The Higher Accuracy!
  • 94.
    January 13, 2015January6, 2015 Data Business
  • 95.
    Step1. 데이터의 측정 •스마트폰 •웨어러블디바이스 •개인 유전 정보 분석 환자 유래의 의료 데이터 (PGHD)
  • 96.
  • 98.
  • 100.
  • 101.
  • 102.
    Epic MyChart EpicEHR Dexcom CGM Patients/User Devices EH Hospit Whitings + Apple Watch Apps HealthKit
  • 105.
  • 106.
    Hospital A HospitalB Hospital C interoperability
  • 107.
  • 108.
    •2018년 1월에 출시당시, 존스홉킨스, UC샌디에고 등 12개의 병원에 연동 •(2019년 2월 현재) 1년 만에 200개 이상의 병원에 연동 •VA와도 연동된다고 밝힘 (with 9 million veterans) •2008년 구글 헬스는 3년 동안 12개 병원에 연동에 그쳤음
  • 109.
  • 111.
  • 112.
    How to Analyzeand Interpret the Big Data?
  • 113.
    and/or Two ways toget insights from the big data
  • 114.
    원격의료 • ‘명시적’으로, ‘전면적’으로‘금지’된 곳은 한국 밖에 없는 듯 • 해외에서는 새로운 서비스의 상당수가 원격의료 기능 포함 • 글로벌 100대 헬스케어 서비스 중 39개가 원격의료 포함 • 다른 모델과 결합하여 갈수록 새로운 모델이 만들어지는 중 • 스마트폰, 웨어러블, IoT, 인공지능, 챗봇 등과 결합 • 10년 뒤 한국 의료에서는?
  • 115.
    원격 의료 원격 진료 원격환자 모니터링 화상 진료 전화 진료 2차 소견 용어 정리 데이터 판독 원격 수술
  • 116.
    •원격 진료: 화상진료 •원격 진료: 2차 소견 •원격 진료: 애플리케이션 •원격 환자 모니터링 원격 의료에도 종류가 많다.
  • 117.
    •원격 진료: 화상진료 •원격 진료: 2차 소견 •원격 진료: 애플리케이션 •원격 환자 모니터링 원격 의료에도 종류가 많다.
  • 118.
  • 122.
    Average Time toAppointment (Familiy Medicine) Boston LA Portland Miami Atlanta Denver Detroit New York Seattle Houston Philadelphia Washington DC San Diego Dallas Minneapolis Total 0 30 60 90 120 20.3 10 8 24 30 9 17 8 24 14 14 9 7 8 59 63 19.5 10 5 7 14 21 19 23 26 16 16 24 12 13 20 66 29.3 days 8 days 12 days 13 days 17 days 17 days 21 days 26 days 26 days 27 days 27 days 27 days 28 days 39 days 42 days 109 days 2017 2014 2009
  • 126.
    0 125 250 375 500 2013 2014 20152016 2017 2018 417.9 233.3 123 77.4 44 20 0 550 1100 1650 2200 2013 2014 2015 2016 2017 2018 2,036 1,461 952 575 299 127 0 6 12 18 24 2013 2014 2015 2016 2017 2018 22.8 19.6 17.5 11.5 8.1 6.2 Revenue ($m) Visits (k) Members (m) Growth of Teladoc
  • 127.
    •원격 진료: 화상진료 •원격 진료: 2차 소견 •원격 진료: 애플리케이션 •원격 환자 모니터링 원격 의료에도 종류가 많다.
  • 128.
    Epic MyChart EpicEHR Dexcom CGM Patients/User Devices EHR Hospital Whitings + Apple Watch Apps HealthKit
  • 129.
    transfer from Share2to HealthKit as mandated by Dexcom receiver Food and Drug Administration device classification. Once the glucose values reach HealthKit, they are passively shared with the Epic MyChart app (https://www.epic.com/software-phr.php). The MyChart patient portal is a component of the Epic EHR and uses the same data- base, and the CGM values populate a standard glucose flowsheet in the patient’s chart. This connection is initially established when a pro- vider places an order in a patient’s electronic chart, resulting in a re- quest to the patient within the MyChart app. Once the patient or patient proxy (parent) accepts this connection request on the mobile device, a communication bridge is established between HealthKit and MyChart enabling population of CGM data as frequently as every 5 Participation required confirmation of Bluetooth pairing of the CGM re- ceiver to a mobile device, updating the mobile device with the most recent version of the operating system, Dexcom Share2 app, Epic MyChart app, and confirming or establishing a username and password for all accounts, including a parent’s/adolescent’s Epic MyChart account. Setup time aver- aged 45–60 minutes in addition to the scheduled clinic visit. During this time, there was specific verbal and written notification to the patients/par- ents that the diabetes healthcare team would not be actively monitoring or have real-time access to CGM data, which was out of scope for this pi- lot. The patients/parents were advised that they should continue to contact the diabetes care team by established means for any urgent questions/ concerns. Additionally, patients/parents were advised to maintain updates Figure 1: Overview of the CGM data communication bridge architecture. BRIEFCOMMUNICATION Kumar R B, et al. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv206, Brief Communication byguestonApril7,2016http://jamia.oxfordjournals.org/Downloadedfrom •Apple HealthKit, Dexcom CGM기기를 통해 지속적으로 혈당을 모니터링한 데이터를 EHR과 통합 •당뇨환자의 혈당관리를 향상시켰다는 연구결과 •Stanford Children’s Health와 Stanford 의대에서 10명 type 1 당뇨 소아환자 대상으로 수행 (288 readings /day) •EHR 기반 데이터분석과 시각화는 데이터 리뷰 및 환자커뮤니케이션을 향상 •환자가 내원하여 진료하는 기존 방식에 비해 실시간 혈당변화에 환자가 대응 JAMIA 2016 Remote Patients Monitoring via Dexcom-HealthKit-Epic-Stanford
  • 130.
    의료계 일각에서는 원격환자 모니터링의 합법화를 요구하기도
  • 132.
    No choice butto bring AI into the medicine
  • 133.
    Martin Duggan,“IBM WatsonHealth - Integrated Care & the Evolution to Cognitive Computing”
  • 134.
    Copyright 2016 AmericanMedical Association. All rights reserved. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs Varun Gulshan, PhD; Lily Peng, MD, PhD; Marc Coram, PhD; Martin C. Stumpe, PhD; Derek Wu, BS; Arunachalam Narayanaswamy, PhD; Subhashini Venugopalan, MS; Kasumi Widner, MS; Tom Madams, MEng; Jorge Cuadros, OD, PhD; Ramasamy Kim, OD, DNB; Rajiv Raman, MS, DNB; Philip C. Nelson, BS; Jessica L. Mega, MD, MPH; Dale R. Webster, PhD IMPORTANCE Deep learning is a family of computational methods that allow an algorithm to program itself by learning from a large set of examples that demonstrate the desired behavior, removing the need to specify rules explicitly. Application of these methods to medical imaging requires further assessment and validation. OBJECTIVE To apply deep learning to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. DESIGN AND SETTING A specific type of neural network optimized for image classification called a deep convolutional neural network was trained using a retrospective development data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy, diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists and ophthalmology senior residents between May and December 2015. The resultant algorithm was validated in January and February 2016 using 2 separate data sets, both graded by at least 7 US board-certified ophthalmologists with high intragrader consistency. EXPOSURE Deep learning–trained algorithm. MAIN OUTCOMES AND MEASURES The sensitivity and specificity of the algorithm for detecting referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy, referable diabetic macular edema, or both, were generated based on the reference standard of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2 operating points selected from the development set, one selected for high specificity and another for high sensitivity. RESULTS TheEyePACS-1datasetconsistedof9963imagesfrom4997patients(meanage,54.4 years;62.2%women;prevalenceofRDR,683/8878fullygradableimages[7.8%]);the Messidor-2datasethad1748imagesfrom874patients(meanage,57.6years;42.6%women; prevalenceofRDR,254/1745fullygradableimages[14.6%]).FordetectingRDR,thealgorithm hadanareaunderthereceiveroperatingcurveof0.991(95%CI,0.988-0.993)forEyePACS-1and 0.990(95%CI,0.986-0.995)forMessidor-2.Usingthefirstoperatingcutpointwithhigh specificity,forEyePACS-1,thesensitivitywas90.3%(95%CI,87.5%-92.7%)andthespecificity was98.1%(95%CI,97.8%-98.5%).ForMessidor-2,thesensitivitywas87.0%(95%CI,81.1%- 91.0%)andthespecificitywas98.5%(95%CI,97.7%-99.1%).Usingasecondoperatingpoint withhighsensitivityinthedevelopmentset,forEyePACS-1thesensitivitywas97.5%and specificitywas93.4%andforMessidor-2thesensitivitywas96.1%andspecificitywas93.9%. CONCLUSIONS AND RELEVANCE In this evaluation of retinal fundus photographs from adults with diabetes, an algorithm based on deep machine learning had high sensitivity and specificity for detecting referable diabetic retinopathy. Further research is necessary to determine the feasibility of applying this algorithm in the clinical setting and to determine whether use of the algorithm could lead to improved care and outcomes compared with current ophthalmologic assessment. JAMA. doi:10.1001/jama.2016.17216 Published online November 29, 2016. Editorial Supplemental content Author Affiliations: Google Inc, Mountain View, California (Gulshan, Peng, Coram, Stumpe, Wu, Narayanaswamy, Venugopalan, Widner, Madams, Nelson, Webster); Department of Computer Science, University of Texas, Austin (Venugopalan); EyePACS LLC, San Jose, California (Cuadros); School of Optometry, Vision Science Graduate Group, University of California, Berkeley (Cuadros); Aravind Medical Research Foundation, Aravind Eye Care System, Madurai, India (Kim); Shri Bhagwan Mahavir Vitreoretinal Services, Sankara Nethralaya, Chennai, Tamil Nadu, India (Raman); Verily Life Sciences, Mountain View, California (Mega); Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts (Mega). Corresponding Author: Lily Peng, MD, PhD, Google Research, 1600 Amphitheatre Way, Mountain View, CA 94043 (lhpeng@google.com). Research JAMA | Original Investigation | INNOVATIONS IN HEALTH CARE DELIVERY (Reprinted) E1 Copyright 2016 American Medical Association. All rights reserved. Downloaded From: http://jamanetwork.com/ on 12/02/2016 안과 LETTERS https://doi.org/10.1038/s41591-018-0335-9 1 Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China. 2 Institute for Genomic Medicine, Institute of Engineering in Medicine, and Shiley Eye Institute, University of California, San Diego, La Jolla, CA, USA. 3 Hangzhou YITU Healthcare Technology Co. Ltd, Hangzhou, China. 4 Department of Thoracic Surgery/Oncology, First Affiliated Hospital of Guangzhou Medical University, China State Key Laboratory and National Clinical Research Center for Respiratory Disease, Guangzhou, China. 5 Guangzhou Kangrui Co. Ltd, Guangzhou, China. 6 Guangzhou Regenerative Medicine and Health Guangdong Laboratory, Guangzhou, China. 7 Veterans Administration Healthcare System, San Diego, CA, USA. 8 These authors contributed equally: Huiying Liang, Brian Tsui, Hao Ni, Carolina C. S. Valentim, Sally L. Baxter, Guangjian Liu. *e-mail: kang.zhang@gmail.com; xiahumin@hotmail.com Artificial intelligence (AI)-based methods have emerged as powerful tools to transform medical care. Although machine learning classifiers (MLCs) have already demonstrated strong performance in image-based diagnoses, analysis of diverse and massive electronic health record (EHR) data remains chal- lenging. Here, we show that MLCs can query EHRs in a manner similar to the hypothetico-deductive reasoning used by physi- cians and unearth associations that previous statistical meth- ods have not found. Our model applies an automated natural language processing system using deep learning techniques to extract clinically relevant information from EHRs. In total, 101.6 million data points from 1,362,559 pediatric patient visits presenting to a major referral center were analyzed to train and validate the framework. Our model demonstrates high diagnostic accuracy across multiple organ systems and is comparable to experienced pediatricians in diagnosing com- mon childhood diseases. Our study provides a proof of con- cept for implementing an AI-based system as a means to aid physicians in tackling large amounts of data, augmenting diag- nostic evaluations, and to provide clinical decision support in cases of diagnostic uncertainty or complexity. Although this impact may be most evident in areas where healthcare provid- ers are in relative shortage, the benefits of such an AI system are likely to be universal. Medical information has become increasingly complex over time. The range of disease entities, diagnostic testing and biomark- ers, and treatment modalities has increased exponentially in recent years. Subsequently, clinical decision-making has also become more complex and demands the synthesis of decisions from assessment of large volumes of data representing clinical information. In the current digital age, the electronic health record (EHR) represents a massive repository of electronic data points representing a diverse array of clinical information1–3 . Artificial intelligence (AI) methods have emerged as potentially powerful tools to mine EHR data to aid in disease diagnosis and management, mimicking and perhaps even augmenting the clinical decision-making of human physicians1 . To formulate a diagnosis for any given patient, physicians fre- quently use hypotheticodeductive reasoning. Starting with the chief complaint, the physician then asks appropriately targeted questions relating to that complaint. From this initial small feature set, the physician forms a differential diagnosis and decides what features (historical questions, physical exam findings, laboratory testing, and/or imaging studies) to obtain next in order to rule in or rule out the diagnoses in the differential diagnosis set. The most use- ful features are identified, such that when the probability of one of the diagnoses reaches a predetermined level of acceptability, the process is stopped, and the diagnosis is accepted. It may be pos- sible to achieve an acceptable level of certainty of the diagnosis with only a few features without having to process the entire feature set. Therefore, the physician can be considered a classifier of sorts. In this study, we designed an AI-based system using machine learning to extract clinically relevant features from EHR notes to mimic the clinical reasoning of human physicians. In medicine, machine learning methods have already demonstrated strong per- formance in image-based diagnoses, notably in radiology2 , derma- tology4 , and ophthalmology5–8 , but analysis of EHR data presents a number of difficult challenges. These challenges include the vast quantity of data, high dimensionality, data sparsity, and deviations Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence Huiying Liang1,8 , Brian Y. Tsui 2,8 , Hao Ni3,8 , Carolina C. S. Valentim4,8 , Sally L. Baxter 2,8 , Guangjian Liu1,8 , Wenjia Cai 2 , Daniel S. Kermany1,2 , Xin Sun1 , Jiancong Chen2 , Liya He1 , Jie Zhu1 , Pin Tian2 , Hua Shao2 , Lianghong Zheng5,6 , Rui Hou5,6 , Sierra Hewett1,2 , Gen Li1,2 , Ping Liang3 , Xuan Zang3 , Zhiqi Zhang3 , Liyan Pan1 , Huimin Cai5,6 , Rujuan Ling1 , Shuhua Li1 , Yongwang Cui1 , Shusheng Tang1 , Hong Ye1 , Xiaoyan Huang1 , Waner He1 , Wenqing Liang1 , Qing Zhang1 , Jianmin Jiang1 , Wei Yu1 , Jianqun Gao1 , Wanxing Ou1 , Yingmin Deng1 , Qiaozhen Hou1 , Bei Wang1 , Cuichan Yao1 , Yan Liang1 , Shu Zhang1 , Yaou Duan2 , Runze Zhang2 , Sarah Gibson2 , Charlotte L. Zhang2 , Oulan Li2 , Edward D. Zhang2 , Gabriel Karin2 , Nathan Nguyen2 , Xiaokang Wu1,2 , Cindy Wen2 , Jie Xu2 , Wenqin Xu2 , Bochu Wang2 , Winston Wang2 , Jing Li1,2 , Bianca Pizzato2 , Caroline Bao2 , Daoman Xiang1 , Wanting He1,2 , Suiqin He2 , Yugui Zhou1,2 , Weldon Haw2,7 , Michael Goldbaum2 , Adriana Tremoulet2 , Chun-Nan Hsu 2 , Hannah Carter2 , Long Zhu3 , Kang Zhang 1,2,7 * and Huimin Xia 1 * NATURE MEDICINE | www.nature.com/naturemedicine 소아청소년과 ARTICLES https://doi.org/10.1038/s41591-018-0177-5 1 Applied Bioinformatics Laboratories, New York University School of Medicine, New York, NY, USA. 2 Skirball Institute, Department of Cell Biology, New York University School of Medicine, New York, NY, USA. 3 Department of Pathology, New York University School of Medicine, New York, NY, USA. 4 School of Mechanical Engineering, National Technical University of Athens, Zografou, Greece. 5 Institute for Systems Genetics, New York University School of Medicine, New York, NY, USA. 6 Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY, USA. 7 Center for Biospecimen Research and Development, New York University, New York, NY, USA. 8 Department of Population Health and the Center for Healthcare Innovation and Delivery Science, New York University School of Medicine, New York, NY, USA. 9 These authors contributed equally to this work: Nicolas Coudray, Paolo Santiago Ocampo. *e-mail: narges.razavian@nyumc.org; aristotelis.tsirigos@nyumc.org A ccording to the American Cancer Society and the Cancer Statistics Center (see URLs), over 150,000 patients with lung cancer succumb to the disease each year (154,050 expected for 2018), while another 200,000 new cases are diagnosed on a yearly basis (234,030 expected for 2018). It is one of the most widely spread cancers in the world because of not only smoking, but also exposure to toxic chemicals like radon, asbestos and arsenic. LUAD and LUSC are the two most prevalent types of non–small cell lung cancer1 , and each is associated with discrete treatment guidelines. In the absence of definitive histologic features, this important distinc- tion can be challenging and time-consuming, and requires confir- matory immunohistochemical stains. Classification of lung cancer type is a key diagnostic process because the available treatment options, including conventional chemotherapy and, more recently, targeted therapies, differ for LUAD and LUSC2 . Also, a LUAD diagnosis will prompt the search for molecular biomarkers and sensitizing mutations and thus has a great impact on treatment options3,4 . For example, epidermal growth factor receptor (EGFR) mutations, present in about 20% of LUAD, and anaplastic lymphoma receptor tyrosine kinase (ALK) rearrangements, present in<5% of LUAD5 , currently have tar- geted therapies approved by the Food and Drug Administration (FDA)6,7 . Mutations in other genes, such as KRAS and tumor pro- tein P53 (TP53) are very common (about 25% and 50%, respec- tively) but have proven to be particularly challenging drug targets so far5,8 . Lung biopsies are typically used to diagnose lung cancer type and stage. Virtual microscopy of stained images of tissues is typically acquired at magnifications of 20×to 40×, generating very large two-dimensional images (10,000 to>100,000 pixels in each dimension) that are oftentimes challenging to visually inspect in an exhaustive manner. Furthermore, accurate interpretation can be difficult, and the distinction between LUAD and LUSC is not always clear, particularly in poorly differentiated tumors; in this case, ancil- lary studies are recommended for accurate classification9,10 . To assist experts, automatic analysis of lung cancer whole-slide images has been recently studied to predict survival outcomes11 and classifica- tion12 . For the latter, Yu et al.12 combined conventional thresholding and image processing techniques with machine-learning methods, such as random forest classifiers, support vector machines (SVM) or Naive Bayes classifiers, achieving an AUC of ~0.85 in distinguishing normal from tumor slides, and ~0.75 in distinguishing LUAD from LUSC slides. More recently, deep learning was used for the classi- fication of breast, bladder and lung tumors, achieving an AUC of 0.83 in classification of lung tumor types on tumor slides from The Cancer Genome Atlas (TCGA)13 . Analysis of plasma DNA values was also shown to be a good predictor of the presence of non–small cell cancer, with an AUC of ~0.94 (ref. 14 ) in distinguishing LUAD from LUSC, whereas the use of immunochemical markers yields an AUC of ~0.94115 . Here, we demonstrate how the field can further benefit from deep learning by presenting a strategy based on convolutional neural networks (CNNs) that not only outperforms methods in previously Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning Nicolas Coudray 1,2,9 , Paolo Santiago Ocampo3,9 , Theodore Sakellaropoulos4 , Navneet Narula3 , Matija Snuderl3 , David Fenyö5,6 , Andre L. Moreira3,7 , Narges Razavian 8 * and Aristotelis Tsirigos 1,3 * Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and sub- type of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep con- volutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most commonly mutated genes in LUAD. We found that six of them—STK11, EGFR, FAT1, SETBP1, KRAS and TP53—can be pre- dicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH. NATURE MEDICINE | www.nature.com/naturemedicine 병리과병리과병리과병리과병리과병리과병리과 ARTICLES https://doi.org/10.1038/s41551-018-0301-3 1 Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital, Chengdu, China. 2 Shanghai Wision AI Co., Ltd, Shanghai, China. 3 Beth Israel Deaconess Medical Center and Harvard Medical School, Center for Advanced Endoscopy, Boston , MA, USA. *e-mail: gary.samsph@gmail.com C olonoscopy is the gold-standard screening test for colorectal cancer1–3 , one of the leading causes of cancer death in both the United States4,5 and China6 . Colonoscopy can reduce the risk of death from colorectal cancer through the detection of tumours at an earlier, more treatable stage as well as through the removal of precancerous adenomas3,7 . Conversely, failure to detect adenomas may lead to the development of interval cancer. Evidence has shown that each 1.0% increase in adenoma detection rate (ADR) leads to a 3.0% decrease in the risk of interval colorectal cancer8 . Although more than 14million colonoscopies are performed in the United States annually2 , the adenoma miss rate (AMR) is estimated to be 6–27%9 . Certain polyps may be missed more fre- quently, including smaller polyps10,11 , flat polyps12 and polyps in the left colon13 . There are two independent reasons why a polyp may be missed during colonoscopy: (i) it was never in the visual field or (ii) it was in the visual field but not recognized. Several hardware innovations have sought to address the first problem by improv- ing visualization of the colonic lumen, for instance by providing a larger, panoramic camera view, or by flattening colonic folds using a distal-cap attachment. The problem of unrecognized polyps within the visual field has been more difficult to address14 . Several studies have shown that observation of the video monitor by either nurses or gastroenterology trainees may increase polyp detection by up to 30%15–17 . Ideally, a real-time automatic polyp-detection system could serve as a similarly effective second observer that could draw the endoscopist’s eye, in real time, to concerning lesions, effec- tively creating an ‘extra set of eyes’ on all aspects of the video data with fidelity. Although automatic polyp detection in colonoscopy videos has been an active research topic for the past 20 years, per- formance levels close to that of the expert endoscopist18–20 have not been achieved. Early work in automatic polyp detection has focused on applying deep-learning techniques to polyp detection, but most published works are small in scale, with small development and/or training validation sets19,20 . Here, we report the development and validation of a deep-learn- ing algorithm, integrated with a multi-threaded processing system, for the automatic detection of polyps during colonoscopy. We vali- dated the system in two image studies and two video studies. Each study contained two independent validation datasets. Results We developed a deep-learning algorithm using 5,545colonoscopy images from colonoscopy reports of 1,290patients that underwent a colonoscopy examination in the Endoscopy Center of Sichuan Provincial People’s Hospital between January 2007 and December 2015. Out of the 5,545images used, 3,634images contained polyps (65.54%) and 1,911 images did not contain polyps (34.46%). For algorithm training, experienced endoscopists annotated the pres- ence of each polyp in all of the images in the development data- set. We validated the algorithm on four independent datasets. DatasetsA and B were used for image analysis, and datasetsC and D were used for video analysis. DatasetA contained 27,113colonoscopy images from colo- noscopy reports of 1,138consecutive patients who underwent a colonoscopy examination in the Endoscopy Center of Sichuan Provincial People’s Hospital between January and December 2016 and who were found to have at least one polyp. Out of the 27,113 images, 5,541images contained polyps (20.44%) and 21,572images did not contain polyps (79.56%). All polyps were confirmed histo- logically after biopsy. DatasetB is a public database (CVC-ClinicDB; Development and validation of a deep-learning algorithm for the detection of polyps during colonoscopy Pu Wang1 , Xiao Xiao2 , Jeremy R. Glissen Brown3 , Tyler M. Berzin 3 , Mengtian Tu1 , Fei Xiong1 , Xiao Hu1 , Peixi Liu1 , Yan Song1 , Di Zhang1 , Xue Yang1 , Liangping Li1 , Jiong He2 , Xin Yi2 , Jingjia Liu2 and Xiaogang Liu 1 * The detection and removal of precancerous polyps via colonoscopy is the gold standard for the prevention of colon cancer. However, the detection rate of adenomatous polyps can vary significantly among endoscopists. Here, we show that a machine- learningalgorithmcandetectpolypsinclinicalcolonoscopies,inrealtimeandwithhighsensitivityandspecificity.Wedeveloped the deep-learning algorithm by using data from 1,290 patients, and validated it on newly collected 27,113 colonoscopy images from 1,138 patients with at least one detected polyp (per-image-sensitivity, 94.38%; per-image-specificity, 95.92%; area under the receiver operating characteristic curve, 0.984), on a public database of 612 polyp-containing images (per-image-sensitiv- ity, 88.24%), on 138 colonoscopy videos with histologically confirmed polyps (per-image-sensitivity of 91.64%; per-polyp-sen- sitivity, 100%), and on 54 unaltered full-range colonoscopy videos without polyps (per-image-specificity, 95.40%). By using a multi-threaded processing system, the algorithm can process at least 25 frames per second with a latency of 76.80±5.60ms in real-time video analysis. The software may aid endoscopists while performing colonoscopies, and help assess differences in polyp and adenoma detection performance among endoscopists. NATURE BIOMEDICA L ENGINEERING | VOL 2 | OCTOBER 2018 | 741–748 | www.nature.com/natbiomedeng 741 소화기내과 1Wang P, et al. Gut 2019;0:1–7. doi:10.1136/gutjnl-2018-317500 Endoscopy ORIGINAL ARTICLE Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study Pu Wang,  1 Tyler M Berzin,  2 Jeremy Romek Glissen Brown,  2 Shishira Bharadwaj,2 Aymeric Becq,2 Xun Xiao,1 Peixi Liu,1 Liangping Li,1 Yan Song,1 Di Zhang,1 Yi Li,1 Guangre Xu,1 Mengtian Tu,1 Xiaogang Liu  1 To cite: Wang P, Berzin TM, Glissen Brown JR, et al. Gut Epub ahead of print: [please include Day Month Year]. doi:10.1136/ gutjnl-2018-317500 ► Additional material is published online only.To view please visit the journal online (http://dx.doi.org/10.1136/ gutjnl-2018-317500). 1 Department of Gastroenterology, Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital, Chengdu, China 2 Center for Advanced Endoscopy, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, Massachusetts, USA Correspondence to Xiaogang Liu, Department of Gastroenterology Sichuan Academy of Medical Sciences and Sichuan Provincial People’s Hospital, Chengdu, China; Gary.samsph@gmail.com Received 30 August 2018 Revised 4 February 2019 Accepted 13 February 2019 © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. ABSTRACT Objective The effect of colonoscopy on colorectal cancer mortality is limited by several factors, among them a certain miss rate, leading to limited adenoma detection rates (ADRs).We investigated the effect of an automatic polyp detection system based on deep learning on polyp detection rate and ADR. Design In an open, non-blinded trial, consecutive patients were prospectively randomised to undergo diagnostic colonoscopy with or without assistance of a real-time automatic polyp detection system providing a simultaneous visual notice and sound alarm on polyp detection.The primary outcome was ADR. Results Of 1058 patients included, 536 were randomised to standard colonoscopy, and 522 were randomised to colonoscopy with computer-aided diagnosis.The artificial intelligence (AI) system significantly increased ADR (29.1%vs20.3%, p<0.001) and the mean number of adenomas per patient (0.53vs0.31, p<0.001).This was due to a higher number of diminutive adenomas found (185vs102; p<0.001), while there was no statistical difference in larger adenomas (77vs58, p=0.075). In addition, the number of hyperplastic polyps was also significantly increased (114vs52, p<0.001). Conclusions In a low prevalent ADR population, an automatic polyp detection system during colonoscopy resulted in a significant increase in the number of diminutive adenomas detected, as well as an increase in the rate of hyperplastic polyps.The cost–benefit ratio of such effects has to be determined further. Trial registration number ChiCTR-DDD-17012221; Results. INTRODUCTION Colorectal cancer (CRC) is the second and third- leading causes of cancer-related deaths in men and women respectively.1 Colonoscopy is the gold stan- dard for screening CRC.2 3 Screening colonoscopy has allowed for a reduction in the incidence and mortality of CRC via the detection and removal of adenomatous polyps.4–8 Additionally, there is evidence that with each 1.0% increase in adenoma detection rate (ADR), there is an associated 3.0% decrease in the risk of interval CRC.9 10 However, polyps can be missed, with reported miss rates of up to 27% due to both polyp and operator charac- teristics.11 12 Unrecognised polyps within the visual field is an important problem to address.11 Several studies have shown that assistance by a second observer increases the polyp detection rate (PDR), but such a strategy remains controversial in terms of increasing the ADR.13–15 Ideally, a real-time automatic polyp detec- tion system, with performance close to that of expert endoscopists, could assist the endosco- pist in detecting lesions that might correspond to adenomas in a more consistent and reliable way Significance of this study What is already known on this subject? ► Colorectal adenoma detection rate (ADR) is regarded as a main quality indicator of (screening) colonoscopy and has been shown to correlate with interval cancers. Reducing adenoma miss rates by increasing ADR has been a goal of many studies focused on imaging techniques and mechanical methods. ► Artificial intelligence has been recently introduced for polyp and adenoma detection as well as differentiation and has shown promising results in preliminary studies. What are the new findings? ► This represents the first prospective randomised controlled trial examining an automatic polyp detection during colonoscopy and shows an increase of ADR by 50%, from 20% to 30%. ► This effect was mainly due to a higher rate of small adenomas found. ► The detection rate of hyperplastic polyps was also significantly increased. How might it impact on clinical practice in the foreseeable future? ► Automatic polyp and adenoma detection could be the future of diagnostic colonoscopy in order to achieve stable high adenoma detection rates. ► However, the effect on ultimate outcome is still unclear, and further improvements such as polyp differentiation have to be implemented. on17March2019byguest.Protectedbycopyright.http://gut.bmj.com/Gut:firstpublishedas10.1136/gutjnl-2018-317500on27February2019.Downloadedfrom 소화기내과 Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018 Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018 Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer David F. Steiner, MD, PhD,* Robert MacDonald, PhD,* Yun Liu, PhD,* Peter Truszkowski, MD,* Jason D. Hipp, MD, PhD, FCAP,* Christopher Gammage, MS,* Florence Thng, MS,† Lily Peng, MD, PhD,* and Martin C. Stumpe, PhD* Abstract: Advances in the quality of whole-slide images have set the stage for the clinical use of digital images in anatomic pathology. Along with advances in computer image analysis, this raises the possibility for computer-assisted diagnostics in pathology to improve histopathologic interpretation and clinical care. To evaluate the potential impact of digital assistance on interpretation of digitized slides, we conducted a multireader multicase study utilizing our deep learning algorithm for the detection of breast cancer metastasis in lymph nodes. Six pathologists reviewed 70 digitized slides from lymph node sections in 2 reader modes, unassisted and assisted, with a wash- out period between sessions. In the assisted mode, the deep learning algorithm was used to identify and outline regions with high like- lihood of containing tumor. Algorithm-assisted pathologists demon- strated higher accuracy than either the algorithm or the pathologist alone. In particular, algorithm assistance significantly increased the sensitivity of detection for micrometastases (91% vs. 83%, P=0.02). In addition, average review time per image was significantly shorter with assistance than without assistance for both micrometastases (61 vs. 116 s, P=0.002) and negative images (111 vs. 137 s, P=0.018). Lastly, pathologists were asked to provide a numeric score regarding the difficulty of each image classification. On the basis of this score, pathologists considered the image review of micrometastases to be significantly easier when interpreted with assistance (P=0.0005). Utilizing a proof of concept assistant tool, this study demonstrates the potential of a deep learning algorithm to improve pathologist accu- racy and efficiency in a digital pathology workflow. Key Words: artificial intelligence, machine learning, digital pathology, breast cancer, computer aided detection (Am J Surg Pathol 2018;00:000–000) The regulatory approval and gradual implementation of whole-slide scanners has enabled the digitization of glass slides for remote consults and archival purposes.1 Digitiza- tion alone, however, does not necessarily improve the con- sistency or efficiency of a pathologist’s primary workflow. In fact, image review on a digital medium can be slightly slower than on glass, especially for pathologists with limited digital pathology experience.2 However, digital pathology and image analysis tools have already demonstrated po- tential benefits, including the potential to reduce inter-reader variability in the evaluation of breast cancer HER2 status.3,4 Digitization also opens the door for assistive tools based on Artificial Intelligence (AI) to improve efficiency and con- sistency, decrease fatigue, and increase accuracy.5 Among AI technologies, deep learning has demon- strated strong performance in many automated image-rec- ognition applications.6–8 Recently, several deep learning– based algorithms have been developed for the detection of breast cancer metastases in lymph nodes as well as for other applications in pathology.9,10 Initial findings suggest that some algorithms can even exceed a pathologist’s sensitivity for detecting individual cancer foci in digital images. How- ever, this sensitivity gain comes at the cost of increased false positives, potentially limiting the utility of such algorithms for automated clinical use.11 In addition, deep learning algo- rithms are inherently limited to the task for which they have been specifically trained. While we have begun to understand the strengths of these algorithms (such as exhaustive search) and their weaknesses (sensitivity to poor optical focus, tumor mimics; manuscript under review), the potential clinical util- ity of such algorithms has not been thoroughly examined. While an accurate algorithm alone will not necessarily aid pathologists or improve clinical interpretation, these benefits may be achieved through thoughtful and appropriate in- tegration of algorithm predictions into the clinical workflow.8 From the *Google AI Healthcare; and †Verily Life Sciences, Mountain View, CA. D.F.S., R.M., and Y.L. are co-first authors (equal contribution). Work done as part of the Google Brain Healthcare Technology Fellowship (D.F.S. and P.T.). Conflicts of Interest and Source of Funding: D.F.S., R.M., Y.L., P.T., J.D.H., C.G., F.T., L.P., M.C.S. are employees of Alphabet and have Alphabet stock. Correspondence: David F. Steiner, MD, PhD, Google AI Healthcare, 1600 Amphitheatre Way, Mountain View, CA 94043 (e-mail: davesteiner@google.com). Supplemental Digital Content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal’s website, www.ajsp.com. Copyright © 2018 The Author(s). Published by Wolters Kluwer Health, Inc. This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. ORIGINAL ARTICLE Am J Surg Pathol Volume 00, Number 00, ’’ 2018 www.ajsp.com | 1 병리과 S E P S I S A targeted real-time early warning score (TREWScore) for septic shock Katharine E. Henry,1 David N. Hager,2 Peter J. Pronovost,3,4,5 Suchi Saria1,3,5,6 * Sepsis is a leading cause of death in the United States, with mortality highest among patients who develop septic shock. Early aggressive treatment decreases morbidity and mortality. Although automated screening tools can detect patients currently experiencing severe sepsis and septic shock, none predict those at greatest risk of developing shock. We analyzed routinely available physiological and laboratory data from intensive care unit patients and devel- oped “TREWScore,” a targeted real-time early warning score that predicts which patients will develop septic shock. TREWScore identified patients before the onset of septic shock with an area under the ROC (receiver operating characteristic) curve (AUC) of 0.83 [95% confidence interval (CI), 0.81 to 0.85]. At a specificity of 0.67, TREWScore achieved a sensitivity of 0.85 and identified patients a median of 28.2 [interquartile range (IQR), 10.6 to 94.2] hours before onset. Of those identified, two-thirds were identified before any sepsis-related organ dysfunction. In compar- ison, the Modified Early Warning Score, which has been used clinically for septic shock prediction, achieved a lower AUC of 0.73 (95% CI, 0.71 to 0.76). A routine screening protocol based on the presence of two of the systemic inflam- matory response syndrome criteria, suspicion of infection, and either hypotension or hyperlactatemia achieved a low- er sensitivity of 0.74 at a comparable specificity of 0.64. Continuous sampling of data from the electronic health records and calculation of TREWScore may allow clinicians to identify patients at risk for septic shock and provide earlier interventions that would prevent or mitigate the associated morbidity and mortality. INTRODUCTION Seven hundred fifty thousand patients develop severe sepsis and septic shock in the United States each year. More than half of them are admitted to an intensive care unit (ICU), accounting for 10% of all ICU admissions, 20 to 30% of hospital deaths, and $15.4 billion in an- nual health care costs (1–3). Several studies have demonstrated that morbidity, mortality, and length of stay are decreased when severe sep- sis and septic shock are identified and treated early (4–8). In particular, one study showed that mortality from septic shock increased by 7.6% with every hour that treatment was delayed after the onset of hypo- tension (9). More recent studies comparing protocolized care, usual care, and early goal-directed therapy (EGDT) for patients with septic shock sug- gest that usual care is as effective as EGDT (10–12). Some have inter- preted this to mean that usual care has improved over time and reflects important aspects of EGDT, such as early antibiotics and early ag- gressive fluid resuscitation (13). It is likely that continued early identi- fication and treatment will further improve outcomes. However, the best approach to managing patients at high risk of developing septic shock before the onset of severe sepsis or shock has not been studied. Methods that can identify ahead of time which patients will later expe- rience septic shock are needed to further understand, study, and im- prove outcomes in this population. General-purpose illness severity scoring systems such as the Acute Physiology and Chronic Health Evaluation (APACHE II), Simplified Acute Physiology Score (SAPS II), SequentialOrgan Failure Assessment (SOFA) scores, Modified Early Warning Score (MEWS), and Simple Clinical Score (SCS) have been validated to assess illness severity and risk of death among septic patients (14–17). Although these scores are useful for predicting general deterioration or mortality, they typical- ly cannot distinguish with high sensitivity and specificity which patients are at highest risk of developing a specific acute condition. The increased use of electronic health records (EHRs), which can be queried in real time, has generated interest in automating tools that identify patients at risk for septic shock (18–20). A number of “early warning systems,” “track and trigger” initiatives, “listening applica- tions,” and “sniffers” have been implemented to improve detection andtimelinessof therapy forpatients with severe sepsis andseptic shock (18, 20–23). Although these tools have been successful at detecting pa- tients currently experiencing severe sepsis or septic shock, none predict which patients are at highest risk of developing septic shock. The adoption of the Affordable Care Act has added to the growing excitement around predictive models derived from electronic health data in a variety of applications (24), including discharge planning (25), risk stratification (26, 27), and identification of acute adverse events (28, 29). For septic shock in particular, promising work includes that of predicting septic shock using high-fidelity physiological signals collected directly from bedside monitors (30, 31), inferring relationships between predictors of septic shock using Bayesian networks (32), and using routine measurements for septic shock prediction (33–35). No current prediction models that use only data routinely stored in the EHR predict septic shock with high sensitivity and specificity many hours before onset. Moreover, when learning predictive risk scores, cur- rent methods (34, 36, 37) often have not accounted for the censoring effects of clinical interventions on patient outcomes (38). For instance, a patient with severe sepsis who received fluids and never developed septic shock would be treated as a negative case, despite the possibility that he or she might have developed septic shock in the absence of such treatment and therefore could be considered a positive case up until the 1 Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA. 2 Division of Pulmonary and Critical Care Medicine, Department of Medicine, School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA. 3 Armstrong Institute for Patient Safety and Quality, Johns Hopkins University, Baltimore, MD 21202, USA. 4 Department of Anesthesiology and Critical Care Medicine, School of Medicine, Johns Hopkins University, Baltimore, MD 21202, USA. 5 Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA. 6 Department of Applied Math and Statistics, Johns Hopkins University, Baltimore, MD 21218, USA. *Corresponding author. E-mail: ssaria@cs.jhu.edu R E S E A R C H A R T I C L E www.ScienceTranslationalMedicine.org 5 August 2015 Vol 7 Issue 299 299ra122 1 onNovember3,2016http://stm.sciencemag.org/Downloadedfrom An Algorithm Based on Deep Learning for Predicting In-Hospital Cardiac Arrest Joon-myoung Kwon, MD;* Youngnam Lee, MS;* Yeha Lee, PhD; Seungwoo Lee, BS; Jinsik Park, MD, PhD Background-—In-hospital cardiac arrest is a major burden to public health, which affects patient safety. Although traditional track- and-trigger systems are used to predict cardiac arrest early, they have limitations, with low sensitivity and high false-alarm rates. We propose a deep learning–based early warning system that shows higher performance than the existing track-and-trigger systems. Methods and Results-—This retrospective cohort study reviewed patients who were admitted to 2 hospitals from June 2010 to July 2017. A total of 52 131 patients were included. Specifically, a recurrent neural network was trained using data from June 2010 to January 2017. The result was tested using the data from February to July 2017. The primary outcome was cardiac arrest, and the secondary outcome was death without attempted resuscitation. As comparative measures, we used the area under the receiver operating characteristic curve (AUROC), the area under the precision–recall curve (AUPRC), and the net reclassification index. Furthermore, we evaluated sensitivity while varying the number of alarms. The deep learning–based early warning system (AUROC: 0.850; AUPRC: 0.044) significantly outperformed a modified early warning score (AUROC: 0.603; AUPRC: 0.003), a random forest algorithm (AUROC: 0.780; AUPRC: 0.014), and logistic regression (AUROC: 0.613; AUPRC: 0.007). Furthermore, the deep learning– based early warning system reduced the number of alarms by 82.2%, 13.5%, and 42.1% compared with the modified early warning system, random forest, and logistic regression, respectively, at the same sensitivity. Conclusions-—An algorithm based on deep learning had high sensitivity and a low false-alarm rate for detection of patients with cardiac arrest in the multicenter study. (J Am Heart Assoc. 2018;7:e008678. DOI: 10.1161/JAHA.118.008678.) Key Words: artificial intelligence • cardiac arrest • deep learning • machine learning • rapid response system • resuscitation In-hospital cardiac arrest is a major burden to public health, which affects patient safety.1–3 More than a half of cardiac arrests result from respiratory failure or hypovolemic shock, and 80% of patients with cardiac arrest show signs of deterioration in the 8 hours before cardiac arrest.4–9 However, 209 000 in-hospital cardiac arrests occur in the United States each year, and the survival discharge rate for patients with cardiac arrest is 20% worldwide.10,11 Rapid response systems (RRSs) have been introduced in many hospitals to detect cardiac arrest using the track-and-trigger system (TTS).12,13 Two types of TTS are used in RRSs. For the single-parameter TTS (SPTTS), cardiac arrest is predicted if any single vital sign (eg, heart rate [HR], blood pressure) is out of the normal range.14 The aggregated weighted TTS calculates a weighted score for each vital sign and then finds patients with cardiac arrest based on the sum of these scores.15 The modified early warning score (MEWS) is one of the most widely used approaches among all aggregated weighted TTSs (Table 1)16 ; however, traditional TTSs including MEWS have limitations, with low sensitivity or high false-alarm rates.14,15,17 Sensitivity and false-alarm rate interact: Increased sensitivity creates higher false-alarm rates and vice versa. Current RRSs suffer from low sensitivity or a high false- alarm rate. An RRS was used for only 30% of patients before unplanned intensive care unit admission and was not used for 22.8% of patients, even if they met the criteria.18,19 From the Departments of Emergency Medicine (J.-m.K.) and Cardiology (J.P.), Mediplex Sejong Hospital, Incheon, Korea; VUNO, Seoul, Korea (Youngnam L., Yeha L., S.L.). *Dr Kwon and Mr Youngnam Lee contributed equally to this study. Correspondence to: Joon-myoung Kwon, MD, Department of Emergency medicine, Mediplex Sejong Hospital, 20, Gyeyangmunhwa-ro, Gyeyang-gu, Incheon 21080, Korea. E-mail: kwonjm@sejongh.co.kr Received January 18, 2018; accepted May 31, 2018. ª 2018 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. DOI: 10.1161/JAHA.118.008678 Journal of the American Heart Association 1 ORIGINAL RESEARCH byguestonJune28,2018http://jaha.ahajournals.org/Downloadedfrom 감염내과 심장내과 BRIEF COMMUNICATION OPEN Digital biomarkers of cognitive function Paul Dagum1 To identify digital biomarkers associated with cognitive function, we analyzed human–computer interaction from 7 days of smartphone use in 27 subjects (ages 18–34) who received a gold standard neuropsychological assessment. For several neuropsychological constructs (working memory, memory, executive function, language, and intelligence), we found a family of digital biomarkers that predicted test scores with high correlations (p 10−4 ). These preliminary results suggest that passive measures from smartphone use could be a continuous ecological surrogate for laboratory-based neuropsychological assessment. npj Digital Medicine (2018)1:10 ; doi:10.1038/s41746-018-0018-4 INTRODUCTION By comparison to the functional metrics available in other disciplines, conventional measures of neuropsychiatric disorders have several challenges. First, they are obtrusive, requiring a subject to break from their normal routine, dedicating time and often travel. Second, they are not ecological and require subjects to perform a task outside of the context of everyday behavior. Third, they are episodic and provide sparse snapshots of a patient only at the time of the assessment. Lastly, they are poorly scalable, taxing limited resources including space and trained staff. In seeking objective and ecological measures of cognition, we attempted to develop a method to measure memory and executive function not in the laboratory but in the moment, day-to-day. We used human–computer interaction on smart- phones to identify digital biomarkers that were correlated with neuropsychological performance. RESULTS In 2014, 27 participants (ages 27.1 ± 4.4 years, education 14.1 ± 2.3 years, M:F 8:19) volunteered for neuropsychological assessment and a test of the smartphone app. Smartphone human–computer interaction data from the 7 days following the neuropsychological assessment showed a range of correla- tions with the cognitive scores. Table 1 shows the correlation between each neurocognitive test and the cross-validated predictions of the supervised kernel PCA constructed from the biomarkers for that test. Figure 1 shows each participant test score and the digital biomarker prediction for (a) digits backward, (b) symbol digit modality, (c) animal fluency, (d) Wechsler Memory Scale-3rd Edition (WMS-III) logical memory (delayed free recall), (e) brief visuospatial memory test (delayed free recall), and (f) Wechsler Adult Intelligence Scale- 4th Edition (WAIS-IV) block design. Construct validity of the predictions was determined using pattern matching that computed a correlation of 0.87 with p 10−59 between the covariance matrix of the predictions and the covariance matrix of the tests. Table 1. Fourteen neurocognitive assessments covering five cognitive domains and dexterity were performed by a neuropsychologist. Shown are the group mean and standard deviation, range of score, and the correlation between each test and the cross-validated prediction constructed from the digital biomarkers for that test Cognitive predictions Mean (SD) Range R (predicted), p-value Working memory Digits forward 10.9 (2.7) 7–15 0.71 ± 0.10, 10−4 Digits backward 8.3 (2.7) 4–14 0.75 ± 0.08, 10−5 Executive function Trail A 23.0 (7.6) 12–39 0.70 ± 0.10, 10−4 Trail B 53.3 (13.1) 37–88 0.82 ± 0.06, 10−6 Symbol digit modality 55.8 (7.7) 43–67 0.70 ± 0.10, 10−4 Language Animal fluency 22.5 (3.8) 15–30 0.67 ± 0.11, 10−4 FAS phonemic fluency 42 (7.1) 27–52 0.63 ± 0.12, 10−3 Dexterity Grooved pegboard test (dominant hand) 62.7 (6.7) 51–75 0.73 ± 0.09, 10−4 Memory California verbal learning test (delayed free recall) 14.1 (1.9) 9–16 0.62 ± 0.12, 10−3 WMS-III logical memory (delayed free recall) 29.4 (6.2) 18–42 0.81 ± 0.07, 10−6 Brief visuospatial memory test (delayed free recall) 10.2 (1.8) 5–12 0.77 ± 0.08, 10−5 Intelligence scale WAIS-IV block design 46.1(12.8) 12–61 0.83 ± 0.06, 10−6 WAIS-IV matrix reasoning 22.1(3.3) 12–26 0.80 ± 0.07, 10−6 WAIS-IV vocabulary 40.6(4.0) 31–50 0.67 ± 0.11, 10−4 Received: 5 October 2017 Revised: 3 February 2018 Accepted: 7 February 2018 1 Mindstrong Health, 248 Homer Street, Palo Alto, CA 94301, USA Correspondence: Paul Dagum (paul@mindstronghealth.com) www.nature.com/npjdigitalmed 정신의학과 P R E C I S I O N M E D I C I N E Identification of type 2 diabetes subgroups through topological analysis of patient similarity Li Li,1 Wei-Yi Cheng,1 Benjamin S. Glicksberg,1 Omri Gottesman,2 Ronald Tamler,3 Rong Chen,1 Erwin P. Bottinger,2 Joel T. Dudley1,4 * Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to improve early prevention and clinical management of T2D and its complications. Clinicians have understood that patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related compli- cations. We used a precision medicine approach to characterize the complexity of T2D patient populations based on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was character- ized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer ma- lignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases, neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent T2D subtypes to identify subtype-specific genetic markers and identified 1279, 1227, and 1338 single-nucleotide polymorphisms (SNPs) that mapped to 425, 322, and 437 unique genes specific to subtypes 1, 2, and 3, respec- tively. By assessing the human disease–SNP association for each subtype, the enriched phenotypes and biological functions at the gene level for each subtype matched with the disease comorbidities and clinical dif- ferences that we identified through EMRs. Our approach demonstrates the utility of applying the precision medicine paradigm in T2D and the promise of extending the approach to the study of other complex, multi- factorial diseases. INTRODUCTION Type 2 diabetes (T2D) is a complex, multifactorial disease that has emerged as an increasing prevalent worldwide health concern asso- ciated with high economic and physiological burdens. An estimated 29.1 million Americans (9.3% of the population) were estimated to have some form of diabetes in 2012—up 13% from 2010—with T2D representing up to 95% of all diagnosed cases (1, 2). Risk factors for T2D include obesity, family history of diabetes, physical inactivity, eth- nicity, and advanced age (1, 2). Diabetes and its complications now rank among the leading causes of death in the United States (2). In fact, diabetes is the leading cause of nontraumatic foot amputation, adult blindness, and need for kidney dialysis, and multiplies risk for myo- cardial infarction, peripheral artery disease, and cerebrovascular disease (3–6). The total estimated direct medical cost attributable to diabetes in the United States in 2012 was $176 billion, with an estimated $76 billion attributable to hospital inpatient care alone. There is a great need to im- prove understanding of T2D and its complex factors to facilitate pre- vention, early detection, and improvements in clinical management. A more precise characterization of T2D patient populations can en- hance our understanding of T2D pathophysiology (7, 8). Current clinical definitions classify diabetes into three major subtypes: type 1 dia- betes (T1D), T2D, and maturity-onset diabetes of the young. Other sub- types based on phenotype bridge the gap between T1D and T2D, for example, latent autoimmune diabetes in adults (LADA) (7) and ketosis- prone T2D. The current categories indicate that the traditional definition of diabetes, especially T2D, might comprise additional subtypes with dis- tinct clinical characteristics. A recent analysis of the longitudinal Whitehall II cohort study demonstrated improved assessment of cardiovascular risks when subgrouping T2D patients according to glucose concentration criteria (9). Genetic association studies reveal that the genetic architec- ture of T2D is profoundly complex (10–12). Identified T2D-associated risk variants exhibit allelic heterogeneity and directional differentiation among populations (13, 14). The apparent clinical and genetic com- plexity and heterogeneity of T2D patient populations suggest that there are opportunities to refine the current, predominantly symptom-based, definition of T2D into additional subtypes (7). Because etiological and pathophysiological differences exist among T2D patients, we hypothesize that a data-driven analysis of a clinical population could identify new T2D subtypes and factors. Here, we de- velop a data-driven, topology-based approach to (i) map the complexity of patient populations using clinical data from electronic medical re- cords (EMRs) and (ii) identify new, emergent T2D patient subgroups with subtype-specific clinical and genetic characteristics. We apply this approachtoadatasetcomprisingmatchedEMRsandgenotypedatafrom more than 11,000 individuals. Topological analysis of these data revealed three distinct T2D subtypes that exhibited distinct patterns of clinical characteristics and disease comorbidities. Further, we identified genetic markers associated with each T2D subtype and performed gene- and pathway-level analysis of subtype genetic associations. Biological and phenotypic features enriched in the genetic analysis corroborated clinical disparities observed among subgroups. Our findings suggest that data- driven,topologicalanalysisofpatientco 내분비내과 LETTER Derma o og - eve c a ca on o k n cancer w h deep neura ne work 피부과 FOCUS LETTERS W W W W W Ca d o og s eve a hy hm a de ec on and c ass ca on n ambu a o y e ec oca d og ams us ng a deep neu a ne wo k M m M FOCUS LETTERS 심장내과 D p a n ng nab obu a m n and on o human b a o y a n v o a on 산부인과 O G NA A W on o On o og nd b e n e e men e ommend on g eemen w h n e pe mu d p n umo bo d 종양내과 D m m B D m OHCA m Kw MD K H MD M H M K m MD M M K m MD M M L m MD M K H K m MD D MD D MD D R K C MD D B H O MD D D m Em M M H K D C C C M H K T w A D C D m M C C M H G m w G R K Tw w C A K H MD D C D m M C C M H K G m w G R K T E m m @ m m A A m O OHCA m m m w w T m m DCA M T w m K OHCA w A CCEPTED M A N U SCRIPT 응급의학과
  • 135.
    •복잡한 의료 데이터의분석 및 insight 도출 •영상 의료/병리 데이터의 분석/판독 •연속 데이터의 모니터링 및 예방/예측 의료 인공지능의 세 유형
  • 136.
    •복잡한 의료 데이터의분석 및 insight 도출 •영상 의료/병리 데이터의 분석/판독 •연속 데이터의 모니터링 및 예방/예측 의료 인공지능의 세 유형
  • 137.
    Jeopardy! 2011년 인간 챔피언두 명 과 퀴즈 대결을 벌여서 압도적인 우승을 차지
  • 140.
    메이요 클리닉 협력 (임상시험 매칭) 전남대병원 도입 인도 마니팔 병원 WFO 도입 식약처 인공지능 가이드라인 초안 메드트로닉과 혈당관리 앱 시연 2011 2012 2013 2014 2015 뉴욕 MSK암센터 협력 (폐암) MD앤더슨 협력 (백혈병) MD앤더슨 파일럿 결과 발표 @ASCO 왓슨 펀드, 웰톡에 투자 뉴욕게놈센터 협력 (교모세포종 분석) GeneMD, 왓슨 모바일 디벨로퍼 챌린지 우승 클리블랜드 클리닉 협력 (암 유전체 분석) 한국 IBM 왓슨 사업부 신설 Watson Health 출범 피텔, 익스플로리스 인수 JJ, 애플, 메드트로닉 협력 에픽 시스템즈, 메이요클리닉 제휴 (EHR 분석) 동경대 도입 ( WFO) 왓슨 펀드, 모더나이징 메디슨 투자 학계/의료계 산업계 패쓰웨이 지노믹스 OME 클로즈드 알파 서비스 시작 트루븐 헬스 인수 애플 리서치 키트 통한 수면 연구 시작 2017 가천대 길병원 도입 메드트로닉 Sugar.IQ 출시 제약사 테바와 제휴 태국 범룽랏 국제 병원, WFO 도입 머지 헬스케어 인수 2016 언더 아머 제휴 브로드 연구소 협력 발표 (유전체 분석-항암제 내 성) 마니팔 병원의 
 WFO 정확성 발표 대구가톨릭병원 대구동산병원 
 도입 부산대병원 도입 왓슨 펀드, 패쓰웨이 지노믹스 투자 제퍼디! 우승 조선대병원 도입 한국 왓슨 컨소시움 출범 쥬피터 
 메디컬 
 센터 도입 식약처 인공지능 가이드라인 메이요 클리닉 임상시험매칭 결과발표 2018 건양대병원 도입 IBM Watson Health Chronicle WFO 최초 논문
  • 141.
    메이요 클리닉 협력 (임상시험 매칭) 전남대병원 도입 인도 마니팔 병원 WFO 도입 식약처 인공지능 가이드라인 초안 메드트로닉과 혈당관리 앱 시연 2011 2012 2013 2014 2015 뉴욕 MSK암센터 협력 (폐암) MD앤더슨 협력 (백혈병) MD앤더슨 파일럿 결과 발표 @ASCO 왓슨 펀드, 웰톡에 투자 뉴욕게놈센터 협력 (교모세포종 분석) GeneMD, 왓슨 모바일 디벨로퍼 챌린지 우승 클리블랜드 클리닉 협력 (암 유전체 분석) 한국 IBM 왓슨 사업부 신설 Watson Health 출범 피텔, 익스플로리스 인수 JJ, 애플, 메드트로닉 협력 에픽 시스템즈, 메이요클리닉 제휴 (EHR 분석) 동경대 도입 ( WFO) 왓슨 펀드, 모더나이징 메디슨 투자 학계/의료계 산업계 패쓰웨이 지노믹스 OME 클로즈드 알파 서비스 시작 트루븐 헬스 인수 애플 리서치 키트 통한 수면 연구 시작 2017 가천대 길병원 도입 메드트로닉 Sugar.IQ 출시 제약사 테바와 제휴 태국 범룽랏 국제 병원, WFO 도입 머지 헬스케어 인수 2016 언더 아머 제휴 브로드 연구소 협력 발표 (유전체 분석-항암제 내 성) 마니팔 병원의 
 WFO 정확성 발표 부산대병원 도입 왓슨 펀드, 패쓰웨이 지노믹스 투자 제퍼디! 우승 조선대병원 도입 한국 왓슨 컨소시움 출범 쥬피터 
 메디컬 
 센터 도입 식약처 인공지능 가이드라인 메이요 클리닉 임상시험매칭 결과발표 2018 건양대병원 도입 IBM Watson Health Chronicle WFO 최초 논문 대구가톨릭병원 대구동산병원 
 도입
  • 142.
    Annals of Oncology(2016) 27 (suppl_9): ix179-ix180. 10.1093/annonc/mdw601 Validation study to assess performance of IBM cognitive computing system Watson for oncology with Manipal multidisciplinary tumour board for 1000 consecutive cases: 
 An Indian experience •인도 마니팔 병원의 1,000명의 암환자 에 대해 의사와 WFO의 권고안의 ‘일치율’을 비교 •유방암 638명, 대장암 126명, 직장암 124명, 폐암 112명 •의사-왓슨 일치율 •추천(50%), 고려(28%), 비추천(17%) •의사의 진료안 중 5%는 왓슨의 권고안으로 제시되지 않음 •일치율이 암의 종류마다 달랐음 •직장암(85%), 폐암(17.8%) •삼중음성 유방암(67.9%), HER2 음성 유방암 (35%)
  • 143.
    WFO in ASCO2017 •가천대 길병원의 대장암과 위암 환자에 왓슨 적용 결과 • 대장암 환자(stage II-IV) 340명 • 진행성 위암 환자 185명 (Retrospective)
 • 의사와의 일치율 • 대장암 환자: 73% • 보조 (adjuvant) 항암치료를 받은 250명: 85% • 전이성 환자 90명: 40%
 • 위암 환자: 49% • Trastzumab/FOLFOX 가 국민 건강 보험 수가를 받지 못함 • S-1(tegafur, gimeracil and oteracil)+cisplatin): • 국내는 매우 루틴; 미국에서는 X
  • 144.
    원칙이 필요하다 •어떤 환자의경우, 왓슨에게 의견을 물을 것인가? •왓슨을 (암종별로) 얼마나 신뢰할 것인가? •왓슨의 의견을 환자에게 공개할 것인가? •왓슨과 의료진의 판단이 다른 경우 어떻게 할 것인가? •왓슨에게 보험 급여를 매길 수 있는가? 이러한 기준에 따라 의료의 질/치료효과가 달라질 수 있으나, 현재 개별 병원이 개별적인 기준으로 활용하게 됨
  • 145.
    •복잡한 의료 데이터의분석 및 insight 도출 •영상 의료/병리 데이터의 분석/판독 •연속 데이터의 모니터링 및 예방/예측 의료 인공지능의 세 유형
  • 146.
  • 148.
  • 149.
    •손 엑스레이 영상을판독하여 환자의 골연령 (뼈 나이)를 계산해주는 인공지능 • 기존에 의사는 그룰리히-파일(Greulich-Pyle)법 등으로 표준 사진과 엑스레이를 비교하여 판독 • 인공지능은 참조표준영상에서 성별/나이별 패턴을 찾아서 유사성을 확률로 표시 + 표준 영상 검색 •의사가 성조숙증이나 저성장을 진단하는데 도움을 줄 수 있음
  • 150.
    - 1 - 보도 자 료 국내에서 개발한 인공지능(AI) 기반 의료기기 첫 허가 - 인공지능 기술 활용하여 뼈 나이 판독한다 - 식품의약품안전처 처장 류영진 는 국내 의료기기업체 주 뷰노가 개발한 인공지능 기술이 적용된 의료영상분석장치소프트웨어 뷰노메드 본에이지 를 월 일 허가했다고 밝혔습니다 이번에 허가된 뷰노메드 본에이지 는 인공지능 이 엑스레이 영상을 분석하여 환자의 뼈 나이를 제시하고 의사가 제시된 정보 등으로 성조숙증이나 저성장을 진단하는데 도움을 주는 소프트웨어입니다 그동안 의사가 환자의 왼쪽 손 엑스레이 영상을 참조표준영상 과 비교하면서 수동으로 뼈 나이를 판독하던 것을 자동화하여 판독시간을 단축하였습니다 이번 허가 제품은 년 월부터 빅데이터 및 인공지능 기술이 적용된 의료기기의 허가 심사 가이드라인 적용 대상으로 선정되어 임상시험 설계에서 허가까지 맞춤 지원하였습니다 뷰노메드 본에이지 는 환자 왼쪽 손 엑스레이 영상을 분석하여 의 료인이 환자 뼈 나이를 판단하는데 도움을 주기 위한 목적으로 허가되었습니다 - 2 - 분석은 인공지능이 촬영된 엑스레이 영상의 패턴을 인식하여 성별 남자 개 여자 개 로 분류된 뼈 나이 모델 참조표준영상에서 성별 나이별 패턴을 찾아 유사성을 확률로 표시하면 의사가 확률값 호르몬 수치 등의 정보를 종합하여 성조숙증이나 저성장을 진단합 니다 임상시험을 통해 제품 정확도 성능 를 평가한 결과 의사가 판단한 뼈 나이와 비교했을 때 평균 개월 차이가 있었으며 제조업체가 해당 제품 인공지능이 스스로 인지 학습할 수 있도록 영상자료를 주기적으로 업데이트하여 의사와의 오차를 좁혀나갈 수 있도록 설계되었습니다 인공지능 기반 의료기기 임상시험계획 승인건수는 이번에 허가받은 뷰노메드 본에이지 를 포함하여 현재까지 건입니다 임상시험이 승인된 인공지능 기반 의료기기는 자기공명영상으로 뇌경색 유형을 분류하는 소프트웨어 건 엑스레이 영상을 통해 폐결절 진단을 도와주는 소프트웨어 건 입니다 참고로 식약처는 인공지능 가상현실 프린팅 등 차 산업과 관련된 의료기기 신속한 개발을 지원하기 위하여 제품 연구 개발부터 임상시험 허가에 이르기까지 전 과정을 맞춤 지원하는 차세대 프로젝트 신개발 의료기기 허가도우미 등을 운영하고 있 습니다 식약처는 이번 제품 허가를 통해 개개인의 뼈 나이를 신속하게 분석 판정하는데 도움을 줄 수 있을 것이라며 앞으로도 첨단 의료기기 개발이 활성화될 수 있도록 적극적으로 지원해 나갈 것이라고 밝혔습니다
  • 151.
    저는 뷰노의 자문을맡고 있으며, 지분 관계가 있음을 밝힙니다
  • 152.
    This copy isfor personal use only. To order printed copies, contact reprints@rsna.org This copy is for personal use only. To order printed copies, contact reprints@rsna.org ORIGINAL RESEARCH • THORACIC IMAGING hest radiography, one of the most common diagnos- intraobserver agreements because of its limited spatial reso- Development and Validation of Deep Learning–based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs Ju Gang Nam, MD* • Sunggyun Park, PhD* • Eui Jin Hwang, MD • Jong Hyuk Lee, MD • Kwang-Nam Jin, MD, PhD • KunYoung Lim, MD, PhD • Thienkai HuyVu, MD, PhD • Jae Ho Sohn, MD • Sangheum Hwang, PhD • Jin Mo Goo, MD, PhD • Chang Min Park, MD, PhD From the Department of Radiology and Institute of Radiation Medicine, Seoul National University Hospital and College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Republic of Korea (J.G.N., E.J.H., J.M.G., C.M.P.); Lunit Incorporated, Seoul, Republic of Korea (S.P.); Department of Radiology, Armed Forces Seoul Hospital, Seoul, Republic of Korea (J.H.L.); Department of Radiology, Seoul National University Boramae Medical Center, Seoul, Republic of Korea (K.N.J.); Department of Radiology, National Cancer Center, Goyang, Republic of Korea (K.Y.L.); Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, Calif (T.H.V., J.H.S.); and Department of Industrial Information Systems Engineering, Seoul National University of Science and Technology, Seoul, Republic of Korea (S.H.). Received January 30, 2018; revision requested March 20; revision received July 29; accepted August 6. Address correspondence to C.M.P. (e-mail: cmpark.morphius@gmail.com). Study supported by SNUH Research Fund and Lunit (06–2016–3000) and by Seoul Research and Business Development Program (FI170002). *J.G.N. and S.P. contributed equally to this work. Conflicts of interest are listed at the end of this article. Radiology 2018; 00:1–11 • https://doi.org/10.1148/radiol.2018180237 • Content codes: Purpose: To develop and validate a deep learning–based automatic detection algorithm (DLAD) for malignant pulmonary nodules on chest radiographs and to compare its performance with physicians including thoracic radiologists. Materials and Methods: For this retrospective study, DLAD was developed by using 43292 chest radiographs (normal radiograph– to–nodule radiograph ratio, 34067:9225) in 34676 patients (healthy-to-nodule ratio, 30784:3892; 19230 men [mean age, 52.8 years; age range, 18–99 years]; 15446 women [mean age, 52.3 years; age range, 18–98 years]) obtained between 2010 and 2015, which were labeled and partially annotated by 13 board-certified radiologists, in a convolutional neural network. Radiograph clas- sification and nodule detection performances of DLAD were validated by using one internal and four external data sets from three South Korean hospitals and one U.S. hospital. For internal and external validation, radiograph classification and nodule detection performances of DLAD were evaluated by using the area under the receiver operating characteristic curve (AUROC) and jackknife alternative free-response receiver-operating characteristic (JAFROC) figure of merit (FOM), respectively. An observer performance test involving 18 physicians, including nine board-certified radiologists, was conducted by using one of the four external validation data sets. Performances of DLAD, physicians, and physicians assisted with DLAD were evaluated and compared. Results: According to one internal and four external validation data sets, radiograph classification and nodule detection perfor- mances of DLAD were a range of 0.92–0.99 (AUROC) and 0.831–0.924 (JAFROC FOM), respectively. DLAD showed a higher AUROC and JAFROC FOM at the observer performance test than 17 of 18 and 15 of 18 physicians, respectively (P , .05), and all physicians showed improved nodule detection performances with DLAD (mean JAFROC FOM improvement, 0.043; range, 0.006–0.190; P , .05). Conclusion: This deep learning–based automatic detection algorithm outperformed physicians in radiograph classification and nod- ule detection performance for malignant pulmonary nodules on chest radiographs, and it enhanced physicians’ performances when used as a second reader. ©RSNA, 2018 Online supplemental material is available for this article. • 43,292 chest PA (normal:nodule=34,067:9225) • labeled/annotated by 13 board-certified radiologists. • DLAD were validated 1 internal + 4 external datasets • 서울대병원 / 보라매병원 / 국립암센터 / UCSF • Classification / Lesion localization • 인공지능 vs. 의사 vs. 인공지능+의사 • 다양한 수준의 의사와 비교 • Non-radiology / radiology residents • Board-certified radiologist / Thoracic radiologists
  • 153.
    Nam et al Figure1: Images in a 78-year-old female patient with a 1.9-cm part-solid nodule at the left upper lobe. (a) The nodule was faintly visible on the chest radiograph (arrowheads) and was detected by 11 of 18 observers. (b) At contrast-enhanced CT examination, biopsy confirmed lung adeno- carcinoma (arrow). (c) DLAD reported the nodule with a confidence level of 2, resulting in its detection by an additional five radiologists and an elevation in its confidence by eight radiologists. Figure 2: Images in a 64-year-old male patient with a 2.2-cm lung adenocarcinoma at the left upper lobe. (a) The nodule was faintly visible on the chest radiograph (arrowheads) and was detected by seven of 18 observers. (b) Biopsy confirmed lung adenocarcinoma in the left upper lobe on contrast-enhanced CT image (arrow). (c) DLAD reported the nodule with a confidence level of 2, resulting in its detection by an additional two radiologists and an elevated confidence level of the nodule by two radiologists.
  • 154.
    Deep Learning AutomaticDetection Algorithm for Malignant Pulmonary Nodules Table 3: Patient Classification and Nodule Detection at the Observer Performance Test Observer Test 1 DLAD versus Test 1 (P Value) Test 2 Test 1 versus Test 2 (P Value) Radiograph Classification (AUROC) Nodule Detection (JAFROC FOM) Radiograph Classification Nodule Detection Radiograph Classification (AUROC) Nodule Detection (JAFROC FOM) Radiograph Classification Nodule Detection Nonradiology physicians Observer 1 0.77 0.716 ,.001 ,.001 0.91 0.853 ,.001 ,.001 Observer 2 0.78 0.657 ,.001 ,.001 0.90 0.846 ,.001 ,.001 Observer 3 0.80 0.700 ,.001 ,.001 0.88 0.783 ,.001 ,.001 Group 0.691 ,.001* 0.828 ,.001* Radiology residents Observer 4 0.78 0.767 ,.001 ,.001 0.80 0.785 .02 .03 Observer 5 0.86 0.772 .001 ,.001 0.91 0.837 .02 ,.001 Observer 6 0.86 0.789 .05 .002 0.86 0.799 .08 .54 Observer 7 0.84 0.807 .01 .003 0.91 0.843 .003 .02 Observer 8 0.87 0.797 .10 .003 0.90 0.845 .03 .001 Observer 9 0.90 0.847 .52 .12 0.92 0.867 .04 .03 Group 0.790 ,.001* 0.867 ,.001* Board-certified radiologists Observer 10 0.87 0.836 .05 .01 0.90 0.865 .004 .002 Observer 11 0.83 0.804 ,.001 ,.001 0.84 0.817 .03 .04 Observer 12 0.88 0.817 .18 .005 0.91 0.841 .01 .01 Observer 13 0.91 0.824 ..99 .02 0.92 0.836 .51 .24 Observer 14 0.88 0.834 .14 .03 0.88 0.840 .87 .23 Group 0.821 .02* 0.840 .01* Thoracic radiologists Observer 15 0.94 0.856 .15 .21 0.96 0.878 .08 .03 Observer 16 0.92 0.854 .60 .17 0.93 0.872 .34 .02 Observer 17 0.86 0.820 .02 .01 0.88 0.838 .14 .12 Observer 18 0.84 0.800 ,.001 ,.001 0.87 0.827 .02 .02 Group 0.833 .08* 0.854 ,.001* Note.—Observer 4 had 1 year of experience; observers 5 and 6 had 2 years of experience; observers 7–9 had 3 years of experience; observers 10–12 had 7 years of experience; observers 13 and 14 had 8 years of experience; observer 15 had 26 years of experience; observer 16 had 13 years of experience; and observers 17 and 18 had 9 years of experience. Observers 1–3 were 4th-year residents from obstetrics and gynecolo- 의사 인공지능 vs. 의사만 (p value) 의사+인공지능 의사 vs. 의사+인공지능 (p value) 영상의학과 1년차 전공의 영상의학과 2년차 전공의 영상의학과 3년차 전공의 산부인과 4년차 전공의 정형외과 4년차 전공의 내과 4년차 전공의 영상의학과 전문의 7년 경력 8년 경력 영상의학과 전문의 (흉부) 26년 경력 13년 경력 9년 경력 영상의학과 전공의 비영상의학과 의사 •인공지능을 second reader로 활용하면 정확도가 개선 •classification: 17 of 18 명이 개선 (15 of 18, P0.05) •nodule detection: 18 of 18 명이 개선 (14 of 18, P0.05)
  • 155.
    Deep Learning AutomaticDetection Algorithm for Malignant Pulmonary Nodules Table 3: Patient Classification and Nodule Detection at the Observer Performance Test Observer Test 1 DLAD versus Test 1 (P Value) Test 2 Test 1 versus Test 2 (P Value) Radiograph Classification (AUROC) Nodule Detection (JAFROC FOM) Radiograph Classification Nodule Detection Radiograph Classification (AUROC) Nodule Detection (JAFROC FOM) Radiograph Classification Nodule Detection Nonradiology physicians Observer 1 0.77 0.716 ,.001 ,.001 0.91 0.853 ,.001 ,.001 Observer 2 0.78 0.657 ,.001 ,.001 0.90 0.846 ,.001 ,.001 Observer 3 0.80 0.700 ,.001 ,.001 0.88 0.783 ,.001 ,.001 Group 0.691 ,.001* 0.828 ,.001* Radiology residents Observer 4 0.78 0.767 ,.001 ,.001 0.80 0.785 .02 .03 Observer 5 0.86 0.772 .001 ,.001 0.91 0.837 .02 ,.001 Observer 6 0.86 0.789 .05 .002 0.86 0.799 .08 .54 Observer 7 0.84 0.807 .01 .003 0.91 0.843 .003 .02 Observer 8 0.87 0.797 .10 .003 0.90 0.845 .03 .001 Observer 9 0.90 0.847 .52 .12 0.92 0.867 .04 .03 Group 0.790 ,.001* 0.867 ,.001* Board-certified radiologists Observer 10 0.87 0.836 .05 .01 0.90 0.865 .004 .002 Observer 11 0.83 0.804 ,.001 ,.001 0.84 0.817 .03 .04 Observer 12 0.88 0.817 .18 .005 0.91 0.841 .01 .01 Observer 13 0.91 0.824 ..99 .02 0.92 0.836 .51 .24 Observer 14 0.88 0.834 .14 .03 0.88 0.840 .87 .23 Group 0.821 .02* 0.840 .01* Thoracic radiologists Observer 15 0.94 0.856 .15 .21 0.96 0.878 .08 .03 Observer 16 0.92 0.854 .60 .17 0.93 0.872 .34 .02 Observer 17 0.86 0.820 .02 .01 0.88 0.838 .14 .12 Observer 18 0.84 0.800 ,.001 ,.001 0.87 0.827 .02 .02 Group 0.833 .08* 0.854 ,.001* Note.—Observer 4 had 1 year of experience; observers 5 and 6 had 2 years of experience; observers 7–9 had 3 years of experience; observers 10–12 had 7 years of experience; observers 13 and 14 had 8 years of experience; observer 15 had 26 years of experience; observer 16 had 13 years of experience; and observers 17 and 18 had 9 years of experience. Observers 1–3 were 4th-year residents from obstetrics and gynecolo- 의사 인공지능 vs. 의사만 (p value) 의사+인공지능 의사 vs. 의사+인공지능 (p value) 영상의학과 1년차 전공의 영상의학과 2년차 전공의 영상의학과 3년차 전공의 산부인과 4년차 전공의 정형외과 4년차 전공의 내과 4년차 전공의 영상의학과 전문의 7년 경력 8년 경력 영상의학과 전문의 (흉부) 26년 경력 13년 경력 9년 경력 영상의학과 전공의 비영상의학과 의사 인공지능 0.91 0.885 •“인공지능 혼자” 한 것이 “영상의학과 전문의+인공지능”보다 대부분 더 정확 •classification: 9명 중 6명보다 나음 •nodule detection: 9명 전원보다 나음
  • 156.
  • 157.
    A B DC Benignwithout atypia / Atypic / DCIS (ductal carcinoma in situ) / Invasive Carcinoma Interpretation? Elmore etl al. JAMA 2015 Diagnostic Concordance Among Pathologists 유방암 병리 데이터 판독하기
  • 158.
    Figure 4. ParticipatingPathologists’ Interpretations of Each of the 240 Breast Biopsy Test Cases 0 25 50 75 100 Interpretations, % 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 Case Benign without atypia 72 Cases 2070 Total interpretations A 0 25 50 75 100 Interpretations, % 218 220 222 224 226 228 230 232 234 236 238 240 Case Invasive carcinoma 23 Cases 663 Total interpretations D 0 25 50 75 100 Interpretations, % 147 145 149 151 153 155 157 159 161 163 165 167 169 171 173 175 177 179 181 183 185 187 189 191 193 195 197 199 201 203 205 207 209 211 213 215 217 Case DCIS 73 Cases 2097 Total interpretations C 0 25 50 75 100 Interpretations, % 74 76 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 122 124 126 128 130 132 134 136 138 140 142 144 Case Atypia 72 Cases 2070 Total interpretations B Benign without atypia Atypia DCIS Invasive carcinoma Pathologist interpretation DCIS indicates ductal carcinoma in situ. Diagnostic Concordance in Interpreting Breast Biopsies Original Investigation Research Elmore etl al. JAMA 2015 유방암 판독에 대한 병리학과 전문의들의 불일치도
  • 159.
    ISBI Grand Challengeon Cancer Metastases Detection in Lymph Node
  • 161.
  • 162.
    International Symposium onBiomedical Imaging 2016 HE Image Processing Framework Train whole slide image sample sample training data normaltumor Test whole slide image overlapping image patches tumor prob. map 1.0 0.0 0.5 Convolutional Neural Network P(tumor)
  • 163.
  • 164.
    Clinical study onISBI dataset Error Rate Pathologist in competition setting 3.5% Pathologists in clinical practice (n = 12) 13% - 26% Pathologists on micro-metastasis(small tumors) 23% - 42% Beck Lab Deep Learning Model 0.65% Beck Lab’s deep learning model now outperforms pathologist Andrew Beck, Machine Learning for Healthcare, MIT 2017
  • 165.
    구글의 유방 병리판독 인공지능 • The localization score(FROC) for the algorithm reached 89%, which significantly exceeded the score of 73% for a pathologist with no time constraint.
  • 166.
    인공지능의 민감도 +인간의 특이도 Yun Liu et al. Detecting Cancer Metastases on Gigapixel Pathology Images (2017) • 구글의 인공지능은 민감도에서 큰 개선 (92.9%, 88.5%) •@8FP: FP를 8개까지 봐주면서, 달성할 수 있는 민감도 •FROC: FP를 슬라이드당 1/4, 1/2, 1, 2, 4, 8개를 허용한 민감도의 평균 •즉, FP를 조금 봐준다면, 인공지능은 매우 높은 민감도를 달성 가능 • 인간 병리학자는 민감도 73%에 반해, 특이도는 거의 100% 달성 •인간 병리학자와 인공지능 병리학자는 서로 잘하는 것이 다름 •양쪽이 협력하면 판독 효율성, 일관성, 민감도 등에서 개선 기대 가능
  • 167.
  • 168.
  • 169.
    •Some polyps weredetected with only partial appearance. •detected in both normal and insufficient light condition. •detected under both qualified and suboptimal bowel preparations. ARTICLESNATURE BIOMEDICAL ENGINEERING from patients who underwent colonoscopy examinations up to 2 years later. Also, we demonstrated high per-image-sensitivity (94.38% and 91.64%) in both the image (datasetA) and video (datasetC) analyses. DatasetsA and C included large variations of polyp mor- phology and image quality (Fig. 3, Supplementary Figs. 2–5 and Supplementary Videos 3 and 4). For images with only flat and iso- datasets are often small and do not represent the full range of colon conditions encountered in the clinical setting, and there are often discrepancies in the reporting of clinical metrics of success such as sensitivity and specificity19,20,26 . Compared with other metrics such as precision, we believe that sensitivity and specificity are the most appropriate metrics for the evaluation of algorithm performance because of their independence on the ratio of positive to negative Fig. 3 | Examples of polyp detection for datasetsA and C. Polyps of different morphology, including flat isochromatic polyps (left), dome-shaped polyps (second from left, middle), pedunculated polyps (second from right) and sessile serrated adenomatous polyps (right), were detected by the algorithm (as indicated by the green tags in the bottom set of images) in both normal and insufficient light conditions, under both qualified and suboptimal bowel preparations. Some polyps were detected with only partial appearance (middle, second from right). See Supplementary Figs 2–6 for additional examples. flat isochromatic polyps dome-shaped polyps sessile serrated adenomatous polypspedunculated polyps 대장내시경에서의 용종 발견 보조 인공지능
  • 170.
    •복잡한 의료 데이터의분석 및 insight 도출 •영상 의료/병리 데이터의 분석/판독 •연속 데이터의 모니터링 및 예방/예측 의료 인공지능의 세 유형
  • 173.
  • 174.
    Fig 1. Whatcan consumer wearables do? Heart rate can be measured with an oximeter built into a ring [3], muscle activity with an electromyographi sensor embedded into clothing [4], stress with an electodermal sensor incorporated into a wristband [5], and physical activity or sleep patterns via an accelerometer in a watch [6,7]. In addition, a female’s most fertile period can be identified with detailed body temperature tracking [8], while levels of me attention can be monitored with a small number of non-gelled electroencephalogram (EEG) electrodes [9]. Levels of social interaction (also known to a PLOS Medicine 2016
  • 175.
  • 177.
    S E PS I S A targeted real-time early warning score (TREWScore) for septic shock Katharine E. Henry,1 David N. Hager,2 Peter J. Pronovost,3,4,5 Suchi Saria1,3,5,6 * Sepsis is a leading cause of death in the United States, with mortality highest among patients who develop septic shock. Early aggressive treatment decreases morbidity and mortality. Although automated screening tools can detect patients currently experiencing severe sepsis and septic shock, none predict those at greatest risk of developing shock. We analyzed routinely available physiological and laboratory data from intensive care unit patients and devel- oped “TREWScore,” a targeted real-time early warning score that predicts which patients will develop septic shock. TREWScore identified patients before the onset of septic shock with an area under the ROC (receiver operating characteristic) curve (AUC) of 0.83 [95% confidence interval (CI), 0.81 to 0.85]. At a specificity of 0.67, TREWScore achieved a sensitivity of 0.85 and identified patients a median of 28.2 [interquartile range (IQR), 10.6 to 94.2] hours before onset. Of those identified, two-thirds were identified before any sepsis-related organ dysfunction. In compar- ison, the Modified Early Warning Score, which has been used clinically for septic shock prediction, achieved a lower AUC of 0.73 (95% CI, 0.71 to 0.76). A routine screening protocol based on the presence of two of the systemic inflam- matory response syndrome criteria, suspicion of infection, and either hypotension or hyperlactatemia achieved a low- er sensitivity of 0.74 at a comparable specificity of 0.64. Continuous sampling of data from the electronic health records and calculation of TREWScore may allow clinicians to identify patients at risk for septic shock and provide earlier interventions that would prevent or mitigate the associated morbidity and mortality. INTRODUCTION Seven hundred fifty thousand patients develop severe sepsis and septic shock in the United States each year. More than half of them are admitted to an intensive care unit (ICU), accounting for 10% of all ICU admissions, 20 to 30% of hospital deaths, and $15.4 billion in an- nual health care costs (1–3). Several studies have demonstrated that morbidity, mortality, and length of stay are decreased when severe sep- sis and septic shock are identified and treated early (4–8). In particular, one study showed that mortality from septic shock increased by 7.6% with every hour that treatment was delayed after the onset of hypo- tension (9). More recent studies comparing protocolized care, usual care, and early goal-directed therapy (EGDT) for patients with septic shock sug- gest that usual care is as effective as EGDT (10–12). Some have inter- preted this to mean that usual care has improved over time and reflects important aspects of EGDT, such as early antibiotics and early ag- gressive fluid resuscitation (13). It is likely that continued early identi- fication and treatment will further improve outcomes. However, the Acute Physiology Score (SAPS II), SequentialOrgan Failure Assessment (SOFA) scores, Modified Early Warning Score (MEWS), and Simple Clinical Score (SCS) have been validated to assess illness severity and risk of death among septic patients (14–17). Although these scores are useful for predicting general deterioration or mortality, they typical- ly cannot distinguish with high sensitivity and specificity which patients are at highest risk of developing a specific acute condition. The increased use of electronic health records (EHRs), which can be queried in real time, has generated interest in automating tools that identify patients at risk for septic shock (18–20). A number of “early warning systems,” “track and trigger” initiatives, “listening applica- tions,” and “sniffers” have been implemented to improve detection andtimelinessof therapy forpatients with severe sepsis andseptic shock (18, 20–23). Although these tools have been successful at detecting pa- tients currently experiencing severe sepsis or septic shock, none predict which patients are at highest risk of developing septic shock. The adoption of the Affordable Care Act has added to the growing excitement around predictive models derived from electronic health R E S E A R C H A R T I C L E onNovember3,2016http://stm.sciencemag.org/Downloadedfrom
  • 178.
    puted as newdata became avail when his or her score crossed t dation set, the AUC obtained f 0.81 to 0.85) (Fig. 2). At a spec of 0.33], TREWScore achieved a s a median of 28.2 hours (IQR, 10 Identification of patients b A critical event in the developme related organ dysfunction (seve been shown to increase after th more than two-thirds (68.8%) o were identified before any sepsi tients were identified a median (Fig. 3B). Comparison of TREWScore Weevaluatedtheperformanceof methods for the purpose of provid use of TREWScore. We first com to MEWS, a general metric used of catastrophic deterioration (17 oped for tracking sepsis, MEWS tion of patients at risk for severe Fig. 2. ROC for detection of septic shock before onset in the validation set. The ROC curve for TREWScore is shown in blue, with the ROC curve for MEWS in red. The sensitivity and specificity performance of the routine screening criteria is indicated by the purple dot. Normal 95% CIs are shown for TREWScore and MEWS. TPR, true-positive rate; FPR, false-positive rate. R E S E A R C H A R T I C L E A targeted real-time early warning score (TREWScore) for septic shock AUC=0.83 At a specificity of 0.67, TREWScore achieved a sensitivity of 0.85 
 and identified patients a median of 28.2 hours before onset.
  • 179.
    March 2019, theFuture of Individual Medicine @San Diego
  • 181.
  • 182.
    •미국에서 아이폰 앱으로출시 •사용이 얼마나 번거로울지가 관건 •어느 정도의 기간을 활용해야 효과가 있는가: 2주? 평생? •Food logging 등을 어떻게 할 것인가? •과금 방식도 아직 공개되지 않은듯
  • 183.
  • 184.
    ADA 2017, SanDiego, Courtesy of Taeho Kim (Seoul Medical Center)
  • 185.
    •복잡한 의료 데이터의분석 및 insight 도출 •영상 의료/병리 데이터의 분석/판독 •연속 데이터의 모니터링 및 예방/예측 의료 인공지능의 세 유형
  • 186.
    디지털 헬스케어의 3단계 •Step1. 데이터의 측정 •Step 2. 데이터의 통합 •Step 3. 데이터의 분석
  • 187.
    의료 시장/산업의 특성 왜그렇게 어려운가?
  • 188.
    글로벌 헬스케어 시장은큰 시장인가? YES and NO.
  • 189.
    글로벌 헬스케어 시장은큰 시장인가? YES and NO.
  • 190.
    글로벌 헬스케어 시장의 총 합은 크다. 12조 달러. (1경 3046조원)
  • 191.
    하지만, 헬스케어 시장은 극도로세분화된 작은 시장의 총합으로 구성되어 있다.
  • 192.
    모든 세부 시장의니즈를 충족시키는 것은 불가능하다.
  • 193.
    헬스케어 시장의 니즈는 고객마다 매우 세분화되어 있다. • 건강인 / 환자 • 20대 / 30대 / 40대 / 50대 / 60대 / 70대 / 80대 • 남성 / 여성 • 저체중 / 정상/ 과체중 • 가족력 • 건강에 대한 관심 • 지불 능력 • 디지털 리터러시
 • B2C / B2B
  • 194.
    • 모든 고객의니즈를 모두 충족시키는 것은 불가능하다. • 결국 한 번에 하나씩 공략 하는 수밖에 없다. • 그렇다면 어떤 고객을 골라야하나? • 가장 절박한 니즈를 가진 고객 세그먼트는? • 우리가 실제로 해결책을 제시할 수 있는 고객은? • 돈을 낼 수 있는 고객은? 그러면 어떻게 해야 하는가?
  • 195.
    헬스케어 마켓 패러독스 건강인중증질환 급성질환 지불의사 높음 지불의사 낮음 대상 고객 많음 대상 고객 적음 중증도
  • 196.
  • 197.
  • 198.
  • 199.
    환자 의사 기업 보험 고객은누구인가? 돈은 누가 내지? 누가 결정하지? 누가 사용하지?
  • 200.
    환자 의사 기업 보험 Payer는누구인가 돈은 누가 내지? 누가 결정하지? 누가 사용하지?
  • 201.
    (수가, 구매팀, 부모) 돈을낼 것인가? 니즈가 얼마나 큰가 지불구조+
  • 202.
  • 203.
    기업 고객 타기업 정부 일반적인 산업생태계와는 달리, 헬스케어 산업의 생태계는 수많은 이해관계자들이 존재합니다
  • 204.
    헬스케어 기업 시민단체 보험사 타기업 정부 규제기관 환자 병원 심평원 일반 스타트업생태계와는 달리, 헬스케어 스타트업의 생태계는 수많은 이해관계자들이 존재합니다
  • 205.
    신생 헬스케어 기업은 이러한 이해관계자들과의 괴리가 존재할 수밖에 없습니다 신생기업 정부 시민단체 규제기관 보험사 병원
  • 206.
  • 208.
  • 209.
    http://graphics.wsj.com/billion-dollar-club/ •기업 가치 $9b(June 2014) •총 투자유치 규모: $400m •엘리자베스 홈즈 본인이 과반 이상 지분 보유
  • 210.
    The Journal ofClinical Investigation C L I N I C A L M E D I C I N E Introduction Clinical laboratory testing plays a critical role in health care and evidence-based medicine (1). Lab tests provide essential data that support clinical decisions to screen, diagnose, and treat health conditions (2). Most individuals encounter clinical testing through their health care provider during a routine health assess- ment or as a patient in a health care facility. However, individu- als are increasingly playing more active roles in managing their health, and some now seek direct access to laboratory testing for self-guided assessment or monitoring (3–5). IntheUSA,allclinicallaboratorytestingconductedonhumans is regulated by Centers for Medicare Medicaid Services (CMS) based on guidelines outlined in Clinical Laboratory Improvement Amendments (CLIA) (6). To ensure analytical quality of labora- tory methods, certified laboratories are required to participate in periodic proficiency testing using a homogeneous batch of sam- ples that are distributed to each laboratory from a CMS-approved proficiency testing program. These programs assess the total allowable error (TEa) that combines method bias and total impre- cision for each analyte. Acceptability criteria are determined by CLIA and/or the appropriate accrediting agency (7). Direct-to-consumer service models now provide means for individuals to obtain laboratory testing outside traditional health care settings (4, 5). One company implementing this new model is Theranos, which offers a blood testing service that uses capillary tube collection and promises several advantages over traditional venipuncture: lower collection volumes (typically ≤150 μl versus ≥1.5 ml), convenience, and reduced cost — on average about 5-fold less than the 2 largest testing laboratories in the USA (Quest and LabCorp) (8). However, availability of these services varies by state, where access to offerings may be more or less restrictive BACKGROUND. Clinical laboratory tests are now being prescribed and made directly available to consumers through retail outlets in the USA. Concerns with these test have been raised regarding the uncertainty of testing methods used in these venues and a lack of open, scientific validation of the technical accuracy and clinical equivalency of results obtained through these services. METHODS. We conducted a cohort study of 60 healthy adults to compare the uncertainty and accuracy in 22 common clinical lab tests between one company offering blood tests obtained from finger prick (Theranos) and 2 major clinical testing services that require standard venipuncture draws (Quest and LabCorp). Samples were collected in Phoenix, Arizona, at an ambulatory clinic and at retail outlets with point-of-care services. RESULTS. Theranos flagged tests outside their normal range 1.6× more often than other testing services (P 0.0001). Of the 22 lab measurements evaluated, 15 (68%) showed significant interservice variability (P 0.002). We found nonequivalent lipid panel test results between Theranos and other clinical services. Variability in testing services, sample collection times, and subjects markedly influenced lab results. CONCLUSION. While laboratory practice standards exist to control this variability, the disparities between testing services we observed could potentially alter clinical interpretation and health care utilization. Greater transparency and evaluation of testing technologies would increase their utility in personalized health management. FUNDING. This work was supported by the Icahn Institute for Genomics and Multiscale Biology, a gift from the Harris Family Charitable Foundation (to J.T. Dudley), and grants from the NIH (R01 DK098242 and U54 CA189201, to J.T. Dudley, and R01 AG046170 and U01 AI111598, to E.E. Schadt). Evaluation of direct-to-consumer low-volume lab tests in healthy adults Brian A. Kidd,1,2,3 Gabriel Hoffman,1,2 Noah Zimmerman,3 Li Li,1,2,3 Joseph W. Morgan,3 Patricia K. Glowe,1,2,3 Gregory J. Botwin,3 Samir Parekh,4 Nikolina Babic,5 Matthew W. Doust,6 Gregory B. Stock,1,2,3 Eric E. Schadt,1,2 and Joel T. Dudley1,2,3 1 Department of Genetics and Genomic Sciences, 2 Icahn Institute for Genomics and Multiscale Biology, 3 Harris Center for Precision Wellness, 4 Department of Hematology and Medical Oncology, and 5 Department of Pathology, Icahn School of Medicine at Mount Sinai, NewYork, NewYork, USA. 6 Hope Research Institute (HRI), Phoenix, Arizona, USA. Conflict of interest: J.T. Dudley owns equity in NuMedii Inc. and has received consulting fees or honoraria from Janssen Pharmaceuticals, GlaxoSmithKline, AstraZeneca, and LAM Therapeutics. Role of funding source: Study funding provided by the Icahn Institute for Genomics and Multiscale Biology and the Harris Center for Precision Wellness at the Icahn School of Medicine at Mount Sinai. Salaries of B.A. Kidd, J.T. Dudley, and E.E. Schadt Downloaded from http://www.jci.org on March 28, 2016. http://dx.doi.org/10.1172/JCI86318 •Mt Sinai 에서 내어놓은 Theranos 의 정확도에 대한 논문 •2015년 7월 경에 60명의 건강한 환자들을 대상으로 5일 간에 걸쳐서 •22가지의 검사 항목을 테라노스와 또 다른 두 군데의 검사 기관에 맡겨서 결과를 비교 •결론적으로 Theranos의 결과가 많이 부정확 •콜레스테롤 등의 경우는 의사의 진단이 바뀔 정도로 크게 부정확 •전반적인 테스트들 결과 정상 범위가 아니라고 판단하는 경우가 테라노스가 1.6배 많음 •22개의 검사 항목 중에서 15개에서 유의미하게 결과의 차이가 있었습니다. •논문에서는 알 수 없는 또 다른 문제 •Theranos가 자체적으로 개발했다고 '주장' 했던 에디슨 기기를 정말로 썼느냐...하는 것 •WSJ 에 나온 과거 직원의 증언에 따르면, 이미 2015년 7월경이라면, •에디슨 기기를 쓰지 않고 지멘스 등 기존 다른 기기에 혈액을 희석해서 쓰고 있을 때 •역시나(?) 이번에도 테라노스는 conflict-of-interest 가 있는 잘못된 논문이라는 반응
  • 211.
  • 212.
    새로운 것을 주장하려면근거가 있어야 한다. 논문, 임상연구…
  • 213.
    IMM인베스트먼트 문여정 이사(산부인과 전문의) “어떤 헬스케어 스타트업에 투자해야 하는가?”
  • 214.
    “어떤 헬스케어 스타트업에투자해야 하는가?” IMM인베스트먼트 문여정 이사 (산부인과 전문의) “한국의 헬스케어에는 답이 없다는 것을 알고 있는 스타트업에 투자해야 한다.”
  • 215.
    한국 의료 시장의특성 한국에서 의료 산업이 가능한가?
  • 216.
  • 217.
    시장의 문제 •“한국이 왜매력적인 시장인가?”에 대한 설득력 있는 답 없음. •너무 작은 국내 시장 (+ 헬스케어는 파편화된 시장) • 혁신의 시도가 어렵고; 지속 가능한 사업모델도 적다 • 결국 해외 진출을 고민해야 함 • 하지만, 실제로는 이도저도 못하는 상황이 많음
  • 218.
    시장의 문제 •정부와 환자는의료를 산업보다는 ‘복지’로 인식 • 소비자가 의료에 돈을 쓴다는 의식이 적음 • ‘저렴, 혹은 공짜이면서도’, ‘완벽한 의료’를 원함 • 의료로는 돈을 벌어서는 안 된다는 인식 •한국은 기본적으로 저신뢰 사회 • 정부-환자-의료계-산업계: 서로를 믿지 못함 • 미리 촘촘한 규제를 만들고, 전문성을 서로 인정하지 않는 구조
  • 219.
    시장의 문제 •의료의 특수성을이해하는 창업자 및 투자자가 적음 • 규제, 인허가, 보험수가 + 복잡한 이해관계 구도 • 시장의 특수성을 이해하고, market-product fit 찾아낸 창업자가 적음 • 이러한 창업자를 알아볼 수 있는 투자자도 적음
  • 220.
  • 221.
    Results within 6-8weeksA little spit is all it takes! DTC Genetic TestingDirect-To-Consumer
  • 222.
  • 223.
  • 227.
    transfer from Share2to HealthKit as mandated by Dexcom receiver Food and Drug Administration device classification. Once the glucose values reach HealthKit, they are passively shared with the Epic MyChart app (https://www.epic.com/software-phr.php). The MyChart patient portal is a component of the Epic EHR and uses the same data- base, and the CGM values populate a standard glucose flowsheet in the patient’s chart. This connection is initially established when a pro- Participation required confirmation of Bluetooth pairing of the CGM re- ceiver to a mobile device, updating the mobile device with the most recent version of the operating system, Dexcom Share2 app, Epic MyChart app, and confirming or establishing a username and password for all accounts, including a parent’s/adolescent’s Epic MyChart account. Setup time aver- aged 45–60 minutes in addition to the scheduled clinic visit. During this time, there was specific verbal and written notification to the patients/par- Figure 1: Overview of the CGM data communication bridge architecture. BRIEFCOMMUNICATION Kumar R B, et al. J Am Med Inform Assoc 2016;0:1–6. doi:10.1093/jamia/ocv206, Brief Communication byguestonApril7,2016http://jamia.oxfordjournals.org/Downloadedfrom JAMIA 2016 Remote Patients Monitoring via Dexcom-HealthKit-Epic-Stanford
  • 229.
    • NewYork • First-timehome visit $50; regular visits $200; physical $100
  • 232.
    •'모든' 보험 상품에핏빗 등의 웨어러블과 스마트폰을 이용한 interactive policy를 추가
  • 233.
    •웨어러블의 데이터를 제공해주면'공짜'로 $1,000짜리 보험에 가입시켜주는 프로그램 •Amica Life, Greenhouse Life Insurance Company과 협업하여 돌연사에 대한 보험 •데이터의 내용이 보험의 커버리지나 요율 등에 변화를 주지는 않을 것
  • 234.
  • 235.
    •스마트 칫솔을 기반으로새로운 치과 보험을 판매하고 있는 Beam Dental •치과 보험에 가입하면 스마트 칫솔과 치약, 치실을 정기 무료 배송 •(사용자 동의하에) 양치질 데이터를 바탕으로 dynamic pricing 
 •KPCB 로부터 $22.5m 규모의 투자를 유치(serise C) •현재 미국의 16개 주에서 서비스, 이번 투자를 바탕으로 연말까지 35개 주로 확대 계획 •“치과 보험 시장은 일반 건강 보험보다 규제와 걸림돌이 적다
  • 236.
  • 237.
    한국에서는 불법. (+ 혹은수가를 받지 못함)
  • 238.
  • 239.
    규제의 문제 (1/3) •포지티브규제 • 법으로 명시적으로 허용한 것 외에는 모두 불법 • 규제 샌드박스, 포괄적 네거티브 규제의 도입 논의 중 •식약처: 의료기기 인허가 • 최근들어 상당히 나아지고 있음 • 인공지능 의료기기 가이드라인 등의 선제적 발표 •심평원: 신의료기술평가 • 2중 규제: 식약처의 의료기기 인허가를 받아도 판매할 수 없음 • 국내 시장에 출시하는데 시간이 더 오래 걸리는 이유 • 한국에서는 기존에 없던 새로운 것을 하기가 어려움
  • 240.
    규제의 문제 (2/3) •의료데이터 관련 규제의 불확실성 • ‘의료 데이터’의 명확한 법적 정의 없음 • 개인식별정보 / 비식별화-재식별화의 법적 정의 없음 •데이터 비즈니스에 대한 국내 인식 • “대기업이 환자의 데이터를 이용해서 돈을 번다” 는 프레임 • 완벽한 보호 + 데이터의 가치 극대화: 모두 요구
  • 241.
    규제의 문제 (3/3) •기타주요 규제 • 영리 법인 병원 금지 (vs. 애플의 병원 설립) • 원격 의료 금지 (vs. 텔라닥) • 의약품 배송 금지 (vs. 아마존, 바이두) • 유전정보 DTC 검사 금지 (vs. 23andMe) • 보험사 건강관리서비스 회색지대 (vs. Oscar, Omada/Noom) • 차량 공유 서비스 금지, 환자 유인행위 금지 (vs. Uber Health)
  • 242.
  • 243.
    한국 의료의 특수성(1/2) •단일 국민 건강보험 당연지정제 • 모든 의료 행위의 가격을 정부가 컨트롤

 • 해당 의료 행위의 의학적 필요성을 정부(심평원)가 판단 (not by 의료인) • 의사가 필요해도, 수가 적용이 되지 않아서 사용하지 못하는 경우 발생 • 정부는 기본적으로 의료계를 신뢰하지 않음 (‘저신뢰 사회’)

 • 건강보험 재정을 아끼는 것이 가장 큰 목표 중의 하나 • 새로운 혁신 기술을 과감히 받아들이는 것에 매우 보수적임
  • 244.
    한국 의료의 특수성(2/3) •문재인케어: 보장성 강화 • ‘의학적으로 필수적인’ 의료 행위는 모두 건강보험 적용 • 보장성 강화 = 국가 컨트롤 강화 = 기업 자율성 약화 = 혁신의 저해 (글로벌과 반대) •저수가 (가장 근본적 문제 중 하나) • 국민건강보험 수가 지정이 매우 보수적임 (ex. 인공지능 의료기기) • 수가를 받아도, 결국 저수가 (원가의 일부 밖에 보전 안됨)
  • 245.
    한국 의료의 특수성(3/3) •높은 접근성 • 한국은 당일 진료 vs. 미국은 예약 후 2-3주 • 미국의 많은 사업모델이 한국에서는 유효하지 않은 이유. •붕괴된 의료 전달 체계 • 일반 감기 환자도 서울대병원 응급실 내원 가능 • 1, 2, 3차 병원의 역할 분담 무너짐 (위기이자 기회일 수도)
  • 246.
  • 247.
    몇가지 해결책 •1. 국민건강 보험을 이원화/다원화 • 결국 한국 의료 시장의 모든 문제는 국민건강보험으로 귀결 • 혁신 기술을 보다 적극적으로 받아들이는 국민건강보험을 추가로 개설 • 근본적인 해결책이 될 것으로 생각되지만, 현실적으로는 불가능할 것 • 소위 ‘국민 정서법’에 저촉 • ‘기계적 형평성’, ‘정치적 올바름’ 이슈 • 돈 있는 사람이 더 좋은 서비스를 받는 것 인정하지 못함 • 보장성 확대와 반대 방향, ‘의료 영리화’에도 저촉
  • 248.
    몇가지 해결책 •2. 사보험사대상의 B2B2C • 최근 사보험사의 디지털 헬스케어 서비스에 대한 관심 높아지고 있음 • 하지만, 국내 보험사의 관련 경험, 전문성, 데이터가 미비 • 건강관리서비스에 대한 가이드라인이 최근에 나옴 • 하지만, 회색 지대는 여전히 있으며, 해석의 여지가 존재 • 2018년 초 복지부의 건강관리서비스 TF 발족하였으나, 아직 활동 전무 • 시민단체 반발도 부담 • ‘보험사가 가입자의 데이터로 돈을 번다’는 프레임


  • 249.
    몇가지 해결책 •3. 의료기기외의 웰니스, 의료 관련 서비스에 집중 • 진단, 치료 등의 의학적 행위 외에, 규제를 받지 않는 영역에서 사업 • 의학의 근본적인 문제를 해결하기는 어려우나, 일단 사업은 가능 • 병원 대상의 B2B • 일반 환자 대상의 B2C
 • 한국에서 성공한 대부분의 디지털 헬스케어 기업은 이 영역에 해당 • 케어랩스(2018년 IPO)의 강남언니: 성형외과 O2O 플랫폼 • DHP가 투자한 영역도 대부분 여기에 해당 • 병원용 챗봇, 의대생/의사 대상의 VR 솔루션, (+ 마음챙김 명상) • O2O 플랫폼: 당뇨병, 탈모, 의사 찾기, (+ 간병인)
  • 250.
    몇가지 해결책 •4. 해외진출 • 시장 크기 작고, 지불 의사도 없고, 규제도 미비된 한국 시장을 떠나서, • 미국, 유럽, 중국, 일본, 동남아 등의 시장에 먼저 진출하려는 움직임 • 최근 일부 스타트업은 아예 해외에서 시작하거나, • 기술 개발 후 출시를 해외에서 먼저 하는 경우 증가
  • 252.
    Feedback/Questions • Email: yoonsup.choi@gmail.com •Blog: http://www.yoonsupchoi.com • Facebook: Yoon Sup Choi