SlideShare a Scribd company logo
1 of 5
웹툰 이미지 뽑아내기
1101551 컴퓨터정보공학과 강택현
긁어오는 방법
 Gethtml 함수를 통해 보고자 하는 페이
지의 소스를 긁어와 파일로 저장한다.
 긁어온 소스를 불러와 extractimgs 함
수를 통해 exp에서 이미지가 있는 부분
을 전부 스캔한다.
 스캔한 이미지를 파일로 저장한다.
소스 # coding: utf-8
import re, sys
import urllib2
def savewebtoon(titleID, i):
URL =
'http://comic.naver.com/webtoon/detail.nhn?titleId=‘+str(titleID)+’&n
o='+str(i)
htmlfullname = 'game_'+str(i)+'.html'
html = gethtml(URL)
savefile(html, htmlfullname)
f = open(htmlfullname, 'r')
html = f.read()
f.close()
imgs = extractimgs(html)
if len(imgs) == 0:
print >> sys.stderr, "No images!"
filenum = 1
for img in imgs:
URL =
'http://imgcomic.naver.com/webtoon/637931/2/20141015232622_846
0c8c1462a1fca3df61710b8842bfd_IMAG01_'+ str(filenum)+'.jpg'
saveimg(URL,filenum)
filenum+=1
return 0
 def extractimgs(html):
exp = re.compile(r'<img
+src="(http://imgcomic.naver.net/webtoon/
[0-9]+/[0-9]+/.+?.jpg)"')
imgs = exp.findall(html)
return imgs
def savefile(contents, filename):
f = open(filename, 'w')
f.write(contents)
f.close()
return 0
def gethtml(url):
response = urllib2.urlopen(url)
return response.read()

def saveimg(URL,filenum):
filename = 'episode'+ str(filenum) + '.jpg'
f = open(filename, 'wb')
response = urllib2.urlopen(URL).read()
f.write(response)
f.close()
return 0
def main():
 titleID = 637931
end = 11
for i in range(1,end):
savewebtoon(titleID, i)
return 0
if __name__ == '__main__':
sys.exit(main())

More Related Content

Viewers also liked

게임 개발자와 여성 혐오
게임 개발자와 여성 혐오게임 개발자와 여성 혐오
게임 개발자와 여성 혐오Sun Park
 
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADE
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADEROLE OF EXPORT MARKETING IN INTERNATIONAL TRADE
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADEsushmitha7
 
Import,export procedure
Import,export procedureImport,export procedure
Import,export procedurerishnrish
 
Indonesia-Vietnam cooperation
Indonesia-Vietnam cooperation Indonesia-Vietnam cooperation
Indonesia-Vietnam cooperation Rossy Verona
 
1. Korean Coffee Imports
1. Korean Coffee Imports1. Korean Coffee Imports
1. Korean Coffee ImportsKj Hong
 
Présentation démo V3D
Présentation démo V3DPrésentation démo V3D
Présentation démo V3Dfoot_baller
 
Import & export presentation
Import & export presentationImport & export presentation
Import & export presentationEric Lee
 
Webteam Sulla Nave marzo 2017
Webteam Sulla Nave marzo 2017Webteam Sulla Nave marzo 2017
Webteam Sulla Nave marzo 2017cambianeve
 
Se connecter sur sa boite gmail
Se connecter sur sa boite gmailSe connecter sur sa boite gmail
Se connecter sur sa boite gmailEPN Gouvy
 
EXPORT IMPORT
EXPORT IMPORTEXPORT IMPORT
EXPORT IMPORTRati Kaul
 

Viewers also liked (12)

게임 개발자와 여성 혐오
게임 개발자와 여성 혐오게임 개발자와 여성 혐오
게임 개발자와 여성 혐오
 
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADE
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADEROLE OF EXPORT MARKETING IN INTERNATIONAL TRADE
ROLE OF EXPORT MARKETING IN INTERNATIONAL TRADE
 
Import,export procedure
Import,export procedureImport,export procedure
Import,export procedure
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Indonesia-Vietnam cooperation
Indonesia-Vietnam cooperation Indonesia-Vietnam cooperation
Indonesia-Vietnam cooperation
 
1. Korean Coffee Imports
1. Korean Coffee Imports1. Korean Coffee Imports
1. Korean Coffee Imports
 
Présentation démo V3D
Présentation démo V3DPrésentation démo V3D
Présentation démo V3D
 
Taiwan AF Guideline巡迴演講
Taiwan AF Guideline巡迴演講Taiwan AF Guideline巡迴演講
Taiwan AF Guideline巡迴演講
 
Import & export presentation
Import & export presentationImport & export presentation
Import & export presentation
 
Webteam Sulla Nave marzo 2017
Webteam Sulla Nave marzo 2017Webteam Sulla Nave marzo 2017
Webteam Sulla Nave marzo 2017
 
Se connecter sur sa boite gmail
Se connecter sur sa boite gmailSe connecter sur sa boite gmail
Se connecter sur sa boite gmail
 
EXPORT IMPORT
EXPORT IMPORTEXPORT IMPORT
EXPORT IMPORT
 

웹툰 이미지 뽑아내기

  • 1. 웹툰 이미지 뽑아내기 1101551 컴퓨터정보공학과 강택현
  • 2. 긁어오는 방법  Gethtml 함수를 통해 보고자 하는 페이 지의 소스를 긁어와 파일로 저장한다.  긁어온 소스를 불러와 extractimgs 함 수를 통해 exp에서 이미지가 있는 부분 을 전부 스캔한다.  스캔한 이미지를 파일로 저장한다.
  • 3. 소스 # coding: utf-8 import re, sys import urllib2 def savewebtoon(titleID, i): URL = 'http://comic.naver.com/webtoon/detail.nhn?titleId=‘+str(titleID)+’&n o='+str(i) htmlfullname = 'game_'+str(i)+'.html' html = gethtml(URL) savefile(html, htmlfullname) f = open(htmlfullname, 'r') html = f.read() f.close() imgs = extractimgs(html) if len(imgs) == 0: print >> sys.stderr, "No images!" filenum = 1 for img in imgs: URL = 'http://imgcomic.naver.com/webtoon/637931/2/20141015232622_846 0c8c1462a1fca3df61710b8842bfd_IMAG01_'+ str(filenum)+'.jpg' saveimg(URL,filenum) filenum+=1 return 0
  • 4.  def extractimgs(html): exp = re.compile(r'<img +src="(http://imgcomic.naver.net/webtoon/ [0-9]+/[0-9]+/.+?.jpg)"') imgs = exp.findall(html) return imgs def savefile(contents, filename): f = open(filename, 'w') f.write(contents) f.close() return 0 def gethtml(url): response = urllib2.urlopen(url) return response.read()
  • 5.  def saveimg(URL,filenum): filename = 'episode'+ str(filenum) + '.jpg' f = open(filename, 'wb') response = urllib2.urlopen(URL).read() f.write(response) f.close() return 0 def main():  titleID = 637931 end = 11 for i in range(1,end): savewebtoon(titleID, i) return 0 if __name__ == '__main__': sys.exit(main())