파이썬 웹 크롤링 연습 예제 #4

티스토리 뷰

개발/Python 웹 크롤링

파이썬 웹 크롤링 연습 예제 #4

KellyEnLab 2020. 3. 2. 14:12

네이버 블로그에서 검색어를 입력하여 원하는 페이지까지 제목과 링크 정보를 출력하는 웹 클롤링 실습 예제 입니다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from bs4 import BeautifulSoup
import urllib.request
import urllib.parse
 
def get_soup(target_url):
    html = urllib.request.urlopen(target_url).read()
    soup = BeautifulSoup(html, 'html.parser')
    return soup
 
def extract_data(soup):
    title = soup.find_all(class_ = 'sh_blog_title')
 
    for i in title:
        print(i.attrs['title'])
        print(i.attrs['href'])
        print()
 
pageNum = 1
count = 1
 
plus_url = urllib.parse.quote_plus(input('검색어를 입력해주세요:'))
i = input('몇페이지까지 크롤링 할까요?')
lastPage = int(i) * 10 - 9
 
while pageNum < lastPage + 1:
    target_url = f'https://search.naver.com/search.naver?date_from=&date_option=0&date_to=&dup_remove=1&nso=&post_blogurl=&post_blogurl_without=&query={plus_url}&sm=tab_pge&srchby=all&st=sim&where=post&start={pageNum}'
    soup = get_soup(target_url)
 
    print(f'***********{count}페이지 결과입니다.***************')
    print()
    extract_data(soup)
    pageNum += 10
    count += 1
    print()
 
Colored by Color Scripter
cs

저작자표시 비영리 변경금지 (새창열림)

'개발 > Python 웹 크롤링' 카테고리의 다른 글

파이썬 웹 크롤링 연습 예제 #6 (0)	2020.03.02
파이썬 웹 크롤링 연습 예제 #5 (0)	2020.03.02
파이썬 웹 크롤링 연습 예제 #3 (0)	2020.03.02
파이썬 웹 크롤링 연습 예제 #2 (0)	2020.03.02
웹 크롤링을 위한 파이썬 설치 및 환경 설정 (0)	2020.03.02

« 2026/06 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

공지사항

최근에 올라온 글

최근에 달린 댓글

링크

캘리의 꿈꾸는 세상

티스토리 뷰

파이썬 웹 크롤링 연습 예제 #4

'개발 > Python 웹 크롤링' 카테고리의 다른 글

티스토리툴바