▶ quotes to scrape
스크랩하기 위한 인용구들을 모아둔 사이트
일종의 데모 사이트
* 사이트
Quotes to Scrape
quotes.toscrape.com
# example1.py
import requests
from bs4 import BeautifulSoup
url = 'https://quotes.toscrape.com/tag/love/'
# 1. 다운로드 - url 을 이용해서, HTML 이 담긴 자료를 받아와야 함
response = requests.get(url)
# html 문서를 text 형태로 확인
html_text = response.text
# str 이 출력된다.
print(type(html_text))
# 문자열 파싱은 코드로 짜기 매우 복잡하다.
# 라이브러리를 쓰자!
soup = BeautifulSoup(html_text, 'html.parser')
# bs4.BeautifulSoup class 가 출력됨
print(type(soup))
# 1. find
# - 첫 번째 태그를 가진 요소를 검색
main = soup.find('title')
print(main)
# 2. find_all
# - 해당 태그를 가진 모든 요소를 검색
# - 리스트로 반환된다.
a_tags = soup.find_all('a')
print(a_tags)
# 3. CSS 선택자로 하나를 선택
# 선택자가 일치하는 첫 번 째 글
# span 태그로도 검색이 가능하지만
# 인용구 라는 내용은 text class 로 지정
# 따라서, class 를 통한 검색이 더 옳다!
word = soup.select_one('.text')
print(f'첫 번째 글 = {word.text}')
# 4. CSS 선택자로 여러개를 선택
# 모든 인용구를 검색
words = soup.select('.text')
for w in words:
print(f'글 : {w.text}')
# 실행하기
python example.py
더보기
$ python example.py
<class 'str'>
<class 'bs4.BeautifulSoup'>
<title>Quotes to Scrape</title>
[<a href="/" style="text-decoration: none">Quotes to Scrape</a>, <a href="/login">Login</a>, <a href="/tag/love/page/1/">love</a>, <a href="/author/Andre-Gide">(about)</a>, <a class="tag" href="/tag/life/page/1/">life</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/author/Marilyn-Monroe">(about)</a>, <a class="tag" href="/tag/friends/page/1/">friends</a>, <a class="tag" href="/tag/heartbreak/page/1/">heartbreak</a>, <a class="tag" href="/tag/inspirational/page/1/">inspirational</a>, <a class="tag" href="/tag/life/page/1/">life</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a class="tag" href="/tag/sisters/page/1/">sisters</a>, <a href="/author/Bob-Marley">(about)</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/author/Elie-Wiesel">(about)</a>, <a class="tag" href="/tag/activism/page/1/">activism</a>, <a class="tag"
href="/tag/apathy/page/1/">apathy</a>, <a class="tag" href="/tag/hate/page/1/">hate</a>, <a class="tag" href="/tag/indifference/page/1/">indifference</a>, <a
class="tag" href="/tag/inspirational/page/1/">inspirational</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a class="tag" href="/tag/opposite/page/1/">opposite</a>, <a class="tag" href="/tag/philosophy/page/1/">philosophy</a>, <a href="/author/Friedrich-Nietzsche">(about)</a>, <a class="tag" href="/tag/friendship/page/1/">friendship</a>, <a class="tag" href="/tag/lack-of-friendship/page/1/">lack-of-friendship</a>, <a class="tag" href="/tag/lack-of-love/page/1/">lack-of-love</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a class="tag" href="/tag/marriage/page/1/">marriage</a>, <a class="tag" href="/tag/unhappy-marriage/page/1/">unhappy-marriage</a>, <a href="/author/Pablo-Neruda">(about)</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a class="tag" href="/tag/poetry/page/1/">poetry</a>, <a href="/author/Marilyn-Monroe">(about)</a>, <a class="tag" href="/tag/girls/page/1/">girls</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/author/Marilyn-Monroe">(about)</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/author/James-Baldwin">(about)</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/author/Jane-Austen">(about)</a>, <a class="tag" href="/tag/friendship/page/1/">friendship</a>, <a class="tag" href="/tag/love/page/1/">love</a>, <a href="/tag/love/page/2/">Next <span aria-hidden="true">→</span></a>, <a class="tag" href="/tag/love/" style="font-size: 28px">love</a>, <a class="tag" href="/tag/inspirational/" style="font-size: 26px">inspirational</a>, <a class="tag" href="/tag/life/" style="font-size: 26px">life</a>, <a class="tag" href="/tag/humor/" style="font-size:
24px">humor</a>, <a class="tag" href="/tag/books/" style="font-size: 22px">books</a>, <a class="tag" href="/tag/reading/" style="font-size: 14px">reading</a>, <a class="tag" href="/tag/friendship/" style="font-size: 10px">friendship</a>, <a class="tag" href="/tag/friends/" style="font-size: 8px">friends</a>, <a class="tag" href="/tag/truth/" style="font-size: 8px">truth</a>, <a class="tag" href="/tag/simile/" style="font-size: 6px">simile</a>, <a href="https://www.goodreads.com/quotes">GoodReads.com</a>, <a class="zyte" href="https://www.zyte.com">Zyte</a>]
첫 번째 글 = “It is better to be hated for what you are than to be loved for what you are not.”
글 : “It is better to be hated for what you are than to be loved for what you are not.”
글 : “This life is what you make it. No matter what, you're going to mess up sometimes, it's a universal truth. But the good part is you get to decide how you're going to mess it up. Girls will be your friends - they'll act like it anyway. But just remember, some come, some go. The ones that stay with you through everything - they're your true best friends. Don't let go of them. Also remember, sisters make the best friends in the world. As for lovers, well, they'll come and go too. And baby, I hate to say it, most of them - actually pretty much all of them are going to break your heart, but you can't give up because if you give up, you'll never find your soulmate. You'll never find that half who makes
you whole and that goes for everything. Just because you fail once, doesn't mean you're gonna fail at everything. Keep trying, hold on, and always, always, always believe in yourself, because if you don't, then who will, sweetie? So keep your head high, keep your chin up, and most importantly, keep smiling, because life's a beautiful thing and there's so much to smile about.”
글 : “You may not be her first, her last, or her only. She loved before she may love again. But if she loves you now, what else matters? She's not perfect—you aren't either, and the two of you may never be perfect together but if she can make you laugh, cause you to think twice, and admit to being human and making
mistakes, hold onto her and give her the most you can. She may not be thinking
about you every second of the day, but she will give you a part of her that she knows you can break—her heart. So don't hurt her, don't change her, don't analyze and don't expect more than she can give. Smile when she makes you happy, let her know when she makes you mad, and miss her when she's not there.”
글 : “The opposite of love is not hate, it's indifference. The opposite of art
is not ugliness, it's indifference. The opposite of faith is not heresy, it's indifference. And the opposite of life is not death, it's indifference.”
글 : “It is not a lack of love, but a lack of friendship that makes unhappy marriages.”
글 : “I love you without knowing how, or when, or from where. I love you simply, without problems or pride: I love you in this way because I do not know any other way of loving but this, in which there is no I or you, so intimate that your hand upon my chest is my hand, so intimate that when I fall asleep your eyes close.”
글 : “If you can make a woman laugh, you can make her do anything.”
글 : “The real lover is the man who can thrill you by kissing your forehead or smiling into your eyes or just staring into space.”
글 : “Love does not begin and end the way we seem to think it does. Love is a battle, love is a war; love is a growing up.”
글 : “There is nothing I would not do for those who are really my friends. I have no notion of loving people by halves, it is not my nature.”
'인공지능, 머신러닝 > Django + Crawling' 카테고리의 다른 글
[Django prac][키워드 검색량 크롤링] 5. 구글에 '탕수육'을 검색했을 때, 검색량이 얼마나 되는지 알아보기 (0) | 2024.04.19 |
---|---|
[Django prac][키워드 검색량 크롤링] 4. 구글에서 '탕수육'을 검색한 후, 페이지 자료 받아오기 (0) | 2024.04.19 |
[Django prac][키워드 검색량 크롤링] 2. requests, BeautifulSoup, Selenium 설치 (0) | 2024.04.19 |
웹 크롤링이란? 웹 크롤링 프로세스 (0) | 2024.04.19 |
데이터 수집 기술 (0) | 2024.04.19 |