-
EDA) Selenium 기초EDA 2023. 3. 10. 22:58
04. Selenium_Basic_1 Selenium Basic¶
1. selenium webdriver 사용하기¶
In [20]:from selenium import webdriver driver = webdriver.Chrome(executable_path="../driver/chromedriver.exe") driver.get("https://pinkwink.kr")
In [3]:# 전체 종료 driver.quit()
In [2]:# 화면 최대 크기 설정 driver.maximize_window()
In [3]:# 화면 최소 크기 설정 driver.minimize_window()
In [7]:# 화면 크기 설정 driver.set_window_size(600, 600)
In [8]:# 새로고침 driver.refresh()
In [9]:# 뒤로 가기 driver.back()
In [10]:# 앞으로 가기 driver.forward()
In [11]:# 클릭 from selenium.webdriver.common.by import By first_content = driver.find_element(By.CSS_SELECTOR,"#content > div.cover-masonry > div > ul > li:nth-child(1)") first_content.click()
In [12]:# 새로운 탭 생성 # execute_script: JAVA 언어 사용 driver.execute_script('window.open("https://www.naver.com")')
In [13]:# 탭 이동 driver.switch_to.window(driver.window_handles[0])
In [14]:len(driver.window_handles)
Out[14]:2
In [15]:# 탭 닫기(탭 하나씩 종료) driver.close()
2. 화면 스크롤¶
In [21]:#스크롤 가능한 높이(길이) driver.execute_script("return document.body.scrollHeight")
Out[21]:5857
In [22]:# 화면 스크롤 하단 이동 driver.execute_script('window.scrollTo(0,document.body.scrollHeight)')
In [23]:# 현재 보이는 화면 스크린샷 저장 driver.save_screenshot('./last_height.png')
Out[23]:True
In [24]:# 화면 스크롤 상단 이동 driver.execute_script('window.scrollTo(0,0)')
In [25]:# 특정 태그 지점까지 스크롤 이동 from selenium.webdriver import ActionChains some_tag = driver.find_element(By.CSS_SELECTOR, '#content > div:nth-child(5) > div > ul > li:nth-child(1)') action = ActionChains(driver) # 현재 사용중인 driver 제어하겠다! action.move_to_element(some_tag).perform()
In [26]:driver.quit()
3. 검색어 입력¶
- CSS_SELECTOR
In [27]:from selenium import webdriver from selenium.webdriver.common.by import By
In [28]:driver = webdriver.Chrome("../driver/chromedriver.exe") driver.get('https://www.naver.com')
In [29]:keyword = driver.find_element(By.CSS_SELECTOR,'#query') keyword.clear() keyword.send_keys('파이썬')
In [30]:search_btn = driver.find_element(By.CSS_SELECTOR, '#search_btn') search_btn.click()
- XPATH
'//': 최상위 '*': 자손 태그 '/' : 자식 태그 'div[1]' : div 중에서 1번째 태그 //*[@id="main_pack"]/section[2]/div/div[2]/panel-list/div/ul/li[1]/div/div/aIn [32]:driver.find_element(By.XPATH,'//*[@id="query"]').send_keys('xpath') driver.find_element(By.XPATH,'//*[@id="search_btn"]').click()
In [33]:driver.quit()
In [34]:from selenium import webdriver from selenium.webdriver.common.by import By
In [35]:driver = webdriver.Chrome('../driver/chromedriver.exe') driver.get('http://pinkwink.kr')
In [36]:# 1. 돋보기 버튼 선택 from selenium.webdriver import ActionChains search_tag = driver.find_element(By.CSS_SELECTOR,'.search') action = ActionChains(driver) action.click(search_tag) action.perform()
In [37]:driver.find_element(By.CSS_SELECTOR,'#header > div.search.on > input[type=text]').send_keys('딥러닝') driver.find_element(By.XPATH, '//*[@id="header"]/div[2]/button').click()
4.selenium + beautifulsoup¶
In [38]:from selenium import webdriver from selenium.webdriver.common.by import By
In [39]:driver = webdriver.Chrome("../driver/chromedriver.exe") driver.get("http://pinkwink.kr")
In [ ]:# 현재 화면의 html코드 가져오기 driver.page_source
In [41]:from bs4 import BeautifulSoup req = driver.page_source soup = BeautifulSoup(req, "html.parser")
In [ ]:soup.select("div.inner > ul")
In [43]:contents = soup.select("div.inner > ul") len(contents)
Out[43]:8
In [44]:contents[7]
Out[44]:<ul> <li> <a href="/1403"> <figure> <img alt="" src="//i1.daumcdn.net/thumb/C600x600/?fname=https://blog.kakaocdn.net/dna/cfy4Yc/btrU01Kt0ae/AAAAAAAAAAAAAAAAAAAAAHsJNAIdQtH1rMvlI6o1fU3V3jdKnzXFKUHDuVSZj-Cb/>?credential=yqXZFxpELC7KVnFOS48ylbz2pIh7yKj8&expires=1780239599&allow_ip=&allow_referer=&signature=15u2%2BCFkMA5FsKbDGQfllweJpuE%3D </figure> <span class="title">2022년 PinkWink 결산</span> </a> </li> <li> <a href="/1402"> <figure> <img alt="" src="//i1.daumcdn.net/thumb/C600x600/?fname=https://blog.kakaocdn.net/dna/eyxII3/btrSIsqsgMN/AAAAAAAAAAAAAAAAAAAAANWEUB5zWiJb1SA1QpgybZthMskki6htFPz4KLEr3qSt/>?credential=yqXZFxpELC7KVnFOS48ylbz2pIh7yKj8&expires=1780239599&allow_ip=&allow_referer=&signature=9V3enRZ6LxnUYpGEM927ttaTxLs%3D </figure> <span class="title">2022년 한국로봇학회로 부터 감사패를 받았습니다.</span> </a> </li> <li> <a href="/1387"> <figure> <img alt="" src="//i1.daumcdn.net/thumb/C600x600/?fname=https://blog.kakaocdn.net/dna/cOUfdg/btrG3rvJTRK/AAAAAAAAAAAAAAAAAAAAANH3-FoDIh5lqhkEWtbdy-Lje5zG4VkuVh7lP6Ez_QP8/>?credential=yqXZFxpELC7KVnFOS48ylbz2pIh7yKj8&expires=1780239599&allow_ip=&allow_referer=&signature=zoDYdRIdvZuKW2DDj4mnnJAVXpg%3D </figure> <span class="title">핑크랩 옆에는 못골 한옥 어린이 도서관이 있어요</span> </a> </li> <li> <a href="/1386"> <figure> <img alt="" src="//i1.daumcdn.net/thumb/C600x600/?fname=https://blog.kakaocdn.net/dna/QJf5U/btrDWmdKo52/AAAAAAAAAAAAAAAAAAAAAMzn5Tst9qvBNiHPvLcjOfjJa3tLUKI5i8wA8HMulB7x/>?credential=yqXZFxpELC7KVnFOS48ylbz2pIh7yKj8&expires=1780239599&allow_ip=&allow_referer=&signature=ndeTFqcqh8c3nWZbXqYo6XHkoV4%3D </figure> <span class="title">핑크윙크 PinkWink가 핑크랩 PinkLAB을 만들었습니다.</span> </a> </li> </ul>
'EDA' 카테고리의 다른 글
EDA) 네이버 API 활용 (0) 2023.03.10 EDA) 셀프 주유소 가격 분석 (0) 2023.03.10 EDA) 네이버 영화순위 시각화 (0) 2023.03.10 EDA) 웹크롤링 기초 예제 - 시카고 샌드위치 (0) 2023.03.10 EDA) 서울시 범죄현황 시각화 (0) 2023.03.10