본문 바로가기

[IT/Programming]/HTML related

JAVA 에서 Selenium 이 제대로 동작 안할 때 해결법

728x90
반응형
# JAVA 에서 Selenium 이 제대로 동작 안할 때 해결법
org.openqa.selenium.NoSuchSessionException 이 자꾸 뜰 때 해결법.
## PH
  • 2024-09-05 : First posting.
## TOC ## JAVA 에서 WebDriver 의 path 를 정해주지 말고, Windows 환경 변수에서 PATH 에 WebDriver directory 를 추가합시다. ``` System.setProperty("webdriver.chrome.driver", FileMap.preFilePath + "/Recoeve/webdriver/chromedriver.exe"); curChromeOptions.setBinary(FileMap.preFilePath + "/Recoeve/webdriver/chromedriver.exe"); ```/ 와 같이 설정해서 실행해 봤는데 계속 Session=null 이라면서 제대로 실행이 안되었는데, Windows PATH 환경 변수에 WebDriver 의 PATH 를 넣어주고, 아래와 같은 ChromeOptions 를 설정하니 이제야 제대로 돌아감. (위 System path 에 directory 까지만 넣어야 하는건가? =ㅂ=;; 알수없네.) ``` curChromeOptions.addArguments("--disable-notifications", "--headless=new", "--remote-debugging-pipe", "--remote-allow-origins=*", "--no-sandbox", "--disable-dev-shm-usage", "--port=" + curPort); ```/ 진짜 별의 별 짓을 다 했는데도 해결이 안됐었는데 우연히 Error Log 에서 링크 를 타고 들어가보니 Use the PATH environment variable 란 얘기가 있길래 시도해 봤더니 성공했음. ChatGPT, Claude, Gemini 들 다 소용이 없음 ㅡ,.ㅡ;;;; 바로 답을 알려줘야지 이상한 얘기만 계속해서 삽질 엄청함. ## Open Source: Async 하게 특정 web page (웹페이지) 의 heads (title, h1, h2, and so on.) 들 뽑아내는 Vert.x JAVA code 공유 ```[.scrollable,lang-java] package recoeve.http; import java.net.URI; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.concurrent.CompletableFuture; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.function.BiConsumer; import java.util.function.Function; import org.openqa.selenium.By; import org.openqa.selenium.InvalidElementStateException; import org.openqa.selenium.NoSuchElementException; import org.openqa.selenium.NoSuchSessionException; import org.openqa.selenium.StaleElementReferenceException; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.chrome.ChromeDriver; import org.openqa.selenium.chrome.ChromeOptions; import io.vertx.core.AbstractVerticle; import io.vertx.core.Context; import io.vertx.core.Promise; import io.vertx.core.Vertx; import io.vertx.core.VertxException; import io.vertx.core.http.HttpServerResponse; import io.vertx.ext.web.client.WebClient; import io.vertx.ext.web.client.WebClientOptions; import recoeve.db.RecoeveDB; import recoeve.db.StrArray; public class RecoeveWebClient extends AbstractVerticle { public static final WebClientOptions options = new WebClientOptions() .setMaxHeaderSize(20000) .setFollowRedirects(true); public static final int MIN_PORT = 50000; public static final int MAX_PORT = 60000; public static final int DEFAULT_MAX_DRIVERS = 5; public static final int UNTIL_TOP = 20; public static final Map<String, String> hostCSSMap; static { hostCSSMap = new HashMap<>(10); hostCSSMap.put("blog.naver.com", ".se-fs-, .se-ff-"); hostCSSMap.put("m.blog.naver.com", ".se-fs-, .se-ff-"); hostCSSMap.put("apod.nasa.gov", "center>b:first-child"); } public RecoeveDB db; public WebClient webClient; public long[] pID = {0, 0, 0}; public long timeoutMilliSecs = 7000L; public long findPerMilliSecs = 500L; public ChromeOptions curChromeOptions; public int maxDrivers; private final ConcurrentLinkedQueue<TimestampedDriver> driverPool; public final long driverTimeout = 300000; // 5 minutes in milliseconds public final int RECURSE_MAX = 100; public int recurseCount; public int curPort; public RecoeveWebClient(Vertx vertx, Context context, RecoeveDB db) { this.vertx = vertx; this.context = context; this.db = db; curPort = MIN_PORT; webClient = WebClient.create(vertx, options); maxDrivers = context.config().getInteger("maxDrivers", DEFAULT_MAX_DRIVERS); driverPool = new ConcurrentLinkedQueue<>(); recurseCount = 0; } @Override public void start(Promise<Void> startPromise) { } private static class TimestampedDriver { public final WebDriver driver; public final long timestamp; public TimestampedDriver(WebDriver driver, long timestamp) { this.driver = driver; this.timestamp = timestamp; } } private WebDriver getDriver() throws RuntimeException { recurseCount++; if (recurseCount >= RECURSE_MAX) { recurseCount = 0; throw new RuntimeException("Error: Too many recursive!"); } TimestampedDriver timestampedDriver; while ((timestampedDriver = driverPool.poll()) != null) { if (System.currentTimeMillis() - timestampedDriver.timestamp > driverTimeout) { closeDriver(timestampedDriver.driver); } else { recurseCount = 0; return timestampedDriver.driver; } } if (driverPool.size() < maxDrivers) { try { curChromeOptions = new ChromeOptions(); curChromeOptions.addArguments("--disable-notifications", "--headless=new", "--remote-debugging-pipe", "--remote-allow-origins=*", "--no-sandbox", "--disable-dev-shm-usage", "--port=" + curPort); curChromeOptions.setAcceptInsecureCerts(true); curChromeOptions.setBrowserVersion("128.0.6613.114"); curChromeOptions.setExperimentalOption("detach", true); curPort++; if (curPort > MAX_PORT) { curPort = MIN_PORT; } driverPool.add(new TimestampedDriver(new ChromeDriver(curChromeOptions), System.currentTimeMillis())); } catch (RuntimeException err) { System.out.println(err.getMessage()); } catch (Exception err) { System.out.println("Failed to create new WebDriver: " + err); } } else { cleanupDrivers(); curChromeOptions = new ChromeOptions(); curChromeOptions.addArguments("--disable-notifications", "--headless=new", "--remote-debugging-pipe", "--remote-allow-origins=*", "--no-sandbox", "--disable-dev-shm-usage", "--port=" + curPort); curChromeOptions.setAcceptInsecureCerts(true); curChromeOptions.setBrowserVersion("128.0.6613.114"); curChromeOptions.setExperimentalOption("detach", true); curPort++; if (curPort > MAX_PORT) { curPort = MIN_PORT; } driverPool.add(new TimestampedDriver(new ChromeDriver(curChromeOptions), System.currentTimeMillis())); } while ((timestampedDriver = driverPool.poll()) != null) { if (System.currentTimeMillis() - timestampedDriver.timestamp > driverTimeout) { closeDriver(timestampedDriver.driver); } else { recurseCount = 0; return timestampedDriver.driver; } } curChromeOptions = new ChromeOptions(); curChromeOptions.addArguments("--disable-notifications", "--headless=new", "--remote-debugging-pipe", "--remote-allow-origins=*", "--no-sandbox", "--disable-dev-shm-usage", "--port=" + curPort); curChromeOptions.setAcceptInsecureCerts(true); curChromeOptions.setBrowserVersion("128.0.6613.114"); curChromeOptions.setExperimentalOption("detach", true); curPort++; if (curPort > MAX_PORT) { curPort = MIN_PORT; } WebDriver webDriver = new ChromeDriver(curChromeOptions); return webDriver; } private void closeDriver(WebDriver driver) { try { driver.close(); } catch (Exception e) { System.out.println("Error closing WebDriver: " + e); } } private synchronized void releaseDriver(WebDriver driver) { if (driver != null) { if (driverPool.size() < maxDrivers) { driverPool.offer(new TimestampedDriver(driver, System.currentTimeMillis())); } else { closeDriver(driver); } } } public void cleanupDrivers() { TimestampedDriver timestampedDriver; while ((timestampedDriver = driverPool.poll()) != null) { closeDriver(timestampedDriver.driver); } } public CompletableFuture<String> redirected(String originalURI) { CompletableFuture<String> completableFuture = new CompletableFuture<>(); try { webClient.headAbs(originalURI) .send() .onSuccess(response -> { if (response.statusCode() >= 200 && response.statusCode() < 300) { List<String> followedURIs = response.followedRedirects(); if (!followedURIs.isEmpty()) { String fullURI = followedURIs.get(followedURIs.size() - 1); System.out.println("The last redirected URL: " + fullURI); completableFuture.complete(fullURI); } } }) .onFailure(throwable -> { System.out.println("Sended originalURI.: " + throwable.getMessage()); completableFuture.complete(originalURI); }); } catch (VertxException err) { RecoeveDB.err(err); completableFuture.completeExceptionally(err); } return completableFuture; } public CompletableFuture<String> asyncFindTitle(WebDriver chromeDriver, String cssSelector) throws Exception { CompletableFuture<String> cfElements = new CompletableFuture<>(); if (cssSelector == null) { cfElements.completeExceptionally(new Exception("\nError: cssSelector is null.")); return cfElements; } pID[0] = vertx.setPeriodic(findPerMilliSecs, id -> { try { List<WebElement> elements = chromeDriver.findElements(By.cssSelector(cssSelector)); if (elements != null && !elements.isEmpty()) { StringBuilder sb = new StringBuilder(); boolean someIsNotEmpty = false; for (int i = 0; i < Math.min(UNTIL_TOP, elements.size()); i++) { String text = elements.get(i).getText().replaceAll("\\s", " ").trim(); if (!text.isEmpty()) { someIsNotEmpty = true; sb.append("\n").append(cssSelector).append("-").append(i).append("\t").append(StrArray.enclose(text)); } } if (someIsNotEmpty) { vertx.cancelTimer(pID[0]); cfElements.complete(sb.toString()); } } } catch (NoSuchElementException | StaleElementReferenceException | InvalidElementStateException | VertxException err) { System.out.println(err); } }); vertx.setTimer(timeoutMilliSecs, id -> { vertx.cancelTimer(pID[0]); cfElements.complete("\nError: timeout " + timeoutMilliSecs+"ms."); }); return cfElements; } public CompletableFuture<String> asyncFindTitleUntilEveryIsFound(WebDriver chromeDriver, String cssSelector) throws Exception { CompletableFuture<String> cfElements = new CompletableFuture<>(); if (cssSelector == null) { cfElements.completeExceptionally(new Exception("\nError: cssSelector is null.")); return cfElements; } pID[0] = vertx.setPeriodic(findPerMilliSecs, id -> { try { List<WebElement> elements = chromeDriver.findElements(By.cssSelector(cssSelector)); if (elements != null && !elements.isEmpty()) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < Math.min(UNTIL_TOP, elements.size()); i++) { String text = elements.get(i).getText(); if (text.isEmpty()) { return; } sb.append("\n").append(cssSelector).append("-").append(i).append("\t").append(StrArray.enclose(text)); } vertx.cancelTimer(pID[0]); cfElements.complete(sb.toString()); } } catch (NoSuchElementException | StaleElementReferenceException | InvalidElementStateException | VertxException err) { System.out.println(err); } }); vertx.setTimer(timeoutMilliSecs, id -> { vertx.cancelTimer(pID[0]); List<WebElement> elements = chromeDriver.findElements(By.cssSelector(cssSelector)); if (elements != null && !elements.isEmpty()) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < Math.min(UNTIL_TOP, elements.size()); i++) { String text = elements.get(i).getText().replaceAll("\\s", " ").trim(); if (!text.isEmpty()) { sb.append("\n").append(cssSelector).append("-").append(i).append("\t").append(StrArray.enclose(text)); } } cfElements.complete(sb.toString()); } else { cfElements.complete("\nError: timeout " + timeoutMilliSecs+"ms."); } }); return cfElements; } public void findTitles(String uri, String uriHost, HttpServerResponse resp) { resp.putHeader("Content-Type", "text/plain; charset=utf-8") .setChunked(true); resp.write("\nuri\t" + StrArray.enclose(uri)); String conciseURI = null; if (RecoeveDB.getutf8mb4Length(uri) > 255) { conciseURI = db.getConciseURI(uri); } if (conciseURI != null) { resp.write("\nconciseURI\t" + StrArray.enclose(conciseURI)); } Function<String, String> applyFn = (result) -> { return result; }; try { WebDriver chromeDriver = getDriver(); vertx.setTimer(200, id -> { try { chromeDriver.get((new URI(uri.trim())).toString()); CompletableFuture<String> findTitle = asyncFindTitle(chromeDriver, "title, h1, h2") .thenApply(applyFn); CompletableFuture<String> findHostSpecific = asyncFindTitle(chromeDriver, hostCSSMap.get(uriHost)) .thenApply(applyFn); CompletableFuture<String> findTitleUntil = asyncFindTitleUntilEveryIsFound(chromeDriver, "title, h1, h2") .thenApply(applyFn); CompletableFuture<String> findHostSpecificUntil = asyncFindTitleUntilEveryIsFound(chromeDriver, hostCSSMap.get(uriHost)) .thenApply(applyFn); CompletableFuture<Void> allOf = CompletableFuture.allOf(findTitle, findHostSpecific, findTitleUntil, findHostSpecificUntil); BiConsumer<String, Throwable> writeChunk = (result, error) -> { if (error == null) { try { result = result.trim(); if (result.isEmpty()) { result = "\nError: Empty result."; System.out.println(result); } resp.write(result, Recoeve.ENCODING); } catch (Exception e) { result = "\nError: writing chunk: " + e.getMessage(); System.err.println(result); resp.write(result, Recoeve.ENCODING); } } else { result = "\nError: in future: " + error.getMessage(); System.err.println(result); resp.write(result, Recoeve.ENCODING); } }; findTitle.whenComplete(writeChunk); findHostSpecific.whenComplete(writeChunk); findTitleUntil.whenComplete(writeChunk); findHostSpecificUntil.whenComplete(writeChunk); allOf.whenComplete((v, error) -> { String errorMsg = ""; if (error != null) { errorMsg = "\nError in futures: " + error.getMessage(); System.err.println(errorMsg); } if (!resp.ended()) { resp.end(errorMsg); } releaseDriver(chromeDriver); }); } catch (NoSuchSessionException e) { resp.end("\nError: No valid session. Please try again.: " + e.getMessage()); } catch (Exception e) { resp.end("\nError: " + e.getMessage()); } }); } catch (NoSuchSessionException e) { resp.end("\nError: No valid session. Please try again."); } catch (RuntimeException e) { resp.end("\n" + e.getMessage()); } catch (Exception e) { resp.end("\nError: " + e.getMessage()); } } @Override public void stop(Promise<Void> stopPromise) { cleanupDrivers(); } public static void main(String... args) { MainVerticle verticle = new MainVerticle(); try { verticle.start(); verticle.getVertx().setTimer(10, id -> { WebDriver chromeDriver = verticle.recoeveWebClient.getDriver(); chromeDriver.get("https://www.youtube.com/watch?v=1MhugHxbhGE"); try { verticle.recoeveWebClient.asyncFindTitle(chromeDriver, "h1") .thenAccept(result -> { System.out.println(result); }); } catch (Exception err0) { System.out.println("Error: " + err0.getMessage()); } }); } catch (Exception err) { System.out.println("Error: " + err.getMessage()); } } } ```/ ## RRA
  1. Unable to Locate Driver Error
    Troubleshooting missing path to driver executable.
728x90
반응형

'[IT/Programming] > HTML related' 카테고리의 다른 글

event.keyCode deprecated, then event.key, event.code 는 무슨값을 가질까? (키보드 (KeyBoard) event handler: compositionstart compositionupdate compositionend)  (0) 2024.09.14
week1 위클리 페이퍼 (CSS 에 대해 설명 (position, display: flex and grid), 시맨틱 (Semantic) 태그를 사용하면 좋은 점)  (4) 2024.09.14
React Router 에서 CSS 충돌을 막고 좀 더 개발 친화적으로 CSS 를 다룰 수 있게 해주는 CSS module 을 배워봅시다. (Learning module.css)  (6) 2024.09.13
React 를 배워보자. (Learning React) with TypeScript and esbuild bundling  (6) 2024.09.04
week6 위클리 페이퍼 (웹 페이지 렌더링 방식 CSR, SSR, SSG 각각의 특징과 각 방식을 어떤 상황에 사용하면 좋을지 설명)  (3) 2024.09.03
week5 위클리 페이퍼 (useMemo, useCallback 에 대해 설명하고, 어떤 경우에 사용하면 좋을지, 남용할 경우 발생할 수 있는 문제점을 설명, 리액트 생명주기 (life cycle) 에 대해 설명, React 에서 배열을 렌더링할 때 key 를 설정해야 하는 이유와 key 설정 시 주의할 점을 설명)  (0) 2024.09.03
JavaScript 중급 서술형 평가 (자바스크립트에서 this 키워드의 사용과 그 특성에 대해 설명, 렉시컬 스코프(Lexical Scope)의 개념과 그 특성에 대해 설명, 브라우저가 어떻게 동작하는지 설명, 이벤트 버블링과 캡처링을 설명하고 이를 방지하기 위한 방법을 서술, 프로미스(Promise)의 3가지 상태에 대해 설명)  (0) 2024.08.31