{"id":2364,"date":"2018-06-06T11:01:34","date_gmt":"2018-06-06T02:01:34","guid":{"rendered":"https:\/\/pasero.net\/~mako\/blog\/?p=2364"},"modified":"2018-12-09T16:36:20","modified_gmt":"2018-12-09T07:36:20","slug":"%e3%83%8b%e3%83%a5%e3%83%bc%e3%82%b9%e3%81%ae%e3%82%b9%e3%82%af%e3%83%ac%e3%82%a4%e3%83%94%e3%83%b3%e3%82%b0%e3%81%a7%e3%82%bf%e3%82%a4%e3%83%94%e3%83%b3%e3%82%b0%e7%b7%b4%e7%bf%92","status":"publish","type":"post","link":"https:\/\/pasero.net\/~mako\/blog\/s\/2364","title":{"rendered":"\u30cb\u30e5\u30fc\u30b9\u306e\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u3067\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2"},"content":{"rendered":"<p><a href=\"https:\/\/pasero.net\/~mako\/blog\/s\/2360\">\u3072\u3089\u304c\u306a\u6570\u6587\u5b57\u3092\u6253\u3061\u8fd4\u3059\u3060\u3051\u306e\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2<\/a>\u306f\u6848\u306e\u5b9a\u3059\u3050\u306b\u98fd\u304d\u3066\u3057\u307e\u3063\u305f\u306e\u3067\u3001\u4f55\u304b\u5225\u306e\u30cd\u30bf\u3092\u8003\u3048\u306a\u304f\u3066\u306f\u306a\u3089\u306a\u304f\u306a\u308a\u307e\u3057\u305f\u3002\u98fd\u304d\u306a\u3044\u305f\u3081\u306b\u306f\u81a8\u5927\u304b\u307e\u305f\u306f\u983b\u7e41\u306b\u66f4\u65b0\u3055\u308c\u308b\u5143\u30c7\u30fc\u30bf\u304c\u3042\u308c\u3070\u3044\u3044\u3001\u9752\u7a7a\u6587\u5eab\u304b\u306a\u3001\u3067\u3082\u5c0f\u5b66\u751f\u306b\u5411\u3044\u3066\u3044\u308b\u3082\u306e\u304c\u3069\u308c\u307b\u3069\u3042\u308b\u304b\u3057\u3089\u3093\u3001\u983b\u7e41\u306b\u66f4\u65b0\u3055\u308c\u308b\u3068\u3044\u3048\u3070\u30cb\u30e5\u30fc\u30b9\u3001\u3067\u3082\u3053\u308c\u307e\u305f\u5c0f\u5b66\u751f\u5411\u304d\u3067\u306f\u306a\u3055\u305d\u3046\u2026\u2026\u3068\u601d\u3063\u305f\u3089\u5b9f\u306b\u3074\u3063\u305f\u308a\u306e\u3082\u306e\u304c\u3042\u308a\u307e\u3057\u305f\u3002<a href=\"https:\/\/www3.nhk.or.jp\/news\/easy\/\" >NHK NENS WEB EASY<\/a> \u3067\u3059\u3002\u3072\u3068\u3064\u306e\u8a18\u4e8b\u306750\u5b57\u307b\u3069\u306e\u6587\u304c10\u307b\u3069\u3002\u610f\u5473\u3082\u308f\u304b\u308a\u3084\u3059\u304f\u3066\u91cf\u3082\u3061\u3087\u3046\u3069\u3044\u3044\u3002\u304b\u306a\u308a<a href=\"https:\/\/www.nhk.or.jp\/school-blog\/1000\/188994.html\" >\u624b\u9593\u3092\u304b\u3051\u3066\u4f5c\u3089\u308c\u3066\u3044\u308b<\/a>\u3088\u3046\u3067\u3059\u3002<\/p>\r\n<p>\u3055\u3066\u3001\u3053\u308c\u3092\u306a\u3093\u3068\u304b\u6301\u3063\u3066\u304d\u3066\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2\u306e\u6750\u6599\u306b\u3057\u3088\u3046\u3068\u601d\u3063\u305f\u306e\u3067\u3059\u304c\u3001\u4f55\u3057\u308d\u672c\u696d\u3067\u3082\u4f55\u3067\u3082\u306a\u3044\u306e\u3067\u60c5\u5831\u3092\u96c6\u3081\u308b\u3068\u3053\u308d\u304b\u3089\u30b9\u30bf\u30fc\u30c8\u3067\u3057\u305f\u3002\u4eca\u56de\u3084\u3063\u3066\u3044\u308b\u3053\u3068\u306f\u5b9f\u306f\u300c\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u300d\u3068\u3044\u3046\u7a0b\u306e\u3053\u3068\u3082\u306a\u3044\u306e\u3067\u3059\u304c\u3001\u305d\u306e\u3068\u3063\u304b\u304b\u308a\u3068\u3044\u3046\u3053\u3068\u3067\u3001\u305b\u3063\u304b\u304f\u306a\u306e\u3067\u3053\u3053\u306b\u8a18\u9332\u3057\u3066\u304a\u304d\u307e\u3059\u3002<\/p>\r\n<h3>\u74b0\u5883\u306e\u6e96\u5099<\/h3>\r\n<p>NHK NENS WEB EASY \u306e\u30da\u30fc\u30b8\u306e\u809d\u5fc3\u306a\u90e8\u5206\u306f JavaScript \u3067\u751f\u6210\u3055\u308c\u3066\u3044\u308b\u3088\u3046\u3067\u3001Python \u3067\u5358\u7d14\u306b <code>requests.get(url)<\/code> \u3068\u3084\u3063\u3066\u3082\u3001\u30d6\u30e9\u30a6\u30b6\u3067\u898b\u3066\u3044\u308b HTML \u30bd\u30fc\u30b9\u3068\u306f\u5225\u306e\u3082\u306e\u3057\u304b\u5f97\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u305b\u3093\u3002\u305d\u3053\u3067\u307e\u305a\u3001\u30d6\u30e9\u30a6\u30b6\u304c\u5b9f\u969b\u306b\u8868\u793a\u3059\u308b\u30da\u30fc\u30b8\u3092\u53d6\u5f97\u3067\u304d\u308b\u3088\u3046\u306b\u3057\u307e\u3059\u3002<\/p>\r\n<p>Debian \u30d1\u30c3\u30b1\u30fc\u30b8 chromium-driver \u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u307e\u3059\u3002<\/p>\r\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\nsudo apt-get install chromium-driver\r\n<\/pre>\r\n<p>\u3053\u308c\u3092 Python \u304b\u3089\u4f7f\u3046\u305f\u3081\u306b\u30e9\u30a4\u30d6\u30e9\u30ea <a href=\"http:\/\/selenium-python.readthedocs.io\/index.html\" >Selenium<\/a>\t\u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u307e\u3059\u3002<\/p>\r\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\npip install selenium\r\n<\/pre>\r\n<p>\u53d6\u5f97\u3057\u305f HTML \u304b\u3089\u5fc5\u8981\u306a\u7b87\u6240\u3092\u5207\u308a\u51fa\u3059\u306e\u306b\u306f <a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/\" >BeautifulSoup4<\/a> \u3092\u4f7f\u3044\u307e\u3059<sup>[<a href=\"#footnote_1_2364\" id=\"identifier_1_2364\" class=\"footnote-link footnote-identifier-link\" title=\"Python \u306b\u306f\u3058\u3081\u304b\u3089\u3042\u308b html.parser \u3067\u3082\u3042\u308b\u7a0b\u5ea6\u3067\u304d\u307e\u3059\u3002\u307e\u305f Selenium \u306b\u3082\u540c\u69d8\u306e\u6a5f\u80fd\u304c\u3042\u308b\u3088\u3046\u3067\u3059\u3002\">1<\/a>]<\/sup>\u3002<\/p>\r\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\npip install beautifulsoup4\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3067\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u307e\u3059\u3002<\/p>\r\n<h3>\u30cb\u30e5\u30fc\u30b9\u30b5\u30a4\u30c8\u306e\u69cb\u9020<\/h3>\r\n<p>\u30cb\u30e5\u30fc\u30b9\u30b5\u30a4\u30c8\u306b\u3088\u304f\u3042\u308b\u3053\u3068\u3067\u3059\u304c\u3001\u5404\u8a18\u4e8b\u306e URL \u306f\u6570\u5b57\u306e\u7f85\u5217\u306e\u3088\u3046\u306a\u540d\u524d\u3067\u3001\u5148\u982d\u30da\u30fc\u30b8\u3067\u306f\u305d\u308c\u304c\u65e5\u3005\u66f4\u65b0\u3055\u308c\u307e\u3059\u3002<\/p>\r\n<p>\u30d6\u30e9\u30a6\u30b6\u306e\u30c7\u30d9\u30ed\u30c3\u30d1\u30fc\u30c4\u30fc\u30eb\u3067 <a href=\"https:\/\/www3.nhk.or.jp\/news\/easy\/\" >NHK NENS WEB EASY<\/a> \u306e\u5148\u982d\u30da\u30fc\u30b8 <code>https:\/\/www3.nhk.or.jp\/news\/easy\/<\/code> \u306e\u69cb\u9020\u3092\u898b\u3066\u307f\u307e\u3059\u3002\r\n<pre class=\"brush: xml; title: ; notranslate\" title=\"\">\r\n&lt;div class=&quot;top-news-list__pickup news-list-item&quot; id=&quot;js-news-pickup&quot;&gt;\r\n  ...\r\n  &lt;h1 class=&quot;news-list-item__title is-pickup&quot;&gt;\r\n  &lt;a href=&quot;.\/k10011463631000\/k10011463631000.html&quot;&gt;&lt;em class=&quot;title&quot;&gt;&lt;ruby&gt;\u65e5\u672c&lt;rt&gt;\u306b\u3063\u307d\u3093&lt;\/rt&gt;&lt;\/ruby&gt;\u306e&lt;ruby&gt;\u4e8c\u9178\u5316\u70ad\u7d20&lt;rt&gt;\u306b\u3055\u3093\u304b\u305f\u3093\u305d&lt;\/rt&gt;&lt;\/ruby&gt;\u306e&lt;ruby&gt;\u6fc3\u5ea6&lt;rt&gt;\u306e\u3046\u3069&lt;\/rt&gt;&lt;\/ruby&gt;\u304c&lt;ruby&gt;\u4eca&lt;rt&gt;\u3044\u307e&lt;\/rt&gt;&lt;\/ruby&gt;\u307e\u3067\u3067\u3044\u3061\u3070\u3093&lt;ruby&gt;\u9ad8&lt;rt&gt;\u305f\u304b&lt;\/rt&gt;&lt;\/ruby&gt;\u304f\u306a\u308b&lt;\/em&gt;&lt;time class=&quot;time&quot;&gt;6\u67085\u65e5 11\u664230\u5206&lt;\/time&gt;&lt;\/a&gt;\r\n  &lt;\/h1&gt;\r\n&lt;\/div&gt;<\/pre>\r\n<p>\u6700\u521d\u306b\u5927\u304d\u304f\u53d6\u308a\u4e0a\u3052\u3089\u308c\u3066\u3044\u308b\u8a18\u4e8b\u306f <code>&lt;div id=\"js-news-pickup\"&gt;<\/code> \u3067\u3001\u305d\u306e\u4e2d\u306e <code>&lt;h1&gt;<\/code> \u306e\u4e2d\u306e <code>&lt;a&gt;<\/code> \u304b\u3089\u8a18\u4e8b\u500b\u5225\u30da\u30fc\u30b8\u306e URL \u304c\u5f97\u3089\u308c\u307e\u3059\u3002<\/p>\r\n<p>\u305d\u306e\u8a18\u4e8b\u500b\u5225\u30da\u30fc\u30b8\u3092\u540c\u69d8\u306b\u30d6\u30e9\u30a6\u30b6\u306e\u30c7\u30d9\u30ed\u30c3\u30d1\u30fc\u30c4\u30fc\u30eb\u3067\u898b\u3066\u307f\u308b\u3068\u3001\u8a18\u4e8b\u672c\u6587\u306f <code>&lt;div id=\"#js-article-body\"&gt;<\/code> \u306b\u3042\u308b\u3053\u3068\u304c\u308f\u304b\u308a\u307e\u3059\u3002\u3053\u308c\u3092\u5207\u308a\u51fa\u3057\u3066\u304f\u308c\u3070\u3044\u3044\u8a33\u3067\u3059\u3002<\/p>\r\n<h3>\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0<\/h3>\r\n<p>\u53c2\u8003\u306b\u3057\u305f (\u3068\u3044\u3046\u304b\u3001\u307b\u307c\u305d\u306e\u307e\u307e\u30b3\u30d4\u30fc\u3055\u305b\u3066\u3082\u3089\u3063\u305f) \u30b3\u30fc\u30c9\u306f\u300cPython Web\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0 \u30c6\u30af\u30cb\u30c3\u30af\u96c6\u300d\u306e\u300c<a href=\"https:\/\/qiita.com\/Azunyan1111\/items\/b161b998790b1db2ff7a#%E5%8B%95%E4%BD%9C%E3%81%99%E3%82%8B%E3%82%B3%E3%83%BC%E3%83%89-2\" >JavaScript\u306b\u3088\u308b\u63cf\u753b\u306b\u5bfe\u5fdc\u3059\u308b<\/a>\u300d\u3067\u3059\u3002<\/p>\r\n<p>\u30bd\u30fc\u30b9\u4e2d\u306e\u30b3\u30e1\u30f3\u30c8\u300c\u30d6\u30e9\u30a6\u30b6\u3092\u8d77\u52d5\u3059\u308b\u300d\u306e\u7b87\u6240\u3067\u3001<a href=\"https:\/\/catcherweb.com\/selenium-chromedriver\/\" >\u30d6\u30e9\u30a6\u30b6\u306e\u30d1\u30b9\u3092\u6307\u5b9a\u3059\u308b\u5fc5\u8981<\/a>\u304c\u3042\u308a\u307e\u3057\u305f\u3002Debian \u306e\u30d1\u30c3\u30b1\u30fc\u30b8\u3092\u4f7f\u3063\u3066\u3044\u308b\u5834\u5408<\/p>\r\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndriver = webdriver.Chrome(executable_path='\/usr\/bin\/chromedriver', chrome_options=options)\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3067\u3059\u3002<\/p>\r\n<h4>\u5207\u308a\u51fa\u3057<\/h4>\r\n<p>1\u56de\u3081\u306e<\/p>\r\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n        # \u30d6\u30e9\u30a6\u30b6\u3067\u30a2\u30af\u30bb\u30b9\u3059\u308b\r\n        siteurl = &quot;https:\/\/www3.nhk.or.jp\/news\/easy\/&quot;\r\n        driver.get(siteurl)\r\n\r\n        ...\r\n\r\n        # BeautifulSoup\u3067\u6271\u3048\u308b\u3088\u3046\u306b\u30d1\u30fc\u30b9\u3057\u307e\u3059\r\n        soup = BeautifulSoup(html, &quot;html.parser&quot;)\r\n\r\n        # id \u3067\u7279\u5b9a\u306e\u8981\u7d20\u3092\u5207\u308a\u51fa\u3059\r\n        href = soup.select_one(&quot;#js-news-pickup h1 a&quot;).get('href')\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3067\u8a18\u4e8b\u500b\u5225\u30cb\u30e5\u30fc\u30b9\u306e URL \u304c\u5f97\u3089\u308c\u308b\u306e\u3067\u30012\u56de\u3081\u306f<\/p>\r\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n        driver.get(newsurl)\r\n        html = driver.page_source.encode('utf-8')\r\n        soup = BeautifulSoup(html, &quot;html.parser&quot;)\r\n        # \u30eb\u30d3\u3092\u524a\u9664\r\n        for s in soup(&#x5B;'rt']):\r\n            s.decompose()\r\n\r\n        ...\r\n\r\n        # \u30cb\u30e5\u30fc\u30b9\u306e\u672c\u6587\r\n        text = soup.select_one(&quot;#js-article-body&quot;).text\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3067\u3001\u8a18\u4e8b\u672c\u6587\u3092\u5207\u308a\u51fa\u3057\u307e\u3059\u3002<\/p>\r\n<h4>\u30eb\u30d3\u3092\u524a\u9664<\/h4>\r\n<p>\u30cb\u30e5\u30fc\u30b9\u306e\u672c\u6587\u306f<\/p>\r\n<pre class=\"brush: xml; title: ; notranslate\" title=\"\">\r\n...&lt;p&gt;&lt;span class=&quot;colorC&quot;&gt;&lt;ruby&gt;\u6c17\u8c61\u5e81&lt;rt&gt;\u304d\u3057\u3087\u3046\u3061\u3087\u3046&lt;\/rt&gt;&lt;\/ruby&gt;&lt;\/span&gt;\u306f\u300c\u3082\u3063\u3068&lt;a href=&quot;javascript:void(0)&quot; class=&quot;dicWin&quot; id=&quot;id-0000&quot;&gt;&lt;ruby&gt;&lt;span class=&quot;under&quot;&gt;\u4e8c\u9178\u5316\u70ad\u7d20&lt;\/span&gt;&lt;rt&gt;\u306b\u3055\u3093\u304b\u305f\u3093\u305d&lt;\/rt&gt;&lt;\/ruby&gt;&lt;\/a&gt;\u3092&lt;ruby&gt;\u51fa&lt;rt&gt;\u3060&lt;\/rt&gt;&lt;\/ruby&gt;\u3055\u306a\u3044\u3088\u3046\u306b\u3057\u306a\u3051\u308c\u3070\u306a\u308a\u307e\u305b\u3093\u300d\u3068&lt;ruby&gt;\u8a71&lt;rt&gt;\u306f\u306a&lt;\/rt&gt;&lt;\/ruby&gt;\u3057\u3066\u3044\u307e\u3059\u3002&lt;\/p&gt;...\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u306e\u3088\u3046\u306b\u306a\u3063\u3066\u3044\u307e\u3059\u3002BeautifulSoup \u306e <code>.text<\/code> \u3067\u5358\u7d14\u306b\u30bf\u30b0\u3092\u524a\u9664\u3059\u308b\u3060\u3051\u3067\u306f<\/p>\r\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\u6c17\u8c61\u5e81\u304d\u3057\u3087\u3046\u3061\u3087\u3046\u306f\u300c\u3082\u3063\u3068\u4e8c\u9178\u5316\u70ad\u7d20\u306b\u3055\u3093\u304b\u305f\u3093\u305d\u3092\u51fa\u3060\u3055\u306a\u3044\u3088\u3046\u306b\u3057\u306a\u3051\u308c\u3070\u306a\u308a\u307e\u305b\u3093\u300d\u3068\u8a71\u306f\u306a\u3057\u3066\u3044\u307e\u3059\u3002\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3068\u3001\u30eb\u30d3\u304c\u672c\u6587\u306b\u6df7\u3058\u3063\u3066\u3057\u307e\u3044\u3001\u307e\u3068\u3082\u306a\u6587\u306b\u306a\u308a\u307e\u305b\u3093\u3002\u4e8b\u524d\u306b <code>rt<\/code> \u8981\u7d20\u3092\u524a\u9664\u3057\u3066\u304a\u304f\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\r\n<p>\u3053\u306e\u4f5c\u696d\u3092\u3084\u3063\u3066\u307f\u3066\u3001<code>rt<\/code> \u304c\u307b\u304b\u3068\u306f\u7570\u8cea\u306a\u30bf\u30b0(\u8981\u7d20)\u3067\u3042\u308b\u3053\u3068\u3092\u5b9f\u611f\u3057\u307e\u3057\u305f\u3002\u3053\u308c\u306b\u3064\u3044\u3066\u306f\u307e\u305f<a href=\"https:\/\/pasero.net\/~mako\/blog\/s\/2370\" >\u5225\u306e\u8a18\u4e8b<\/a>\u306b\u66f8\u3053\u3046\u3068\u601d\u3044\u307e\u3059\u3002<\/p>\r\n<h4>1\u6587\u305a\u3064\u306b\u5206\u89e3<\/h4>\r\n<p><a href=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?ssl=1\" data-rel=\"lightbox-image-0\" data-rl_title=\"\" data-rl_caption=\"\" title=\"\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2365\" data-permalink=\"https:\/\/pasero.net\/~mako\/blog\/s\/2364\/scraping\" data-orig-file=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?fit=605%2C682&amp;ssl=1\" data-orig-size=\"605,682\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"\u30cb\u30e5\u30fc\u30b9\u3067\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?fit=266%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?fit=605%2C682&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?resize=266%2C300&#038;ssl=1\" alt=\"\" width=\"266\" height=\"300\" class=\"alignright size-medium wp-image-2365\" srcset=\"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?resize=266%2C300&amp;ssl=1 266w, https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?resize=170%2C192&amp;ssl=1 170w, https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/06\/scraping.png?w=605&amp;ssl=1 605w\" sizes=\"auto, (max-width: 266px) 100vw, 266px\" \/><\/a>\u8a18\u4e8b\u3092\u300c\u3002\u300d\u3067\u533a\u5207\u308a\u3001\u30ea\u30b9\u30c8\u306b\u3057\u307e\u3059\u3002\u300c\u3002\u300d\u81ea\u8eab\u3082\u542b\u3081\u305f\u3044\u306e\u3067 <code>split<\/code> \u304c\u4f7f\u3048\u307e\u305b\u3093\u3002NHK \u306e\u30cb\u30e5\u30fc\u30b9\u8a18\u4e8b\u3067\u5168\u4f53\u306e\u6700\u5f8c\u306b\u300c\u3002\u300d\u304c\u306a\u3044\u3053\u3068\u306f\u307e\u3055\u304b\u306a\u3044\u3060\u308d\u3046\u3068\u4eee\u5b9a\u3057\u3066\u3001<\/p>\r\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n        lines = re.findall(&quot;.*?\u3002&quot;, text)\r\n<\/pre>\r\n<p style=\"text-indent: 0em\">\u3068\u3057\u307e\u3059\u3002\u3042\u3068\u306f Errnot \u304c\u3053\u308c\u30921\u6587\u305a\u3064\u8868\u793a\u3059\u308b\u3088\u3046\u306b\u3059\u308b\u3060\u3051\u3067\u3059\u3002\u3053\u308c\u3067\u3053\u306e bot \u3092\u76f8\u624b\u306b XMPP \u306e\u30c1\u30e3\u30c3\u30c8\u3067\u30aa\u30a6\u30e0\u8fd4\u3057\u306b\u30bf\u30a4\u30d4\u30f3\u30b0\u306e\u7df4\u7fd2\u3092\u3059\u308b\u3053\u3068\u304c\u3067\u304d\u308b\u3088\u3046\u306b\u306a\u308a\u307e\u3057\u305f\u3002<\/p>\r\n<p style=\"margin-top:2em\">\u305d\u308c\u306b\u3057\u3066\u3082\u3001\u3044\u308d\u3044\u308d\u5bc4\u305b\u96c6\u3081\u308b\u3060\u3051\u3067\u3053\u308c\u3060\u3051\u3067\u304d\u308b\u306e\u3067\u3059\u304b\u3089\u3001\u4fbf\u5229\u306a\u4e16\u306e\u4e2d\u306b\u306a\u3063\u305f\u3082\u306e\u3060\u3068\u3064\u304f\u3065\u304f\u601d\u3044\u307e\u3057\u305f\u3002<\/p>\r\n<ol class=\"footnotes\"><li id=\"footnote_1_2364\" class=\"footnote\">Python \u306b\u306f\u3058\u3081\u304b\u3089\u3042\u308b <code>html.parser<\/code> \u3067\u3082\u3042\u308b\u7a0b\u5ea6\u3067\u304d\u307e\u3059\u3002\u307e\u305f Selenium \u306b\u3082\u540c\u69d8\u306e\u6a5f\u80fd\u304c\u3042\u308b\u3088\u3046\u3067\u3059\u3002<span class=\"footnote-back-link-wrapper\"><a href=\"#identifier_1_2364\" class=\"footnote-link footnote-back-link\">&#8593;<\/a><\/span><\/li><\/ol>","protected":false},"excerpt":{"rendered":"\u3072\u3089\u304c\u306a\u6570\u6587\u5b57\u3092\u6253\u3061\u8fd4\u3059\u3060\u3051\u306e\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2\u306f\u6848\u306e\u5b9a\u3059\u3050\u306b\u98fd\u304d\u3066\u3057\u307e\u3063\u305f\u306e\u3067\u3001\u4f55\u304b\u5225\u306e\u30cd\u30bf\u3092\u8003\u3048\u306a\u304f\u3066\u306f\u306a\u3089\u306a\u304f\u306a\u308a\u307e\u3057\u305f\u3002\u98fd\u304d\u306a\u3044\u305f\u3081\u306b\u306f\u81a8\u5927\u304b\u307e\u305f\u306f\u983b\u7e41\u306b\u66f4\u65b0\u3055\u308c\u308b\u5143\u30c7\u30fc\u30bf\u304c\u3042\u308c\u3070\u3044\u3044\u3001\u9752\u7a7a\u6587\u5eab\u304b\u306a\u3001\u3067\u3082\u5c0f\u5b66\u751f\u306b\u5411\u3044&hellip; <a class=\"more-link\" href=\"https:\/\/pasero.net\/~mako\/blog\/s\/2364\">\u7d9a\u304d\u3092\u8aad\u3080<span class=\"screen-reader-text\">: \u30cb\u30e5\u30fc\u30b9\u306e\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u3067\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2<\/span> <span class=\"meta-nav\" aria-hidden=\"true\">&rarr;<\/span><\/a>","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"blog: \u30cb\u30e5\u30fc\u30b9\u306e\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u3067\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[77,75],"tags":[88,90,42,89,36,34,7],"class_list":["post-2364","post","type-post","status-publish","format-standard","hentry","category-internet","category-software","tag-errbot","tag-html","tag-jabber","tag-python","tag-xmpp","tag-internet","tag-software"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pLxlV-C8","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":652,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/652","url_meta":{"origin":2364,"position":0},"title":"\u30ea\u30e2\u30fc\u30c8\u306b\u7f6e\u3044\u3066\u3042\u308b\u97f3\u697d\u30d5\u30a1\u30a4\u30eb\u3092\u30ed\u30fc\u30ab\u30ebPC\u306e\u30b9\u30d4\u30fc\u30ab\u30fc\u3067\u9cf4\u3089\u3059","author":"Mako","date":"2014\u5e7410\u67081\u65e5(\u6c34)","format":false,"excerpt":"\u4ee5\u524d\u306b\u300c\u30a4\u30f3\u30bf\u30fc\u30cd\u30c3\u30c8\u30e9\u30b8\u30aa\u3092 FM \u30e9\u30b8\u30aa\u3067\u8074\u304f\u300d\u3068\u3044\u3046\u306e\u3092\u66f8\u304d\u307e\u3057\u305f\u3002\u5c11\u3057\u96e2\u308c\u305f\u5834\u6240\u306b\u3042\u308b\u30de\u30b7\u30f3\u2026","rel":"","context":"\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2","block_context":{"text":"\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/software"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2187,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/2187","url_meta":{"origin":2364,"position":1},"title":"\u74b0\u5883\u5909\u6570 CHROMIUM_FLAGS \u3068 Chromium \u306e\u30aa\u30d7\u30b7\u30e7\u30f3 --enable-remote-extensions","author":"Mako","date":"2017\u5e742\u67085\u65e5(\u65e5)","format":false,"excerpt":"\u81ea\u5206\u7528\u30e1\u30e2\u3002 \u307b\u304b\u306f\u3069\u3046\u304b\u77e5\u3089\u306a\u3044\u304c\u5c11\u306a\u304f\u3068\u3082\u3053\u308c\u3092\u66f8\u3044\u3066\u3044\u308b\u6642\u70b9\u306e Debian (Stretch\u2026","rel":"","context":"Debian","block_context":{"text":"Debian","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/debian"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2360,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/2360","url_meta":{"origin":2364,"position":2},"title":"XMPP \u30c1\u30e3\u30c3\u30c8\u3068 bot \u3067\u30bf\u30a4\u30d4\u30f3\u30b0\u7df4\u7fd2","author":"Mako","date":"2018\u5e745\u670831\u65e5(\u6728)","format":false,"excerpt":"\u30b9\u30a6\u3061\u3083\u3093 (\u4eee\u540d\u30014\u5e74\u751f) \u304c\u30d1\u30bd\u30b3\u30f3\u3067\u3084\u308a\u305f\u3044\u3053\u3068\u306e\u3072\u3068\u3064\u304c\u300c\u5b57\u3092\u6253\u3066\u308b\u3088\u3046\u306b\u306a\u308b\u300d\u3053\u3068\u3002 \u5b66\u6821\u2026","rel":"","context":"Jabber\/XMPP","block_context":{"text":"Jabber\/XMPP","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/xmpp-2"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2018\/05\/typingexercise-266x300.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":160,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/160","url_meta":{"origin":2364,"position":3},"title":"AT5IONT-I \u3067\u7d44\u3080 PC \u306b Debian \u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb","author":"Mako","date":"2010\u5e7410\u670815\u65e5(\u91d1)","format":false,"excerpt":"AT5IONT-I \u3067 PC \u3092\u7d44\u3080 PC \u306e1\u53f0\u306e\u30d5\u30a1\u30f3\u304c\u3046\u308b\u3055\u304f\u3066\u3057\u304b\u305f\u304c\u306a\u304f\u306a\u3063\u305f\u3002\u305d\u308c\u3060\u3051\u4ea4\u2026","rel":"","context":"Debian","block_context":{"text":"Debian","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/debian"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2676,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/2676","url_meta":{"origin":2364,"position":4},"title":"USB \u30a6\u30a7\u30d6\u30ab\u30e1\u30e9\u3092\u30bd\u30d5\u30c8\u7684\u306b\u7121\u52b9\u5316\u3059\u308b","author":"Mako","date":"2026\u5e741\u670827\u65e5(\u706b)","format":false,"excerpt":"OS \u306f Debian Linux\u3001\u30c7\u30b9\u30af\u30c8\u30c3\u30d7\u74b0\u5883\u306f XFCE \u3067\u3001\u30a6\u30a7\u30d6\u30ab\u30e1\u30e9\u3092\u4f7f\u7528\u3057\u3066\u3044\u308b\u3002\u5fc5\u2026","rel":"","context":"Debian","block_context":{"text":"Debian","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/debian"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2190,"url":"https:\/\/pasero.net\/~mako\/blog\/s\/2190","url_meta":{"origin":2364,"position":5},"title":"\u30aa\u30f3\u30e9\u30a4\u30f3\u30b9\u30c8\u30ec\u30fc\u30b8 hubiC \u3092\u30d0\u30c3\u30af\u30a2\u30c3\u30d7\u306b\u4f7f\u3063\u3066\u307f\u308b","author":"Mako","date":"2017\u5e742\u670818\u65e5(\u571f)","format":false,"excerpt":"PC \u306e\u30c7\u30fc\u30bf\u306e\u30d0\u30c3\u30af\u30a2\u30c3\u30d7\u306f\u3001\u5b85\u5185\u306b\u8907\u6570\u306e PC \u304c\u3042\u308b\u306e\u3067\u76f8\u4e92\u306b\u30b3\u30d4\u30fc\u3059\u308b\u3068\u3044\u3046\u5b89\u76f4\u306a\u65b9\u6cd5\u3092\u3068\u3063\u2026","rel":"","context":"Debian","block_context":{"text":"Debian","link":"https:\/\/pasero.net\/~mako\/blog\/s\/category\/debian"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/pasero.net\/~mako\/blog\/wp-content\/uploads\/2017\/02\/hubiC1-244x300.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/posts\/2364","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/comments?post=2364"}],"version-history":[{"count":0,"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/posts\/2364\/revisions"}],"wp:attachment":[{"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/media?parent=2364"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/categories?post=2364"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pasero.net\/~mako\/blog\/wp-json\/wp\/v2\/tags?post=2364"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}