Full Code of virus-warnning/twnews for AI

develop 4c7ef4360184 cached
64 files
470.9 KB
154.2k tokens
199 symbols
1 requests
Download .txt
Showing preview only (517K chars total). Download the full file or copy to clipboard to get everything.
Repository: virus-warnning/twnews
Branch: develop
Commit: 4c7ef4360184
Files: 64
Total size: 470.9 KB

Directory structure:
gitextract_2u5gsm6r/

├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       ├── feature_request.md
│       ├── improvement.md
│       └── magic.md
├── .gitignore
├── .pylintrc
├── LICENSE
├── PULL_REQUEST_TEMPLATE.md
├── README.md
├── README.rst
├── bin/
│   ├── busm_failed.py
│   ├── fin_schedule.py
│   ├── getnews.sh
│   ├── publish.py
│   └── weekly.py
├── docs/
│   ├── SOUP_NOTES.md
│   └── changelog.md
├── postman/
│   ├── TWMKT.postman_collection.json
│   └── TWMKT.postman_environment.json
├── requirements.txt
├── setup.py
└── twnews/
    ├── __init__.py
    ├── __main__.py
    ├── cache.py
    ├── common.py
    ├── conf/
    │   ├── logging.yaml
    │   ├── news-soup.json
    │   └── usage.txt
    ├── exceptions.py
    ├── finance/
    │   ├── __init__.py
    │   ├── broker.py
    │   ├── tdcc.py
    │   ├── ticksmap.py
    │   ├── tpex.py
    │   └── twse.py
    ├── res/
    │   ├── ticksmap-captcha.xrc
    │   ├── ticksmap-locations.geojson
    │   └── ticksmap-template.html
    ├── samples/
    │   ├── appledaily.html.xz
    │   ├── chinatimes.html.xz
    │   ├── cna.html.xz
    │   ├── ettoday.html.xz
    │   ├── ltn.html.xz
    │   ├── setn.html.xz
    │   └── udn.html.xz
    ├── search.py
    ├── soup.py
    └── tests/
        ├── __init__.py
        ├── search/
        │   ├── __init__.py
        │   ├── test_search_appledaily.py
        │   ├── test_search_cna.py
        │   ├── test_search_ettoday.py
        │   ├── test_search_ltn.py
        │   ├── test_search_setn.py
        │   └── test_search_udn.py
        └── soup/
            ├── __init__.py
            ├── test_soup_appledaily.py
            ├── test_soup_chinatimes.py
            ├── test_soup_cna.py
            ├── test_soup_common.py
            ├── test_soup_ettoday.py
            ├── test_soup_ltn.py
            ├── test_soup_setn.py
            └── test_soup_udn.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: 回報錯誤
title: 摘要一下問題吧
labels: bug
assignees: virus-warnning

---

**問題描述**

盡可能清楚地描述狀況,讓我觀落陰不會走火入魔

**重現方法**

用下列步驟可以讓程式爆掉
1. python3 -m twnews ...
2. ...

**預期行為**

我本來覺得應該會這樣 ...

**實際行為**

結果變成這樣 ...
```bash
$ python3 -m twnews soup 
---------------------------------------------------------------------------
路徑: https://tw.news.appledaily.com/local/realtime/20181025/1453825
頻道: appledaily
標題: 男子疑久病厭世 學校圍牆上吊輕生亡
日期: 2018-10-25 12:03:00
記者: None
內文:
台北市北投區西安街二段
```

**系統環境**

 - Python: [至少要 3.5 以上喔]
 - OS: [例如 Ubuntu 18.04]


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: 許個願望
title: 想要一個神奇的功能
labels: enhancement
assignees: virus-warnning

---

**許願動機**

描述一下這個願望能滿足什麼需求,或是對大家有什麼貢獻,現有功能無法滿足的原因等

**具體用法**

描述一下這個願望想做成什麼樣的形式,例如 function 定義,或是 command 用法


================================================
FILE: .github/ISSUE_TEMPLATE/improvement.md
================================================
---
name: Improvement
about: 現有功能改善
title: OOO效能改善
labels: improvement
assignees: virus-warnning

---

**問題摘要**

描述一下現有功能的問題,例如效能不佳,誤判率太高

**改善方式**

例如調整 SQL 語法,調整 Regex Pattern,或是改用OO演算法等


================================================
FILE: .github/ISSUE_TEMPLATE/magic.md
================================================
---
name: Magic
about: 變個魔法 (不提供訪客使用)
title: 變個魔法
labels: magic
assignees: virus-warnning

---

**咒語內容**

希望機器人可以自己發現八卦,然後自動告訴主人

**施法細節**

* [ ] 首先要 OOO
* [ ] 然後要 XXX


================================================
FILE: .gitignore
================================================
# Python 3
*.pyc
__pycache__

# Testing report
.coverage

# setup.py
dist
build
twnews.egg-info

# virtualenv
sandbox

# vim
*.swp

# MacOS
.DS_Store


================================================
FILE: .pylintrc
================================================
[MASTER]

# A comma-separated list of package or module names from where C extensions may
# be loaded. Extensions are loading into the active Python interpreter and may
# run arbitrary code.
extension-pkg-whitelist=

# Add files or directories to the blacklist. They should be base names, not
# paths.
ignore=CVS

# Add files or directories matching the regex patterns to the blacklist. The
# regex matches against base names, not paths.
ignore-patterns=

# Python code to execute, usually for sys.path manipulation such as
# pygtk.require().
#init-hook=

# Use multiple processes to speed up Pylint. Specifying 0 will auto-detect the
# number of processors available to use.
jobs=1

# Control the amount of potential inferred values when inferring a single
# object. This can help the performance when dealing with large functions or
# complex, nested conditions.
limit-inference-results=100

# List of plugins (as comma separated values of python modules names) to load,
# usually to register additional checkers.
load-plugins=

# Pickle collected data for later comparisons.
persistent=yes

# Specify a configuration file.
#rcfile=

# When enabled, pylint would attempt to guess common misconfiguration and emit
# user-friendly hints instead of false-positive error messages.
suggestion-mode=yes

# Allow loading of arbitrary C extensions. Extensions are imported into the
# active Python interpreter and may run arbitrary code.
unsafe-load-any-extension=no


[MESSAGES CONTROL]

# Only show warnings with the listed confidence levels. Leave empty to show
# all. Valid levels: HIGH, INFERENCE, INFERENCE_FAILURE, UNDEFINED.
confidence=

# Disable the message, report, category or checker with the given id(s). You
# can either give multiple identifiers separated by comma (,) or put this
# option multiple times (only on the command line, not in the configuration
# file where it should appear only once). You can also use "--disable=all" to
# disable everything first and then reenable specific checks. For example, if
# you want to run only the similarities checker, you can use "--disable=all
# --enable=similarities". If you want to run only the classes checker, but have
# no Warning level messages displayed, use "--disable=all --enable=classes
# --disable=W".
disable=print-statement,
        parameter-unpacking,
        unpacking-in-except,
        old-raise-syntax,
        backtick,
        long-suffix,
        old-ne-operator,
        old-octal-literal,
        import-star-module-level,
        non-ascii-bytes-literal,
        raw-checker-failed,
        bad-inline-option,
        locally-disabled,
        file-ignored,
        suppressed-message,
        useless-suppression,
        deprecated-pragma,
        use-symbolic-message-instead,
        apply-builtin,
        basestring-builtin,
        buffer-builtin,
        cmp-builtin,
        coerce-builtin,
        execfile-builtin,
        file-builtin,
        long-builtin,
        raw_input-builtin,
        reduce-builtin,
        standarderror-builtin,
        unicode-builtin,
        xrange-builtin,
        coerce-method,
        delslice-method,
        getslice-method,
        setslice-method,
        no-absolute-import,
        old-division,
        dict-iter-method,
        dict-view-method,
        next-method-called,
        metaclass-assignment,
        indexing-exception,
        raising-string,
        reload-builtin,
        oct-method,
        hex-method,
        nonzero-method,
        cmp-method,
        input-builtin,
        round-builtin,
        intern-builtin,
        unichr-builtin,
        map-builtin-not-iterating,
        zip-builtin-not-iterating,
        range-builtin-not-iterating,
        filter-builtin-not-iterating,
        using-cmp-argument,
        eq-without-hash,
        div-method,
        idiv-method,
        rdiv-method,
        exception-message-attribute,
        invalid-str-codec,
        sys-max-int,
        bad-python3-import,
        deprecated-string-function,
        deprecated-str-translate-call,
        deprecated-itertools-function,
        deprecated-types-field,
        next-method-defined,
        dict-items-not-iterating,
        dict-keys-not-iterating,
        dict-values-not-iterating,
        deprecated-operator-function,
        deprecated-urllib-function,
        xreadlines-attribute,
        deprecated-sys-function,
        exception-escape,
        comprehension-escape,
        duplicate-code,
        fixme

# Enable the message, report, category or checker with the given id(s). You can
# either give multiple identifier separated by comma (,) or put this option
# multiple time (only on the command line, not in the configuration file where
# it should appear only once). See also the "--disable" option for examples.
enable=c-extension-no-member


[REPORTS]

# Python expression which should return a note less than 10 (10 is the highest
# note). You have access to the variables errors warning, statement which
# respectively contain the number of errors / warnings messages and the total
# number of statements analyzed. This is used by the global evaluation report
# (RP0004).
evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)

# Template used to display messages. This is a python new-style format string
# used to format the message information. See doc for all details.
#msg-template=

# Set the output format. Available formats are text, parseable, colorized, json
# and msvs (visual studio). You can also give a reporter class, e.g.
# mypackage.mymodule.MyReporterClass.
output-format=text

# Tells whether to display a full report or only the messages.
reports=no

# Activate the evaluation score.
score=yes


[REFACTORING]

# Maximum number of nested blocks for function / method body
max-nested-blocks=5

# Complete name of functions that never returns. When checking for
# inconsistent-return-statements if a never returning function is called then
# it will be considered as an explicit return statement and no message will be
# printed.
never-returning-functions=sys.exit


[LOGGING]

# Format style used to check logging format string. `old` means using %
# formatting, while `new` is for `{}` formatting.
logging-format-style=old

# Logging modules to check that the string format arguments are in logging
# function parameter format.
logging-modules=logging


[SPELLING]

# Limits count of emitted suggestions for spelling mistakes.
max-spelling-suggestions=4

# Spelling dictionary name. Available dictionaries: none. To make it working
# install python-enchant package..
spelling-dict=

# List of comma separated words that should not be checked.
spelling-ignore-words=

# A path to a file that contains private dictionary; one word per line.
spelling-private-dict-file=

# Tells whether to store unknown words to indicated private dictionary in
# --spelling-private-dict-file option instead of raising a message.
spelling-store-unknown-words=no


[MISCELLANEOUS]

# List of note tags to take in consideration, separated by a comma.
notes=FIXME,
      XXX,
      TODO


[TYPECHECK]

# List of decorators that produce context managers, such as
# contextlib.contextmanager. Add to this list to register other decorators that
# produce valid context managers.
contextmanager-decorators=contextlib.contextmanager

# List of members which are set dynamically and missed by pylint inference
# system, and so shouldn't trigger E1101 when accessed. Python regular
# expressions are accepted.
generated-members=

# Tells whether missing members accessed in mixin class should be ignored. A
# mixin class is detected if its name ends with "mixin" (case insensitive).
ignore-mixin-members=yes

# Tells whether to warn about missing members when the owner of the attribute
# is inferred to be None.
ignore-none=yes

# This flag controls whether pylint should warn about no-member and similar
# checks whenever an opaque object is returned when inferring. The inference
# can return multiple potential results while evaluating a Python object, but
# some branches might not be evaluated, which results in partial inference. In
# that case, it might be useful to still emit no-member and other checks for
# the rest of the inferred objects.
ignore-on-opaque-inference=yes

# List of class names for which member attributes should not be checked (useful
# for classes with dynamically set attributes). This supports the use of
# qualified names.
ignored-classes=optparse.Values,thread._local,_thread._local

# List of module names for which member attributes should not be checked
# (useful for modules/projects where namespaces are manipulated during runtime
# and thus existing member attributes cannot be deduced by static analysis. It
# supports qualified module names, as well as Unix pattern matching.
ignored-modules=

# Show a hint with possible names when a member name was not found. The aspect
# of finding the hint is based on edit distance.
missing-member-hint=yes

# The minimum edit distance a name should have in order to be considered a
# similar match for a missing member name.
missing-member-hint-distance=1

# The total number of similar names that should be taken in consideration when
# showing a hint for a missing member.
missing-member-max-choices=1


[VARIABLES]

# List of additional names supposed to be defined in builtins. Remember that
# you should avoid defining new builtins when possible.
additional-builtins=

# Tells whether unused global variables should be treated as a violation.
allow-global-unused-variables=yes

# List of strings which can identify a callback function by name. A callback
# name must start or end with one of those strings.
callbacks=cb_,
          _cb

# A regular expression matching the name of dummy variables (i.e. expected to
# not be used).
dummy-variables-rgx=_+$|(_[a-zA-Z0-9_]*[a-zA-Z0-9]+?$)|dummy|^ignored_|^unused_

# Argument names that match this expression will be ignored. Default to name
# with leading underscore.
ignored-argument-names=_.*|^ignored_|^unused_

# Tells whether we should check for unused import in __init__ files.
init-import=no

# List of qualified module names which can have objects that can redefine
# builtins.
redefining-builtins-modules=six.moves,past.builtins,future.builtins,builtins,io


[FORMAT]

# Expected format of line ending, e.g. empty (any line ending), LF or CRLF.
expected-line-ending-format=

# Regexp for a line that is allowed to be longer than the limit.
ignore-long-lines=^\s*(# )?<?https?://\S+>?$

# Number of spaces of indent required inside a hanging or continued line.
indent-after-paren=4

# String used as indentation unit. This is usually "    " (4 spaces) or "\t" (1
# tab).
indent-string='    '

# Maximum number of characters on a single line.
max-line-length=100

# Maximum number of lines in a module.
max-module-lines=1000

# List of optional constructs for which whitespace checking is disabled. `dict-
# separator` is used to allow tabulation in dicts, etc.: {1  : 1,\n222: 2}.
# `trailing-comma` allows a space between comma and closing bracket: (a, ).
# `empty-line` allows space-only lines.
no-space-check=trailing-comma,
               dict-separator

# Allow the body of a class to be on the same line as the declaration if body
# contains single statement.
single-line-class-stmt=no

# Allow the body of an if to be on the same line as the test if there is no
# else.
single-line-if-stmt=no


[SIMILARITIES]

# Ignore comments when computing similarities.
ignore-comments=yes

# Ignore docstrings when computing similarities.
ignore-docstrings=yes

# Ignore imports when computing similarities.
ignore-imports=no

# Minimum lines number of a similarity.
min-similarity-lines=4


[BASIC]

# Naming style matching correct argument names.
argument-naming-style=snake_case

# Regular expression matching correct argument names. Overrides argument-
# naming-style.
#argument-rgx=

# Naming style matching correct attribute names.
attr-naming-style=snake_case

# Regular expression matching correct attribute names. Overrides attr-naming-
# style.
#attr-rgx=

# Bad variable names which should always be refused, separated by a comma.
bad-names=foo,
          bar,
          baz,
          toto,
          tutu,
          tata

# Naming style matching correct class attribute names.
class-attribute-naming-style=any

# Regular expression matching correct class attribute names. Overrides class-
# attribute-naming-style.
#class-attribute-rgx=

# Naming style matching correct class names.
class-naming-style=PascalCase

# Regular expression matching correct class names. Overrides class-naming-
# style.
#class-rgx=

# Naming style matching correct constant names.
const-naming-style=UPPER_CASE

# Regular expression matching correct constant names. Overrides const-naming-
# style.
#const-rgx=

# Minimum line length for functions/classes that require docstrings, shorter
# ones are exempt.
docstring-min-length=-1

# Naming style matching correct function names.
function-naming-style=snake_case

# Regular expression matching correct function names. Overrides function-
# naming-style.
#function-rgx=

# Good variable names which should always be accepted, separated by a comma.
good-names=i,
           j,
           k,
           ex,
           Run,
           _

# Include a hint for the correct naming format with invalid-name.
include-naming-hint=no

# Naming style matching correct inline iteration names.
inlinevar-naming-style=any

# Regular expression matching correct inline iteration names. Overrides
# inlinevar-naming-style.
#inlinevar-rgx=

# Naming style matching correct method names.
method-naming-style=snake_case

# Regular expression matching correct method names. Overrides method-naming-
# style.
#method-rgx=

# Naming style matching correct module names.
module-naming-style=snake_case

# Regular expression matching correct module names. Overrides module-naming-
# style.
#module-rgx=

# Colon-delimited sets of names that determine each other's naming style when
# the name regexes allow several styles.
name-group=

# Regular expression which should only match function or class names that do
# not require a docstring.
no-docstring-rgx=^_

# List of decorators that produce properties, such as abc.abstractproperty. Add
# to this list to register other decorators that produce valid properties.
# These decorators are taken in consideration only for invalid-name.
property-classes=abc.abstractproperty

# Naming style matching correct variable names.
variable-naming-style=snake_case

# Regular expression matching correct variable names. Overrides variable-
# naming-style.
#variable-rgx=


[STRING]

# This flag controls whether the implicit-str-concat-in-sequence should
# generate a warning on implicit string concatenation in sequences defined over
# several lines.
check-str-concat-over-line-jumps=no


[IMPORTS]

# Allow wildcard imports from modules that define __all__.
allow-wildcard-with-all=no

# Analyse import fallback blocks. This can be used to support both Python 2 and
# 3 compatible code, which means that the block might have code that exists
# only in one or another interpreter, leading to false positives when analysed.
analyse-fallback-blocks=no

# Deprecated modules which should not be used, separated by a comma.
deprecated-modules=optparse,tkinter.tix

# Create a graph of external dependencies in the given file (report RP0402 must
# not be disabled).
ext-import-graph=

# Create a graph of every (i.e. internal and external) dependencies in the
# given file (report RP0402 must not be disabled).
import-graph=

# Create a graph of internal dependencies in the given file (report RP0402 must
# not be disabled).
int-import-graph=

# Force import order to recognize a module as part of the standard
# compatibility libraries.
known-standard-library=

# Force import order to recognize a module as part of a third party library.
known-third-party=enchant


[CLASSES]

# List of method names used to declare (i.e. assign) instance attributes.
defining-attr-methods=__init__,
                      __new__,
                      setUp

# List of member names, which should be excluded from the protected access
# warning.
exclude-protected=_asdict,
                  _fields,
                  _replace,
                  _source,
                  _make

# List of valid names for the first argument in a class method.
valid-classmethod-first-arg=cls

# List of valid names for the first argument in a metaclass class method.
valid-metaclass-classmethod-first-arg=cls


[DESIGN]

# Maximum number of arguments for function / method.
max-args=5

# Maximum number of attributes for a class (see R0902).
max-attributes=7

# Maximum number of boolean expressions in an if statement.
max-bool-expr=5

# Maximum number of branch for function / method body.
max-branches=12

# Maximum number of locals for function / method body.
max-locals=15

# Maximum number of parents for a class (see R0901).
max-parents=7

# Maximum number of public methods for a class (see R0904).
max-public-methods=20

# Maximum number of return / yield for function / method body.
max-returns=6

# Maximum number of statements in function / method body.
max-statements=50

# Minimum number of public methods for a class (see R0903).
min-public-methods=2


[EXCEPTIONS]

# Exceptions that will emit a warning when being caught. Defaults to
# "BaseException, Exception".
overgeneral-exceptions=BaseException,
                       Exception


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2018 Raymond Wu (小璋丸)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: PULL_REQUEST_TEMPLATE.md
================================================
**目的與用途**

描述一下這個 PR 的目的與用途

**被刪除的功能**

如果有刪除某些功能,請說明刪除原因以及效益

**被刪除的測試程式**

如果有刪除測試程式,請說明為何不再需要測試,以及對應的替代方案


================================================
FILE: README.md
================================================
台灣新聞拆拆樂的用來蒐集或處理大眾感興趣的資訊,目前著重在新聞和市場情報,希望能藉由這個工具讓其他人變出各種好玩的衍生應用,目前提供這些功能:

* 將新聞網頁拆出標題,內文,報導時間,記者
* 關鍵字查新聞,並且可接續拆新聞或新聞數
* 蒐集三大法人,信用交易,借券賣出,股權分散表等市場資訊

## 為什麼做這個?

最早的開發動機,是希望能做個工具能快速找到最新的凶宅,用來輔助我的另一個專案[鄉民風水師](https://geomancer.tacosync.com/),實現資料自動更新,所以做了拆站工具和搜尋爬蟲。做了一段時間也開始有話題性,就修了一些大家反應的問題。然後希望維護這專案的負擔不要太重,又寫了一些能發財的東西,然後就變成現在的樣子。如果想了解更多以前的事,可以參考[版本異動紀錄](docs/changelog.md)。

如果你是來找發財工具,另一個乏人問津的專案[群益證券聽牌機](https://github.com/virus-warnning/skcom)也歡迎來關注一下。 

## 怎麼用?

**系統需求**

* OS: 不限
* Python: 3.5 ~ 3.7

**安裝**

```sh
# Windows (只有 Python 3.x 的環境)
pip install twnews

# macOS, Linux (Python 2.x 與 3.x 並存的環境)
pip3 install twnews
```

**體驗**

為了方便有緣人使用,開發了一些 console 工具讓人可以不用寫程式就立即體驗

例如朝聖一則凶宅新聞:

```bash
python3 -m twnews soup https://www.setn.com/News.aspx?NewsID=490688&From=Search&Key=%E5%8F%B0%E4%B8%AD%E6%9C%80%E7%8C%9B%E5%85%87%E5%AE%85%EF%BC%81%E9%80%A36%E5%90%8D%E6%88%BF%E5%AE%A2%E5%90%8C%E4%BD%8D%E7%BD%AE%E4%B8%8A%E5%90%8A
---------------------------------------------------------------------------
原始路徑: https://www.setn.com/News.aspx?NewsID=490688
最終路徑: https://www.setn.com/m/news.aspx?newsid=490688
頻道: setn
標題: 堪稱台中最猛兇宅!連續6名房客上吊 竟都死在相同位置
日期: 2019-02-07 20:00:00
記者: 林盈君
內文:
記者林盈君/台中報導許多人不論是租屋、購屋,都擔心會碰上兇宅,有位職業為房仲的網友,就在PTT上,分享他曾經遇過的詭異事件!
他說約在幾年前,接到一個同行介紹的案子,是一間位於台中沙鹿的頂樓老公寓,當時他看到房屋價格時,簡直低到讓他不敢相信,詢問過
屋主之後,屋主表示屋內曾有人上吊死亡,而且是「連續6名房客」,且竟然都在「同一位置」選擇輕生!▲該公寓曾發生6名租客上吊事件
。(圖/示意圖/翻攝pixabay)據悉,幾年前曾有一名房客,在屋內上吊自殺,屋主之後請了法師來超渡誦經,就低價再次租給其他房客
,不料6個月後,該房客卻遲遲沒有交租金,就連簡訊電話都無回應,他只好請警察協助開門,沒想到進到屋內就發現,房客已經上吊死亡
,而且死亡的位置,跟第一個房客上吊的地方「一模一樣」,實在令人毛骨悚然。▲許多人認為這根本是「抓交替」。(圖/示意圖/翻攝
pixabay)在那之後,屋主連續租給其他幾位房客,但卻頻頻發生意外,6位租客全都選擇在屋內上吊自殺,上吊的位置也全部一模一樣,
這樣詭異的情況,也讓轄區員警懷疑,屋主是不是殺害租客後,用上吊來掩蓋罪行,讓屋主相當崩潰!屋主曾無奈表示「1、2個就算了,是
連續6個租客你知道嗎?」、「這6個租客我根本不知道為什麼他們最後都上吊死了,而且上吊的位置全都一模一樣」。更加詭異的是,連續
6名房客上吊身亡後,屋主有2年多都沒再把房子出租,屋內空無一人,但對面鄰居卻跟他表示,看到有人進出他的房子,樓下的住戶也說,
上面常常傳來很用力踩地板、拖著椅子及摔東西的聲音,甚至連一樓的住戶,都說看到房子的陽台上有人,但等到屋主拿鑰匙進屋查看時,
發現從門口到陽台都是厚厚的灰塵,完全沒有任何人走動的跡象。因為發生過命案,再加上接連詭異的情況,讓對面鄰居和樓下住戶都陸續
搬走,也有房仲找投資客來看房,但大家都覺得這房子太邪門,認為根本是「抓交替」。後來有位投資客表示意願想買,但到簽約當天屋主
卻反悔,表示不想讓對方成為上吊身亡的「第七個人」,所以直到現在,那間房子依然空在那裡,也沒有其他房客入住。
★ 三立新聞網提醒您:勇敢求救並非弱者,生命一定可以找到出路。透過守門123步驟-1問2應3轉介,你我都可以成為自殺防治守門人。
※ 安心專線:0800-788-995(0800-請幫幫-救救我)
※ 張老師專線:1980
※ 生命線專線:1995
※ 反霸凌專線:0800-200-885
有效內容率: 2.16%
---------------------------------------------------------------------------
```

## 深入了解

* [新聞 Console 工具](https://github.com/virus-warnning/twnews/wiki/01.%E6%96%B0%E8%81%9E-Console-%E5%B7%A5%E5%85%B7)
* [新聞 API](https://github.com/virus-warnning/twnews/wiki/02.%E6%96%B0%E8%81%9E-API)
* [股市情報蒐集工具](https://github.com/virus-warnning/twnews/wiki/03.%E8%82%A1%E5%B8%82%E6%83%85%E5%A0%B1%E8%92%90%E9%9B%86%E5%B7%A5%E5%85%B7)


================================================
FILE: README.rst
================================================
台灣新聞拆拆樂 (twnews) 用來分解台灣各大新聞網站,取出重要的純文字內容

功能
====

- 支援蘋果日報、中時電子報、中央社、東森新聞雲、自由時報、三立新聞網、聯合新聞網
- 使用行動版網頁與快取機制節省流量
- 盡可能找出記者姓名
- 利用 BeautifulSoup 的 CSS selector 功能搭配設定檔分解,容易同步網站改版
- 解決 Python for Windows CP950 編碼問題,節省處理鳥事的時間
- 每週執行自動化測試檢查分解程式的時效性

0.3.2 新功能

- 更新東森新聞爬蟲設定
- 解決自由地產、自由時尚因 lxml 版本太舊而拆失敗的問題
- 新增證交所資料蒐集程式 (三大法人、信用交易、借券賣出、鉅額交易、ETF淨值溢價率)
- 改善集保中心資料蒐集程式 (股權分散表)

0.3.1 新功能

- 跟進中時電子報的排版變更
- 支援蘋果地產
- 股權分散表資料表結構改善
- 股權分散表資料表日期改為 ISO 格式

0.3.0 新功能

- 股權分散表 CSV 檔蒐集程式
- 修正自由時報娛樂新聞分解問題 `#50 <https://github.com/virus-warnning/twnews/issues/50>`_ (回報者: `CpOuyang <https://github.com/CpOuyang>`_)

0.2.4 新功能

- 改善記者姓名辨識能力
- 自由時報分類新聞自動切換手機版
- 修正自由時報部分新聞無法取得日期問題
- 修正中國時報部分新聞無法取得記者問題
- 修正蘋果日報搜尋功能自動翻頁問題
- 增加測試項目

安裝
====

.. code:: bash

  pip3 install twnews

工具程式
========

.. code:: bash

  # 拆新聞
  python3 -m twnews soup https://tw.news.appledaily.com/local/realtime/20181025/1453825

  # 搜尋
  python3 -m twnews search 韓國瑜 udn

  # 搜尋 + 拆
  python3 -m twnews snsp 酒駕

  # 統計關鍵字出現在標題的次數
  python3 -m twnews cpkw 柯文哲

  # 查看用法
  python3 -m twnews help

範例 - 分解新聞
===============

.. code:: python

  from twnews.soup import NewsSoup

  nsoup = NewsSoup('https://tw.news.appledaily.com/local/realtime/20181025/1453825')
  print('頻道: {}'.format(nsoup.channel))
  print('標題: {}'.format(nsoup.title()))
  print('日期: {}'.format(nsoup.date()))
  print('記者: {}'.format(nsoup.author()))
  print('內文:')
  print(nsoup.contents())
  print('有效內容率: {:.2f}%'.format(nsoup.effective_text_rate() * 100))

.. code:: text

  頻道: appledaily
  標題: 男子疑久病厭世 學校圍牆上吊輕生亡│即時新聞│20181025│蘋果日報
  日期: 2018-10-25 12:03:00
  記者: 江宏倫
  內文:
  台北市北投區西安街二段,昨晚10時許,1名游姓男子(約80歲)坐在學校圍牆邊上吊輕生,路過民眾驚見嚇得趕緊報案,警消趕抵,時發現輕生男子已經沒有生命跡象,緊急送醫搶救仍宣告不治,警方初步調查排除外力介入,輕生原因仍有待釐清。

  警消表示,抵達現場時,發現游男坐在某國中圍牆邊上吊輕生,立即將他救下,但已無呼吸心跳,立即進行CPR並送醫搶救,家屬接獲通知趕抵醫院,同意放棄急救。警方調查,年約80多歲的游男,疑似因長期洗腎又患有心臟疾病、糖尿病才會想不開,現場並無打鬥痕跡,初步已排除外力介入,詳細輕生原因仍待調查釐清。(突發中心江宏倫/台北報導)《蘋果》關心你自殺解決不了問題,卻留給家人無比悲痛。請珍惜生命。再給自己一次機會自殺防治諮詢安心專線:0800-788995(24小時) 生命線協談專線:1995 張老師專線:1980出版時間02:07更新時間12:03


  >>加入蘋果日報粉絲團94即時94狂!
  有效內容率: 1.37%

範例 - 關鍵字搜尋 + 分解新聞
============================

.. code:: python

  from twnews.search import NewsSearch

  nsearch = NewsSearch(
    'ltn',
    limit=10,
    beg_date='2018-08-03', # 自由時報的日期範圍只能在 90 天以內
    end_date='2018-11-01'
  )
  nsoups = nsearch.by_keyword('上吊', title_only=True).to_soup_list()

  for (i, nsoup) in enumerate(nsoups):
      print('{:03d}: {}'.format(i, nsoup.path))
      if nsoup.title() is not None:
          print('     記者: {} / 日期: {}'.format(nsoup.author(), nsoup.date()))
          print('     標題: {}'.format(nsoup.title()))
          print('     {} ...'.format(nsoup.contents()[0:30]))
      else:
          print('     新聞分解失敗,無法識別 DOM 結構')

.. code:: text

  000: http://m.ltn.com.tw/news/society/breakingnews/2581807
       記者: None / 日期: 2018-10-15 23:51:00
       標題: 疑因病厭世 男子國小圖書館上吊身亡
       〔即時新聞/綜合報導〕台北市萬華區的老松國小今(15)日早上 ...
  001: http://m.ltn.com.tw/news/society/breakingnews/2579780
       記者: None / 日期: 2018-10-13 16:52:00
       標題: 汐止五指山驚傳男子上吊 水管繞頸陳屍樹林
       〔記者林嘉東、吳昇儒/新北報導〕台北市郭姓男子今天午後被發現 ...
  002: http://m.ltn.com.tw/news/entertainment/breakingnews/2579590
       新聞分解失敗,無法識別 DOM 結構
  003: http://m.ltn.com.tw/news/society/breakingnews/2577987
       記者: 謝武雄 / 日期: 2018-10-11 18:10:00
       標題: 議員尿急樹林解放赫見白骨 男子上吊這天正好滿七...
       [記者謝武雄/桃園報導]桃園市大園選區市議員游吾和昨天在臉書 ...
  004: http://m.ltn.com.tw/news/entertainment/breakingnews/2577596
       新聞分解失敗,無法識別 DOM 結構
  005: http://m.ltn.com.tw/news/society/breakingnews/2570595
       記者: 吳仁捷 / 日期: 2018-10-04 13:40:00
       標題: 疑借貸千萬翻身失敗 公墓上吊嚇壞爬山男
       〔記者吳仁捷/新北報導〕章姓男子今天上午到新北市樹林大同山區 ...
  006: http://m.ltn.com.tw/news/entertainment/breakingnews/2567740
       新聞分解失敗,無法識別 DOM 結構
  007: http://m.ltn.com.tw/news/life/breakingnews/2567637
       記者: None / 日期: 2018-10-01 23:35:00
       標題: 「肉粽」難送! 員林三合院連5人在「同條樑」上吊
       〔即時新聞/綜合報導〕在彰化沿海一帶,為上吊身亡者「送肉棕」 ...
  008: http://m.ltn.com.tw/news/society/breakingnews/2561962
       記者: None / 日期: 2018-09-26 11:08:00
       標題: 男子北美館樹林上吊亡 警到場調查
       〔即時新聞/綜合報導〕今天上午10時許,台北市立美術館停車場 ...
  009: http://m.ltn.com.tw/news/society/breakingnews/2561566
       記者: 黃良傑 / 日期: 2018-09-25 18:05:00
       標題: 美籍女師上吊租屋處身亡 美籍男友:房內發現遺書
       〔記者黃良傑/高雄報導〕一名美籍女老師今午被男友發現陳屍租屋 ...


================================================
FILE: bin/busm_failed.py
================================================
import os
import sys
import threading

sys.path.append(os.path.realpath('.'))

import twnews.common as cm

# 這一行搭配 logger.error() 可以讓程式在 Mac 環境爆掉
#   objc[70893]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
#   objc[70893]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
# 但目前找不到具體原因
import twnews.finance.twse as twse

def main():
    if os.fork() > 0:
        exit(0)

    logger = cm.get_logger('finance')
    logger.error('test')

if __name__ == '__main__':
    main()


================================================
FILE: bin/fin_schedule.py
================================================
import logging
import os
import schedule
import signal
import sys
import threading
import time
from datetime import datetime, timedelta

sys.path.append(os.path.realpath('.'))

import twnews.common as cm
import twnews.finance.twse as twse
import twnews.finance.tpex as tpex
import twnews.finance.tdcc as tdcc

class JustDaemon:
    """
    定型化 Daemon
    TODO: 分割成獨立套件
    """

    def __init__(self, init_task=None, loop_task=None, stdout='/dev/null', stderr='/dev/null', background=True):
        if background and os.fork() > 0:
            exit(0)

        self.init_task = init_task if init_task is not None else self.do_nothing
        self.loop_task = loop_task if loop_task is not None else self.do_nothing
        self.stdout = stdout
        self.stderr = stderr
        self.background = background
        self.close_requested = False

    def do_nothing(self):
        pass

    def on_quit(self, signum, frame):
        self.close_requested = True

    def listen_sys_signals(self):
        ACCEPTED_SIGNALS = [
            signal.SIGHUP,  # 1
            signal.SIGINT,  # 2
            signal.SIGQUIT, # 3
            signal.SIGABRT, # 6
            signal.SIGTERM  # 15
        ]
        for sig in ACCEPTED_SIGNALS:
            signal.signal(sig, self.on_quit)

    def stream_redirect(self):
        if self.background:
            outpath = os.path.expanduser(self.stdout)
            sys.stdout = open(outpath, 'w')
            errpath = os.path.expanduser(self.stderr)
            sys.stderr = open(errpath, 'w')

    def stream_flush(self):
        if self.background:
            sys.stdout.flush()
            sys.stderr.flush()

    def stream_close(self):
        if self.background:
            sys.stdout.close()
            sys.stderr.close()

    def pidfile_create(self):
        # 產生 pid file
        self.pid_file = os.path.expanduser('~/.twnews/fin_schedule.pid')
        pid_base = os.path.dirname(self.pid_file)
        if not os.path.isdir(pid_base):
            os.makedirs(pid_base)
        with open(self.pid_file, 'w') as pid_stream:
            pid_stream.write('%s' % os.getpid())

    def pidfile_remove(self):
        # 移除 pid file
        os.remove(self.pid_file)

    def run(self):
        self.listen_sys_signals()
        self.stream_redirect()
        self.pidfile_create()

        attrs = {}
        self.init_task(attrs)
        while not self.close_requested:
            self.loop_task(attrs)
            self.stream_flush()

            # Make the next waking near 0 second.
            t = time.time()
            delay = 1 - (t - int(t))
            time.sleep(delay)

        self.pidfile_remove()
        self.stream_close()

class ScheduleDaemon(JustDaemon):
    """
    排程型 Daemon
    TODO: 分割成獨立套件
    """

    def __init__(self, schedule_table, stdout='/dev/null', stderr='/dev/null', background=True):
        self.schedule_table = schedule_table
        super().__init__(
            init_task = self.init_task,
            loop_task = self.loop_task,
            stdout = stdout,
            stderr = stderr,
            background = background
        )

    def init_task(self, attrs):
        for run_at in self.schedule_table:
            func = self.schedule_table[run_at]['func']
            args = self.schedule_table[run_at]['args']
            weekend = True
            if 'weekend' in self.schedule_table[run_at]:
                weekend = self.schedule_table[run_at]['weekend']

            if weekend:
                schedule.every().day.at(run_at).do(self.run_parallel, func, args)
            else:
                schedule.every().monday.at(run_at).do(self.run_parallel, func, args)
                schedule.every().tuesday.at(run_at).do(self.run_parallel, func, args)
                schedule.every().wednesday.at(run_at).do(self.run_parallel, func, args)
                schedule.every().thursday.at(run_at).do(self.run_parallel, func, args)
                schedule.every().friday.at(run_at).do(self.run_parallel, func, args)

    def loop_task(self, attrs):
        schedule.run_pending()

    def run_parallel(self, func, args):
        th = threading.Thread(target=func, args=args)
        th.start()

def im_fine():
    logger = cm.get_logger('finance')
    logger.info('I\'m fine. (%s)', __file__)

def tpe_at(timestr):
    hh_adjust = (time.altzone / -3600) - 8
    hh = (int(timestr[0:2]) + hh_adjust) % 24
    mm = int(timestr[3:5])
    return '%02d:%02d' % (hh, mm)

def main():
    # 注意!! args 要用 list 不可以用 tuple,否則傳遞一個字串時,會把每個字元當成一個參數傳遞
    ScheduleDaemon(
        schedule_table = {
            # 每日健康回報
            tpe_at('09:30'): { 'func': im_fine, 'args': [] },
            tpe_at('14:00'): { 'func': im_fine, 'args': [] },
            # 證交所
            tpe_at('14:09'): { 'func': twse.sync_dataset, 'args': ['borrowed'], 'weekend': False },
            tpe_at('15:57'): { 'func': twse.sync_dataset, 'args': ['etfnet'], 'weekend': False },
            tpe_at('16:44'): { 'func': twse.sync_dataset, 'args': ['institution'], 'weekend': False },
            tpe_at('17:33'): { 'func': twse.sync_dataset, 'args': ['block'], 'weekend': False },
            tpe_at('21:41'): { 'func': twse.sync_dataset, 'args': ['margin'], 'weekend': False },
            tpe_at('21:42'): { 'func': twse.sync_dataset, 'args': ['selled'], 'weekend': False },
            # 櫃買中心
            tpe_at('16:49'): { 'func': tpex.sync_dataset, 'args': ['institution'], 'weekend': False },
            tpe_at('17:48'): { 'func': tpex.sync_dataset, 'args': ['block'], 'weekend': False },
            tpe_at('21:47'): { 'func': tpex.sync_dataset, 'args': ['margin'], 'weekend': False },
            tpe_at('21:48'): { 'func': tpex.sync_dataset, 'args': ['selled'], 'weekend': False },
            # 集保中心
            tpe_at('07:01'): { 'func': tdcc.sync_dataset, 'args': [] },
        },
        # background = False
    ).run()

if __name__ == '__main__':
    main()


================================================
FILE: bin/getnews.sh
================================================
# 製作新聞離線範本
SAMPLE_DIR='twnews/samples'
USER_AGENT='Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36'

wget -U "$USER_AGENT" -O $SAMPLE_DIR/appledaily.html https://tw.news.appledaily.com/local/realtime/20160521/867195
wget -U "$USER_AGENT" -O $SAMPLE_DIR/chinatimes.html https://www.chinatimes.com/realtimenews/20180916001767-260402
wget -U "$USER_AGENT" -O $SAMPLE_DIR/cna.html https://www.cna.com.tw/news/asoc/201603190029-1.aspx
wget -U "$USER_AGENT" -O $SAMPLE_DIR/ettoday.html https://www.ettoday.net/news/20171209/1069025.htm
wget -U "$USER_AGENT" -O $SAMPLE_DIR/ltn.html https://m.ltn.com.tw/news/life/breakingnews/2504351
wget -U "$USER_AGENT" -O $SAMPLE_DIR/setn.html https://www.setn.com/m/News.aspx?NewsID=350370
wget -U "$USER_AGENT" -O $SAMPLE_DIR/udn.html https://udn.com/news/story/7315/3705102

# $WGET -O $SAMPLE_DIR/judicial-cp950.html http://aomp.judicial.gov.tw/abbs/wkw/WHD2ASHOW.jsp?rowid=%2Fsld%2F10703%2F09162840289.020
# iconv -f cp950 -t utf8 $SAMPLE_DIR/judicial-cp950.html > $SAMPLE_DIR/judicial.html
# rm -f $SAMPLE_DIR/judicial-cp950.html

xz -f $SAMPLE_DIR/*.html


================================================
FILE: bin/publish.py
================================================
#!/usr/bin/env python3
#
# 必要工具套件:
# * pylint
# * rstcheck
# * setuptools

import configparser
import os
import re
import subprocess
import sys

def get_wheel():
    """ 製作 wheel 檔案與取得檔名 """
    comp = subprocess.run(['python3', 'setup.py', 'bdist_wheel'], capture_output=True)
    if comp.returncode != 0:
        return False

    wheel = False
    stdout = comp.stdout.decode('utf-8').split('\n')
    for line in stdout:
        match = re.search('dist/twnews-.+\.whl', line)
        if match is not None:
            wheel = match.group(0)
            break

    return wheel

def get_latest_python():
    """ 選擇 pyenv 環境中 3.5 ~ 3.8 的最新版本 """
    detected_ver = {
        '3.5': -1,
        '3.6': -1,
        '3.7': -1,
        '3.8': -1
    }

    comp = subprocess.run(['pyenv', 'versions'], capture_output=True)
    stdout = comp.stdout.decode('utf-8').strip().split('\n')
    for line in stdout:
        match = re.match('  (\d\.\d)\.(\d+)', line)
        if match is not None:
            minor_ver = match.group(1)
            patch_ver = int(match.group(2))
            if minor_ver in detected_ver and \
                patch_ver > detected_ver[minor_ver]:
                detected_ver[minor_ver] = patch_ver

    latest_ver = []
    for minor_ver in detected_ver:
        if detected_ver[minor_ver] > -1:
            full_ver = '%s.%d' % (minor_ver, detected_ver[minor_ver])
            latest_ver.append(full_ver)

    return latest_ver

def test_in_virtualenv(pyver, wheel):
    """ 配置 virtualenv 並執行測試程式 """

    # 安裝 virtualenv
    src = os.path.expanduser('~/.pyenv/versions/%s/bin/python' % pyver)
    dst = 'sandbox/%s' % pyver
    comp = subprocess.run(['virtualenv', '-p', src, dst])
    if comp.returncode != 0:
        return False

    # 測試 wheel 包
    # 1. 安裝 wheel
    # 2. 安裝 green
    # 3. 執行測試程式
    os.chdir(dst)
    wheel = '../../' + wheel
    comp = subprocess.run(['bin/pip', 'install', wheel, 'green'])
    if comp.returncode == 0:
        comp = subprocess.run(['bin/green', '-vv', 'twnews'])
    os.chdir('../..')

    return (comp.returncode == 0)

def wheel_check():
    """ 檢查 wheel 是否能正常運作在各個 Python 版本環境上 """

    """
    print('檢查 logging.ini')
    config = configparser.ConfigParser()
    config.read('twnews/conf/logging.ini')
    if config['handler_stdout']['level'] != 'CRITICAL':
        print('handler_stdout 忘記切換成 CRITICAL level')
        exit(1)
    """

    print('檢查程式碼品質')
    ret = os.system('pylint -f colorized twnews')
    if ret != 0:
        print('檢查沒通過,停止封裝')
        exit(ret)

    print('檢查 README.rst')
    ret = os.system('rstcheck README.rst')
    if ret != 0:
        print('檢查沒通過,停止封裝')
        exit(ret)

    print('偵測可用的測試環境')
    os.system('rm -rf sandbox/*')
    wheel = get_wheel()
    latest_python = get_latest_python()
    if len(latest_python) == 0:
        print('沒有任何可用的測試環境')
        exit(1)

    for pyver in latest_python:
        print('測試 Python %s' % pyver)
        test_in_virtualenv(pyver, wheel)

def upload_to_pypi(test=False):
    """ 上傳 wheel 到 PyPi """

    # 檢查 ~/.pypirc 是否存在
    if not os.path.isfile(os.path.expanduser('~/.pypirc')):
        print('缺少 pypi 設定檔 ~/.pypirc')
        print('參考: https://gist.github.com/ibrahim12/c6a296c1e8f409dbed2f')

    # 重新產生 wheel 與上傳前確認
    wheel = get_wheel()
    prompt = '準備上傳的檔案是 %s, 確定上傳嗎 [y/n]? ' % wheel
    print(prompt, end='', flush=True)
    ans = sys.stdin.readline().strip()
    if ans != 'y':
        print('取消上傳')
        return

    # 上傳 wheel
    cmd = ['twine', 'upload']
    if test:
        cmd.append('--repository')
        cmd.append('testpypi')
    cmd.append('--verbose')
    cmd.append(wheel)
    comp = subprocess.run(cmd)
    if comp.returncode == 0:
        print('上傳成功')
    else:
        print('上傳失敗')

def main():
    # 確保不在 repo 目錄也能正常執行
    TWNEWS_HOME = os.path.realpath(os.path.dirname(__file__) + '/..')
    os.chdir(TWNEWS_HOME)

    action = 'wheel'
    if len(sys.argv) > 1:
        action = sys.argv[1]

    if action == 'release':
        upload_to_pypi()
    elif action == 'test':
        upload_to_pypi(True)
    elif action == 'wheel':
        wheel_check()
    else:
        print('Unknown action "%s".' % action)
        exit(1)

if __name__ == '__main__':
    main()


================================================
FILE: bin/weekly.py
================================================
#!/usr/bin/env python3

import json
import os
import os.path
import re

import smtplib
from email.header import Header
from email.mime.text import MIMEText

# 載入隱私設定
conf = None
CONF_PATH = os.path.expanduser('~/.twnews/weekly.json')
if os.path.isfile(CONF_PATH):
    with open(CONF_PATH, 'r') as conf_file:
        conf = json.load(conf_file)

if conf is None:
    print('Cannot load "{}".'.format(CONF_PATH))
    exit(1)

# 測試
working_dir = os.path.realpath(os.path.dirname(__file__) + '/..')
cmd = 'cd {} && python3 -m unittest discover -v 2>&1 1>/dev/null'.format(working_dir)
pipe = os.popen(cmd, 'r')
detail = pipe.read()
ret = pipe.close()

# 組信
if ret is None:
    subject = '[twnews] 每週測試 - 成功 ⭕️'
    contents = re.sub(r'\n\s+\| ', '\n', '''
        | <h3 style="color:#00f;">每週測試成功</h3>
        | <p>詳細內容:</p>
        | <pre style="border:1px solid #aaa; background:#eee; padding:10px;">
        | {}
        | </pre>
        ''', re.M).strip().format(detail)
else:
    subject = '[twnews] 每週測試 - 失敗 ❌'
    contents = re.sub(r'\n\s+\| ', '\n', '''
        | <h3 style="color:#f00;">每週測試失敗 ({})</h3>
        | <p>詳細內容:</p>
        | <pre style="border:1px solid #aaa; background:#eee; padding:10px;">
        | {}
        | </pre>
        ''', re.M).strip().format(ret, detail)

msg = MIMEText(contents, 'html', 'utf-8')
msg['Subject'] = Header(subject)
msg['From'] = '{} <{}>'.format(Header(conf['from_name']).encode(), conf['from_mail'])
msg['To'] = '{} <{}>'.format(Header(conf['to_name']).encode(), conf['to_mail'])
smtp_data = msg.as_string()

# 發信
try:
    server = smtplib.SMTP(conf['smtp_host'], conf['smtp_port'])
    server.set_debuglevel(1)
    server.starttls()
    server.login(conf['smtp_user'], conf['smtp_pass'])
    server.sendmail(conf['from_mail'], conf['to_mail'], smtp_data)
    server.close()
except Exception as ex:
    print('SMTP 異常:', ex)


================================================
FILE: docs/SOUP_NOTES.md
================================================
# 新聞網站技術細節

### 新聞分解

頻道 | RWD | 轉址 | 記者欄
---- | ---- | ---- | ----
蘋果 | 半殘 | - | X
中時 | 完整 | - | O
中央社 | 完整 | - | X
東森新聞雲 | 半殘 | - | X
自由時報 | 無 | X | X
三立新聞網 | 無 | O | X
聯合新聞網 | 完整 | - | O

* 中時幾乎都有註明記者
* 聯合記者欄可能不存在
* 三立偵測到網址與裝置類型不符會發出 HTTP 302 轉址,要關閉自動跟隨參數避免程式誤判
* 完整 RWD 表示行動版和桌面版使用同一個網址,且 DOM 結構差異不大
* 半殘 RWD 表示行動版和桌面版使用同一個網址,但 DOM 結構差異很大
* 無 RWD 表示行動版和桌面版不在同一個網址上

### 新聞搜尋

頻道 | 介面 | 翻頁參數 | 日期範圍參數
---- | ---- | ---- | ----
蘋果 | 傳統網頁 | O | O
中時 | 沒提供 | - | -
中央社 | API | O | X
東森新聞雲 | 傳統網頁 | O | 特殊用法
自由時報 | 傳統網頁 | O | O
三立新聞網 | 傳統網頁 | O | X
聯合新聞網 | 傳統網頁 | O | X

* 蘋果、自由,可以用同一個模式運作
* 三立、聯合、東森,可以用同一個模式運作
* 中央社有提供 JSON API,不用拆網頁,要獨立一個模式運作
* 自由的時間區間只能設定三個月內,應該是全文檢索引擎效能不足
* 東森新聞雲雖然有提供時間範圍 daydiff(1~3) 但是不好用,當作沒這參數吧
* 中時直接掛 Google 自訂搜尋,這部份有點瞎

自由時報聲明的查詢範圍是 3 個月,實測結果有些微差異,應該是相減需 ≤90 天
```
# 2018-08-02 ~ 2018-11-01 沒有搜尋結果
nsearch = NewsSearch('ltn', '2018-08-02', '2018-11-01', 100)
# 2018-08-03 ~ 2018-11-01 有搜尋結果
nsearch = NewsSearch('ltn', '2018-08-03', '2018-11-01', 100)
```

### 新聞台搜尋效能比較

測試方法為下列指令連續跑 3 次,取單頁處理時間

```bash
python3 -m twnews sncp
```

&nbsp; | 1st | 2nd | 3rd
---- | ---- | ---- | ----
蘋果日報 | 0.40 | 0.03 | 0.03
中央社 | 0.21 | 0.14 | 0.15
東森新聞雲 | 0.11 | 0.13 | 0.08
自由時報 | 0.79 | 0.02 | 0.02
三立新聞網 | 0.21 | 0.02 | 0.02
聯合新聞網 | 1.76 | 0.06 | 0.06

* 沒快取的狀況下最慢的依序是: 聯合 > 自由 > 蘋果
* 中央社和東森可能沒有快取機制,其餘都有查詢快取
* 三立的行動搜尋介面,頁籤沒有最後一頁的連結,只有桌面版有提供


================================================
FILE: docs/changelog.md
================================================
## 0.3.x

0.3 的開發主軸是財經資料蒐集,針對證交所,櫃買中心,集保中心,公開資訊觀測站等地方取得有價值資料,並且匯入到 SQLite 方便使用。

### 0.3.3

* 櫃買中心三大法人,融資融券,鉅額交易 - #67
* 證交所爬蟲穩定性改善 - #69
* 集保中心爬蟲 Log Level 調整 - #73
* LZMA 壓縮快取檔 - #77 #78 #79
* 依新聞台名稱分配快取目錄 - #76
* 處理蘋果日報亂碼問題 - #80 #81
* 事務性工作 - #85 #74 #70

### 0.3.2

* 爬證交所三大法人,信用交易,借券賣出,鉅額交易,ETF折溢價率 - #56 #68
* 股權分散表爬蟲重構 - #66
* 修正新聞爬蟲對自由天下與自由時尚的相容問題 - #61
* 修正蘋果日報內文無法取得問題 - #62
* 修正東森時尚日期格式問題 - #63
* 事務性工作 - #72

### 0.3.1

* 股權分散表改用等級分表 - #57
* 股權分散表日期改用 ISO 格式 - #58
* 修正蘋果地產無法使用 https 問題 - #54
* 修正中時網頁結構改變問題 - #59
* 事務性工作 - #60

### 0.3.0

* 爬股權分散表 - #51
* 解決 // 開頭網址無法正確處理的問題 - #50 (回報者: [CpOuyang](//github.com/CpOuyang))
* 東森測試程式增加 proxy 參數解決檔日本 linode 問題 - #52
* 事務性工作 - #55

## 0.2.x

0.2 的開發主軸是新聞搜尋,針對各家新聞網站的搜尋功能,設計成行為一致的 API,簡化找新聞的需求,也解決拆新聞過程中發現的問題,列入測試樣本。

### 0.2.5

* Issue Template - #46

### 0.2.4

* 搜尋功能測試程式 - #37
* 增加 proxy 設計,讓東森新聞雲在日本 linode 也能使用 - #27
* 解決自由時報存在多種日期格式問題 - #34
* 強化記者姓名的辨識能力 - #40
* 被自動轉址時,自動切換分解新聞設定 - #43
* 修正蘋果日報搜尋的翻頁參數錯誤 - #44
* 事務性工作 - #45

### 0.2.3

* 解決各家新聞的子頻道編排相容性問題 - #31
* 新聞爬蟲重構 - #39
* 程式碼導入 pylint 檢查,遵循 PEP8 規範 - #38
* 事務性工作 - #42

### 0.2.2

* 修正 isinstance 用錯類別的問題 - #38

### 0.2.1

* 新聞標題統計功能 - #33
* 蘋果日報搜尋改用 XHR - #30
* 事務性工作 - #32

### 0.2.0

* 新聞關鍵字搜尋功能 - #9

# 0.1.x

0.1 的開發主軸是設計一個通用型的爬蟲,透過爬蟲設定值依照不同的新聞網站拆解 HTML 結構,取得標題,內文,記者,日期等資訊。

### 0.1.10

* 解決 Python 3.7 for Windows 相容性問題 - #28
* 解決 Python 3.5 3.6 for Windows 相容性問題 - #29

### 0.1.9

* 作者欄位加強測試 - #7
* 不理想 URL 測試 - #10
* 壓縮快取檔 - #15
* 壓縮範本檔 - #16
* 使用檔案儲存 log - #17
* 提供 URL 參數直接在 console 拆新聞 - #20
* 新聞爬蟲定期測試程式 - #24
* Python 3.5 相容設計 - #25
* 事務性工作 - #26

### 0.1.8

* 爬蟲支援中時電子報 - #4
* 事務性工作 - #12 #13 #14

### 0.1.7

* 清除快取前,先偵測目錄是否存在 - #2
* 變更快取目錄 - #5
* 移除 unicode_escape() - #6
* 計算有效內容率 - #8
* 變更頻果日報標題 CSS Selector - #11
* 事務性工作 - #3


================================================
FILE: postman/TWMKT.postman_collection.json
================================================
{
	"info": {
		"_postman_id": "96e57888-4ea0-406b-9f9b-774e3cf3896f",
		"name": "0xFE-TWMKT",
		"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json"
	},
	"item": [
		{
			"name": "[證交] 00 三大法人",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "http://www.twse.com.tw/fund/T86?response={{TWSE_EXFORMAT}}&date={{TWSE_EXDATE}}&selectType=ALL",
					"protocol": "http",
					"host": [
						"www",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"fund",
						"T86"
					],
					"query": [
						{
							"key": "response",
							"value": "{{TWSE_EXFORMAT}}"
						},
						{
							"key": "date",
							"value": "{{TWSE_EXDATE}}"
						},
						{
							"key": "selectType",
							"value": "ALL"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[證交] 01 融資融券",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "http://www.twse.com.tw/exchangeReport/MI_MARGN?response={{TWSE_EXFORMAT}}&date={{TWSE_EXDATE}}&selectType=ALL",
					"protocol": "http",
					"host": [
						"www",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"exchangeReport",
						"MI_MARGN"
					],
					"query": [
						{
							"key": "response",
							"value": "{{TWSE_EXFORMAT}}"
						},
						{
							"key": "date",
							"value": "{{TWSE_EXDATE}}"
						},
						{
							"key": "selectType",
							"value": "ALL"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[證交] 02 鉅額交易",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "http://www.twse.com.tw/block/BFIAUU?response={{TWSE_EXFORMAT}}&date={{TWSE_EXDATE}}&selectType=S",
					"protocol": "http",
					"host": [
						"www",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"block",
						"BFIAUU"
					],
					"query": [
						{
							"key": "response",
							"value": "{{TWSE_EXFORMAT}}"
						},
						{
							"key": "date",
							"value": "{{TWSE_EXDATE}}"
						},
						{
							"key": "selectType",
							"value": "S"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[證交] 03 融券借券",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "http://www.twse.com.tw/exchangeReport/TWT93U?response={{TWSE_EXFORMAT}}&date={{TWSE_EXDATE}}",
					"protocol": "http",
					"host": [
						"www",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"exchangeReport",
						"TWT93U"
					],
					"query": [
						{
							"key": "response",
							"value": "{{TWSE_EXFORMAT}}"
						},
						{
							"key": "date",
							"value": "{{TWSE_EXDATE}}"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[證交] 04 可借券賣出",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "http://www.twse.com.tw/SBL/TWT96U?response=csv",
					"protocol": "http",
					"host": [
						"www",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"SBL",
						"TWT96U"
					],
					"query": [
						{
							"key": "response",
							"value": "csv"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[證交] 05 ETF 折溢價",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://mis.twse.com.tw/stock/data/all_etf.txt",
					"protocol": "https",
					"host": [
						"mis",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"stock",
						"data",
						"all_etf.txt"
					]
				}
			},
			"response": []
		},
		{
			"name": "[櫃買] 00 三大法人明細",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://www.tpex.org.tw/web/stock/3insti/daily_trade/3itrade_hedge_result.php?l=zh_tw&o=json&se=EW&t=D&d={{TPEX_EXPORT_DATE}}&s=0,asc",
					"protocol": "https",
					"host": [
						"www",
						"tpex",
						"org",
						"tw"
					],
					"path": [
						"web",
						"stock",
						"3insti",
						"daily_trade",
						"3itrade_hedge_result.php"
					],
					"query": [
						{
							"key": "l",
							"value": "zh_tw"
						},
						{
							"key": "o",
							"value": "json"
						},
						{
							"key": "se",
							"value": "EW"
						},
						{
							"key": "t",
							"value": "D"
						},
						{
							"key": "d",
							"value": "{{TPEX_EXPORT_DATE}}"
						},
						{
							"key": "s",
							"value": "0,asc"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[櫃買] 01 融資融券明細",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://www.tpex.org.tw/web/stock/margin_trading/margin_balance/margin_bal_result.php?l=zh_tw&o=json&d={{TPEX_EXPORT_DATE}}",
					"protocol": "https",
					"host": [
						"www",
						"tpex",
						"org",
						"tw"
					],
					"path": [
						"web",
						"stock",
						"margin_trading",
						"margin_balance",
						"margin_bal_result.php"
					],
					"query": [
						{
							"key": "l",
							"value": "zh_tw"
						},
						{
							"key": "o",
							"value": "json"
						},
						{
							"key": "d",
							"value": "{{TPEX_EXPORT_DATE}}"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[櫃買] 02 鉅額交易明細",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://www.tpex.org.tw/web/stock/block_trade/daily_qutoes/block_day_download.php?l=zh_tw&d={{TPEX_EXPORT_DATE}}&s=0,asc,0&charset=UTF-8",
					"protocol": "https",
					"host": [
						"www",
						"tpex",
						"org",
						"tw"
					],
					"path": [
						"web",
						"stock",
						"block_trade",
						"daily_qutoes",
						"block_day_download.php"
					],
					"query": [
						{
							"key": "l",
							"value": "zh_tw"
						},
						{
							"key": "d",
							"value": "{{TPEX_EXPORT_DATE}}"
						},
						{
							"key": "s",
							"value": "0,asc,0"
						},
						{
							"key": "charset",
							"value": "UTF-8"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[櫃買] 03 當沖明細",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://www.tpex.org.tw/web/stock/trading/intraday_stat/intraday_trading_stat_download.php?l=zh-tw&d={{TPEX_EXPORT_DATE}}&s=0,asc,0&charset=UTF-8",
					"protocol": "https",
					"host": [
						"www",
						"tpex",
						"org",
						"tw"
					],
					"path": [
						"web",
						"stock",
						"trading",
						"intraday_stat",
						"intraday_trading_stat_download.php"
					],
					"query": [
						{
							"key": "l",
							"value": "zh-tw"
						},
						{
							"key": "d",
							"value": "{{TPEX_EXPORT_DATE}}"
						},
						{
							"key": "s",
							"value": "0,asc,0"
						},
						{
							"key": "charset",
							"value": "UTF-8"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[櫃買] 04 融券借券",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://www.tpex.org.tw/web/stock/margin_trading/margin_sbl/margin_sbl_download.php?l=zh-tw&d={{TPEX_EXPORT_DATE}}&s=0,asc,0&charset=utf-8",
					"protocol": "https",
					"host": [
						"www",
						"tpex",
						"org",
						"tw"
					],
					"path": [
						"web",
						"stock",
						"margin_trading",
						"margin_sbl",
						"margin_sbl_download.php"
					],
					"query": [
						{
							"key": "l",
							"value": "zh-tw"
						},
						{
							"key": "d",
							"value": "{{TPEX_EXPORT_DATE}}"
						},
						{
							"key": "s",
							"value": "0,asc,0"
						},
						{
							"key": "charset",
							"value": "utf-8"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[集保] 01 股權分散表最新週報",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://smart.tdcc.com.tw/opendata/getOD.ashx?id=1-5",
					"protocol": "https",
					"host": [
						"smart",
						"tdcc",
						"com",
						"tw"
					],
					"path": [
						"opendata",
						"getOD.ashx"
					],
					"query": [
						{
							"key": "id",
							"value": "1-5"
						}
					]
				}
			},
			"response": []
		},
		{
			"name": "[集保] 02 股權分散表\b結算日期",
			"request": {
				"method": "POST",
				"header": [
					{
						"key": "Content-Type",
						"name": "Content-Type",
						"type": "text",
						"value": "application/x-www-form-urlencoded"
					},
					{
						"key": "Referer",
						"type": "text",
						"value": "https://www.tdcc.com.tw/smWeb/QryStockAjax.do"
					}
				],
				"body": {
					"mode": "urlencoded",
					"urlencoded": [
						{
							"key": "REQ_OPR",
							"value": "qrySelScaDates",
							"type": "text"
						}
					]
				},
				"url": {
					"raw": "https://www.tdcc.com.tw/smWeb/QryStockAjax.do",
					"protocol": "https",
					"host": [
						"www",
						"tdcc",
						"com",
						"tw"
					],
					"path": [
						"smWeb",
						"QryStockAjax.do"
					]
				}
			},
			"response": []
		},
		{
			"name": "[集保] 03 股權分散表個股",
			"request": {
				"method": "POST",
				"header": [
					{
						"key": "Content-Type",
						"name": "Content-Type",
						"value": "application/x-www-form-urlencoded",
						"type": "text"
					},
					{
						"key": "Referer",
						"value": "https://www.tdcc.com.tw/smWeb/QryStockAjax.do",
						"type": "text"
					}
				],
				"body": {
					"mode": "urlencoded",
					"urlencoded": [
						{
							"key": "scaDate",
							"value": "20190119",
							"type": "text"
						},
						{
							"key": "SqlMethod",
							"value": "StockNo",
							"type": "text"
						},
						{
							"key": "REQ_OPR",
							"value": "SELECT",
							"type": "text"
						},
						{
							"key": "clkStockNo",
							"value": "3049",
							"type": "text"
						}
					]
				},
				"url": {
					"raw": "https://www.tdcc.com.tw/smWeb/QryStockAjax.do",
					"protocol": "https",
					"host": [
						"www",
						"tdcc",
						"com",
						"tw"
					],
					"path": [
						"smWeb",
						"QryStockAjax.do"
					]
				}
			},
			"response": []
		},
		{
			"name": "[分點] 00 券商地址 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[分點] 01 股務代理 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[財報] 00 會計師查核",
			"request": {
				"method": "POST",
				"header": [
					{
						"key": "Content-Type",
						"name": "Content-Type",
						"value": "application/x-www-form-urlencoded",
						"type": "text"
					}
				],
				"body": {
					"mode": "urlencoded",
					"urlencoded": [
						{
							"key": "TYPEK",
							"value": "sii",
							"type": "text"
						},
						{
							"key": "year",
							"value": "107",
							"type": "text"
						},
						{
							"key": "season",
							"value": "04",
							"type": "text"
						},
						{
							"key": "isQuery",
							"value": "Y",
							"type": "text"
						},
						{
							"key": "off",
							"value": "1",
							"type": "text"
						},
						{
							"key": "firstin",
							"value": "1",
							"type": "text"
						},
						{
							"key": "step",
							"value": "1",
							"type": "text"
						},
						{
							"key": "encodeURIComponent",
							"value": "1",
							"type": "text"
						}
					]
				},
				"url": {
					"raw": "https://mops.twse.com.tw/mops/web/t163sb14",
					"protocol": "https",
					"host": [
						"mops",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"mops",
						"web",
						"t163sb14"
					]
				}
			},
			"response": []
		},
		{
			"name": "[財報] 01 資產負債 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[財報] 02 損益 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[財報] 03 現金流量 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[財報] 04 股東權益 - TODO",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": ""
				}
			},
			"response": []
		},
		{
			"name": "[大股東] 00 上市大股東股權異動",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://siis.twse.com.tw/publish/sii/106IRB110_01.HTM",
					"protocol": "https",
					"host": [
						"siis",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"publish",
						"sii",
						"106IRB110_01.HTM"
					]
				}
			},
			"response": []
		},
		{
			"name": "[大股東] 01 上櫃大股東股權異動",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://siis.twse.com.tw/publish/otc/107IRB110_12.HTM",
					"protocol": "https",
					"host": [
						"siis",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"publish",
						"otc",
						"107IRB110_12.HTM"
					]
				}
			},
			"response": []
		},
		{
			"name": "[大股東] 02 興櫃大股東股權異動",
			"request": {
				"method": "GET",
				"header": [],
				"url": {
					"raw": "https://siis.twse.com.tw/publish/rotc/106IRB110_01.HTM",
					"protocol": "https",
					"host": [
						"siis",
						"twse",
						"com",
						"tw"
					],
					"path": [
						"publish",
						"rotc",
						"106IRB110_01.HTM"
					]
				}
			},
			"response": []
		}
	],
	"event": [
		{
			"listen": "prerequest",
			"script": {
				"id": "2a60f749-9ad5-4f69-91c7-981502d34dd4",
				"type": "text/javascript",
				"exec": [
					""
				]
			}
		},
		{
			"listen": "test",
			"script": {
				"id": "716c094f-8f7b-42e6-98ee-e62977e4633d",
				"type": "text/javascript",
				"exec": [
					""
				]
			}
		}
	]
}

================================================
FILE: postman/TWMKT.postman_environment.json
================================================
{
	"id": "e09139ab-0991-4315-82ef-93998a367e54",
	"name": "TWMKT",
	"values": [
		{
			"key": "TWSE_EXFORMAT",
			"value": "json",
			"enabled": true
		},
		{
			"key": "TWSE_EXDATE",
			"value": "20191015",
			"enabled": true
		},
		{
			"key": "TPEX_EXPORT_DATE",
			"value": "108/10/01",
			"enabled": true
		}
	],
	"_postman_variable_scope": "environment",
	"_postman_exported_at": "2019-10-23T04:15:07.097Z",
	"_postman_exported_using": "Postman/7.9.0"
}

================================================
FILE: requirements.txt
================================================
beautifulsoup4>=4.7.1
busm>=0.9.0
lxml>=4.3.3
requests>=2.21.0
pandas>=0.24.2
PyYAML>=5.1.2
# 只有 ticksmap 需要用到 wxWidget 
# wxPython>=4.0.4


================================================
FILE: setup.py
================================================
from setuptools import setup, find_packages
import twnews.common as common

# Load reStructedText description.
# Online Editor   - http://rst.ninjs.org/
# Quick Reference - http://docutils.sourceforge.net/docs/user/rst/quickref.html
readme = open('README.md', 'r')
longdesc = readme.read()
readme.close()

# See
# https://packaging.python.org/tutorials/packaging-projects/
# https://python-packaging.readthedocs.io/en/latest/non-code-files.html
setup(
    name='twnews',
    version=common.VERSION,
    description='To tear down news web pages in Taiwan.',
    long_description=longdesc,
    long_description_content_type='text/markdown',
    packages=find_packages(),
    url='https://github.com/virus-warnning/twnews',
    license='MIT',
    author='Raymond Wu',
    package_data={
        'twnews': ['conf/*', 'res/*', 'samples/*', 'tests/soup/*', 'tests/search/*']
    },
    install_requires=[
        'beautifulsoup4>=4.7.1',
        'busm>=0.9.0',
        'lxml>=4.3.3',
        'requests>=2.21.0',
        'pandas>=0.24.2',
        'PyYAML>=5.1.2'
    ],
    python_requires='>=3.5'
)


================================================
FILE: twnews/__init__.py
================================================
"""
twnews 套件載入前作業,用來解決 Windows 環境會發生的編碼問題
"""

import sys
import locale
import _locale

# 以下是魔法不要亂改
if locale.getpreferredencoding() == 'cp950':
    # pylint: disable=protected-access, global-statement

    # 編碼是 CP950 就強制轉 UTF-8
    _locale._getdefaultlocale = (lambda *args: ['zh_TW', 'utf-8'])

    # 置換 Python 3.5 的 print() 避免轉碼錯誤
    VERSION = sys.version_info
    if VERSION.major == 3 and VERSION.minor == 5:
        NATIVE_PRINT = __builtins__['print']
        def _replaced_print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False):
            """
            替換用的 print()
            """
            global NATIVE_PRINT
            filtered = []
            for obj in objects:
                if isinstance(obj, str):
                    filtered.append(obj.encode('cp950', 'ignore').decode('cp950'))
                else:
                    filtered.append(obj)
            NATIVE_PRINT(*filtered, sep=sep, end=end, file=file, flush=flush)
        __builtins__['print'] = _replaced_print


================================================
FILE: twnews/__main__.py
================================================
"""
工具程式
"""

import sys
import locale
import os.path
from datetime import datetime
from twnews.common import get_logger, VERSION
from twnews.soup import NewsSoup
from twnews.search import NewsSearch

def soup(path):
    """
    分解新聞
    """
    print('-' * 75)
    nsoup = NewsSoup(path)
    print('原始路徑: {}'.format(path))
    print('最終路徑: {}'.format(nsoup.path))
    print('頻道: {}'.format(nsoup.channel))
    print('標題: {}'.format(nsoup.title()))
    ndt = nsoup.date()
    if ndt is not None:
        print('日期: {}'.format(ndt.strftime('%Y-%m-%d %H:%M:%S')))
    else:
        print('日期: None')
    print('記者: {}'.format(nsoup.author()))
    print('內文:')
    print(nsoup.contents())
    print('有效內容率: {:.2f}%'.format(nsoup.effective_text_rate() * 100))
    print('-' * 75)

def search_and_list(keyword, channel):
    """
    搜尋,然後列出新聞標題
    """
    print('測試搜尋')
    nsearch = NewsSearch(channel, limit=10)
    results = nsearch.by_keyword(keyword).to_dict_list()
    logger = get_logger()

    for (i, result) in enumerate(results):
        try:
            print('{:03d}: {}'.format(i, result['title']))
            print('     日期: {}'.format(result['date']))
            print('     連結: {}'.format(result['link']))
        except ValueError as ex:
            logger.error('例外類型: %s', type(ex).__name__)
            logger.error(ex)

def search_and_soup(keyword, channel):
    """
    搜尋,然後分解新聞
    """
    print('測試搜尋與分解, 搜尋中 ...', end='', flush=True)
    logger = get_logger()
    nsearch = NewsSearch(channel, limit=10)
    nsoups = nsearch.by_keyword(keyword).to_soup_list()
    print('\r測試搜尋與分解' + ' ' * 20, flush=True)

    for (i, nsoup) in enumerate(nsoups):
        try:
            print('{:03d}: {}'.format(i, nsoup.path))
            print('     記者: {} / 日期: {}'.format(nsoup.author(), nsoup.date()))
            print('     標題: {}'.format(nsoup.title()))
            print('     {} ...'.format(nsoup.contents(30)), flush=True)
        except ValueError as ex:
            logger.error('例外類型: %s', type(ex).__name__)
            logger.error(ex)

def search_and_compare_performance(keyword):
    """
    測試各新聞台搜尋效能
    """
    print('測試各新聞台搜尋效能')
    summary = {}

    for channel in ['appledaily', 'cna', 'ettoday', 'ltn', 'setn', 'udn']:
        print()
        print(channel)
        print('-' * 60)
        summary[channel] = []
        for repeat in range(3):
            nsearch = NewsSearch(channel, limit=100)
            nsearch.by_keyword(keyword)
            results = nsearch.to_dict_list()
            total = len(results)
            tpp = nsearch.elapsed() / nsearch.pages()
            tpr = nsearch.elapsed() / total
            summary[channel].append(tpp)
            msg = '{:03d}: {:.3f} 秒/頁, {:.3f} 秒/筆, 共 {} 頁, 總耗時: {:.3f} 秒'
            print(msg.format(repeat, tpp, tpr, nsearch.pages(), nsearch.elapsed()))
        print('-' * 60)

    print()
    print('Markdown 摘要表:')
    print()
    print('&nbsp; | 1st | 2nd | 3rd')
    print('---- | ---- | ---- | ----')
    for (channel, samples) in summary.items():
        print(channel, end='')
        for sample in samples:
            print(' | {:.3f}'.format(sample), end='')
        print()
    print()

def compare_keyword(keyword):
    """
    比較關鍵字在各媒體的出現次數
    """
    print('比較上個月 "{}" 在各媒體標題出現次數'.format(keyword))
    now = datetime.now()
    nts = now.timestamp()
    nts = nts - nts % 86400
    day_lmon = datetime.fromtimestamp(nts - 86400 * now.day).day
    beg_date = datetime(now.year, now.month - 1, 1).strftime('%Y-%m-%d')
    end_date = datetime(now.year, now.month - 1, day_lmon).strftime('%Y-%m-%d')
    print('時間區間: {} ~ {}'.format(beg_date, end_date))

    media = {
        'appledaily': '  蘋果',
        'cna': '中央社',
        'ettoday': '  東森',
        'ltn': '  自由',
        'setn': '  三立',
        'udn': '  聯合'
    }

    for (channel, name) in media.items():
        nsearch = NewsSearch(
            channel,
            beg_date=beg_date,
            end_date=end_date,
            limit=999
        )
        results = nsearch.by_keyword(keyword, title_only=True).to_dict_list()
        msg = '{}: {}'.format(name, len(results))
        print(msg, flush=True)

def usage():
    """
    使用說明
    """
    print('twnews {} (預設編碼: {})'.format(
        VERSION, locale.getpreferredencoding()))
    print()
    usage_path = os.path.dirname(__file__) + '/conf/usage.txt'
    with open(usage_path, 'r') as usage_file:
        print(usage_file.read())

def get_cmd_param(index, default=None):
    """
    取得 shell 參數
    """
    if len(sys.argv) > index:
        return sys.argv[index]
    return default

def main():
    """
    main()
    """
    action = get_cmd_param(1)
    if action == 'soup':
        keyword = get_cmd_param(
            2, 'https://tw.news.appledaily.com/local/realtime/20181025/1453825')
        soup(keyword)
    elif action == 'search':
        keyword = get_cmd_param(2, '酒駕')
        channel = get_cmd_param(3, 'appledaily')
        search_and_list(keyword, channel)
    elif action == 'snsp':
        keyword = get_cmd_param(2, '酒駕')
        channel = get_cmd_param(3, 'appledaily')
        search_and_soup(keyword, channel)
    elif action == 'sncp':
        keyword = get_cmd_param(2, '酒駕')
        search_and_compare_performance(keyword)
    elif action == 'cpkw':
        keyword = get_cmd_param(2, '酒駕')
        compare_keyword(keyword)
    else:
        if action != 'help':
            print('動作名稱錯誤')
            print()
        usage()

if __name__ == '__main__':
    main()


================================================
FILE: twnews/cache.py
================================================
"""
快取處理模組
"""

import os
import lzma
import json

class DateCache:
    """
    處理日期命名快取
    """

    def __init__(self, category, item, data_format):
        """
        建立日期命名快取
        """
        self.category = category
        self.item = item
        self.data_format = data_format

    def get_path(self, datestr):
        """
        產生快取檔路徑
        """
        cache_dir = os.path.expanduser('~/.twnews/cache/' + self.category)
        if not os.path.isdir(cache_dir):
            os.makedirs(cache_dir)
        return '%s/%s-%s.%s.xz' % (cache_dir, self.item, datestr, self.data_format)

    def has(self, datestr):
        """
        檢查快取檔是否存在
        """
        cache_path = self.get_path(datestr)
        return os.path.isfile(cache_path)

    def load(self, datestr):
        """
        載入快取檔
        """
        content = None
        cache_path = self.get_path(datestr)
        with lzma.open(cache_path, 'rt') as f_cache:
            if self.data_format == 'json':
                content = json.load(f_cache)
            else:
                content = f_cache.read()
        return content

    def save(self, datestr, content):
        """
        儲存快取檔
        """
        cache_path = self.get_path(datestr)
        with lzma.open(cache_path, 'wt') as f_cache:
            if self.data_format == 'json':
                json.dump(content, f_cache)
            else:
                f_cache.write(content)


================================================
FILE: twnews/common.py
================================================
"""
twnews 共用項目
"""

import json
import os
import os.path
import logging
import logging.config
import socket
import yaml

import requests
import requests.packages.urllib3.util.connection as urllib3_conn

# 強制 requests 使用 IPv4
urllib3_conn.allowed_gai_family = lambda: socket.AF_INET

# pylint: disable=global-statement
__LOGGER_LOADED = False
__ALLCONF = None
__SESSION = {
    "direct": None,
    "proxy": None
}

VERSION = '0.3.3'

def found_socks5():
    """
    檢查是否有 SOCKS 5 Proxy
    """
    found = False
    with socket.socket(socket.AF_INET) as sock:
        try:
            sock.connect(('localhost', 9050))
            found = True
        except socket.error:
            pass
        finally:
            sock.close()
    return found

def get_package_dir():
    """
    取得套件根目錄,用來定位套件內資源
    """
    return os.path.dirname(__file__)

def get_logger(name='news'):
    """
    取得 logger 如果已經存在就使用現有的
    """
    global __LOGGER_LOADED

    if not __LOGGER_LOADED:
        logdir = os.path.expanduser('~/.twnews/log')
        if not os.path.isdir(logdir):
            os.makedirs(logdir)

        cfg_yaml = '{}/conf/logging.yaml'.format(get_package_dir())
        with open(cfg_yaml, 'r') as cfg_file:
            cfg_dict = yaml.load(cfg_file, Loader=yaml.SafeLoader)
            for hname in cfg_dict['handlers']:
                handler = cfg_dict['handlers'][hname]
                if 'filename' in handler and handler['filename'].startswith('~'):
                    handler['filename'] = os.path.expanduser(handler['filename'])
            logging.config.dictConfig(cfg_dict)
            __LOGGER_LOADED = True

    logger = None
    if __LOGGER_LOADED:
        logger = logging.getLogger(name)

    return logger

def get_session(proxy_first):
    """
    取得 requests session 如果已經存在就使用現有的
    """
    global __SESSION
    logger = get_logger('common')

    if proxy_first and found_socks5():
        logger.debug('透過 proxy 連線')
        session_type = 'proxy'
    else:
        session_type = 'direct'

    if __SESSION[session_type] is None:
        logger.debug('建立新的 session')
        user_agent = 'Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) ' \
            + 'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36'
        __SESSION[session_type] = requests.Session()
        __SESSION[session_type].headers.update({
            "Accept": "text/html,application/xhtml+xml,application/xml",
            "Accept-Encoding": "gzip, deflate",
            "Accept-Language": "zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7",
            "Cache-Control": "max-age=0",
            "Connection": "keep-alive",
            "User-Agent": user_agent
        })
        if session_type == 'proxy':
            __SESSION[session_type].proxies = {
                'http': 'socks5h://localhost:9050',
                'https': 'socks5h://localhost:9050'
            }
    else:
        logger.debug('使用現有 session')

    return __SESSION[session_type]

def get_all_conf():
    """
    取得完整設定
    """
    global __ALLCONF

    if __ALLCONF is None:
        soup_cfg = '{}/conf/news-soup.json'.format(get_package_dir())
        with open(soup_cfg, 'r') as conf_file:
            __ALLCONF = json.load(conf_file)

    return __ALLCONF

def detect_channel(path):
    """
    偵測路徑對應的新聞頻道
    """
    all_conf = get_all_conf()
    for channel in all_conf:
        if channel in path:
            return channel
    return ''

def get_channel_conf(channel, action=None):
    """
    載入新聞台設定
    """
    all_conf = get_all_conf()
    if channel in all_conf:
        chconf = all_conf[channel]
        if action is None:
            return chconf
        if action in chconf:
            return chconf[action]
    return None

def get_cache_dir(category):
    """
    取得快取目錄
    """
    logger = get_logger()
    cache_dir = os.path.expanduser('~/.twnews/cache/' + category)
    if not os.path.isdir(cache_dir):
        logger.debug('建立快取目錄: %s', cache_dir)
        os.makedirs(cache_dir)
    logger.debug('使用快取目錄: %s', cache_dir)
    return cache_dir


================================================
FILE: twnews/conf/logging.yaml
================================================
version: 1
disable_existing_loggers: true
formatters:
  standard:
    datefmt: '%Y-%m-%d %H:%M:%S'
    format: '[%(asctime)s] %(levelname)-7s | %(message)s'
  simple:
    datefmt: '%H:%M:%S'
    format: '[%(asctime)s] %(name)-7s | %(levelname)-7s | %(message)s'
handlers:
  common_log:
    level: DEBUG
    formatter: standard
    class: logging.handlers.TimedRotatingFileHandler
    filename: '~/.twnews/log/common.log'
    when: 'D'
  news_log:
    level: DEBUG
    formatter: standard
    class: logging.handlers.TimedRotatingFileHandler
    filename: '~/.twnews/log/news.log'
    when: 'D'
  finance_log:
    level: DEBUG
    formatter: standard
    class: logging.handlers.TimedRotatingFileHandler
    filename: '~/.twnews/log/finance.log'
    when: 'D'
  stdout:
    level: ERROR
    formatter: simple
    class: logging.StreamHandler
    stream: ext://sys.stdout
  telegram:
    level: INFO
    formatter: simple
    class: busm.BusmHandler
    subject: finance crawler
loggers:
  common:
    level: DEBUG
    handlers:
      - common_log
      - stdout
  news:
    level: DEBUG
    handlers:
      - news_log
      - stdout
  finance:
    level: DEBUG
    handlers:
      - finance_log
      - telegram
      - stdout


================================================
FILE: twnews/conf/news-soup.json
================================================
{
  "what's that": {
    "name": "頻道名稱",
    "layout_list": [
      { "name": "特殊排版名稱", "prefix": "特殊排版網址" }
    ],
    "mobile": {
      "title_node": "標題所在的 CSS selector",
      "date_node": "日期時間所在的 CSS Selector",
      "date_format": "日期時間的格式,會用 datetime.strptime 處理",
      "author_node": "記者所在的 CSS selector,空字串表示記者姓名在內文中,程式會自動偵測",
      "article_node": "內文所在的 CSS selector"
    },
    "search": {
      "url": "搜尋網址形式",
      "begin_date_format": "開始日期參數形式",
      "end_date_format": "結束日期參數形式",
      "result_node": "搜尋結果位置",
      "title_node": "標題位置,相對於搜尋結果",
      "link_node": "連結位置,相對於搜尋結果",
      "date_node": "日期位置,相對於搜尋結果",
      "date_format": "日期格式"
    }
  },
  "appledaily": {
    "name": "蘋果日報",
    "layout_list": [
      { "layout": "home", "prefix": "http://home.appledaily.com.tw/" }
    ],
    "mobile": {
      "title_node": "article.nm-content > div.nm-article > header > h2",
      "date_node": "article.nm-content > div.nm-article > header > div.time-and-share > time",
      "date_format": "建立時間:%Y/%m/%d %H:%M",
      "author_node": "",
      "article_node": "article.nm-content > div.nm-article > div.nm-article-body > div.text"
    },
    "home": {
      "title_node": "#maincontent > section > div.ncbox_cont > h1",
      "date_node": "#maincontent > section > div.ncbox_cont > div.nctimeshare > time",
      "date_format": "%Y年%m月%d日",
      "author_node": "",
      "article_node": "#maincontent > section > div.ncbox_cont > div.articulum > p"
    },
    "search": {
      "url": "https://tw.appledaily.com/search/ajaxresult/page/${PAGE}?querystrS=${KEYWORD}&sort=time&searchType=all",
      "begin_date_format": "&dateStart=%Y/%m/%d",
      "end_date_format": "&dateEnd=%Y/%m/%d",
      "result_node": "",
      "title_node": "title",
      "link_node": "sharing > url",
      "date_node": "pubDate",
      "date_format": "%Y%m%d"
    }
  },
  "chinatimes": {
    "name": "中時電子報",
    "layout_list": [],
    "mobile": {
      "title_node": "article.article-box h1.article-title",
      "date_node": "article.article-box div.meta-info > time",
      "date_with_children": true,
      "date_format": "%H:%M%Y/%m/%d",
      "author_node": "article.article-box div.author > a",
      "article_node": "article.article-box div.article-body > p"
    }
  },
  "cna": {
    "name": "中央社",
    "layout_list": [],
    "mobile": {
      "title_node": "div.centralContent > h1",
      "date_node": "div.centralContent > div.timeBox > div.updatetime > span",
      "date_format": "%Y/%m/%d %H:%M",
      "author_node": "",
      "article_node": "div.centralContent > div.paragraph > p"
    },
    "search": {
      "url": "https://www.cna.com.tw/cna2018api/api/simplelist/searchkeyword/${KEYWORD}/pageidx/${PAGE}/",
      "result_node": "result > SimpleItems",
      "link_node": "PageUrl",
      "title_node": "HeadLine",
      "date_node": "CreateTime",
      "date_format": "%Y/%m/%d %H:%M"
    }
  },
  "ettoday": {
    "name": "東森新聞雲",
    "layout_list": [
      { "layout": "fashion", "prefix": "https://fashion.ettoday.net/" },
      { "layout": "game", "prefix": "https://game.ettoday.net/" },
      { "layout": "health", "prefix": "https://health.ettoday.net/" },
      { "layout": "game", "prefix": "https://house.ettoday.net/" },
      { "layout": "game", "prefix": "https://pets.ettoday.net/" },
      { "layout": "speed", "prefix": "https://speed.ettoday.net/" },
      { "layout": "game", "prefix": "https://sports.ettoday.net/" },
      { "layout": "game", "prefix": "https://star.ettoday.net/" },
      { "layout": "health", "prefix": "https://travel.ettoday.net/" }
    ],
    "mobile": {
      "title_node": "div.subject_news > header > h1",
      "date_node": "span.date > time",
      "date_format": "%Y年%m月%d日 %H:%M",
      "author_node": "",
      "article_node": "div.subject_news > div.story > p"
    },
    "fashion": {
      "title_node": "header > h1",
      "date_node": "time.date",
      "date_format": "%Y/%m/%d %H:%M",
      "author_node": "",
      "article_node": "article div.story > p"
    },
    "game": {
      "title_node": "article h1.title",
      "date_node": "time",
      "date_format": "%Y年%m月%d日 %H:%M",
      "author_node": "",
      "article_node": "article div.story > p"
    },
    "health": {
      "title_node": "header > h1",
      "date_node": "time",
      "date_format": "時間: %Y年%m月%d日 %H:%M",
      "author_node": "",
      "article_node": "article div.story > p"
    },
    "speed": {
      "title_node": "article h1.title",
      "date_node": "time",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "",
      "article_node": "article div.story > p"
    },
    "search": {
      "url": "https://www.ettoday.net/news_search/doSearch.php?keywords=${KEYWORD}&page=${PAGE}&daydiff=3&idx=1",
      "last_page": "div.page_nav > div.menu_page > p.info",
      "page_pattern": "共(\\d+)頁",
      "result_node": "#result-list > div.archive",
      "link_node": "div.box_2 > h2 > a",
      "title_node": "div.box_2 > h2 > a",
      "date_node": "div.box_2 > p.detail > span.date",
      "date_pattern": "\\d{4}-\\d{2}-\\d{2}",
      "date_format": "%Y-%m-%d"
    }
  },
  "ltn": {
    "name": "自由時報",
    "layout_list": [
      { "layout": "3c", "prefix": "https://3c.ltn.com.tw" },
      { "layout": "auto", "prefix": "https://auto.ltn.com.tw" },
      { "layout": "ec", "prefix": "https://ec.ltn.com.tw" },
      { "layout": "ent", "prefix": "https://ent.ltn.com.tw" },
      { "layout": "estate", "prefix": "https://estate.ltn.com.tw" },
      { "layout": "food", "prefix": "https://food.ltn.com.tw" },
      { "layout": "istyle", "prefix": "https://istyle.ltn.com.tw" },
      { "layout": "market", "prefix": "https://market.ltn.com.tw" },
      { "layout": "playing", "prefix": "https://playing.ltn.com.tw" },
      { "layout": "sports", "prefix": "https://sports.ltn.com.tw" },
      { "layout": "talk", "prefix": "https://talk.ltn.com.tw" }
    ],
    "mobile": {
      "title_node": "div.com > h2",
      "date_node": "div.com > div.h2_else > div.time",
      "date_format": [ "%Y-%m-%d %H:%M", "%Y-%m-%d" ],
      "author_node": "",
      "article_node": "div.com > div.boxTitle > p"
    },
    "3c": {
      "title_node": "section.nbox > h1",
      "date_node": "section.nbox > span.writer > span:nth-of-type(1)",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "section.nbox > span.writer > span:nth-of-type(2)",
      "article_node": "div.boxTitle > p"
    },
    "auto": {
      "title_node": "section > div.title_box > h1",
      "date_node": "section > div.title_box > div.unit_else",
      "date_format": "%Y/%m/%d",
      "author_node": "section > div.title_box > div.unit_else",
      "article_node": "section > div.detail > p"
    },
    "ec": {
      "title_node": "article.com > h1",
      "date_node": "article.com > div.h1_else > div.time",
      "date_format": "%Y-%m-%d %H:%M:%S",
      "author_node": "",
      "article_node": "article.com > div.textbody > p"
    },
    "ent": {
      "title_node": "div.news_content > h1",
      "date_node": "div.news_content > div.author > div.date",
      "date_format": "%Y/%m/%d %H:%M",
      "author_node": "",
      "article_node": "div.news_content > p"
    },
    "estate": {
      "title_node": "div.container > h1",
      "date_node": "div.container > p.author > span.time",
      "date_format": "%Y/%m/%d %H:%M",
      "author_node": "div.container > p.author",
      "article_node": "div.container > div.wordright > p"
    },
    "food": {
      "title_node": "section > div.context > h2",
      "date_node": "section > div.context > b.date",
      "date_format": "%Y/%m/%d",
      "author_node": "",
      "article_node": "section > div.context > p"
    },
    "istyle": {
      "title_node": "div.content > div.title_box > h1",
      "date_node": "div.content > div.title_box > div.unit_else",
      "date_format": "%b. %d %Y",
      "author_node": "div.content > div.title_box > div.unit_else",
      "article_node": "div.content > div.article_content > p"
    },
    "market": {
      "title_node": "div.com > h2",
      "date_node": "div.com > div.h2_else > div.time",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "",
      "article_node": "div.com > div.h2_else > p"
    },
    "playing": {
      "title_node": "div.left_content > div.article_header > h1",
      "date_node": "div.left_content > div.article_header > div.author > span:nth-of-type(2)",
      "date_format": "%Y-%m-%d",
      "author_node": "div.left_content > div.article_header > div.author > span:nth-of-type(1)",
      "article_node": "div.left_content > div.article_body > div.text > p"
    },
    "sports": {
      "title_node": "div.news_content > h1",
      "date_node": "div.news_content > div.author > div.date",
      "date_format": "%Y/%m/%d %H:%M",
      "author_node": "",
      "article_node": "div.news_content > p"
    },
    "talk": {
      "title_node": "div.conbox > h1",
      "date_node": "div.conbox > div.top_share > div.writer > a > span:nth-of-type(2) > div.mobile_none",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "div.conbox > div.top_share > div.writer > a > span:nth-of-type(2)",
      "article_node": "div.conbox > div.cont > p"
    },
    "search": {
      "url": "https://m.ltn.com.tw/search/${PAGE}?q=${KEYWORD}",
      "begin_date_format": "&start=%Y-%m-%d",
      "end_date_format": "&end=%Y-%m-%d",
      "result_node": "ul.news > li",
      "link_node": "a",
      "title_node": "a > p",
      "date_node": "a > span",
      "date_format": "%Y-%m-%d"
    }
  },
  "setn": {
    "name": "三立新聞網",
    "layout_list": [
      { "layout": "entertainment", "prefix": "https://www.setn.com/e/" }
    ],
    "mobile": {
      "title_node": "section.news-all-area > h1",
      "date_node": "section.news-all-area > div.page-date > time",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "",
      "article_node": "#ckuse > article > p"
    },
    "entertainment": {
      "title_node": "div.content1 > div.title > h1",
      "date_node": "div.content1 > div.title > div.titleBtnBlock > div.time",
      "date_format": "%Y/%m/%d %H:%M:%S",
      "author_node": "",
      "article_node": "div.Content2 > p"
    },
    "search": {
      "url": "https://www.setn.com/m/search.aspx?q=${KEYWORD}&p=${PAGE}",
      "result_node": "div.news-area",
      "link_node": "div.news-info > div.news-word > a",
      "title_node": "div.news-info > div.news-word > a",
      "date_node": "div.news-info > div.lable-date",
      "date_format": "%Y/%m/%d %H:%M"
    }
  },
  "udn": {
    "name": "聯合新聞網",
    "layout_list": [
      { "layout": "autos", "prefix": "https://autos.udn.com" },
      { "layout": "game", "prefix": "https://game.udn.com" },
      { "layout": "global", "prefix": "https://global.udn.com" },
      { "layout": "game", "prefix": "https://health.udn.com" },
      { "layout": "house", "prefix": "https://house.udn.com" },
      { "layout": "nba", "prefix": "https://nba.udn.com" },
      { "layout": "opinion", "prefix": "https://opinion.udn.com" },
      { "layout": "running", "prefix": "https://running.udn.com" },
      { "layout": "stars", "prefix": "https://stars.udn.com" },
      { "layout": "style", "prefix": "https://style.udn.com" },
      { "layout": "vision", "prefix": "https://vision.udn.com" }
    ],
    "mobile": {
      "title_node": "#story_body_content > h1",
      "date_node": "#story_bady_info > div.story_bady_info_author > span",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "#story_bady_info > div.story_bady_info_author > a",
      "article_node": "#story_body_content > p"
    },
    "autos": {
      "title_node": "#story_art_title",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M:%S",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story_body_content > p"
    },
    "game": {
      "title_node": "#story_art_title",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story_body_content > p"
    },
    "global": {
      "title_node": "#story_art_title",
      "date_node": "#story_bady_info",
      "date_format": "%Y/%m/%d",
      "author_node": "",
      "article_node": "#story_body > div.story_body_content > p"
    },
    "house": {
      "title_node": "#story_art_title",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M:%S",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story_body_content > p"
    },
    "nba": {
      "title_node": "#story_body_content > h1",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story_body_content > span > p"
    },
    "opinion": {
      "title_node": "#container > main > h1",
      "date_node": ".story_bady_info_author > time",
      "date_format": "%Y/%m/%d",
      "author_node": ".story_bady_info_author > .story_bady_info > a",
      "article_node": "#container > main > p"
    },
    "stars": {
      "title_node": "#story_art_title",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story > p"
    },
    "style": {
      "title_node": "#story_art_title > font",
      "date_node": "#shareBar > div.shareBar__info > div.shareBar__info--author > span",
      "date_format": "%Y-%m-%d %H:%M",
      "author_node": "#shareBar > div.shareBar__info > div.shareBar__info--author",
      "article_node": "#story-main > p"
    },
    "search": {
      "url": "https://udn.com/search/result/2/${KEYWORD}/${PAGE}",
      "last_page": "#result_list > div.pagelink > span.total",
      "page_pattern": "共 (\\d+) 頁",
      "result_node": "#search_content > dt",
      "link_node": "a",
      "title_node": "a > h2",
      "date_node": "a > span.cat",
      "date_pattern": "\\d{4}/\\d{2}/\\d{2}",
      "date_format": "%Y/%m/%d"
    }
  }
}


================================================
FILE: twnews/conf/usage.txt
================================================
用法:
  python3 -m twnews 動作

動作列表:
  soup   [路徑]           分解一則新聞,路徑可以是新聞網址或檔案路徑
  search [關鍵字] [頻道]  關鍵字搜尋新聞,取10筆,列出搜尋結果
  snsp   [關鍵字] [頻道]  關鍵字搜尋新聞,取10筆,讀取新聞,分解新聞
  sncp   [關鍵字]         關鍵字搜尋新聞,取100筆,比較各新聞網站效能
  cpkw   [關鍵字]         關鍵字搜尋上個月的新聞,統計各家媒體標題出現關鍵字的次數
  help                    顯示這個訊息

頻道列表:
  appledaily  蘋果日報
  cna         中央社
  ettoday     東森新聞雲
  ltn         自由時報
  setn        三立新聞網
  udn         聯合新聞網

預設值:
  [路徑]    https://tw.news.appledaily.com/local/realtime/20181025/1453825 (這是一則上吊新聞)
  [頻道]    appledaily
  [關鍵字]  酒駕

* 中時電子報雖然不支援搜尋功能,但是分解新聞功能有作用


================================================
FILE: twnews/exceptions.py
================================================
"""
例外模組
"""

class SyncException(Exception):
    """
    同步資料時觸發的例外
    """
    def __init__(self, reason):
        super().__init__()
        self.reason = reason

class NetworkException(SyncException):
    """
    因網路問題觸發的例外
    """

class InvalidDataException(SyncException):
    """
    因無效資料問題觸發的例外
    """


================================================
FILE: twnews/finance/__init__.py
================================================
"""
財經資料蒐集工具共用模組
"""

import os
import sqlite3
import sys

from requests.exceptions import RequestException

import twnews.common as common
from twnews.exceptions import NetworkException

DDL_LIST = [
    # 三大法人
    '''
    CREATE TABLE IF NOT EXISTS `institution` (
        trading_date TEXT,
        security_id TEXT,
        security_name TEXT,
        foreign_trend INTEGER,
        stic_trend INTEGER,
        dealer_trend INTEGER,
        PRIMARY KEY (`trading_date`, `security_id`)
    );
    ''',
    # 融資融券
    '''
    CREATE TABLE IF NOT EXISTS `margin` (
        trading_date TEXT,
        security_id TEXT,
        security_name TEXT,
        buying_balance INTEGER,
        selling_balance INTEGER,
        PRIMARY KEY (`trading_date`, `security_id`)
    );
    ''',
    # 借券賣出
    '''
    CREATE TABLE IF NOT EXISTS `short_sell` (
        trading_date TEXT,
        security_id TEXT,
        security_name TEXT,
        borrowed INTEGER,
        selled INTEGER,
        PRIMARY KEY (`trading_date`, `security_id`)
    );
    ''',
    # 鉅額交易
    '''
    CREATE TABLE IF NOT EXISTS `block` (
        trading_date TEXT,
        security_id TEXT,
        security_name TEXT,
        tick_rank INTEGER,
        tick_type TEXT,
        close REAL,
        volume INTEGER,
        total INTEGER,
        PRIMARY KEY (`trading_date`, `security_id`, `tick_rank`)
    );
    ''',
    # ETF 淨值折溢價
    '''
    CREATE TABLE IF NOT EXISTS `etf_offset` (
        trading_date TEXT,
        security_id TEXT,
        security_name TEXT,
        close REAL,
        net REAL,
        offset REAL,
        PRIMARY KEY (`trading_date`, `security_id`)
    );
    '''
]

# 股權分散
DDL_DIST = '''
CREATE TABLE IF NOT EXISTS level{:02d} (
    `trading_date` TEXT NOT NULL,
    `security_id` TEXT NOT NULL,
    `numof_holders` INTEGER NOT NULL,
    `numof_stocks` INTEGER NOT NULL,
    `percentof_stocks` REAL NOT NULL,
    PRIMARY KEY(`trading_date`, `security_id`)
);
'''

REPEAT_LIMIT = 3
REPEAT_INTERVAL = 5

def get_connection(rebuild=False):
    """
    自動產生財經資料庫與取得連線
    """
    db_dir = os.path.expanduser('~/.twnews/db')
    if not os.path.isdir(db_dir):
        os.makedirs(db_dir)

    db_path = db_dir + '/finance.sqlite'
    if rebuild:
        os.remove(db_path)
    db_ready = os.path.isfile(db_path)

    db_conn = sqlite3.connect(db_path)
    if not db_ready:
        # 產生籌碼資料表
        for ddl in DDL_LIST:
            db_conn.execute(ddl)

        # 產生各級股權分散表, 有 1~17 級
        for level in range(1, 18):
            ddl = DDL_DIST.format(level)
            db_conn.execute(ddl)

        db_conn.commit()

    return db_conn

def get_argument(index, default=''):
    """
    取得 shell 參數, 或使用預設值
    """
    if len(sys.argv) <= index:
        return default
    return sys.argv[index]

def fucking_get(hook, url, params):
    """
    共用 HTTP GET 處理邏輯
    """
    session = common.get_session(False)
    try:
        resp = session.get(url, params=params)
        if resp.status_code != 200:
            msg = 'Got HTTP error, status code: %d' % resp.status_code
            raise NetworkException(msg)
        dataset = hook(resp)
    except RequestException as ex:
        msg = 'Cannot get response, exception type: %s' % type(ex).__name__
        raise NetworkException(msg)

    return dataset


================================================
FILE: twnews/finance/broker.py
================================================
"""
券商與分公司資訊與門牌定位

已知問題:
* 有筆紀錄無法成功定位
  1. 8770 大鼎
  2. 1115 台灣企銀-太平
  3. 779m 國票-中港
  4. 9636 富邦-中壢
  這 4 筆用工人智慧取經緯度,其餘自動化
* TGOS 服務查詢 500 次的時候,會拒絕之後的 request
  可能需要重置 requests session,現階段程式無法一口氣完成 9xx 筆定位
"""

import os
import re
import json
import sqlite3
import time
import urllib.parse

import requests
from bs4 import BeautifulSoup
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Column, Integer, String, Float, create_engine
from pyproj import Transformer

Base = declarative_base()

class TradingLocation(Base):
    """
    分點資料
    """
    __tablename__ = 'trading_locs'
    id = Column(String(4), primary_key=True)
    name = Column(String(100), nullable=False)
    address = Column(String(100), nullable=False)
    phone = Column(String(100), nullable=False)
    parent = Column(String(100), nullable=False, default=-1)
    lat = Column(Float(), nullable=False, default=0.0)
    lng = Column(Float(), nullable=False, default=0.0)
    started_at = Column(String(100), nullable=False)

def date_conv(hw_date):
    """
    民國日期轉西元日期 ISO 格式
    """
    patterns = [
        r'^(\d{2,3})(\d{2})(\d{2})$',
        r'^(\d{2,3})/(\d{2})/(\d{2})$',
    ]

    ad_date = None
    for pattern in patterns:
        m = re.match(pattern, hw_date)
        if not m:
            continue
        d_yy = int(m[1]) + 1911
        d_mm = int(m[2])
        d_dd = int(m[3])
        ad_date = '%4d-%02d-%02d' % (d_yy, d_mm, d_dd)
        break

    return ad_date

def visit_branch(session, parent_id):
    """
    爬分公司
    """
    url = 'https://www.twse.com.tw/brokerService/brokerServiceAudit'
    params = {
        'stkNo': parent_id,
        'showType': 'list',
        'focus': 6
    }
    resp = requests.get(url, params=params)
    if resp.status_code != 200:
        exit(1)

    branches = []
    soup = BeautifulSoup(resp.text, 'lxml')
    table = soup.select('#table6 > table')[0]
    for row in table.select('tr'):
        cols = row.select('td')
        if not cols:
            continue
        if cols[0].has_attr('colspan'):
            break

        loc = TradingLocation()
        loc.id = cols[0].text.strip()
        loc.name = cols[1].text.strip()
        loc.address = cols[3].text.strip()
        loc.phone = cols[4].text.strip()
        loc.started_at = date_conv(cols[2].text.strip())
        loc.parent = parent_id
        session.merge(loc)

def visit_parent(session):
    """
    爬母公司
    """
    resp = requests.get('https://www.twse.com.tw/zh/brokerService/brokerServiceAudit')
    if resp.status_code != 200:
        exit(1)

    db_path = os.path.expanduser('~/.twnews/db/finance.sqlite')
    db_conn = sqlite3.connect(db_path)
    soup = BeautifulSoup(resp.text, 'lxml')
    table = soup.select('#table2 > table')[0]
    for row in table.select('tr'):
        cols = row.select('td')
        if not cols:
            continue

        loc = TradingLocation()
        loc.id = cols[0].text.strip()
        loc.name = cols[1].text.strip()
        loc.address = cols[3].text.strip()
        loc.phone = cols[4].text.strip()
        loc.started_at = date_conv(cols[2].text.strip())
        session.merge(loc)

        visit_branch(session, loc.id)

def geocode_single(address, req_session, sid, transformer):
    """
    處理單筆地址定位
    """
    # 騙到 EPSG:3826 座標
    time.sleep(0.5)
    url = 'https://map.tgos.tw/TGOSCloud/Generic/Project/GHTGOSViewer_Map.ashx'
    params = {
        'method': 'querymoiaddr',
        'address': address,
        'useoddeven': False,
        'sid': sid
    }
    resp = req_session.post(url, data=params)
    if resp.status_code != 200:
        # print(resp.status_code)
        return (0.0, 0.0)

    try:
        respjson = json.loads(resp.text)
    except:
        return (0.0, 0.0)

    if not respjson['AddressList']:
        # print(json.dumps(respjson, indent=2))
        return (0.0, 0.0)

    result = json.loads(resp.text)['AddressList'][0]
    point = transformer.transform(result['X'], result['Y'])
    return point

def geocode(orm_session):
    # 座標轉換器,只需要做一次 (需要 pyproj)
    transformer = Transformer.from_crs("EPSG:3826", "EPSG:4326")

    # 做個給掰的 session
    # 目前發現偽裝真實用戶只能查 500 個地址,可能需要重置連線才能處理更多資料
    req_session = requests.Session()
    req_session.headers.update({
        'Accept': 'text/html,application/xhtml+xml,application/xml',
        'Accept-Encoding': 'gzip, deflate',
        'Accept-Language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
        'Cache-Control': 'max-age=0',
        'Connection': 'keep-alive',
        'User-Agent': 'Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) ' \
            + 'AppleWebKit/537.36 (KHTML, like Gecko) ' \
            + 'Chrome/46.0.2490.76 Mobile Safari/537.36'
    })

    # 騙到 pagekey, 只需要做一次
    url = 'https://map.tgos.tw/TGOSCloud/Web/Map/TGOSViewer_Map.aspx'
    resp = req_session.get(url)
    if resp.status_code != 200:
        print(resp.status_code)
        exit(1)

    match = re.search(r"sircPAGEKEY\s?='([^\']+)';", resp.text)
    if not match:
        print('無法取得 pagekey')
        exit(1)

    pagekey = urllib.parse.unquote(match[1])
    # print(pagekey)

    # 加上這一行其他的 request 才不會被擋掉
    req_session.headers['Referer'] = url

    # 騙到 sid, 只需要做一次
    url = 'https://map.tgos.tw/TGOSCloud/Generic/Utility/UG_Handler.ashx'
    params = {
        'method': 'GetSessionID',
        'pagekey': pagekey
    }
    resp = req_session.post(url, params=params)
    if resp.status_code != 200:
        print(resp.status_code)
        exit(1)

    fields = json.loads(resp.text)
    sid = fields['id']
    # print(sid)

    q = orm_session.query(TradingLocation).filter(
        TradingLocation.lat == 0.0,
        TradingLocation.lng == 0.0
    )
    for loc in q.all():
        slices = loc.address.split('、')
        m = re.search('^.+號', slices[0])
        if m:
            fixed_address = m[0]
        else:
            fixed_address = slices[0] + '號'
        (lat, lng) = geocode_single(fixed_address, req_session, sid, transformer)
        print(loc.id, fixed_address, lat, lng)
        loc.lat = lat
        loc.lng = lng
        orm_session.merge(loc)
        #orm_session.commit()
    #address = '台北市114內湖區石潭路151號'
    #geocode_single(address, req_session, sid, transformer)

def main():
    db_repl = 'sqlite:///' + os.path.expanduser('~/.twnews/db/finance.sqlite')
    engine = create_engine(db_repl, echo=True)
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()
    # 爬券商分點
    # visit_parent(session)
    # 券商分點做地理定位
    geocode(session)
    session.commit()

if __name__ == '__main__':
    main()


================================================
FILE: twnews/finance/tdcc.py
================================================
"""
集保中心資料蒐集模組
"""

import os
import re
import sqlite3

import pandas

import twnews.common as common
from twnews.finance import get_argument, fucking_get, get_connection
from twnews.cache import DateCache

def import_dist(csv_date='latest'):
    """
    匯入指定日期的股權分散表到資料庫
    """
    logger = common.get_logger('finance')
    csv_dir = common.get_cache_dir('tdcc')

    if csv_date == 'latest':
        max_date = ''
        for filename in os.listdir(csv_dir):
            match = re.match(r'dist-(\d{8}).csv.xz', filename)
            if match is not None:
                if max_date < match.group(1):
                    max_date = match.group(1)
        csv_date = max_date

    csv_file = '{}/dist-{}.csv.xz'.format(csv_dir, csv_date)
    iso_date = re.sub(r'(\d{4})(\d{2})(\d{2})', r'\1-\2-\3', csv_date)
    if not os.path.isfile(csv_file):
        logger.error('沒有 TDCC %s 的股權分散表檔案: %s', iso_date, csv_file)
        return

    db_conn = get_connection()
    col_names = [
        'trading_date',
        'security_id',
        'level',
        'numof_holders',
        'numof_stocks',
        'percentof_stocks'
    ]
    # Pandas 會自動偵測 extension 解壓縮, 不需要自幹
    dfrm = pandas.read_csv(csv_file, skiprows=1, header=None, names=col_names)
    # print(df.head(3))
    # print(df.tail(3))
    sql_template = '''
    INSERT INTO level%02d (
        trading_date, security_id,
        numof_holders, numof_stocks, percentof_stocks
    ) VALUES (?,?,?,?,?);
    '''
    affected = -1
    for index, row in dfrm.iterrows():
        sql = sql_template % row['level']
        try:
            db_conn.execute(sql, (
                iso_date,
                row['security_id'],
                row['numof_holders'],
                row['numof_stocks'],
                row['percentof_stocks']
            ))
            if index > 0 and index % 5000 == 0:
                logger.debug('已儲存 TDCC %s 的股權分散資料 %d 筆', iso_date, index)
            affected = index
        except sqlite3.IntegrityError:
            affected = 0
            break

    if affected > 0:
        logger.info('已匯入 TDCC %s 的股權分散資料 %d 筆', iso_date, affected)
    else:
        logger.warning('已匯入過 TDCC %s 的股權分散資料', iso_date)
    db_conn.commit()
    db_conn.close()

def rebuild_dist():
    """
    重建股權分散表資料庫
    """
    # 清除現有資料
    db_conn = get_connection()
    for level in range(1, 18):
        sql = 'DELETE FROM level%02d;' % level
        db_conn.execute(sql)
    db_conn.commit()
    db_conn.execute('VACUUM') # cannot VACUUM from within a transaction
    db_conn.close()

    # 確認可以重建的日期
    csv_dir = common.get_cache_dir('tdcc')
    date_list = []
    for filename in os.listdir(csv_dir):
        match = re.match(r'dist-(\d{8}).csv.xz', filename)
        if match is not None:
            date_list.append(match.group(1))

    # 依日期順序重建資料
    date_list.sort()
    for csv_date in date_list:
        import_dist(csv_date)

def backup_dist(refresh=False):
    """
    備份最新的股權分散表
    """
    url = 'https://smart.tdcc.com.tw/opendata/getOD.ashx'
    params = {
        'id': '1-5'
    }
    def hook(resp):
        logger = common.get_logger('finance')
        # 確認統計日期
        csv = resp.text
        dt_beg = csv.find('\n') + 1
        dt_end = csv.find(',', dt_beg)
        csv_date = csv[dt_beg:dt_end]
        date_cache = DateCache('tdcc', 'dist', 'csv')
        changed = refresh or not date_cache.has(csv_date)

        # 製作備份檔
        if changed:
            date_cache.save(csv_date, csv)
            logger.info('已更新 TDCC %s 的股權分散表', csv_date)
        else:
            logger.info('已存在 TDCC %s 的股權分散表, 不需更新', csv_date)

        return changed

    return fucking_get(hook, url, params)

def sync_dataset():
    """
    暫時寫成這個形式方便排程用
    """
    changed = backup_dist()
    if changed:
        import_dist()

    # 測試時,即使快取存在也重新匯入一次
    # backup_dist()
    # import_dist()

def main():
    """
    下載最新的股權分散表,轉檔到資料庫:
      python3 -m twnews.finance.tdcc
    使用既有的 CSV 檔案重建股權分散表資料庫:
      python3 -m twnews.finance.tdcc rebuild
    """
    action = get_argument(1, 'update')
    logger = common.get_logger('finance')

    if action == 'update':
        sync_dataset()
    elif action == 'rebuild':
        rebuild_dist()
    else:
        logger.error('無法識別的動作 %s', action)

if __name__ == '__main__':
    main()


================================================
FILE: twnews/finance/ticksmap.py
================================================
# pylint: disable=all

import io
import json
import os
import re
import requests
import sys
import subprocess
from datetime import datetime

import wx
import wx.xrc
from bs4 import BeautifulSoup

from twnews.cache import DateCache

class CaptchaDialog(wx.App):
    """
    驗證碼詢問 GUI
    """

    def __init__(self, captcha_stream):
        self.captcha_code = False
        self.captcha_stream = captcha_stream
        super().__init__(self)

    def OnInit(self):
        """
        載入 GUI 與 Enter 事件配置
        """
        try:
            path = os.path.realpath(os.path.dirname(__file__) + '/../res/ticksmap-captcha.xrc')
            res = wx.xrc.XmlResource(path)
            self.frame = res.LoadFrame(None, 'main_frame')

            cpi = wx.xrc.XRCCTRL(self.frame, 'captcha_image', 'wxStaticBitmap')
            im = wx.Image(self.captcha_stream)
            bm = wx.Bitmap(im)
            cpi.SetBitmap(bm)

            self.cpc = wx.xrc.XRCCTRL(self.frame, 'captcha_code', 'wxTextCtrl')
            self.cpc.Bind(wx.EVT_KEY_UP, self.OnKeyPress, id=wx.xrc.XRCID('captcha_code'))

            self.frame.Centre()
            self.frame.Show()
        except:
            return False

        return True

    def OnKeyPress(self, event):
        """
        文字輸入框按下 Enter 後送出
        """
        if event.GetKeyCode() == 13:
            self.captcha_code = self.cpc.GetValue()
            self.frame.Close()

def handle_captcha(captcha_stream):
    """
    用 wxPython 介面詢問驗證碼
    """
    app = CaptchaDialog(captcha_stream)
    app.MainLoop()
    return app.captcha_code

def load_soup(security_id, date_str):
    """
    分點進出 HTML 載入作業,含快取管理與下載流程
    TODO: 加強錯誤處理
    """

    dc = DateCache('bsr.twse', security_id, 'html')

    # 偵測快取
    if date_str != 'latest':
        if dc.has(date_str):
            soup = BeautifulSoup(dc.load(date_str), 'lxml')
            return soup
        return None

    # 無快取狀況下載分點進出網頁
    session = requests.Session()
    resp = session.get('https://bsr.twse.com.tw/bshtm/bsMenu.aspx')
    if resp.status_code != 200:
        return None

    soup = BeautifulSoup(resp.text, 'lxml')
    nodes = soup.select('form input')
    params = {}
    for node in nodes:
        name = node.attrs['name']

        # 忽略鉅額交易的 radio button
        if name in ('RadioButton_Excd', 'Button_Reset'):
            continue

        if 'value' in node.attrs:
            params[node.attrs['name']] = node.attrs['value']
        else:
            params[node.attrs['name']] = ''

    # 找 captcha 圖片
    captcha_image = soup.select('#Panel_bshtm img')[0]['src']
    m = re.search(r'guid=(.+)', captcha_image)
    if m is None:
        return None

    # 顯示 captcha 圖片
    url = 'https://bsr.twse.com.tw/bshtm/' + captcha_image
    resp = requests.get(url)
    if resp.status_code != 200:
        return None

    captcha_stream = io.BytesIO(resp.content)
    captcha_code = handle_captcha(captcha_stream)
    if captcha_code == False:
        return None

    params['CaptchaControl1'] = captcha_code
    params['TextBox_Stkno'] = security_id

    # 送出
    resp = session.post('https://bsr.twse.com.tw/bshtm/bsMenu.aspx', data=params)
    if resp.status_code != 200:
        print('任務失敗: %d' % resp.status_code)
        return None

    soup = BeautifulSoup(resp.text, 'lxml')
    nodes = soup.select('#HyperLink_DownloadCSV')
    if len(nodes) == 0:
        print('任務失敗,沒有下載連結')
        return None

    # 下載分點進出 CSV
    # 分點進出 CSV 沒有日期
    # HTML https://bsr.twse.com.tw/bshtm/bsContent.aspx?v=t (其實任意參數都是 HTML)
    # CSV  https://bsr.twse.com.tw/bshtm/bsContent.aspx
    url = 'https://bsr.twse.com.tw/bshtm/bsContent.aspx?v=t'
    resp = session.get(url)
    if resp.status_code != 200:
        print('任務失敗,無法下載分點進出 CSV')
        return None

    # 拆解分點進出網頁,找日期欄,然後存入快取
    soup = BeautifulSoup(resp.text, 'lxml')
    root_tables = soup.select('#sp_HtmlCode > table')

    # 取日期資訊,製作快取檔
    # TODO: 簡化與容錯處理
    meta_path = 'tr > td > table > tr:nth-of-type(1) > td > table'
    meta_table = root_tables[0].select(meta_path)[0]
    date_item = meta_table.select('#receive_date')[0]
    date_str = date_item.get_text().strip().replace('/', '')
    dc.save(date_str, resp.text)
    return soup

def parse_tick_node(row):
    """
    撮合值的 td 取出 (分點 ID, 買量, 賣量)
    """
    cols = row.select('td')
    rank = cols[0].get_text().strip()
    if rank == '':
        return (False, False, False)
    loc_id = cols[1].get_text().strip()[0:4]
    bid_vol = int(cols[3].get_text().strip().replace(',', ''))
    ask_vol = int(cols[4].get_text().strip().replace(',', ''))
    return (loc_id, bid_vol, ask_vol)

def main():
    """
    分點進出表下載工具
    """
    if len(sys.argv) < 2:
        exit(1)

    if len(sys.argv) == 3:
        datestr = sys.argv[2]
    else:
        datestr = 'latest'

    print('* 載入分點進出明細', flush=True)
    soup = load_soup(sys.argv[1], datestr)
    if soup is None:
        exit(1)

    # 取最頂層的表格,數目應該是頁數的 2 倍
    # * 偶數表是 meta data 與 detail
    # * 奇數表是 pagination
    print('* 製作熱值圖 HTML', flush=True)
    root_tables = soup.select('#sp_HtmlCode > table')
    meta_path = 'tr > td > table > tr:nth-of-type(1) > td > table'
    meta_table = root_tables[0].select(meta_path)[0]
    trading_date = meta_table.select('#receive_date')[0] \
        .get_text() \
        .strip() \
        .replace('/', '-')
    (security_id, security_name) = meta_table.select('#stock_id')[0] \
        .get_text() \
        .strip() \
        .split('\xa0') # 注意這裡是 ascii 160, &nbsp; 不能使用一般空白字元切

    # 處理根層級的偶數表,蒐集撮合紀錄
    tick_nodes = []
    for even in range(0, len(root_tables), 2):
        # 偶數表中取撮合紀錄表,左表奇數次交易,右表偶數次交易
        sel = '#table2 > tr > td > table'
        ticks_table = root_tables[even].select(sel)
        even_ticks = ticks_table[0].select('tr')
        odd_ticks = ticks_table[1].select('tr')
        i = 1
        while i < len(even_ticks):
            tick_nodes.append(even_ticks[i])
            tick_nodes.append(odd_ticks[i])
            i += 1

    # 分點買賣明細加總
    volsum = {}
    for tick_node in tick_nodes:
        (loc_id, bid, ask) = parse_tick_node(tick_node)
        if loc_id != False:
            if loc_id not in volsum:
                volsum[loc_id] = { 'bid': bid, 'ask': ask }
            else:
                volsum[loc_id]['bid'] += bid
                volsum[loc_id]['ask'] += ask

    # 置入 geojson
    path = os.path.realpath(os.path.dirname(__file__) + '/../res/ticksmap-locations.geojson')
    with open(path, 'r') as ffc:
        fc = json.load(ffc)
        filtered_features = []
        for feature in fc['features']:
            loc_id = feature['properties']['id']
            if loc_id in volsum:
                feature['properties']['bid'] = volsum[loc_id]['bid']
                feature['properties']['ask'] = volsum[loc_id]['ask']
                filtered_features.append(feature)
        fc['features'] = filtered_features

    # 置入 heatmap 模板
    path = os.path.realpath(os.path.dirname(__file__) + '/../res/ticksmap-template.html')
    with open(path, 'r') as tplf:
        leaflet = os.path.expanduser('~/Desktop/ticksmap-{}-{}.html'.format(
            security_id,
            trading_date.replace('-', '')
        ))
        screenshot = os.path.expanduser('~/Desktop/ticksmap-{}-{}.png'.format(
            security_id,
            trading_date.replace('-', '')
        ))

        # 將分點進出資料套版,生成 leaflet 熱值圖網頁
        fcstr = json.dumps(fc)
        template = tplf.read()
        template = template.replace('__FEATURE_COLLECTION__', fcstr, 1)
        template = template.replace('__SECURITY_ID__', security_id, 1)
        template = template.replace('__SECURITY_NAME__', security_name, 1)
        html = template.replace('__DATE__', trading_date, 1)
        with open(leaflet, 'w') as out:
            out.write(html)

        # 用 headless browser 將 leaflet 網頁轉換成圖檔
        # TODO: 自動偵測瀏覽器路徑
        #   * Windows 使用者 Chrome
        #   * Windows 系統 Chrome
        #   * Windows Brave
        #   * Chrome for MacOS
        chrome_bin = '/Applications/Brave Browser.app/Contents/MacOS/Brave Browser'
        # chrome_bin = r'C:\Program Files (x86)\BraveSoftware\Brave-Browser\Application\brave.exe' # ok!

        # 相關參數參考 https://peter.sh/experiments/chromium-command-line-switches/
        cmd = [
            chrome_bin,
            '--headless',
            '--hide-scrollbars',
            '--window-size=1000x850',
            '--screenshot=' + screenshot,
            leaflet
        ]
        print('* 製作熱值圖 PNG', flush=True)
        subprocess.run(cmd, stderr=subprocess.DEVNULL)

        # 用預設軟體開啟圖檔
        #         MacOS: open
        #  Windows + PS: start-process -file
        # print(os.path.realpath(screenshot))
        # cmd = [ 'powershell', 'start-process', '-file', os.path.realpath(screenshot) ]
        cmd = [ 'open', os.path.realpath(screenshot) ]
        # print(cmd)
        subprocess.run(cmd)

if __name__ == '__main__':
    main()


================================================
FILE: twnews/finance/tpex.py
================================================
"""
櫃買中心資料蒐集模組
"""

from datetime import datetime
import io
import re
import sqlite3
import sys
import time

import pandas

import twnews.common as common
from twnews.finance import get_argument, fucking_get, get_connection, REPEAT_LIMIT, REPEAT_INTERVAL
from twnews.cache import DateCache
from twnews.exceptions import InvalidDataException, NetworkException

URL_BASE = 'https://www.tpex.org.tw/web/stock/'

def download_margin(datestr):
    """
    下載信用交易資料集
    """
    url = URL_BASE + 'margin_trading/margin_balance/margin_bal_result.php'
    params = {
        'd': datestr,
        'l': 'zh_tw',
        'o': 'json'
    }
    def hook(resp):
        dataset = resp.json()
        if dataset['iTotalRecords'] == 0:
            raise InvalidDataException('日期格式錯誤,或是 %s 的資料尚未產出' % datestr)
        return dataset
    return fucking_get(hook, url, params)

def download_block(datestr):
    """
    下載鉅額交易資料集
    """
    url = URL_BASE + 'block_trade/daily_qutoes/block_day_download.php'
    params = {
        'd': datestr,
        'l': 'zh_tw',
        's': '0,asc,0',
        'charset': 'UTF-8'
    }
    def hook(resp):
        # TODO: 需要檢查資料完整性,任意日期都有資料
        dataset = resp.text
        return dataset
    return fucking_get(hook, url, params)

def download_institution(datestr):
    """
    下載三大法人資料集
    """
    url = URL_BASE + '3insti/daily_trade/3itrade_hedge_result.php'
    params = {
        'd': datestr,
        'l': 'zh_tw',
        'o': 'json',
        's': '0,asc',
        't': 'D',
        'se': 'EW'
    }
    def hook(resp):
        dataset = resp.json()
        # 取標題內日期,轉換成與輸入參數相同的格式
        title = dataset['reportTitle']
        date_in_title = re.sub('[年月]', '/', title[:title.find(' ') - 1])
        # 參數日期與標題日期相同才視為有效資料
        if datestr != date_in_title:
            raise InvalidDataException('日期格式錯誤,或是 %s 的資料尚未產出' % datestr)
        return dataset
    return fucking_get(hook, url, params)

def download_selled(datestr):
    """
    下載已借券賣出
    """
    url = URL_BASE + 'margin_trading/margin_sbl/margin_sbl_download.php'
    params = {
        'd': datestr,
        'l': 'zh-tw',
        's': '0,asc,0',
        'charset': 'utf-8'
    }
    def hook(resp):
        # TODO: 非交易日還是有資料,不過只有 header 和 footer
        obuf = io.StringIO()
        ibuf = io.StringIO(resp.text)

        # 過濾 header, footer & 欄位名稱消除空白字元
        enabled = False
        line = ibuf.readline()
        while line:
            if not enabled:
                enabled = line.startswith('股票代號')
                if enabled:
                    line = line.replace(' ', '')
            else:
                enabled = line.startswith('"')
            if enabled:
                obuf.write(line)
            line = ibuf.readline()

        dataset = obuf.getvalue()
        ibuf.close()
        obuf.close()
        return dataset
    return fucking_get(hook, url, params)

def import_margin(dbcon, trading_date, dataset):
    """
    匯入信用交易資料集
    """
    sql = '''
        INSERT INTO `margin` (
            trading_date, security_id, security_name,
            buying_balance, selling_balance
        ) VALUES (?,?,?,?,?)
    '''
    for detail in dataset['aaData']:
        # TODO
        security_id = detail[0]
        security_name = detail[1].strip()
        buying_balance = int(detail[6].replace(',', ''))
        selling_balance = int(detail[14].replace(',', ''))
        dbcon.execute(sql, (
            trading_date,
            security_id,
            security_name,
            buying_balance,
            selling_balance
        ))

def import_block(dbcon, trading_date, dataset):
    """
    匯入鉅額交易資料集
    """
    col_names = ['交易型態', '交割期別', '代號', '名稱', '成交價格(元)', '成交股數', '成交值(元)', '成交時間']
    # Pandas 只能吃 file-like 參數,需要把 dataset 轉成 StringIO
    dfrm = pandas.read_csv(
        io.StringIO(dataset),
        engine='python', sep=',',
        skiprows=3, skipfooter=1, names=col_names
    )

    sql = '''
        INSERT INTO `block` (
            trading_date, security_id, security_name,
            tick_rank, tick_type,
            close, volume, total
        ) VALUES (?,?,?,?,?,?,?,?)
    '''
    tick_rank = {}

    for _index, row in dfrm.iterrows():
        security_id = row['代號']
        security_name = row['名稱']
        tick_type = row['交易型態']
        close = float(row['成交價格(元)'])
        volume = int(row['成交股數'].replace(',', ''))
        total = int(row['成交值(元)'].replace(',', ''))

        if security_id not in tick_rank:
            tick_rank[security_id] = 1
        else:
            tick_rank[security_id] += 1

        dbcon.execute(sql, (
            trading_date,
            security_id,
            security_name,
            tick_rank[security_id],
            tick_type,
            close,
            volume,
            total
        ))

def import_institution(dbcon, trading_date, dataset):
    """
    匯入三大法人資料集
    """
    sql = '''
        INSERT INTO `institution` (
            trading_date, security_id, security_name,
            foreign_trend, stic_trend, dealer_trend
        ) VALUES (?,?,?,?,?,?)
    '''
    for detail in dataset['aaData']:
        security_id = detail[0]
        security_name = detail[1].strip()
        # 外資 + 外資自營 + 陸資
        foreign_trend = int(detail[10].replace(',', '')) // 1000
        # 投信
        stic_trend = int(detail[13].replace(',', '')) // 1000
        # 自營投資 + 自營避險
        dealer_trend = int(detail[22].replace(',', '')) // 1000
        dbcon.execute(sql, (
            trading_date, security_id, security_name,
            foreign_trend, stic_trend, dealer_trend
        ))

def import_selled(dbcon, trading_date, dataset):
    """
    匯入借券賣出
    """
    sql = '''
        UPDATE `short_sell` SET `security_name`=?,`selled`=?
        WHERE `trading_date`=? AND `security_id`=?
    '''
    dfrm = pandas.read_csv(io.StringIO(dataset), sep=',')
    for _, row in dfrm.iterrows():
        security_id = row['股票代號']
        security_name = row['股票名稱'].strip()
        balance = int(row['借券賣出當日餘額'].replace(',', ''))
        dbcon.execute(sql, (security_name, balance, trading_date, security_id))

def sync_dataset(dsitem, trading_date='latest'):
    """
    同步資料集共用流程

    * HTTP 日期格式: 108/05/29
    * DB, Cache 日期格式: 2019-05-29
    """
    if trading_date == 'latest':
        trading_date = datetime.today().strftime('%Y-%m-%d')

    logger = common.get_logger('finance')
    dtm = re.match(r'(\d{4})-(\d{2})-(\d{2})', trading_date)
    tokens = [
        str(int(dtm.group(1)) - 1911),
        dtm.group(2),
        dtm.group(3)
    ]
    datestr = '/'.join(tokens)
    data_format = 'csv' if dsitem in ['block', 'selled'] else 'json'
    this_mod = sys.modules[__name__]

    daily_cache = DateCache('tpex', dsitem, data_format)
    if daily_cache.has(trading_date):
        # 載入快取資料集
        logger.info('套用 TPEX %s 的 %s 快取', trading_date, dsitem)
        dataset = daily_cache.load(trading_date)
    else:
        # 下載資料集
        dataset = None
        repeat = 0
        hookfunc = getattr(this_mod, 'download_' + dsitem)
        while dataset is None and repeat < REPEAT_LIMIT:
            repeat += 1
            if repeat > 1:
                time.sleep(REPEAT_INTERVAL)
            try:
                logger.info('下載 TPEX %s 的 %s', trading_date, dsitem)
                dataset = hookfunc(datestr)
                logger.debug('儲存 TPEX %s 的 %s', trading_date, dsitem)
                daily_cache.save(trading_date, dataset)
            except InvalidDataException as ex:
                logger.error(
                    '無法取得 TPEX %s 的 %s (重試: %d, %s)',
                    trading_date, dsitem, repeat, ex.reason
                )
                repeat = REPEAT_LIMIT
            except NetworkException as ex:
                logger.error(
                    '無法取得 TPEX %s 的 %s (重試: %d, %s)',
                    trading_date, dsitem, repeat, ex.reason
                )

    if dataset is None:
        return

    # return

    # 匯入資料庫
    # pylint: disable=bare-except
    dbcon = get_connection()
    hookfunc = getattr(this_mod, 'import_' + dsitem)
    try:
        hookfunc(dbcon, trading_date, dataset)
        logger.info('匯入 TPEX %s 的 %s', trading_date, dsitem)
    except sqlite3.IntegrityError as ex:
        logger.warning('已經匯入過 TPEX %s 的 %s', trading_date, dsitem)
    except:
        logger.error('無法匯入 TPEX %s 的 %s', trading_date, dsitem)
    dbcon.commit()
    dbcon.close()

def main():
    """
    python3 -m twnews.finance.tpex {action}
    python3 -m twnews.finance.tpex {action} {date}
    """
    action = get_argument(1)
    trading_date = get_argument(2, 'latest')

    if trading_date != 'latest' and re.match(r'^\d{4}-\d{2}-\d{2}$', trading_date) is None:
        print('日期格式錯誤')
        return

    if action in ['block', 'margin', 'institution', 'selled']:
        sync_dataset(action, trading_date)
    else:
        print('參數錯誤')

if __name__ == '__main__':
    main()


================================================
FILE: twnews/finance/twse.py
================================================
"""
證交所資料蒐集模組
"""

from datetime import datetime
import io
import re
import sqlite3
import sys
import time

import pandas

import twnews.common as common
from twnews.finance import get_argument, fucking_get, get_connection, REPEAT_LIMIT, REPEAT_INTERVAL
from twnews.cache import DateCache
from twnews.exceptions import NetworkException, InvalidDataException

def download_margin(datestr):
    """
    下載信用交易資料集
    """
    url = 'http://www.twse.com.tw/exchangeReport/MI_MARGN'
    params = {
        'date': datestr,
        'response': 'json',
        'selectType': 'ALL'
    }
    def hook(resp):
        dataset = resp.json()
        status = dataset['stat']
        if status == 'OK':
            if not dataset['data']:
                raise InvalidDataException('可能尚未結算或是非交易日')
        else:
            raise NetworkException(status)
        return dataset
    return fucking_get(hook, url, params)

def download_block(datestr):
    """
    下載鉅額交易資料集
    """
    url = 'http://www.twse.com.tw/block/BFIAUU'
    params = {
        'date': datestr,
        'response': 'json',
        'selectType': 'S'
    }
    def hook(resp):
        dataset = resp.json()
        status = dataset['stat']
        if status == 'OK':
            if not dataset['data']:
                raise InvalidDataException('可能尚未結算或是非交易日')
        else:
            raise NetworkException(status)
        return dataset
    return fucking_get(hook, url, params)

def download_institution(datestr):
    """
    下載三大法人資料集
    """
    url = 'http://www.twse.com.tw/fund/T86'
    params = {
        'date': datestr,
        'response': 'json',
        'selectType': 'ALL'
    }
    def hook(resp):
        # TODO: 需要驗證資料完整性
        dataset = resp.json()
        status = dataset['stat']
        if status != 'OK':
            raise NetworkException(status)
        return dataset
    return fucking_get(hook, url, params)

def download_borrowed(datestr):
    """
    下載可借券賣出資料集 (只有當天資料)
    TODO: 待確認資料切換時間
    """
    url = 'http://www.twse.com.tw/SBL/TWT96U'
    params = {
        'response': 'csv'
    }
    def hook(resp):
        dataset = resp.text
        line1 = dataset[:dataset.find('\r\n')]
        match = re.search(r'(\d{3})年(\d{2})月(\d{2})日', line1)
        if match is not None:
            dtup = (
                int(match.group(1)) + 1911,
                match.group(2),
                match.group(3)
            )
            dsdate = '%04d%s%s' % dtup
            if dsdate != datestr:
                raise InvalidDataException('資料日期 %s 與指定日期不同' % dsdate)
        else:
            raise NetworkException('無法取得 CSV 內的日期字串')
        return dataset
    return fucking_get(hook, url, params)

def download_selled(datestr):
    """
    下載已借券賣出
    """
    url = 'http://www.twse.com.tw/exchangeReport/TWT93U'
    params = {
        'date': datestr,
        'response': 'json'
    }
    def hook(resp):
        dataset = resp.json()
        status = dataset['stat']
        if status == 'OK':
            if not dataset['data']:
                raise InvalidDataException('可能尚未結算或非交易日')
        else:
            raise NetworkException(status)
        return dataset
    return fucking_get(hook, url, params)

def download_etfnet(datestr):
    """
    下載 ETF 淨值折溢價率
    """
    url = 'https://mis.twse.com.tw/stock/data/all_etf.txt'
    params = {}
    def hook(resp):
        dataset = resp.json()
        dsdate = dataset['a1'][1]['msgArray'][0]['i']
        if datestr != dsdate:
            raise InvalidDataException('資料日期 %s 與指定日期不同' % dsdate)
        return dataset
    return fucking_get(hook, url, params)

def import_margin(dbcon, trading_date, dataset):
    """
    匯入信用交易資料集
    """
    sql = '''
        INSERT INTO `margin` (
            trading_date, security_id, security_name,
            buying_balance, selling_balance
        ) VALUES (?,?,?,?,?)
    '''
    for detail in dataset['data']:
        security_id = detail[0]
        security_name = detail[1].strip()
        buying_balance = int(detail[6].replace(',', ''))
        selling_balance = int(detail[12].replace(',', ''))
        dbcon.execute(sql, (
            trading_date,
            security_id,
            security_name,
            buying_balance,
            selling_balance
        ))

def import_block(dbcon, trading_date, dataset):
    """
    匯入鉅額交易資料集
    """
    sql = '''
        INSERT INTO `block` (
            trading_date, security_id, security_name,
            tick_rank, tick_type,
            close, volume, total
        ) VALUES (?,?,?,?,?,?,?,?)
    '''
    tick_rank = {}
    for trade in dataset['data']:
        if trade[0] == '總計':
            break
        security_id = trade[0]
        security_name = trade[1]
        tick_type = trade[2]
        close = float(trade[3].replace(',', ''))
        volume = int(trade[4].replace(',', ''))
        total = int(trade[5].replace(',', ''))
        if security_id not in tick_rank:
            tick_rank[security_id] = 1
        else:
            tick_rank[security_id] += 1
        dbcon.execute(sql, (
            trading_date,
            security_id,
            security_name,
            tick_rank[security_id],
            tick_type,
            close,
            volume,
            total
        ))

def import_institution(dbcon, trading_date, dataset):
    """
    匯入三大法人資料集
    """
    sql = '''
        INSERT INTO `institution` (
            trading_date, security_id, security_name,
            foreign_trend, stic_trend, dealer_trend
        ) VALUES (?,?,?,?,?,?)
    '''
    for detail in dataset['data']:
        security_id = detail[0]
        security_name = detail[1].strip()
        foreign_trend = int(detail[4].replace(',', '')) // 1000
        stic_trend = int(detail[10].replace(',', '')) // 1000
        dealer_trend = int(detail[11].replace(',', '')) // 1000
        dbcon.execute(sql, (
            trading_date, security_id, security_name,
            foreign_trend, stic_trend, dealer_trend
        ))

def import_borrowed(dbcon, trading_date, dataset):
    """
    匯入可借券賣出資料集 (使用 pandas 處理 CSV)
    """
    sql = '''
        INSERT INTO `short_sell` (
            trading_date, security_id, borrowed
        ) VALUES (?,?,?)
    '''
    col_names = ['sec1', 'vol1', 'sec2', 'vol2', 'shit']
    # Pandas 只能吃 file-like 參數,需要把 dataset 轉成 StringIO
    dfrm = pandas.read_csv(io.StringIO(dataset), sep=',', skiprows=3, header=None, names=col_names)
    cnt = 0
    for _, row in dfrm.iterrows():
        # 匯入左側資料
        security_id = row['sec1'].strip('="')
        borrowed = int(row['vol1'].replace(',', ''))
        dbcon.execute(sql, (trading_date, security_id, borrowed))
        cnt += 1
        # 匯入右側資料
        security_id = row['sec2'].strip('="')
        if security_id != '_':
            borrowed = int(row['vol2'].replace(',', ''))
            dbcon.execute(sql, (trading_date, security_id, borrowed))
            cnt += 1

def import_selled(dbcon, trading_date, dataset):
    """
    匯入已借券賣出資料集
    """
    sql = '''
        UPDATE `short_sell` SET `security_name`=?, `selled`=?
        WHERE `trading_date`=? AND `security_id`=?
    '''
    for detail in dataset['data']:
        security_id = detail[0]
        security_name = detail[1].strip()
        balance = int(detail[12].replace(',', ''))
        if security_id != '':
            # TODO: 如果 WHERE 條件不成立,沒更新到資料,應該要產生 Exception 觸發錯誤回報
            dbcon.execute(sql, (
                security_name, balance,
                trading_date, security_id
            ))

def import_etfnet(dbcon, trading_date, dataset):
    """
    匯入 ETF 淨值折溢價率
    """
    # 來源資料轉換 key/value 形式
    # 相同發行者的 ETF 會放一組
    etf_dict = {}
    for fund in dataset['a1']:
        if 'msgArray' in fund:
            for etf in fund['msgArray']:
                etf_dict[etf['a']] = etf

    # 依證券代碼順序處理
    sql = '''
        INSERT INTO `etf_offset` (
            trading_date, security_id, security_name,
            close, net, offset
        ) VALUES (?,?,?,?,?,?)
    '''
    for k in sorted(etf_dict.keys()):
        etf = etf_dict[k]
        dbcon.execute(sql, (trading_date, etf['a'], etf['b'], etf['e'], etf['f'], etf['g']))

def sync_dataset(dsitem, trading_date='latest'):
    """
    同步資料集共用流程
    """
    if trading_date == 'latest':
        trading_date = datetime.today().strftime('%Y-%m-%d')

    logger = common.get_logger('finance')
    datestr = trading_date.replace('-', '')
    data_format = 'csv' if dsitem == 'borrowed' else 'json'
    this_mod = sys.modules[__name__]

    daily_cache = DateCache('twse', dsitem, data_format)
    if daily_cache.has(datestr):
        # 載入快取資料集
        logger.debug('套用 TWSE %s 的 %s 快取', trading_date, dsitem)
        dataset = daily_cache.load(datestr)
    else:
        # 下載資料集
        dataset = None
        repeat = 0
        hookfunc = getattr(this_mod, 'download_' + dsitem)
        while dataset is None and repeat < REPEAT_LIMIT:
            repeat += 1
            if repeat > 1:
                time.sleep(REPEAT_INTERVAL)
            try:
                logger.info('下載 TWSE %s 的 %s', trading_date, dsitem)
                dataset = hookfunc(datestr)
                logger.debug('儲存 TWSE %s 的 %s', trading_date, dsitem)
                daily_cache.save(datestr, dataset)
            except InvalidDataException as ex:
                logger.error(
                    '無法取得 TWSE %s 的 %s (重試: %d, %s)',
                    trading_date, dsitem, repeat, ex.reason
                )
                repeat = REPEAT_LIMIT
            except NetworkException as ex:
                logger.error(
                    '無法取得 TWSE %s 的 %s (重試: %d, %s)',
                    trading_date, dsitem, repeat, ex.reason
                )

    if dataset is None:
        return

    # 匯入資料庫
    # pylint: disable=bare-except
    dbcon = get_connection()
    hookfunc = getattr(this_mod, 'import_' + dsitem)
    try:
        hookfunc(dbcon, trading_date, dataset)
        logger.info('匯入 TWSE %s 的 %s', trading_date, dsitem)
    except sqlite3.IntegrityError as ex:
        logger.warning('已經匯入過 TWSE %s 的 %s', trading_date, dsitem)
    except:
        # TODO: ex.args[0] 不確定是否可靠, 需要再確認
        logger.error('無法匯入 TWSE %s 的 %s', trading_date, dsitem)
    dbcon.commit()
    dbcon.close()

def main():
    """
    python3 -m twnews.finance.twse {action}
    python3 -m twnews.finance.twse {action} {date}
    """
    action = get_argument(1)
    trading_date = get_argument(2, 'latest')

    if trading_date != 'latest' and re.match(r'^\d{4}-\d{2}-\d{2}$', trading_date) is None:
        print('日期格式錯誤')
        return

    if action in ['block', 'margin', 'institution', 'borrowed', 'selled', 'etfnet']:
        sync_dataset(action, trading_date)
    else:
        print('參數錯誤')

if __name__ == '__main__':
    main()


================================================
FILE: twnews/res/ticksmap-captcha.xrc
================================================
<?xml version="1.0" encoding="utf-8"?>
<resource version="2.9.0.5">
  <object name="main_frame" class="wxFrame">
    <title>輸入驗證碼</title>
    <object class="wxBoxSizer">
      <orient>wxVERTICAL</orient>
      <minsize>230,160</minsize>
      <object class="spacer">
        <size>200,15</size>
      </object>
      <object class="sizeritem">
        <flag>wxALIGN_CENTER</flag>
        <object name="captcha_image" class="wxStaticBitmap">
          <size>200,60</size>
        </object>
      </object>
      <object class="spacer">
        <size>200,15</size>
      </object>
      <object class="sizeritem">
        <flag>wxALIGN_CENTER</flag>
        <object name="captcha_code" class="wxTextCtrl">
          <size>200,25</size>
        </object>
      </object>
      <object class="spacer">
        <size>200,15</size>
      </object>
      <object class="sizeritem">
        <flag>wxALIGN_CENTER</flag>
        <object class="wxStaticText">
          <size>200,30</size>
        	<style>wxALIGN_RIGHT</style>
        	<label>完成後按 Enter 送出</label>
        </object>
      </object>
    </object>
  </object>
</resource>


================================================
FILE: twnews/res/ticksmap-locations.geojson
================================================
{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55632732452477,
          25.04158688504074
        ]
      },
      "properties": {
        "id": "1020",
        "name": "\u5408\u5eab",
        "phone": "02-27319987"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51080823591725,
          25.043664306140116
        ]
      },
      "properties": {
        "id": "1030",
        "name": "\u571f\u9280",
        "phone": "02-23483948"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51310710381462,
          25.043913459588925
        ]
      },
      "properties": {
        "id": "1040",
        "name": "\u81fa\u9280\u8b49\u5238",
        "phone": "02-23882188"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51028779488004,
          25.051170155854706
        ]
      },
      "properties": {
        "id": "1110",
        "name": "\u53f0\u7063\u4f01\u9280",
        "phone": "02-25597171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53342667008538,
          25.052281416537795
        ]
      },
      "properties": {
        "id": "1160",
        "name": "\u65e5\u76db",
        "phone": "02-25048888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51167945998723,
          25.042175310370748
        ]
      },
      "properties": {
        "id": "1230",
        "name": "\u5f70\u9280",
        "phone": "02-23619654"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55309086689374,
          25.03297338266213
        ]
      },
      "properties": {
        "id": "1260",
        "name": "\u5b8f\u9060",
        "phone": "02-27008899"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54368267948932,
          25.047179768694402
        ]
      },
      "properties": {
        "id": "1360",
        "name": "\u6e2f\u5546\u9ea5\u683c\u7406",
        "phone": "02-27347500"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54931009689535,
          25.029235814800124
        ]
      },
      "properties": {
        "id": "1380",
        "name": "\u53f0\u7063\u532f\u7acb",
        "phone": "02-2326-8188"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54925496369476,
          25.02601029425199
        ]
      },
      "properties": {
        "id": "1440",
        "name": "\u7f8e\u6797",
        "phone": "02-23763766"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54925496369476,
          25.02601029425199
        ]
      },
      "properties": {
        "id": "1470",
        "name": "\u53f0\u7063\u6469\u6839\u58eb\u4e39\u5229",
        "phone": "02-27302888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54925496369476,
          25.02601029425199
        ]
      },
      "properties": {
        "id": "1480",
        "name": "\u7f8e\u5546\u9ad8\u76db",
        "phone": "02-27304000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54517955306842,
          25.058078123771878
        ]
      },
      "properties": {
        "id": "1520",
        "name": "\u745e\u58eb\u4fe1\u8cb8",
        "phone": "02-27156388"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55371954979228,
          25.037410864660483
        ]
      },
      "properties": {
        "id": "1530",
        "name": "\u6e2f\u5546\u5fb7\u610f\u5fd7",
        "phone": "02-21922888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56608187757948,
          25.03877471028857
        ]
      },
      "properties": {
        "id": "1560",
        "name": "\u6e2f\u5546\u91ce\u6751",
        "phone": "02-21769999"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56669383519447,
          25.04064778336209
        ]
      },
      "properties": {
        "id": "1570",
        "name": "\u6e2f\u5546\u6cd5\u570b\u8208\u696d",
        "phone": "02-21750800"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56608187757948,
          25.03877471028857
        ]
      },
      "properties": {
        "id": "1590",
        "name": "\u82b1\u65d7\u74b0\u7403",
        "phone": "8726-9000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5688646287331,
          25.03826461521507
        ]
      },
      "properties": {
        "id": "1650",
        "name": "\u65b0\u52a0\u5761\u5546\u745e\u9280",
        "phone": "02-87227200"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51309587724606,
          25.043075555632672
        ]
      },
      "properties": {
        "id": "2180",
        "name": "\u4e9e\u6771",
        "phone": "02-23618600"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54367316718195,
          25.05204949387523
        ]
      },
      "properties": {
        "id": "2200",
        "name": "\u5143\u5927\u671f\u8ca8",
        "phone": "02-27176000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54918468731599,
          25.028912522276457
        ]
      },
      "properties": {
        "id": "2210",
        "name": "\u7fa4\u76ca\u671f\u8ca8",
        "phone": "02-2700-2888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51706465414371,
          25.050152610866082
        ]
      },
      "properties": {
        "id": "5050",
        "name": "\u5927\u5c55",
        "phone": "02-25551234"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54614850573897,
          25.051999380423336
        ]
      },
      "properties": {
        "id": "5110",
        "name": "\u5bcc\u9686",
        "phone": "02-27182788"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53568958854905,
          25.057707199312865
        ]
      },
      "properties": {
        "id": "5260",
        "name": "\u5927\u6176",
        "phone": "02-25084888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.22860157863632,
          24.955210757911725
        ]
      },
      "properties": {
        "id": "5320",
        "name": "\u9ad8\u6a4b",
        "phone": "03-4224243"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52468602293277,
          25.04876928202976
        ]
      },
      "properties": {
        "id": "5380",
        "name": "\u7b2c\u4e00\u91d1",
        "phone": "02-25636262"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.6888876522262,
          24.163771451198038
        ]
      },
      "properties": {
        "id": "5460",
        "name": "\u5bf6\u76db",
        "phone": "04-22309377"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51701086522118,
          25.032492670595666
        ]
      },
      "properties": {
        "id": "5600",
        "name": "\u6c38\u8208",
        "phone": "02-23218200"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.82198887585135,
          24.56031630372757
        ]
      },
      "properties": {
        "id": "5660",
        "name": "\u65e5\u9032",
        "phone": "037-332266"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56518798473458,
          25.050297600264386
        ]
      },
      "properties": {
        "id": "5850",
        "name": "\u7d71\u4e00",
        "phone": "02-27478266"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30056660961714,
          22.634887195832352
        ]
      },
      "properties": {
        "id": "5860",
        "name": "\u76c8\u6ea2",
        "phone": "07-2888516"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.6107618870506,
          23.976696137468185
        ]
      },
      "properties": {
        "id": "5870",
        "name": "\u5149\u9686",
        "phone": "038-352181"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54394897702572,
          25.039444340340598
        ]
      },
      "properties": {
        "id": "5920",
        "name": "\u5143\u5bcc",
        "phone": "02-23255818"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.6818310173108,
          23.976856378820973
        ]
      },
      "properties": {
        "id": "5960",
        "name": "\u65e5\u8302",
        "phone": "049-2353266"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54427997732583,
          25.05284928794247
        ]
      },
      "properties": {
        "id": "6010",
        "name": "\u7287\u4e9e\u8b49\u5238",
        "phone": "02-27180101"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.68151648556015,
          24.13779326319005
        ]
      },
      "properties": {
        "id": "6110",
        "name": "\u53f0\u4e2d\u9280",
        "phone": "04-22268588"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.61617924984901,
          25.0589172348979
        ]
      },
      "properties": {
        "id": "6160",
        "name": "\u4e2d\u570b\u4fe1\u8a17",
        "phone": "02-66392000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29800724636863,
          22.623099703721376
        ]
      },
      "properties": {
        "id": "6210",
        "name": "\u65b0\u767e\u738b",
        "phone": "07-2118888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.5242747909776,
          23.87335182058557
        ]
      },
      "properties": {
        "id": "6380",
        "name": "\u5149\u548c",
        "phone": "04-8886500"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.30300040827895,
          24.993786703883426
        ]
      },
      "properties": {
        "id": "6450",
        "name": "\u6c38\u5168",
        "phone": "03-3352155"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.45795929347729,
          25.008461659152243
        ]
      },
      "properties": {
        "id": "6460",
        "name": "\u5927\u660c",
        "phone": "02-29689685"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.518143901973,
          25.045820830940887
        ]
      },
      "properties": {
        "id": "6480",
        "name": "\u798f\u90a6",
        "phone": "02-23836888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.30631297712581,
          24.992110225995972
        ]
      },
      "properties": {
        "id": "6620",
        "name": "\u5168\u6cf0",
        "phone": "03-3353359"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53258288735728,
          25.04201384619974
        ]
      },
      "properties": {
        "id": "6910",
        "name": "\u5fb7\u4fe1",
        "phone": "02-23939988"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.44149706381457,
          25.1694911587292
        ]
      },
      "properties": {
        "id": "6950",
        "name": "\u798f\u52dd",
        "phone": "02-26251818"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53098268243339,
          25.043042093948422
        ]
      },
      "properties": {
        "id": "7000",
        "name": "\u5146\u8c50",
        "phone": "02-23278988"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.20167088614014,
          22.99998813402813
        ]
      },
      "properties": {
        "id": "7030",
        "name": "\u81f4\u548c",
        "phone": "06-2219777"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.72673628232234,
          24.25488700661086
        ]
      },
      "properties": {
        "id": "7070",
        "name": "\u8c50\u8fb2",
        "phone": "04-25281000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.20649400143107,
          24.952791292516785
        ]
      },
      "properties": {
        "id": "7080",
        "name": "\u77f3\u6a4b",
        "phone": "03-4927373"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.3247967662418,
          22.620387790266808
        ]
      },
      "properties": {
        "id": "7670",
        "name": "\u91d1\u6e2f",
        "phone": "07-7253922"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5152164425769,
          25.012069955709205
        ]
      },
      "properties": {
        "id": "7750",
        "name": "\u5317\u57ce",
        "phone": "02-29283456"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51396094014368,
          25.06958687115064
        ]
      },
      "properties": {
        "id": "7790",
        "name": "\u570b\u7968",
        "phone": "02-25288988"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52250680552487,
          25.055200268481965
        ]
      },
      "properties": {
        "id": "8150",
        "name": "\u53f0\u65b0",
        "phone": "02-2181-5888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.32446616282999,
          22.58870591508886
        ]
      },
      "properties": {
        "id": "8380",
        "name": "\u5b89\u6cf0",
        "phone": "07-8122789"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56677327562443,
          25.032587487648936
        ]
      },
      "properties": {
        "id": "8440",
        "name": "\u6469\u6839\u5927\u901a",
        "phone": "02-87862288"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56521045118325,
          25.04281373711842
        ]
      },
      "properties": {
        "id": "8450",
        "name": "\u5eb7\u548c",
        "phone": "02-87871888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.44447000080147,
          23.475370003551532
        ]
      },
      "properties": {
        "id": "8490",
        "name": "\u842c\u6cf0",
        "phone": "05-2289911"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.5375320581055,
          24.259971290645026
        ]
      },
      "properties": {
        "id": "8520",
        "name": "\u4e2d\u8fb2",
        "phone": "04-26572801"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51309791903128,
          25.043609091008246
        ]
      },
      "properties": {
        "id": "8560",
        "name": "\u65b0\u5149",
        "phone": "02-23118181"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53511049086421,
          25.052216677516075
        ]
      },
      "properties": {
        "id": "8580",
        "name": "\u806f\u90a6\u5546\u9280",
        "phone": "02-25040066"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56413596337005,
          25.059200351508807
        ]
      },
      "properties": {
        "id": "8710",
        "name": "\u967d\u4fe1",
        "phone": "02-27629288"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5584161,
          25.0330058
        ]
      },
      "properties": {
        "id": "8770",
        "name": "\u5927\u9f0e",
        "phone": "02-87862666"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5476906789493,
          25.057382833887658
        ]
      },
      "properties": {
        "id": "8840",
        "name": "\u7389\u5c71",
        "phone": "02-27131313"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54912154770592,
          25.02199616910617
        ]
      },
      "properties": {
        "id": "8880",
        "name": "\u570b\u6cf0",
        "phone": "02-23269888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56354738285579,
          25.04027198666496
        ]
      },
      "properties": {
        "id": "8890",
        "name": "\u5927\u548c\u570b\u6cf0",
        "phone": "02-27239698"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56460399421985,
          25.033524181174343
        ]
      },
      "properties": {
        "id": "8900",
        "name": "\u6cd5\u9280\u5df4\u9ece",
        "phone": "02-87297000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56111164490846,
          25.03425396521133
        ]
      },
      "properties": {
        "id": "8960",
        "name": "\u9999\u6e2f\u4e0a\u6d77\u532f\u8c50",
        "phone": "02-66312899"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5476906789493,
          25.057382833887658
        ]
      },
      "properties": {
        "id": "9100",
        "name": "\u7fa4\u76ca\u91d1\u9f0e",
        "phone": "02-87898888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55150597524369,
          25.084875603296265
        ]
      },
      "properties": {
        "id": "9200",
        "name": "\u51f1\u57fa",
        "phone": "02-21818888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55091534806718,
          25.05773398758345
        ]
      },
      "properties": {
        "id": "9300",
        "name": "\u83ef\u5357\u6c38\u660c",
        "phone": "02-25456888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55228341974629,
          25.038171253374973
        ]
      },
      "properties": {
        "id": "9600",
        "name": "\u5bcc\u90a6",
        "phone": "02-87716888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54367316718195,
          25.05204949387523
        ]
      },
      "properties": {
        "id": "9800",
        "name": "\u5143\u5927",
        "phone": "02-27177777"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51313466224889,
          25.046858563293043
        ]
      },
      "properties": {
        "id": "9A00",
        "name": "\u6c38\u8c50\u91d1",
        "phone": "02-2311-4345"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.67932882032501,
          24.138023049092705
        ]
      },
      "properties": {
        "id": "1021",
        "name": "\u5408\u5eab- \u53f0\u4e2d",
        "phone": "04-22255141"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.20785217188497,
          22.998048342742422
        ]
      },
      "properties": {
        "id": "1022",
        "name": "\u5408\u5eab-\u53f0\u5357",
        "phone": "06-2260148"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.28415335084445,
          22.624858545156123
        ]
      },
      "properties": {
        "id": "1023",
        "name": "\u5408\u5eab-\u9ad8\u96c4",
        "phone": "07-5319755"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.44830999896347,
          23.480500006287315
        ]
      },
      "properties": {
        "id": "1024",
        "name": "\u5408\u5eab-\u5609\u7fa9",
        "phone": "05-2220016"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.74243480777429,
          25.129790883787546
        ]
      },
      "properties": {
        "id": "1025",
        "name": "\u5408\u5eab-\u57fa\u9686",
        "phone": "02-24288468"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.54226127991265,
          24.08201552308815
        ]
      },
      "properties": {
        "id": "1028",
        "name": "\u5408\u5eab - \u5f70\u5316",
        "phone": "04-7295528"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.36076902006839,
          22.63198133614876
        ]
      },
      "properties": {
        "id": "1029",
        "name": "\u5408\u5eab - \u9cf3\u5c71",
        "phone": "07-7905555"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.97270020694022,
          24.804291620951954
        ]
      },
      "properties": {
        "id": "102A",
        "name": "\u5408\u5eab-\u65b0\u7af9",
        "phone": "03-5262711"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53180372709332,
          25.052251466892706
        ]
      },
      "properties": {
        "id": "102C",
        "name": "\u5408\u5eab-\u81ea\u5f37",
        "phone": "02-25218815"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.31150586549468,
          24.991279043189472
        ]
      },
      "properties": {
        "id": "102E",
        "name": "\u5408\u5eab-\u6843\u5712",
        "phone": "03-347-3456"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.66157747025434,
          24.165313145938327
        ]
      },
      "properties": {
        "id": "102F",
        "name": "\u5408\u5eab-\u897f\u53f0\u4e2d",
        "phone": "04-2316-6368"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.49800733486221,
          25.06144447654486
        ]
      },
      "properties": {
        "id": "102G",
        "name": "\u5408\u5eab-\u4e09\u91cd",
        "phone": "02-2977-9588"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54533682918803,
          25.04842521151739
        ]
      },
      "properties": {
        "id": "102Q",
        "name": "\u5408\u5eab-\u570b\u969b\u8b49\u5238",
        "phone": "02-27319987"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.67990128465617,
          24.13825322494037
        ]
      },
      "properties": {
        "id": "1031",
        "name": "\u571f\u9280-\u53f0\u4e2d",
        "phone": "04-22266845"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.20339440181245,
          22.992252120248214
        ]
      },
      "properties": {
        "id": "1032",
        "name": "\u571f\u9280-\u53f0\u5357",
        "phone": "06-2200558"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.27129411775813,
          22.873571988641256
        ]
      },
      "properties": {
        "id": "1033",
        "name": "\u571f\u9280-\u9ad8\u96c4",
        "phone": "07-5319591"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.44979999686525,
          23.47966000124929
        ]
      },
      "properties": {
        "id": "1034",
        "name": "\u571f\u9280-\u5609\u7fa9",
        "phone": "05-2224930"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.96659466996886,
          24.80375868702233
        ]
      },
      "properties": {
        "id": "1035",
        "name": "\u571f\u9280-\u65b0\u7af9",
        "phone": "03-5213211"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.31489456142324,
          23.3332966299898
        ]
      },
      "properties": {
        "id": "1036",
        "name": "\u571f\u9280-\u7389\u91cc",
        "phone": "03-8884900"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.60528118307762,
          23.98263045084194
        ]
      },
      "properties": {
        "id": "1037",
        "name": "\u571f\u9280-\u82b1\u84ee",
        "phone": "038-330723"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5474650234814,
          25.024833568124606
        ]
      },
      "properties": {
        "id": "1038",
        "name": "\u571f\u9280-\u548c\u5e73",
        "phone": "02-27001555"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52724587319906,
          25.097343788113815
        ]
      },
      "properties": {
        "id": "1039",
        "name": "\u571f\u9280-\u58eb\u6797",
        "phone": "02-28323842"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.3164644400523,
          22.635521544152244
        ]
      },
      "properties": {
        "id": "103A",
        "name": "\u571f\u9280-\u5efa\u570b",
        "phone": "07-2224911"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.54183356049035,
          24.080770728752796
        ]
      },
      "properties": {
        "id": "103B",
        "name": "\u571f\u9280-\u5f70\u5316",
        "phone": "04-7201777"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.414848991017,
          23.351634208408967
        ]
      },
      "properties": {
        "id": "103C",
        "name": "\u571f\u9280-\u767d\u6cb3",
        "phone": "06-6852888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.60640542181336,
          25.05446942001287
        ]
      },
      "properties": {
        "id": "103F",
        "name": "\u571f\u9280-\u5357\u6e2f",
        "phone": "02-26539955"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.35732650742509,
          22.629326183169592
        ]
      },
      "properties": {
        "id": "1041",
        "name": "\u81fa\u9280-\u9cf3\u5c71",
        "phone": "07-7452126"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.19675565514052,
          22.99330465321226
        ]
      },
      "properties": {
        "id": "1042",
        "name": "\u81fa\u9280-\u81fa\u5357",
        "phone": "06-2202507"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51851236194777,
          25.062495687748147
        ]
      },
      "properties": {
        "id": "1043",
        "name": "\u81fa\u9280-\u6c11\u6b0a",
        "phone": "02-25529458"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.9702534128656,
          24.80177054430763
        ]
      },
      "properties": {
        "id": "1045",
        "name": "\u81fa\u9280-\u65b0\u7af9",
        "phone": "03-5266799"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.6204925306737,
          24.34941595522094
        ]
      },
      "properties": {
        "id": "104A",
        "name": "\u81fa\u9280-\u81fa\u4e2d",
        "phone": "04-22238370"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29537003211789,
          22.61962931052201
        ]
      },
      "properties": {
        "id": "104C",
        "name": "\u81fa\u9280-\u9ad8\u96c4",
        "phone": "07-2711434"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52639082254834,
          25.028249598276087
        ]
      },
      "properties": {
        "id": "104D",
        "name": "\u81fa\u9280-\u91d1\u5c71",
        "phone": "02-23418678"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.67896972188629,
          24.143131876821887
        ]
      },
      "properties": {
        "id": "1111",
        "name": "\u53f0\u7063\u4f01\u9280-\u53f0\u4e2d",
        "phone": "04-2270160"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.19802574994664,
          22.99270074166405
        ]
      },
      "properties": {
        "id": "1112",
        "name": "\u53f0\u7063\u4f01\u9280-\u53f0\u5357",
        "phone": "06-2238967"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30653312289412,
          22.64079219600887
        ]
      },
      "properties": {
        "id": "1113",
        "name": "\u53f0\u7063\u4f01\u9280-\u4e5d\u5982",
        "phone": "07-3137171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.45106999990244,
          23.479200007141184
        ]
      },
      "properties": {
        "id": "1114",
        "name": "\u53f0\u7063\u4f01\u9280-\u5609\u7fa9",
        "phone": "05-2278043"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.7178125,
          24.1301842
        ]
      },
      "properties": {
        "id": "1115",
        "name": "\u53f0\u7063\u4f01\u9280-\u592a\u5e73",
        "phone": "04-22757802"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.489969232361,
          22.670347729445268
        ]
      },
      "properties": {
        "id": "1116",
        "name": "\u53f0\u7063\u4f01\u9280-\u5c4f\u6771",
        "phone": "08-7327171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.00969324648646,
          24.826725088982517
        ]
      },
      "properties": {
        "id": "1117",
        "name": "\u53f0\u7063\u4f01\u9280-\u7af9\u5317",
        "phone": "03-5527171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.71535072688161,
          24.251532446909962
        ]
      },
      "properties": {
        "id": "1118",
        "name": "\u53f0\u7063\u4f01\u9280-\u8c50\u539f",
        "phone": "04-25253485"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29587875997149,
          22.795320846840365
        ]
      },
      "properties": {
        "id": "1119",
        "name": "\u53f0\u7063\u4f01\u9280-\u5ca1\u5c71",
        "phone": "07-6227171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.43975670187386,
          23.52144527714178
        ]
      },
      "properties": {
        "id": "111A",
        "name": "\u53f0\u7063\u4f01\u9280-\u6c11\u96c4",
        "phone": "05-2207171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5184056211577,
          25.0529491430657
        ]
      },
      "properties": {
        "id": "111B",
        "name": "\u53f0\u7063\u4f01\u9280-\u5efa\u6210",
        "phone": "02-25506998"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30159407232479,
          22.632742219225428
        ]
      },
      "properties": {
        "id": "111C",
        "name": "\u53f0\u7063\u4f01\u9280-\u4e09\u6c11",
        "phone": "07-2877171"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51309704441478,
          25.043398910926587
        ]
      },
      "properties": {
        "id": "111D",
        "name": "\u81fa\u7063\u4f01\u9280-\u53f0\u5317",
        "phone": "02-23880971"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.19929070410133,
          25.06746781989051
        ]
      },
      "properties": {
        "id": "111E",
        "name": "\u81fa\u7063\u4f01\u9280-\u6843\u5712",
        "phone": "03-3311007"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30872276956832,
          22.63465156760871
        ]
      },
      "properties": {
        "id": "111F",
        "name": "\u53f0\u7063\u4f01\u9280-\u5317\u9ad8\u96c4",
        "phone": "07-2360116"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.47399617905653,
          25.015708971866616
        ]
      },
      "properties": {
        "id": "111G",
        "name": "\u81fa\u7063\u4f01\u9280-\u57d4\u5898",
        "phone": "02-89515671"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51313594465451,
          25.046588077225692
        ]
      },
      "properties": {
        "id": "1161",
        "name": "\u65e5\u76db-\u5fe0\u5b5d",
        "phone": "02-23117676"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.211231212601,
          22.998331684858304
        ]
      },
      "properties": {
        "id": "1162",
        "name": "\u65e5\u76db-\u53f0\u5357",
        "phone": "06-2208088"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.65802322556158,
          24.157501330064473
        ]
      },
      "properties": {
        "id": "1163",
        "name": "\u65e5\u76db-\u53f0\u4e2d",
        "phone": "04-23278989"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.58702324772196,
          25.06885851523333
        ]
      },
      "properties": {
        "id": "1164",
        "name": "\u65e5\u76db-\u5167\u6e56",
        "phone": "02-87927888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.45884157473422,
          25.008598217597026
        ]
      },
      "properties": {
        "id": "1165",
        "name": "\u65e5\u76db-\u677f\u6a4b",
        "phone": "02-29693333"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.510193268056,
          25.001498418233083
        ]
      },
      "properties": {
        "id": "1166",
        "name": "\u65e5\u76db-\u96d9\u548c",
        "phone": "02-29206868"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.43782999882556,
          23.46793000485439
        ]
      },
      "properties": {
        "id": "1167",
        "name": "\u65e5\u76db-\u5609\u7fa9",
        "phone": "05-2860777"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29160481593269,
          22.631515517996363
        ]
      },
      "properties": {
        "id": "1168",
        "name": "\u65e5\u76db-\u9ad8\u96c4",
        "phone": "07-2816888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52582340111636,
          25.092753962167233
        ]
      },
      "properties": {
        "id": "1169",
        "name": "\u65e5\u76db-\u58eb\u6797",
        "phone": "02-28831515"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56731982932799,
          24.98833379889497
        ]
      },
      "properties": {
        "id": "116A",
        "name": "\u65e5\u76db-\u6728\u67f5",
        "phone": "02-22342616"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.2251226719558,
          24.957608037965695
        ]
      },
      "properties": {
        "id": "116B",
        "name": "\u65e5\u76db-\u4e2d\u58e2",
        "phone": "03-4273888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.29970803171479,
          24.95993352049299
        ]
      },
      "properties": {
        "id": "116C",
        "name": "\u65e5\u76db-\u516b\u5fb7",
        "phone": "03-3622888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.49804393990259,
          25.062408436133325
        ]
      },
      "properties": {
        "id": "116E",
        "name": "\u65e5\u76db-\u4e09\u91cd",
        "phone": "02-29845858"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.5707048949148,
          23.957347491571056
        ]
      },
      "properties": {
        "id": "116F",
        "name": "\u65e5\u76db-\u54e1\u6797",
        "phone": "04-8326118"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54063427936396,
          24.991821165066195
        ]
      },
      "properties": {
        "id": "116G",
        "name": "\u65e5\u76db-\u666f\u7f8e",
        "phone": "02-9353966"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.97270020694022,
          24.804291620951954
        ]
      },
      "properties": {
        "id": "116H",
        "name": "\u65e5\u76db-\u65b0\u7af9",
        "phone": "03-5227868"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52944637920923,
          25.051881061427295
        ]
      },
      "properties": {
        "id": "116I",
        "name": "\u65e5\u76db-\u5357\u4eac",
        "phone": "02-25215678"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.47906797537641,
          25.019361137984383
        ]
      },
      "properties": {
        "id": "116J",
        "name": "\u65e5\u76db-\u677f\u6a4b\u4e2d\u5c71",
        "phone": "02-29559000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.60888846070146,
          23.977371477505788
        ]
      },
      "properties": {
        "id": "116K",
        "name": "\u65e5\u76db-\u82b1\u84ee",
        "phone": "038-346888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.64954856076642,
          24.15305289496997
        ]
      },
      "properties": {
        "id": "116L",
        "name": "\u65e5\u76db-\u5927\u58a9",
        "phone": "04-23288898"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.48957035317076,
          22.675317402185232
        ]
      },
      "properties": {
        "id": "116M",
        "name": "\u65e5\u76db-\u5c4f\u6771",
        "phone": "08-7663888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.2337621228569,
          22.998759871949794
        ]
      },
      "properties": {
        "id": "116N",
        "name": "\u65e5\u76db-\u6c38\u5eb7",
        "phone": "06-3113888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.55725922914377,
          25.028875092781018
        ]
      },
      "properties": {
        "id": "116P",
        "name": "\u65e5\u76db-\u4fe1\u7fa9",
        "phone": "02-27390369"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.36746868801708,
          23.89997613328988
        ]
      },
      "properties": {
        "id": "116Q",
        "name": "\u65e5\u76db-\u4e8c\u6797",
        "phone": "04-8951808"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.4460097674286,
          25.034632171027447
        ]
      },
      "properties": {
        "id": "116S",
        "name": "\u65e5\u76db-\u65b0\u838a",
        "phone": "02-22019789"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.2022110994658,
          24.96883782461372
        ]
      },
      "properties": {
        "id": "116U",
        "name": "\u65e5\u76db-\u6843\u5712",
        "phone": "03-3553188"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.54105159248628,
          23.70872990093559
        ]
      },
      "properties": {
        "id": "116V",
        "name": "\u65e5\u76db-\u6597\u516d",
        "phone": "05-5351010"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.3623775474523,
          22.625268942966677
        ]
      },
      "properties": {
        "id": "116W",
        "name": "\u65e5\u76db-\u9cf3\u5c71",
        "phone": "07-7462488"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.75337274685407,
          24.761014320229368
        ]
      },
      "properties": {
        "id": "116X",
        "name": "\u65e5\u76db-\u5b9c\u862d",
        "phone": "039-313588"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.5001116224282,
          25.001599923068472
        ]
      },
      "properties": {
        "id": "116Z",
        "name": "\u65e5\u76db-\u4e2d\u548c",
        "phone": "02-22493366"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.99475736353257,
          24.801203341946138
        ]
      },
      "properties": {
        "id": "116b",
        "name": "\u65e5\u76db-\u5712\u5340",
        "phone": "03-5717666"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.71972816217485,
          24.253464932865967
        ]
      },
      "properties": {
        "id": "116c",
        "name": "\u65e5\u76db-\u8c50\u539f",
        "phone": "(04)25229688"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.37227801178788,
          24.935912457393407
        ]
      },
      "properties": {
        "id": "116d",
        "name": "\u65e5\u76db-\u4e09\u5cfd",
        "phone": "02-26745568"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.01920664134106,
          24.83387110944215
        ]
      },
      "properties": {
        "id": "116e",
        "name": "\u65e5\u76db-\u7af9\u5317",
        "phone": "03-5585988"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.54422854381538,
          25.05047535925724
        ]
      },
      "properties": {
        "id": "116f",
        "name": "\u65e5\u76db-\u5fa9\u8208",
        "phone": "02-87718888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.21607720943548,
          24.87067940488312
        ]
      },
      "properties": {
        "id": "116g",
        "name": "\u65e5\u76db-\u9f8d\u6f6d",
        "phone": "03-4802000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.42308778712945,
          24.991153315140735
        ]
      },
      "properties": {
        "id": "116i",
        "name": "\u65e5\u76db-\u6a39\u6797",
        "phone": "02-26869968"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.9074091385529,
          24.685669888167702
        ]
      },
      "properties": {
        "id": "116j",
        "name": "\u65e5\u76db-\u982d\u4efd",
        "phone": "037-663888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29503071447735,
          22.6681474940093
        ]
      },
      "properties": {
        "id": "116k",
        "name": "\u65e5\u76db-\u5317\u9ad8\u96c4",
        "phone": "07-5501768"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.49992112043795,
          24.11062777812132
        ]
      },
      "properties": {
        "id": "116m",
        "name": "\u65e5\u76db-\u548c\u7f8e",
        "phone": "04-7565988"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.47125800045569,
          25.027392998861323
        ]
      },
      "properties": {
        "id": "116r",
        "name": "\u65e5\u76db-\u6587\u5316",
        "phone": "02-22569999"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.44161556510949,
          24.974302992121746
        ]
      },
      "properties": {
        "id": "116s",
        "name": "\u65e5\u76db-\u571f\u57ce",
        "phone": "02-82613366"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.53342667008538,
          25.052281416537795
        ]
      },
      "properties": {
        "id": "116x",
        "name": "\u65e5\u76db-\u570b\u969b\u8b49\u5238",
        "phone": "25615888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30649233569413,
          22.63455164010488
        ]
      },
      "properties": {
        "id": "1232",
        "name": "\u5f70\u9280-\u4e03\u8ce2",
        "phone": "07-2355658"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.66208752306628,
          24.121981890403184
        ]
      },
      "properties": {
        "id": "1233",
        "name": "\u5f70\u9280-\u53f0\u4e2d",
        "phone": "04-22660011"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.56476131199557,
          25.05958175606323
        ]
      },
      "properties": {
        "id": "1261",
        "name": "\u5b8f\u9060-\u6c11\u751f",
        "phone": "02-27657933"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.20264341500024,
          24.967683409334388
        ]
      },
      "properties": {
        "id": "1262",
        "name": "\u5b8f\u9060-\u6843\u5712",
        "phone": "03-3463456"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.20273029005736,
          22.99233203740094
        ]
      },
      "properties": {
        "id": "126D",
        "name": "\u5b8f\u9060-\u53f0\u5357",
        "phone": "06-2236677"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.48291408703106,
          25.00131097700671
        ]
      },
      "properties": {
        "id": "126H",
        "name": "\u5b8f\u9060-\u4e2d\u548c",
        "phone": "02-32346738"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.64630481795642,
          24.15414050969535
        ]
      },
      "properties": {
        "id": "126L",
        "name": "\u5b8f\u9060-\u53f0\u4e2d",
        "phone": "04-2255-6687"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.31371491262017,
          22.630301980875437
        ]
      },
      "properties": {
        "id": "126Q",
        "name": "\u5b8f\u9060-\u9ad8\u96c4",
        "phone": "07-2299788"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51136020869055,
          25.04467234443559
        ]
      },
      "properties": {
        "id": "126U",
        "name": "\u5b8f\u9060-\u9928\u524d",
        "phone": "02-23121122"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30845491756992,
          23.03593096673082
        ]
      },
      "properties": {
        "id": "126X",
        "name": "\u5b8f\u9060-\u65b0\u5316",
        "phone": "03-5102233"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.45108742668148,
          24.995064535148288
        ]
      },
      "properties": {
        "id": "2181",
        "name": "\u4e9e\u6771-\u677f\u6a4b",
        "phone": "02-89665558"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.01286162348,
          24.7852175964651
        ]
      },
      "properties": {
        "id": "2184",
        "name": "\u4e9e\u6771-\u65b0\u7af9",
        "phone": "03-5641818"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.22162554325557,
          22.985519366879632
        ]
      },
      "properties": {
        "id": "2185",
        "name": "\u4e9e\u6771-\u53f0\u5357",
        "phone": "06-2358999"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.30512137628241,
          22.61333762243996
        ]
      },
      "properties": {
        "id": "2186",
        "name": "\u4e9e\u6771-\u9ad8\u96c4",
        "phone": "07-5369888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.68375742175336,
          24.141871905883995
        ]
      },
      "properties": {
        "id": "2187",
        "name": "\u4e9e\u6771-\u53f0\u4e2d",
        "phone": "04-22218600"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51309587724606,
          25.043075555632672
        ]
      },
      "properties": {
        "id": "218A",
        "name": "\u4e9e\u6771-\u570b\u969b\u8b49\u5238",
        "phone": "23618600"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.20402113257764,
          23.00375477914528
        ]
      },
      "properties": {
        "id": "5058",
        "name": "\u5927\u5c55-\u53f0\u5357",
        "phone": "06-2208866"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.52367473697689,
          25.049246066329747
        ]
      },
      "properties": {
        "id": "5112",
        "name": "\u5bcc\u9686-\u9577\u5b89",
        "phone": "02-25631168"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.47547887366542,
          25.081490869215976
        ]
      },
      "properties": {
        "id": "5261",
        "name": "\u5927\u6176-\u8606\u6d32",
        "phone": "02-22897766"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.14713887966792,
          24.907989183750434
        ]
      },
      "properties": {
        "id": "5262",
        "name": "\u5927\u6176-\u694a\u6885",
        "phone": "03-4759977"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.43273544806505,
          25.06140485790723
        ]
      },
      "properties": {
        "id": "5263",
        "name": "\u5927\u6176-\u6cf0\u5c71",
        "phone": "02-22966688"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.29995803297672,
          22.6180418487384
        ]
      },
      "properties": {
        "id": "5264",
        "name": "\u5927\u6176-\u9ad8\u96c4",
        "phone": "07-3312288"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.82395733068195,
          24.565902960893695
        ]
      },
      "properties": {
        "id": "5265",
        "name": "\u5927\u6176-\u82d7\u6817",
        "phone": "037-262888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.20665210622606,
          24.956619656393652
        ]
      },
      "properties": {
        "id": "5266",
        "name": "\u5927\u6176-\u4e2d\u58e2",
        "phone": "03-4912588"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.7452648092833,
          25.131951261037777
        ]
      },
      "properties": {
        "id": "5267",
        "name": "\u5927\u6176-\u57fa\u9686",
        "phone": "02-24281122"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.19654067148683,
          23.001448188353876
        ]
      },
      "properties": {
        "id": "5268",
        "name": "\u5927\u6176-\u53f0\u5357",
        "phone": "06-2232233"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.68565611605638,
          24.171121900936875
        ]
      },
      "properties": {
        "id": "5269",
        "name": "\u5927\u6176-\u53f0\u4e2d",
        "phone": "04-22375888"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.51192366259788,
          24.990736483988634
        ]
      },
      "properties": {
        "id": "526A",
        "name": "\u5927\u6176-\u4e2d\u548c",
        "phone": "02-89411188"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.49748640397175,
          25.067048077165925
        ]
      },
      "properties": {
        "id": "526K",
        "name": "\u5927\u6176-\u5bcc\u9806",
        "phone": "02-29818889"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.46148067540592,
          25.084623639808186
        ]
      },
      "properties": {
        "id": "526L",
        "name": "\u5927\u6176-\u9577\u69ae",
        "phone": "02-82828289"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.21744862230926,
          24.859226111192125
        ]
      },
      "properties": {
        "id": "5321",
        "name": "\u9ad8\u6a4b-\u9f8d\u6f6d",
        "phone": "03-4807889"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.2229708455698,
          24.957053320201805
        ]
      },
      "properties": {
        "id": "5322",
        "name": "\u9ad8\u6a4b-\u4e2d\u58e2",
        "phone": "03-4270889"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.25635682087655,
          24.973402789288105
        ]
      },
      "properties": {
        "id": "5323",
        "name": "\u9ad8\u6a4b-\u5167\u58e2",
        "phone": "03-4351010"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.57982188951588,
          23.96310714236305
        ]
      },
      "properties": {
        "id": "5381",
        "name": "\u7b2c\u4e00\u91d1-\u54e1\u6797",
        "phone": "04-8339000"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.66367629618756,
          24.146974902856645
        ]
      },
      "properties": {
        "id": "5382",
        "name": "\u7b2c\u4e00\u91d1-\u53f0\u4e2d",
        "phone": "04-23103648"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.33264111605224,
          22.628160096230733
        ]
      },
      "properties": {
        "id": "5383",
        "name": "\u7b2c\u4e00\u91d1-\u9ad8\u96c4",
        "phone": "07-7262378"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          120.21105304066558,
          22.98608682149472
        ]
      },
      "properties": {
        "id": "5384",
        "name": "\u7b2c\u4e00\u91d1-\u53f0\u5357",
        "phone": "06-2148111"
      }
    },
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          121.3096701383553,
          24.990159572682227
        ]
      },
      "properties": {
        "id": "5385",
        "name": "\u7b2c\u4e00\u91d1-\u6843\u5712",
        "phone": "03-3382881"
    
Download .txt
gitextract_2u5gsm6r/

├── .github/
│   └── ISSUE_TEMPLATE/
│       ├── bug_report.md
│       ├── feature_request.md
│       ├── improvement.md
│       └── magic.md
├── .gitignore
├── .pylintrc
├── LICENSE
├── PULL_REQUEST_TEMPLATE.md
├── README.md
├── README.rst
├── bin/
│   ├── busm_failed.py
│   ├── fin_schedule.py
│   ├── getnews.sh
│   ├── publish.py
│   └── weekly.py
├── docs/
│   ├── SOUP_NOTES.md
│   └── changelog.md
├── postman/
│   ├── TWMKT.postman_collection.json
│   └── TWMKT.postman_environment.json
├── requirements.txt
├── setup.py
└── twnews/
    ├── __init__.py
    ├── __main__.py
    ├── cache.py
    ├── common.py
    ├── conf/
    │   ├── logging.yaml
    │   ├── news-soup.json
    │   └── usage.txt
    ├── exceptions.py
    ├── finance/
    │   ├── __init__.py
    │   ├── broker.py
    │   ├── tdcc.py
    │   ├── ticksmap.py
    │   ├── tpex.py
    │   └── twse.py
    ├── res/
    │   ├── ticksmap-captcha.xrc
    │   ├── ticksmap-locations.geojson
    │   └── ticksmap-template.html
    ├── samples/
    │   ├── appledaily.html.xz
    │   ├── chinatimes.html.xz
    │   ├── cna.html.xz
    │   ├── ettoday.html.xz
    │   ├── ltn.html.xz
    │   ├── setn.html.xz
    │   └── udn.html.xz
    ├── search.py
    ├── soup.py
    └── tests/
        ├── __init__.py
        ├── search/
        │   ├── __init__.py
        │   ├── test_search_appledaily.py
        │   ├── test_search_cna.py
        │   ├── test_search_ettoday.py
        │   ├── test_search_ltn.py
        │   ├── test_search_setn.py
        │   └── test_search_udn.py
        └── soup/
            ├── __init__.py
            ├── test_soup_appledaily.py
            ├── test_soup_chinatimes.py
            ├── test_soup_cna.py
            ├── test_soup_common.py
            ├── test_soup_ettoday.py
            ├── test_soup_ltn.py
            ├── test_soup_setn.py
            └── test_soup_udn.py
Download .txt
SYMBOL INDEX (199 symbols across 30 files)

FILE: bin/busm_failed.py
  function main (line 15) | def main():

FILE: bin/fin_schedule.py
  class JustDaemon (line 17) | class JustDaemon:
    method __init__ (line 23) | def __init__(self, init_task=None, loop_task=None, stdout='/dev/null',...
    method do_nothing (line 34) | def do_nothing(self):
    method on_quit (line 37) | def on_quit(self, signum, frame):
    method listen_sys_signals (line 40) | def listen_sys_signals(self):
    method stream_redirect (line 51) | def stream_redirect(self):
    method stream_flush (line 58) | def stream_flush(self):
    method stream_close (line 63) | def stream_close(self):
    method pidfile_create (line 68) | def pidfile_create(self):
    method pidfile_remove (line 77) | def pidfile_remove(self):
    method run (line 81) | def run(self):
  class ScheduleDaemon (line 100) | class ScheduleDaemon(JustDaemon):
    method __init__ (line 106) | def __init__(self, schedule_table, stdout='/dev/null', stderr='/dev/nu...
    method init_task (line 116) | def init_task(self, attrs):
    method loop_task (line 133) | def loop_task(self, attrs):
    method run_parallel (line 136) | def run_parallel(self, func, args):
  function im_fine (line 140) | def im_fine():
  function tpe_at (line 144) | def tpe_at(timestr):
  function main (line 150) | def main():

FILE: bin/publish.py
  function get_wheel (line 14) | def get_wheel():
  function get_latest_python (line 30) | def get_latest_python():
  function test_in_virtualenv (line 58) | def test_in_virtualenv(pyver, wheel):
  function wheel_check (line 81) | def wheel_check():
  function upload_to_pypi (line 117) | def upload_to_pypi(test=False):
  function main (line 147) | def main():

FILE: twnews/__init__.py
  function _replaced_print (line 20) | def _replaced_print(*objects, sep=' ', end='\n', file=sys.stdout, flush=...

FILE: twnews/__main__.py
  function soup (line 13) | def soup(path):
  function search_and_list (line 34) | def search_and_list(keyword, channel):
  function search_and_soup (line 52) | def search_and_soup(keyword, channel):
  function search_and_compare_performance (line 72) | def search_and_compare_performance(keyword):
  function compare_keyword (line 108) | def compare_keyword(keyword):
  function usage (line 141) | def usage():
  function get_cmd_param (line 152) | def get_cmd_param(index, default=None):
  function main (line 160) | def main():

FILE: twnews/cache.py
  class DateCache (line 9) | class DateCache:
    method __init__ (line 14) | def __init__(self, category, item, data_format):
    method get_path (line 22) | def get_path(self, datestr):
    method has (line 31) | def has(self, datestr):
    method load (line 38) | def load(self, datestr):
    method save (line 51) | def save(self, datestr, content):

FILE: twnews/common.py
  function found_socks5 (line 29) | def found_socks5():
  function get_package_dir (line 44) | def get_package_dir():
  function get_logger (line 50) | def get_logger(name='news'):
  function get_session (line 77) | def get_session(proxy_first):
  function get_all_conf (line 113) | def get_all_conf():
  function detect_channel (line 126) | def detect_channel(path):
  function get_channel_conf (line 136) | def get_channel_conf(channel, action=None):
  function get_cache_dir (line 149) | def get_cache_dir(category):

FILE: twnews/exceptions.py
  class SyncException (line 5) | class SyncException(Exception):
    method __init__ (line 9) | def __init__(self, reason):
  class NetworkException (line 13) | class NetworkException(SyncException):
  class InvalidDataException (line 18) | class InvalidDataException(SyncException):

FILE: twnews/finance/__init__.py
  function get_connection (line 92) | def get_connection(rebuild=False):
  function get_argument (line 120) | def get_argument(index, default=''):
  function fucking_get (line 128) | def fucking_get(hook, url, params):

FILE: twnews/finance/broker.py
  class TradingLocation (line 31) | class TradingLocation(Base):
  function date_conv (line 45) | def date_conv(hw_date):
  function visit_branch (line 67) | def visit_branch(session, parent_id):
  function visit_parent (line 100) | def visit_parent(session):
  function geocode_single (line 127) | def geocode_single(address, req_session, sid, transformer):
  function geocode (line 158) | def geocode(orm_session):
  function main (line 229) | def main():

FILE: twnews/finance/tdcc.py
  function import_dist (line 15) | def import_dist(csv_date='latest'):
  function rebuild_dist (line 81) | def rebuild_dist():
  function backup_dist (line 107) | def backup_dist(refresh=False):
  function sync_dataset (line 136) | def sync_dataset():
  function main (line 148) | def main():

FILE: twnews/finance/ticksmap.py
  class CaptchaDialog (line 18) | class CaptchaDialog(wx.App):
    method __init__ (line 23) | def __init__(self, captcha_stream):
    method OnInit (line 28) | def OnInit(self):
    method OnKeyPress (line 52) | def OnKeyPress(self, event):
  function handle_captcha (line 60) | def handle_captcha(captcha_stream):
  function load_soup (line 68) | def load_soup(security_id, date_str):
  function parse_tick_node (line 159) | def parse_tick_node(row):
  function main (line 172) | def main():

FILE: twnews/finance/tpex.py
  function download_margin (line 21) | def download_margin(datestr):
  function download_block (line 38) | def download_block(datestr):
  function download_institution (line 55) | def download_institution(datestr):
  function download_selled (line 79) | def download_selled(datestr):
  function import_margin (line 115) | def import_margin(dbcon, trading_date, dataset):
  function import_block (line 139) | def import_block(dbcon, trading_date, dataset):
  function import_institution (line 184) | def import_institution(dbcon, trading_date, dataset):
  function import_selled (line 208) | def import_selled(dbcon, trading_date, dataset):
  function sync_dataset (line 223) | def sync_dataset(dsitem, trading_date='latest'):
  function main (line 294) | def main():

FILE: twnews/finance/twse.py
  function download_margin (line 19) | def download_margin(datestr):
  function download_block (line 40) | def download_block(datestr):
  function download_institution (line 61) | def download_institution(datestr):
  function download_borrowed (line 80) | def download_borrowed(datestr):
  function download_selled (line 107) | def download_selled(datestr):
  function download_etfnet (line 127) | def download_etfnet(datestr):
  function import_margin (line 141) | def import_margin(dbcon, trading_date, dataset):
  function import_block (line 164) | def import_block(dbcon, trading_date, dataset):
  function import_institution (line 200) | def import_institution(dbcon, trading_date, dataset):
  function import_borrowed (line 221) | def import_borrowed(dbcon, trading_date, dataset):
  function import_selled (line 247) | def import_selled(dbcon, trading_date, dataset):
  function import_etfnet (line 266) | def import_etfnet(dbcon, trading_date, dataset):
  function sync_dataset (line 289) | def sync_dataset(dsitem, trading_date='latest'):
  function main (line 350) | def main():

FILE: twnews/search.py
  function visit_dict (line 17) | def visit_dict(dict_node, path):
  function filter_duplicated (line 29) | def filter_duplicated(results):
  class NewsSearchException (line 52) | class NewsSearchException(Exception):
  class NewsSearch (line 57) | class NewsSearch:
    method __init__ (line 62) | def __init__(self, channel, limit=25, beg_date=None, end_date=None, pr...
    method by_keyword (line 122) | def by_keyword(self, keyword, title_only=False):
    method to_dict_list (line 186) | def to_dict_list(self):
    method to_soup_list (line 192) | def to_soup_list(self):
    method elapsed (line 202) | def elapsed(self):
    method pages (line 208) | def pages(self):
    method __flip_to_end_date (line 214) | def __flip_to_end_date(self, keyword):
    method __load_page (line 299) | def __load_page(self, keyword, page):
    method __result_nodes (line 334) | def __result_nodes(self):
    method __parse_title_node (line 344) | def __parse_title_node(self, result_node):
    method __parse_date_node (line 356) | def __parse_date_node(self, result_node):
    method __parse_link_node (line 377) | def __parse_link_node(self, result_node):

FILE: twnews/soup.py
  function get_cache_filepath (line 20) | def get_cache_filepath(channel, uri):
  function url_follow_redirection (line 28) | def url_follow_redirection(url, proxy_first):
  function url_force_https (line 70) | def url_force_https(url):
  function url_force_ltn_mobile (line 88) | def url_force_ltn_mobile(url):
  function soup_from_website (line 107) | def soup_from_website(url, channel, refresh, proxy_first):
  function soup_from_file (line 149) | def soup_from_file(file_path):
  function scan_author (line 170) | def scan_author(article):
  class NewsSoup (line 202) | class NewsSoup:
    method __init__ (line 209) | def __init__(self, path, refresh=False, proxy_first=False):
    method __get_soup (line 251) | def __get_soup(self):
    method title (line 281) | def title(self):
    method date_raw (line 306) | def date_raw(self):
    method date (line 334) | def date(self):
    method author (line 361) | def author(self):
    method contents (line 406) | def contents(self, limit=0):
    method effective_text_rate (line 434) | def effective_text_rate(self):

FILE: twnews/tests/search/test_search_appledaily.py
  class TestAppleDaily (line 9) | class TestAppleDaily(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/search/test_search_cna.py
  class TestCna (line 9) | class TestCna(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/search/test_search_ettoday.py
  class TestEttoday (line 9) | class TestEttoday(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/search/test_search_ltn.py
  class TestLtn (line 9) | class TestLtn(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/search/test_search_setn.py
  class TestSetn (line 9) | class TestSetn(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/search/test_search_udn.py
  class TestUdn (line 9) | class TestUdn(unittest.TestCase):
    method setUp (line 14) | def setUp(self):
    method test_01_filter_title (line 18) | def test_01_filter_title(self):
    method test_02_search_and_soup (line 27) | def test_02_search_and_soup(self):

FILE: twnews/tests/soup/test_soup_appledaily.py
  class TestAppleDaily (line 12) | class TestAppleDaily(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 20) | def test_01_sample(self):
    method test_02_mobile (line 32) | def test_02_mobile(self):
    method test_03_layouts (line 44) | def test_03_layouts(self):

FILE: twnews/tests/soup/test_soup_chinatimes.py
  class TestChinatimes (line 12) | class TestChinatimes(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 21) | def test_01_sample(self):
    method test_02_mobile (line 33) | def test_02_mobile(self):

FILE: twnews/tests/soup/test_soup_cna.py
  class TestCna (line 12) | class TestCna(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 21) | def test_01_sample(self):
    method test_02_mobile (line 33) | def test_02_mobile(self):

FILE: twnews/tests/soup/test_soup_common.py
  class TestCommon (line 14) | class TestCommon(unittest.TestCase):
    method setUp (line 19) | def setUp(self):
    method test_01_cache (line 23) | def test_01_cache(self):
    method test_02_soup_from_website (line 46) | def test_02_soup_from_website(self):
    method test_03_soup_from_file (line 87) | def test_03_soup_from_file(self):
    method test_04_author_in_contents (line 143) | def test_04_author_in_contents(self):
    method test_05_author_in_node (line 193) | def test_05_author_in_node(self):

FILE: twnews/tests/soup/test_soup_ettoday.py
  class TestEttoday (line 11) | class TestEttoday(unittest.TestCase):
    method setUp (line 16) | def setUp(self):
    method test_01_sample (line 19) | def test_01_sample(self):
    method test_02_mobile (line 31) | def test_02_mobile(self):
    method test_03_layouts (line 43) | def test_03_layouts(self):

FILE: twnews/tests/soup/test_soup_ltn.py
  class TestLtn (line 12) | class TestLtn(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 20) | def test_01_sample(self):
    method test_02_mobile (line 32) | def test_02_mobile(self):
    method test_03_layouts (line 44) | def test_03_layouts(self):
    method test_04_switch_mobile (line 141) | def test_04_switch_mobile(self):
    method test_05_multiple_date_format (line 153) | def test_05_multiple_date_format(self):
    method test_06_http_redirect (line 165) | def test_06_http_redirect(self):

FILE: twnews/tests/soup/test_soup_setn.py
  class TestSetn (line 12) | class TestSetn(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 21) | def test_01_sample(self):
    method test_02_mobile (line 33) | def test_02_mobile(self):
    method test_03_layouts (line 44) | def test_03_layouts(self):

FILE: twnews/tests/soup/test_soup_udn.py
  class TestUdn (line 12) | class TestUdn(unittest.TestCase):
    method setUp (line 17) | def setUp(self):
    method test_01_sample (line 21) | def test_01_sample(self):
    method test_02_mobile (line 33) | def test_02_mobile(self):
    method test_03_layouts (line 44) | def test_03_layouts(self):
Condensed preview — 64 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (566K chars).
[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 557,
    "preview": "---\nname: Bug report\nabout: 回報錯誤\ntitle: 摘要一下問題吧\nlabels: bug\nassignees: virus-warnning\n\n---\n\n**問題描述**\n\n盡可能清楚地描述狀況,讓我觀落陰不會"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 215,
    "preview": "---\nname: Feature request\nabout: 許個願望\ntitle: 想要一個神奇的功能\nlabels: enhancement\nassignees: virus-warnning\n\n---\n\n**許願動機**\n\n描述一"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/improvement.md",
    "chars": 189,
    "preview": "---\nname: Improvement\nabout: 現有功能改善\ntitle: OOO效能改善\nlabels: improvement\nassignees: virus-warnning\n\n---\n\n**問題摘要**\n\n描述一下現有功"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/magic.md",
    "chars": 168,
    "preview": "---\nname: Magic\nabout: 變個魔法 (不提供訪客使用)\ntitle: 變個魔法\nlabels: magic\nassignees: virus-warnning\n\n---\n\n**咒語內容**\n\n希望機器人可以自己發現八卦,"
  },
  {
    "path": ".gitignore",
    "chars": 150,
    "preview": "# Python 3\n*.pyc\n__pycache__\n\n# Testing report\n.coverage\n\n# setup.py\ndist\nbuild\ntwnews.egg-info\n\n# virtualenv\nsandbox\n\n#"
  },
  {
    "path": ".pylintrc",
    "chars": 17520,
    "preview": "[MASTER]\n\n# A comma-separated list of package or module names from where C extensions may\n# be loaded. Extensions are lo"
  },
  {
    "path": "LICENSE",
    "chars": 1073,
    "preview": "MIT License\n\nCopyright (c) 2018 Raymond Wu (小璋丸)\n\nPermission is hereby granted, free of charge, to any person obtaining "
  },
  {
    "path": "PULL_REQUEST_TEMPLATE.md",
    "chars": 110,
    "preview": "**目的與用途**\n\n描述一下這個 PR 的目的與用途\n\n**被刪除的功能**\n\n如果有刪除某些功能,請說明刪除原因以及效益\n\n**被刪除的測試程式**\n\n如果有刪除測試程式,請說明為何不再需要測試,以及對應的替代方案\n"
  },
  {
    "path": "README.md",
    "chars": 2648,
    "preview": "台灣新聞拆拆樂的用來蒐集或處理大眾感興趣的資訊,目前著重在新聞和市場情報,希望能藉由這個工具讓其他人變出各種好玩的衍生應用,目前提供這些功能:\n\n* 將新聞網頁拆出標題,內文,報導時間,記者\n* 關鍵字查新聞,並且可接續拆新聞或新聞數\n* "
  },
  {
    "path": "README.rst",
    "chars": 4243,
    "preview": "台灣新聞拆拆樂 (twnews) 用來分解台灣各大新聞網站,取出重要的純文字內容\n\n功能\n====\n\n- 支援蘋果日報、中時電子報、中央社、東森新聞雲、自由時報、三立新聞網、聯合新聞網\n- 使用行動版網頁與快取機制節省流量\n- 盡可能找出記"
  },
  {
    "path": "bin/busm_failed.py",
    "chars": 731,
    "preview": "import os\nimport sys\nimport threading\n\nsys.path.append(os.path.realpath('.'))\n\nimport twnews.common as cm\n\n# 這一行搭配 logge"
  },
  {
    "path": "bin/fin_schedule.py",
    "chars": 5938,
    "preview": "import logging\nimport os\nimport schedule\nimport signal\nimport sys\nimport threading\nimport time\nfrom datetime import date"
  },
  {
    "path": "bin/getnews.sh",
    "chars": 1187,
    "preview": "# 製作新聞離線範本\nSAMPLE_DIR='twnews/samples'\nUSER_AGENT='Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWe"
  },
  {
    "path": "bin/publish.py",
    "chars": 4252,
    "preview": "#!/usr/bin/env python3\n#\n# 必要工具套件:\n# * pylint\n# * rstcheck\n# * setuptools\n\nimport configparser\nimport os\nimport re\nimpor"
  },
  {
    "path": "bin/weekly.py",
    "chars": 1875,
    "preview": "#!/usr/bin/env python3\n\nimport json\nimport os\nimport os.path\nimport re\n\nimport smtplib\nfrom email.header import Header\nf"
  },
  {
    "path": "docs/SOUP_NOTES.md",
    "chars": 1362,
    "preview": "# 新聞網站技術細節\n\n### 新聞分解\n\n頻道 | RWD | 轉址 | 記者欄\n---- | ---- | ---- | ----\n蘋果 | 半殘 | - | X\n中時 | 完整 | - | O\n中央社 | 完整 | - | X\n東森新"
  },
  {
    "path": "docs/changelog.md",
    "chars": 1771,
    "preview": "## 0.3.x\n\n0.3 的開發主軸是財經資料蒐集,針對證交所,櫃買中心,集保中心,公開資訊觀測站等地方取得有價值資料,並且匯入到 SQLite 方便使用。\n\n### 0.3.3\n\n* 櫃買中心三大法人,融資融券,鉅額交易 - #67\n*"
  },
  {
    "path": "postman/TWMKT.postman_collection.json",
    "chars": 13581,
    "preview": "{\n\t\"info\": {\n\t\t\"_postman_id\": \"96e57888-4ea0-406b-9f9b-774e3cf3896f\",\n\t\t\"name\": \"0xFE-TWMKT\",\n\t\t\"schema\": \"https://schem"
  },
  {
    "path": "postman/TWMKT.postman_environment.json",
    "chars": 459,
    "preview": "{\n\t\"id\": \"e09139ab-0991-4315-82ef-93998a367e54\",\n\t\"name\": \"TWMKT\",\n\t\"values\": [\n\t\t{\n\t\t\t\"key\": \"TWSE_EXFORMAT\",\n\t\t\t\"value"
  },
  {
    "path": "requirements.txt",
    "chars": 139,
    "preview": "beautifulsoup4>=4.7.1\nbusm>=0.9.0\nlxml>=4.3.3\nrequests>=2.21.0\npandas>=0.24.2\nPyYAML>=5.1.2\n# 只有 ticksmap 需要用到 wxWidget "
  },
  {
    "path": "setup.py",
    "chars": 1093,
    "preview": "from setuptools import setup, find_packages\nimport twnews.common as common\n\n# Load reStructedText description.\n# Online "
  },
  {
    "path": "twnews/__init__.py",
    "chars": 1010,
    "preview": "\"\"\"\ntwnews 套件載入前作業,用來解決 Windows 環境會發生的編碼問題\n\"\"\"\n\nimport sys\nimport locale\nimport _locale\n\n# 以下是魔法不要亂改\nif locale.getprefer"
  },
  {
    "path": "twnews/__main__.py",
    "chars": 5525,
    "preview": "\"\"\"\n工具程式\n\"\"\"\n\nimport sys\nimport locale\nimport os.path\nfrom datetime import datetime\nfrom twnews.common import get_logger"
  },
  {
    "path": "twnews/cache.py",
    "chars": 1431,
    "preview": "\"\"\"\n快取處理模組\n\"\"\"\n\nimport os\nimport lzma\nimport json\n\nclass DateCache:\n    \"\"\"\n    處理日期命名快取\n    \"\"\"\n\n    def __init__(self,"
  },
  {
    "path": "twnews/common.py",
    "chars": 4082,
    "preview": "\"\"\"\ntwnews 共用項目\n\"\"\"\n\nimport json\nimport os\nimport os.path\nimport logging\nimport logging.config\nimport socket\nimport yaml"
  },
  {
    "path": "twnews/conf/logging.yaml",
    "chars": 1226,
    "preview": "version: 1\ndisable_existing_loggers: true\nformatters:\n  standard:\n    datefmt: '%Y-%m-%d %H:%M:%S'\n    format: '[%(ascti"
  },
  {
    "path": "twnews/conf/news-soup.json",
    "chars": 14483,
    "preview": "{\n  \"what's that\": {\n    \"name\": \"頻道名稱\",\n    \"layout_list\": [\n      { \"name\": \"特殊排版名稱\", \"prefix\": \"特殊排版網址\" }\n    ],\n    "
  },
  {
    "path": "twnews/conf/usage.txt",
    "chars": 573,
    "preview": "用法:\n  python3 -m twnews 動作\n\n動作列表:\n  soup   [路徑]           分解一則新聞,路徑可以是新聞網址或檔案路徑\n  search [關鍵字] [頻道]  關鍵字搜尋新聞,取10筆,列出搜尋結果"
  },
  {
    "path": "twnews/exceptions.py",
    "chars": 313,
    "preview": "\"\"\"\n例外模組\n\"\"\"\n\nclass SyncException(Exception):\n    \"\"\"\n    同步資料時觸發的例外\n    \"\"\"\n    def __init__(self, reason):\n        sup"
  },
  {
    "path": "twnews/finance/__init__.py",
    "chars": 3303,
    "preview": "\"\"\"\n財經資料蒐集工具共用模組\n\"\"\"\n\nimport os\nimport sqlite3\nimport sys\n\nfrom requests.exceptions import RequestException\n\nimport twne"
  },
  {
    "path": "twnews/finance/broker.py",
    "chars": 6679,
    "preview": "\"\"\"\n券商與分公司資訊與門牌定位\n\n已知問題:\n* 有筆紀錄無法成功定位\n  1. 8770 大鼎\n  2. 1115 台灣企銀-太平\n  3. 779m 國票-中港\n  4. 9636 富邦-中壢\n  這 4 筆用工人智慧取經緯度,其餘"
  },
  {
    "path": "twnews/finance/tdcc.py",
    "chars": 4307,
    "preview": "\"\"\"\n集保中心資料蒐集模組\n\"\"\"\n\nimport os\nimport re\nimport sqlite3\n\nimport pandas\n\nimport twnews.common as common\nfrom twnews.financ"
  },
  {
    "path": "twnews/finance/ticksmap.py",
    "chars": 8952,
    "preview": "# pylint: disable=all\n\nimport io\nimport json\nimport os\nimport re\nimport requests\nimport sys\nimport subprocess\nfrom datet"
  },
  {
    "path": "twnews/finance/tpex.py",
    "chars": 8927,
    "preview": "\"\"\"\n櫃買中心資料蒐集模組\n\"\"\"\n\nfrom datetime import datetime\nimport io\nimport re\nimport sqlite3\nimport sys\nimport time\n\nimport pand"
  },
  {
    "path": "twnews/finance/twse.py",
    "chars": 10797,
    "preview": "\"\"\"\n證交所資料蒐集模組\n\"\"\"\n\nfrom datetime import datetime\nimport io\nimport re\nimport sqlite3\nimport sys\nimport time\n\nimport panda"
  },
  {
    "path": "twnews/res/ticksmap-captcha.xrc",
    "chars": 1127,
    "preview": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resource version=\"2.9.0.5\">\n  <object name=\"main_frame\" class=\"wxFrame\">\n    <ti"
  },
  {
    "path": "twnews/res/ticksmap-locations.geojson",
    "chars": 281125,
    "preview": "{\n  \"type\": \"FeatureCollection\",\n  \"features\": [\n    {\n      \"type\": \"Feature\",\n      \"geometry\": {\n        \"type\": \"Poi"
  },
  {
    "path": "twnews/res/ticksmap-template.html",
    "chars": 7157,
    "preview": "<!DOCTYPE html>\n<html>\n<head>\n<title>分點進出買賣對照</title>\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.5.1/"
  },
  {
    "path": "twnews/search.py",
    "chars": 12930,
    "preview": "\"\"\"\n新聞搜尋模組\n\"\"\"\n\nimport re\nimport time\nimport urllib.parse\nfrom string import Template\nfrom datetime import datetime\n\nfro"
  },
  {
    "path": "twnews/soup.py",
    "chars": 13915,
    "preview": "\"\"\"\n新聞分解模組\n\"\"\"\n\nimport io\nimport re\nimport copy\nimport lzma\nimport hashlib\nimport os\nimport os.path\nfrom datetime import"
  },
  {
    "path": "twnews/tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "twnews/tests/search/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "twnews/tests/search/test_search_appledaily.py",
    "chars": 1019,
    "preview": "\"\"\"\n蘋果日報搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestAppleDaily(unittest.Te"
  },
  {
    "path": "twnews/tests/search/test_search_cna.py",
    "chars": 792,
    "preview": "\"\"\"\n中央社搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestCna(unittest.TestCase):"
  },
  {
    "path": "twnews/tests/search/test_search_ettoday.py",
    "chars": 879,
    "preview": "\"\"\"\n東森新聞雲搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestEttoday(unittest.Test"
  },
  {
    "path": "twnews/tests/search/test_search_ltn.py",
    "chars": 796,
    "preview": "\"\"\"\n自由時報搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestLtn(unittest.TestCase)"
  },
  {
    "path": "twnews/tests/search/test_search_setn.py",
    "chars": 802,
    "preview": "\"\"\"\n三立新聞網搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestSetn(unittest.TestCas"
  },
  {
    "path": "twnews/tests/search/test_search_udn.py",
    "chars": 797,
    "preview": "\"\"\"\n聯合新聞網搜尋測試\n\"\"\"\n\nimport unittest\nfrom twnews.search import NewsSearch\n\n#@unittest.skip\nclass TestUdn(unittest.TestCase"
  },
  {
    "path": "twnews/tests/soup/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "twnews/tests/soup/test_soup_appledaily.py",
    "chars": 2210,
    "preview": "\"\"\"\n蘋果日報分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSou"
  },
  {
    "path": "twnews/tests/soup/test_soup_chinatimes.py",
    "chars": 1265,
    "preview": "\"\"\"\n中時電子報分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSo"
  },
  {
    "path": "twnews/tests/soup/test_soup_cna.py",
    "chars": 1226,
    "preview": "\"\"\"\n中央社分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSoup"
  },
  {
    "path": "twnews/tests/soup/test_soup_common.py",
    "chars": 7512,
    "preview": "\"\"\"\n分解共用程式測試\n\n為了節省爬蟲開發時間,原則上 NewsSoup 盡量不丟例外,有問題以 logging 機制為主\n\"\"\"\n\n# pylint: disable=wildcard-import,unused-wildcard-im"
  },
  {
    "path": "twnews/tests/soup/test_soup_ettoday.py",
    "chars": 4348,
    "preview": "\"\"\"\n東森新聞雲分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSo"
  },
  {
    "path": "twnews/tests/soup/test_soup_ltn.py",
    "chars": 6804,
    "preview": "\"\"\"\n自由時報單元分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsS"
  },
  {
    "path": "twnews/tests/soup/test_soup_setn.py",
    "chars": 2059,
    "preview": "\"\"\"\n三立新聞網分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSo"
  },
  {
    "path": "twnews/tests/soup/test_soup_udn.py",
    "chars": 4546,
    "preview": "\"\"\"\n聯合新聞網分解測試\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport unittest\nimport twnews.common\nfrom twnews.soup import NewsSo"
  }
]

// ... and 7 more files (download for full content)

About this extraction

This page contains the full source code of the virus-warnning/twnews GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 64 files (470.9 KB), approximately 154.2k tokens, and a symbol index with 199 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!