Full Code of AnuoF/TranslateTool for AI

master 4dd07c4332c4 cached

21 files

57.2 KB

18.8k tokens

50 symbols

1 requests

Download .txt

Repository: AnuoF/TranslateTool
Branch: master
Commit: 4dd07c4332c4
Files: 21
Total size: 57.2 KB

Directory structure:
gitextract_edc4_0lw/

├── .gitignore
├── .vscode/
│   └── settings.json
├── README.md
└── src/
    └── TranslateTool/
        ├── .vs/
        │   └── TranslateTool/
        │       ├── v15/
        │       │   └── .suo
        │       └── v16/
        │           └── .suo
        ├── TranslateTool/
        │   ├── Doc_In/
        │   │   └── I have a dream.txt
        │   ├── Logger.py
        │   ├── Main.py
        │   ├── Py4Js.py
        │   ├── T_Base.py
        │   ├── T_Docx.py
        │   ├── T_Pdf_PyPDF2.py
        │   ├── T_Pdf_pdfminer.py
        │   ├── T_Txt.py
        │   ├── TranslateFunc.py
        │   ├── TranslateFuncT.py
        │   ├── TranslateTool.pyproj
        │   ├── log/
        │   │   └── log.txt
        │   └── testdata/
        │       ├── I have a dream.docx
        │       └── I have a dream.txt
        └── TranslateTool.sln

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/


================================================
FILE: .vscode/settings.json
================================================
{
    "python.pythonPath": "C:\\Anaconda3\\envs\\tfenv\\python.exe"
}

================================================
FILE: README.md
================================================
# Python实现文档自动翻译 #
## 目的 ##
本文旨在说明Python文档自动翻译的需求分析以及实现过程。 

## 需求分析 ##
因在工作中，经常需要浏览英文文档，然而对于英文不太好的我来说，有时需要借助谷歌或者百度翻译，但文档往往比较长，复制粘贴很麻烦，所以心里就萌生了一个想法，看能不能自动对文档进行翻译，并生成文档，想法是可以的，实现应该是可行的，所以决定撸起袖子就是干！  
### 选择Python ###
为什么选择Python：  
1）因为人生苦短；  
2）因为刚好在学习AI；  

好了，废话就这么多，进入主题。  
### 功能需求 ###

· 用户可选择谷歌、百度和有道翻译，后期可考虑添加其他的接口；  
· 实现PDF、Word、TXT等多种类型的文档翻译，程序自动识别文档类型；  
· 翻译后生成的新文档格式可以为PDF、Word、TXT格式（未实现）;  
· 有日志记录文件。

## 关键方法 ##

### 提取文档内容 ###
#### 读取TXT文档 ####
txt文档的读取很简单，直接用python自带的open()方法就好，代码如下所示：  

	# 读取TXT文档
	def read_txt(path):
	    '''实现TXT文档的读取，一次将内容全部取出'''
	    content = ''
	    with open(path) as f:
	        content = f.read()
	    return content
	# 也可以用readline()读取每一行

#### 读取Word文档 ####
读取Word文档也比较简单，导入第三方库python-docx，安装指令为pip install python-docx，实例代码如下：  


	import docx    # 安装指令：pip install python-docx

    def translate(self):
        '''翻译'''

        # 获取文档对象
        doc = docx.Document(self.fullName)

        # 创建内存中的word文档对象
        new_doc = docx.Document()

        # 遍历每一段文本
        for para in doc.paragraphs:
            # 翻译
            trans = baidu_translate(para.text)           
            # 写入新文件
            new_doc.add_paragraph(para.text)
            new_doc.add_paragraph(trans)

        # 保存到本地文件
        new_doc.save(self.new_fullPath)
       
#### 读取PDF文档 ####
读取PDF文档同样需要安装第三方库，主要有PyPDF2和pdfminer，这两个库我都有去了解，算是各有特点吧。  
PyPDF2使用相对简单，但只支持英文，对中文支持不太友好；相反pdfminer使用相对而言要复杂点，仅仅是相对而言，其支持多种语言，图表、图片等，功能较强大。这两种方式我在代码中均有实现，其实例代码如下：  
**PyPDF2**

	# 安装指令：pip install pypdf2
	from PyPDF2.pdf import PdfFileReader

    def translate(self):
        '''读取pdf内容，并翻译，写入txt文件'''
        f = open(self.fullPath,'rb')
        pdf = PdfFileReader(f)

        for i in range(0,pdf.getNumPages()):
            extractedText = pdf.getPage(i).extractText()
            content = extractedText.split('\n')  
            content = self.removeBlankFromList(content)  

            # 拼接之后的文本，如果单词间歇超过一个空格的，认为是需要换行处理的
            content_list = self.enter_symbol(content)

            for line in content_list:
                trans = baidu_translate(line)
                self.write(line + '\n')
                self.write(trans)

        f.close()
        Logger().write(self.fileName + '翻译完成，新文档：' + self.new_fullPath)

**pdfminer**

	# 安装指令：pip install pdfminer3k
	from pdfminer.pdfparser import PDFParser,PDFDocument
	from pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreter
	from pdfminer.layout import LAParams,LTTextBoxHorizontal
	from pdfminer.converter import PDFPageAggregator
	from pdfminer.pdfinterp import PDFTextExtractionNotAllowed

    def translate(self):
        '''读取pdf内容，并翻译，写入txt文件'''

        # 以二进制读模式打开本地pdf文件
        fp = open(self.fullPath,'rb')
        # 用文件对象来创建一个pdf文档分析器
        praser_pdf = PDFParser(fp)
        # 创建一个PDF文档
        doc_pdf = PDFDocument()
        # 连接分析器与文档对象
        praser_pdf.set_document(doc_pdf)
        doc_pdf.set_parser(praser_pdf)
        # 提供初始化密码doc.initialize("123456")，如果没有密码 就创建一个空的字符串
        doc_pdf.initialize()

        # 检查文档是否提供txt转换，不提供就无法翻译文档
        if not doc_pdf.is_extractable:
            Logger().write(self.fileName + '未能提取有效的文本，停止翻译。')
            return
        else:
            # 创建PDF资源管理器来共享资源
            rsrcmgr = PDFResourceManager()
            # 创建一个PDF参数分析器
            laparams = LAParams()
            # 创建聚合器
            device = PDFPageAggregator(rsrcmgr,laparams=laparams)
            # 创建一个PDF页面解释器对象
            interpreter = PDFPageInterpreter(rsrcmgr,device)

            # 循环遍历列表，每次处理一页的内容
            for page in doc_pdf.get_pages():
                # 使用页面解释器来读取
                interpreter.process_page(page)
                # 使用聚合器获取内容
                layout = device.get_result()

                # 这里layout是一个LTPage对象 里面存放着 这个page解析出的各种对象 一般包括LTTextBox, LTFigure, LTImage, LTTextBoxHorizontal 等等 想要获取文本就获得对象的text属性，
                for out in layout:
                    # 判断是否含有get_text()方法，图片之类的就没有
                    if isinstance(out,LTTextBoxHorizontal):
                        content = out.get_text()
                        trans = baidu_translate(content)
                        self.write(content)
                        self.write(trans)
            Logger().write(self.fileName + '翻译完成，新文档：' + self.new_fullPath)


### 调用翻译接口 ###
利用python网络爬虫可以很轻松的实现数据爬取，这里就是利用这种“手段”实现翻译功能，对此，还是要感谢这些接口提供商，感谢CCTV、铁岭TV。
#### 百度翻译 ####

百度翻译有反爬机制，电脑端的爬虫会被干掉，所幸手机端可以使用，代码如下所示：  
	
	import urllib.request  
	import urllib.parse  
	import json 

    # 百度翻译方法
    def baidu_translate(content,type=1):
	    '''实现百度翻译'''
	    baidu_url = 'http://fanyi.baidu.com/basetrans'
	    data = {}
	
	    data['from'] = 'en'
	    data['to'] = 'zh'
	    data['query'] = content
	    data['transtype'] = 'translang'
	    data['simple_means_flag'] = '3'
	    data['sign'] = '94582.365127'
	    data['token'] = 'ec980ef090b173ebdff2eea5ffd9a778'
	    data = urllib.parse.urlencode(data).encode('utf-8')
	
	    headers = {"User-Agent":"Mozilla/5.0 (Linux; Android 5.1.1; Nexus 6 Build/LYZ28E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Mobile Safari/537.36"}
	    baidu_re = urllib.request.Request(baidu_url, data, headers)
	    baidu_response = urllib.request.urlopen(baidu_re)
	    baidu_html = baidu_response.read().decode('utf-8')
	    target2 = json.loads(baidu_html)
	
	    trans = target2['trans']
	    ret = ''
	    for i in range(len(trans)):
	        ret += trans[i]['dst'] + '\n'
	
	    return ret


#### 谷歌翻译 ####

首先需要一个类实现JS码的生成

	
	import execjs  
	  
	class Py4Js():  
	      
	    def __init__(self):  
	        self.ctx = execjs.compile(""" 
	        function TL(a) { 
	        var k = ""; 
	        var b = 406644; 
	        var b1 = 3293161072; 
	         
	        var jd = "."; 
	        var $b = "+-a^+6"; 
	        var Zb = "+-3^+b+-f"; 
	     
	        for (var e = [], f = 0, g = 0; g < a.length; g++) { 
	            var m = a.charCodeAt(g); 
	            128 > m ? e[f++] = m : (2048 > m ? e[f++] = m >> 6 | 192 : (55296 == (m & 64512) && g + 1 < a.length && 56320 == (a.charCodeAt(g + 1) & 64512) ? (m = 65536 + ((m & 1023) << 10) + (a.charCodeAt(++g) & 1023), 
	            e[f++] = m >> 18 | 240, 
	            e[f++] = m >> 12 & 63 | 128) : e[f++] = m >> 12 | 224, 
	            e[f++] = m >> 6 & 63 | 128), 
	            e[f++] = m & 63 | 128) 
	        } 
	        a = b; 
	        for (f = 0; f < e.length; f++) a += e[f], 
	        a = RL(a, $b); 
	        a = RL(a, Zb); 
	        a ^= b1 || 0; 
	        0 > a && (a = (a & 2147483647) + 2147483648); 
	        a %= 1E6; 
	        return a.toString() + jd + (a ^ b) 
	    }; 
	     
	    function RL(a, b) { 
	        var t = "a"; 
	        var Yb = "+"; 
	        for (var c = 0; c < b.length - 2; c += 3) { 
	            var d = b.charAt(c + 2), 
	            d = d >= t ? d.charCodeAt(0) - 87 : Number(d), 
	            d = b.charAt(c + 1) == Yb ? a >>> d: a << d; 
	            a = b.charAt(c) == Yb ? a + d & 4294967295 : a ^ d 
	        } 
	        return a 
	    } 
	    """)  
	          
	    def getTk(self,text):  
	        return self.ctx.call("TL",text)  

调用方法如下所示：

	from Py4Js import *

	# 谷歌翻译方法
	def google_translate(content):
	    '''实现谷歌的翻译'''
	    js = Py4Js()
	    tk = js.getTk(content)
	
	    if len(content) > 4891:      
	        print("翻译的长度超过限制！！！")      
	        return    
	  
	    param = {'tk': tk, 'q': content}  
	  
	    result = requests.get("""http://translate.google.cn/translate_a/single?client=t&sl=en 
	        &tl=zh-CN&hl=zh-CN&dt=at&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss 
	        &dt=t&ie=UTF-8&oe=UTF-8&clearbtn=1&otf=1&pc=1&srcrom=0&ssel=0&tsel=0&kc=2""", params=param)  
	  
	    #返回的结果为Json，解析为一个嵌套列表  
	    trans = result.json()[0]
	    ret = ''
	    for i in range(len(trans)):
	        line = trans[i][0]
	        if line != None:
	            ret += trans[i][0]
	
	    return ret

#### 有道翻译 ####
有道翻译的代码实现如下所示：

	import urllib.request  
	import urllib.parse  
	import json 

	# 有道翻译方法
    def youdao_translate(content):
	    '''实现有道翻译的接口'''
	    youdao_url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
	    data = {}
	    
	    data['i']= content
	    data['from'] = 'AUTO'
	    data['to'] = 'AUTO'
	    data['smartresult'] = 'dict'
	    data['client'] = 'fanyideskweb'
	    data['salt'] = '1525141473246'
	    data['sign'] = '47ee728a4465ef98ac06510bf67f3023'
	    data['doctype'] = 'json'
	    data['version'] = '2.1'
	    data['keyfrom'] = 'fanyi.web'
	    data['action'] = 'FY_BY_CLICKBUTTION'
	    data['typoResult'] = 'false'
	    data = urllib.parse.urlencode(data).encode('utf-8')
	
	    youdao_response = urllib.request.urlopen(youdao_url, data)
	    youdao_html = youdao_response.read().decode('utf-8')
	    target = json.loads(youdao_html)
	
	    trans = target['translateResult']
	    ret = ''
	    for i in range(len(trans)):
	        line = ''
	        for j in range(len(trans[i])):
	            line = trans[i][j]['tgt']
	        ret += line + '\n'
	
	    return ret

### 写入文档 ###
#### 写TXT文档 ####
TXT文档的写比较简单，代码如下所示：

	# 写TXT文档
	def write_txt(path,content):
	    '''实现TXT文档的写方法'''
	    with open(path,'a+') as f:
	        f.write(content)

#### 写Word文档 ####
Word文档的写就是用上面所属的python-docx库实现，在上面读取Word文档小节中已有代码明细，非常之简单，这里就不在赘述。
#### 写PDF文档 ####
同上，写PDF文档，用PyPdf和pdfminer均可实现，PyPDF相对而言要简单写，因本脚本对翻译后的文档只实现了Word和TXT的写，方便对文档进行编辑处理，若需要生成PDF文档，有兴趣可自行研究。

好了，关键技术已基本描述清楚，下面就是具体的实现过程和效果对比。

## 实现 ##
实现过程就是怼代码的过程，思路有了，自然信手拈来，语法不清楚的可以google、百度，我也是个python新手，代码只是思路的体现，没多少含金量，只是熟能生巧罢了。所以这里就不贴代码，如需查看我丑陋的代码，等下我会把代码共享到全球最大的同性交友网站，您可自取。
### 效果 ###
我准备了著名的演讲马丁·路德·金的《我有一个梦想》英文版3种不同格式的文档，如下图所示：
![](https://i.imgur.com/XyX2UMy.png)  
运行Python脚本，如下图所示：  
![](https://i.imgur.com/4W721ir.png)  
查看生成的文档，如下图所示：  
![](https://i.imgur.com/okLXGmM.png)  
翻译前后对比**（TXT）**
![](https://i.imgur.com/5PZw7ko.png)  
翻译前后对比**（Word）**
![](https://i.imgur.com/O09nMEB.png)
翻译前后对比**（PDF）**
![](https://i.imgur.com/6IWdcBq.png)

嗯！大概就是这样的。

## 分享 ##
本项目的地址：[https://github.com/AnuoF/TranslateTool](https://github.com/AnuoF/TranslateTool)  
如有问题，可以与我交流。  

Allen   
June 23,2018  
Chengdu

================================================
FILE: src/TranslateTool/TranslateTool/Doc_In/I have a dream.txt
================================================
Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. This momentous decree came as a great beacon light of hope to millions of Negro slaves who had been seared in the flames of withering injustice. It came as a joyous daybreak to end the long night of bad captivity.
    But one hundred years later, the Negro still is not free. One hundred years later, the life of the Negro is still sadly crippled by the manacles of segregation and the chains of discrimination. One hundred years later, the Negro lives on a lonely island of poverty in the midst of a vast ocean of material prosperity. One hundred years later, the Negro is still languished in the corners of American society and finds himself an exile in his own land. So we’ve come here today to dramatize a shameful condition.



    I am not unmindful that some of you have come here out of great trials and tribulations. Some of you have come fresh from narrow jail cells. Some of you have come from areas where your quest for freedom left you battered by the storms of persecution and staggered by the winds of police brutality. You have been the veterans of creative suffering. Continue to work with the faith that unearned suffering is redemptive.
Go back to Mississippi, go back to Alabama, go back to South Carolina, go back to Georgia, go back to Louisiana, go back to the slums and ghettos of our northern cities, knowing that somehow this situation can and will be changed. Let us not wallow in the valley of despair.



I say to you today, my friends, so even though we face the difficulties of today and tomorrow, I still have a dream. It is a dream deeply rooted in the American dream.
    I have a dream that one day this nation will rise up, live up to the true meaning of its creed: “We hold these truths to be self-evident; that all men are created equal.”
    I have a dream that one day on the red hills of Georgia the sons of former slaves and the sons of former slave-owners will be able to sit down together at the table of brotherhood.
    I have a dream that one day even the state of Mississippi, a state sweltering with the heat of injustice, sweltering with the heat of oppression, will be transformed into an oasis of freedom and justice.
    I have a dream that my four children will one day live in a nation where they will not be judged by the color if their skin but by the content of their character.
    I have a dream today.



    I have a dream that one day down in Alabama with its governor having his lips dripping with the words of interposition and nullification, one day right down in Alabama little black boys and black girls will be able to join hands with little white boys and white girls as sisters and brothers.
    I have a dream today.
    I have a dream that one day every valley shall be exalted, every hill and mountain shall be made low, the rough places will be made plain, and the crooked places will be made straight, and the glory of the Lord shall be revealed, and all flesh shall see it together.
This is our hope. This is the faith that I go back to the South with. With this faith we will be able to hew out of the mountain of despair a stone of hope. With this faith we will be able to transform the jangling discords of our nation into a beautiful symphony of brotherhood. With this faith we will be able to work together, to pray together, to struggle together, to go to jail together, to stand up for freedom together, knowing that we will be free one day.
    This will be the day when all of God’s children will be able to sing with new meaning. 
My country, ’ tis of thee,
Sweet land of liberty,
Of thee I sing:
Land where my fathers died,
Land of the pilgrims’ pride,
From every mountainside
Let freedom ring.
And if America is to be a great nation this must become true. So let freedom ring from the prodigious hilltops of New Hampshire.
Let freedom ring from the mighty mountains of New York!
Let freedom ring from the heightening Alleghenies of Pennsylvania!
Let freedom ring from the snowcapped Rockies of Colorado!
Let freedom ring from the curvaceous slops of California!
But not only that; let freedom ring from Stone Mountain of Georgia!
Let freedom ring from Lookout Mountain of Tennessee!
Let freedom ring from every hill and molehill of Mississippi!
From every mountainside, let freedom ring!
When we let freedom ring, when we let it ring from every village and every hamlet, from every state and every city, we will be able to speed up that day when all of God’s children, black men and white men, Jews and Gentiles, Protestants and Catholics, will be able to join hands and sing in the words of the old Negro spiritual, “Free at last! free at last! thank God almighty, we are free at last!”





================================================
FILE: src/TranslateTool/TranslateTool/Logger.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   log.py
 
@Time    :   June 20,2018
 
@Desc    :   This module implements log records and maintenance ,using singleton pattern.
 
'''

import os
import datetime


# 日志类
class Logger(object):
    '''实现日志的记录和维护，采用单例模式'''
    
    def __init__(self):
        '''初始化时删除先前的日志'''

        # 获取运行目录
        path = os.path.split(os.path.realpath(__file__))[0]
        path = os.path.join(path,'log')
        if not os.path.exists(path):
            os.makedirs(path)

        self.log_path = os.path.join(path,'log.txt')


    def __new__(cls,*args,**kwargs):
        '''实现单例模式'''

        if not hasattr(Logger,"_instance"):
            Logger._instance = object.__new__(cls)
        return Logger._instance


    def write(self,content):
        '''写日志'''
        
         # 添加时间信息
        time_str = datetime.datetime.strftime(datetime.datetime.now(),'%Y-%m-%d %H:%M:%S')
        content = '[' + time_str + '] -> ' + content + '\n'

        # 打印消息
        print(content)
        # 写文件
        with open(self.log_path,'a+',encoding='utf-8') as f:
            f.write(content)
    

    def delete_old_log():
        '''删除之前存在的日志文件'''
        
        # 获取运行目录
        path = os.path.split(os.path.realpath(__file__))[0]
        path = os.path.join(path,'log')

        if os.path.exists():
            os.path.removedirs(path)

        os.path.mkdir(path)
        log_path = os.path.join(path,'log.txt')


================================================
FILE: src/TranslateTool/TranslateTool/Main.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   Main.py
 
@Time    :   June 21,2018
 
@Desc    :   程序入口模块.
 
'''


import os
from Logger import *
from T_Txt import TxtTranslate
from T_Pdf_pdfminer import PdfTranslate
from T_Docx import DocxTranslate
from T_Pdf_PyPDF2 import PdfTranslagePyPDF2


def run():

     # 提取档
    fileList = get_doc()
    if not fileList:
        # 文件不存在
        write_log('指定路径文件不存在，不执行翻译，结束程序。')
        return

    msg = '提取到 ' + str(len(fileList)) + ' 个文档'
    write_log(msg)

    # 创建文件夹（文档输出目录）
    if not os.path.exists(get_path('Doc_Out')):
        os.makedirs(get_path('Doc_Out'))

    for i in range(len(fileList)):
        doc = fileList[i]
        write_log('开始翻译' + doc)
        # 翻译单个文档
        translate_doc(doc)

    write_log('翻译完成，请查看Doc_Out文件夹下面的文档')


def translate_doc(doc):
    '''翻译单个文档'''

    path = get_path('Doc_In',doc)
    translate = get_translate(doc,path)
    if not translate:
        write_log('根据文件' + doc + '映射翻译对象失败！')
        return

    translate.translate()


def get_translate(doc,path):
    '''根据文件扩展名返回翻译对象'''

    tranlate = None
    extend_str = os.path.splitext(doc)[1]

    if extend_str == '.txt':
        tranlate = TxtTranslate(doc,path)
    elif extend_str == '.doc' or extend_str == '.docx':
        tranlate = DocxTranslate(doc,path)
    elif extend_str == '.pdf':
        #tranlate = PdfTranslagePyPDF2(doc,path)
        tranlate = PdfTranslate(doc,path)
    else:
        tranlate = None

    return tranlate


def get_doc():
    '''提取指定路径下的文件，返回文件列表'''

    path = get_path('Doc_In')
    if not os.path.exists(path):
        return None

    list = []
    dirs = os.listdir(path)
    for i in dirs:
        extend_str = os.path.splitext(i)[1]
        if extend_str == '.txt' or extend_str == '.pdf' or extend_str == '.doc' or extend_str == '.docx':
            list.append(i)

    return list


def get_path(*paths):
    '''获取、组装路径，参考os.path.join()方法实现'''

    path = os.path.split(os.path.realpath(__file__))[0]
    if len(paths):
        for i in range(len(paths)):
            path = os.path.join(path,paths[i])

    return path


def write_log(msg):
    '''打印消息并写日志'''

    # 写入日志文件
    Logger().write(msg)


# 启动时运行
if __name__ == '__main__':
    run()

================================================
FILE: src/TranslateTool/TranslateTool/Py4Js.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Alen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   Py4Js.py
 
@Time    :   June 20,2018
 
@Desc    :   This module defines some commonly used functional methods.
 
'''

import execjs  # 安装指令：pip install PyExecJS
  
class Py4Js():  
      
    def __init__(self):  
        self.ctx = execjs.compile(""" 
        function TL(a) { 
        var k = ""; 
        var b = 406644; 
        var b1 = 3293161072; 
         
        var jd = "."; 
        var $b = "+-a^+6"; 
        var Zb = "+-3^+b+-f"; 
     
        for (var e = [], f = 0, g = 0; g < a.length; g++) { 
            var m = a.charCodeAt(g); 
            128 > m ? e[f++] = m : (2048 > m ? e[f++] = m >> 6 | 192 : (55296 == (m & 64512) && g + 1 < a.length && 56320 == (a.charCodeAt(g + 1) & 64512) ? (m = 65536 + ((m & 1023) << 10) + (a.charCodeAt(++g) & 1023), 
            e[f++] = m >> 18 | 240, 
            e[f++] = m >> 12 & 63 | 128) : e[f++] = m >> 12 | 224, 
            e[f++] = m >> 6 & 63 | 128), 
            e[f++] = m & 63 | 128) 
        } 
        a = b; 
        for (f = 0; f < e.length; f++) a += e[f], 
        a = RL(a, $b); 
        a = RL(a, Zb); 
        a ^= b1 || 0; 
        0 > a && (a = (a & 2147483647) + 2147483648); 
        a %= 1E6; 
        return a.toString() + jd + (a ^ b) 
    }; 
     
    function RL(a, b) { 
        var t = "a"; 
        var Yb = "+"; 
        for (var c = 0; c < b.length - 2; c += 3) { 
            var d = b.charAt(c + 2), 
            d = d >= t ? d.charCodeAt(0) - 87 : Number(d), 
            d = b.charAt(c + 1) == Yb ? a >>> d: a << d; 
            a = b.charAt(c) == Yb ? a + d & 4294967295 : a ^ d 
        } 
        return a 
    } 
    """)  
          
    def getTk(self,text):  
        return self.ctx.call("TL",text)  


================================================
FILE: src/TranslateTool/TranslateTool/T_Base.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   TranslateBase.py
 
@Time    :   June 21,2018
 
@Desc    :   此模块为翻译基类.
 
'''


# 翻译基类
class Translate(object):

    def translate(doc):
        '''翻译'''
        pass




================================================
FILE: src/TranslateTool/TranslateTool/T_Docx.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   T_Docx.py
 
@Time    :   June 21,2018
 
@Desc    :   此模块实现.docx文档的翻译功能.
 
'''


from T_Base import Translate
from TranslateFunc import *
import os
import docx    # 安装指令：pip install python-docx
from Logger import *


class DocxTranslate(Translate):

    def __init__(self, fileName, fullPath):
        '''构造函数'''
        # fileName:文件名
        # fullPath:全路径

        self.fileName = fileName
        self.fullName = fullPath
        self.prepare()


    def translate(self):
        '''翻译'''

        # 获取文档对象
        doc = docx.Document(self.fullName)

        # 创建内存中的word文档对象
        new_doc = docx.Document()

        i = 0
        # 遍历每一段文本
        for para in doc.paragraphs:
            # 翻译
            content = para.text.strip()    # 去掉多余空格

            if content != '':
                ret = translate_func(content)
                trans = ret if ret else '翻译失败'
                # 写入新文件
                new_doc.add_paragraph(content)
                new_doc.add_paragraph(trans)
                i += 1
                print(i,end=' ',flush=True)
                # 保存到本地文件
                new_doc.save(self.new_fullPath)
        
        Logger().write(self.fileName + '翻译完成，新文档：' + self.new_fullPath)


    def prepare(self):
        '''准备：生成的文件名和路径'''

        # 查看要生成的文件名是否已存在，若存在，则在文件名中 + 1
        file_name = os.path.splitext(self.fileName)[0] + os.path.splitext(self.fileName)[1]
        path = self.get_path('Doc_Out',file_name)
        i = 1

        while os.path.exists(path):   # 循环，生成新的文件名
            file_name = os.path.splitext(self.fileName)[0] + str(i) + os.path.splitext(self.fileName)[1]
            path = self.get_path('Doc_Out',file_name)
            i = i + 1

        self.new_fileName = file_name
        self.new_fullPath = path


    def get_path(self,*paths):
        '''获取，组装路径'''

        path = os.path.split(os.path.realpath(__file__))[0]
        if len(paths):
            for i in range(len(paths)):
                path = os.path.join(path,paths[i])

        return path

================================================
FILE: src/TranslateTool/TranslateTool/T_Pdf_PyPDF2.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   T_Pdf_PyPDF2.py
 
@Time    :   June 23,2018
 
@Desc    :   此模块实现.pdf文档的翻译功能，读取PDF文档，并将内容翻译，写入txt文档。
 
'''


import os
from TranslateFunc import *
from T_Base import Translate
from Logger import *
# 安装指令：pip install pypdf2
from PyPDF2.pdf import PdfFileReader

class PdfTranslagePyPDF2(Translate):
    '''PDF文档翻译模块'''

    def __init__(self, fileName, fullPath):
        '''构造函数'''
        # fileName:文件名
        # fullPath:全路径

        self.fileName = fileName
        self.fullPath = fullPath
        self.prepare()


    def translate(self):
        '''读取pdf内容，并翻译，写入txt文件'''
        f = open(self.fullPath,'rb')
        pdf = PdfFileReader(f)

        index = 0
        for i in range(0,pdf.getNumPages()):
            extractedText = pdf.getPage(i).extractText()
            content = extractedText.split('\n')  
            content = self.removeBlankFromList(content)  

            # 拼接之后的文本，如果单词间歇超过一个空格的，认为是需要换行处理的
            content_list = self.enter_symbol(content)

            for line in content_list:
                line = line.strip()
                if line:
                    ret = translate_func(line)
                    trans = ret if ret else '翻译失败'
                    self.write(line + '\n')
                    self.write(trans)
                    index += 1
                    print(index,end=' ',flush=True)

        f.close()
        Logger().write(self.fileName + '翻译完成，新文档：' + self.new_fullPath)


    def removeBlankFromList(self,list_old):
        '''移除空白'''

        list_new = []  
        for i in list_old:  
            if i != '':  
                list_new.append(i)  
        return self.combine_list2str(list_new)


    def combine_list2str(self,list_old):
        '''列表到字符串的转换'''

        s = ''
        for i in list_old:
            s += i
        return s


    def enter_symbol(self,content_old):
        '''换行处理，返回处理后的行列表'''
        
        content = content_old
        for i in range(10,1,-1):
            symbol = ' '
            symbol = symbol * i
            content = content.replace(symbol,'A34EN')
        return content.split('A34EN')


    def prepare(self):
        '''准备：生成的文件名和路径'''

        # 查看要生成的文件名是否已存在，若存在，则在文件名中 + 1
        file_name = os.path.splitext(self.fileName)[0] + '_pdf.txt'
        path = self.get_path('Doc_Out',file_name)
        i = 1

        while os.path.exists(path):   # 循环，生成新的文件名
            file_name = os.path.splitext(self.fileName)[0] + str(i) + '_pdf.txt'
            path = self.get_path('Doc_Out',file_name)
            i = i + 1

        self.new_fileName = file_name
        self.new_fullPath = path


    def get_path(self,*paths):
        '''获取，组装路径，参考os.path.join()方法实现'''

        path = os.path.split(os.path.realpath(__file__))[0]
        if len(paths):
            for i in range(len(paths)):
                path = os.path.join(path,paths[i])

        return path


    def write(self,content):
        '''写入文件'''

        #content = str(content.encode('utf-8'))
        # ‘a+’表示追加文本
        with open(self.new_fullPath,'a+',encoding='utf-8') as f:
            f.write(content)

================================================
FILE: src/TranslateTool/TranslateTool/T_Pdf_pdfminer.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
 
'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   T_Pdf_pdfminer.py
 
@Time    :   June 21,2018
 
@Desc    :   此模块实现.pdf文档的翻译功能，读取PDF文档，并将内容翻译，写入txt文档。
 
'''


from T_Base import Translate
from Logger import *
from TranslateFunc import *
import os
import time

# 安装指令：pip install PyExecJS
from pdfminer.pdfparser import PDFParser,PDFDocument   # pip install pdfminer3k
from pdfminer.pdfinterp import PDFResourceManager,PDFPageInterpreter
from pdfminer.layout import LAParams,LTTextBoxHorizontal
from pdfminer.converter import PDFPageAggregator
from pdfminer.pdfinterp import PDFTextExtractionNotAllowed


class PdfTranslate(Translate):
    '''PDF文档翻译模块'''

    def __init__(self, fileName, fullPath):
        '''构造函数'''
        # fileName:文件名
        # fullPath:全路径

        self.fileName = fileName
        self.fullPath = fullPath
        self.prepare()


    def translate(self):
        '''读取pdf内容，并翻译，写入txt文件'''

        # 以二进制读模式打开本地pdf文件
        fp = open(self.fullPath,'rb')
        # 用文件对象来创建一个pdf文档分析器
        praser_pdf = PDFParser(fp)
        # 创建一个PDF文档
        doc_pdf = PDFDocument()
        # 连接分析器与文档对象
        praser_pdf.set_document(doc_pdf)
        doc_pdf.set_parser(praser_pdf)
        # 提供初始化密码doc.initialize("123456")，如果没有密码 就创建一个空的字符串
        doc_pdf.initialize()

        # 检查文档是否提供txt转换，不提供就无法翻译文档
        if not doc_pdf.is_extractable:
            Logger().write(self.fileName + '未能提取有效的文本，停止翻译。')
            return
        else:
            # 创建PDF资源管理器来共享资源
            rsrcmgr = PDFResourceManager()
            # 创建一个PDF参数分析器
            laparams = LAParams()
            # 创建聚合器
            device = PDFPageAggregator(rsrcmgr,laparams=laparams)
            # 创建一个PDF页面解释器对象
            interpreter = PDFPageInterpreter(rsrcmgr,device)

            i = 0
            # 循环遍历列表，每次处理一页的内容
            for page in doc_pdf.get_pages():
                # 使用页面解释器来读取
                interpreter.process_page(page)
                # 使用聚合器获取内容
                layout = device.get_result()

                # 这里layout是一个LTPage对象 里面存放着 这个page解析出的各种对象 一般包括LTTextBox, LTFigure, LTImage, LTTextBoxHorizontal 等等 想要获取文本就获得对象的text属性，
                for out in layout:
                    # 判断是否含有get_text()方法，图片之类的就没有
                    if isinstance(out,LTTextBoxHorizontal):
                        content = out.get_text().strip()
                        if content:
                            to_trans_content = content.replace("\r\n","")
                            ret = translate_func(to_trans_content)
                            trans = ret if ret else '翻译失败'

                            self.write(content)
                            self.write(trans)
                            i += 1
                            print(i,end=' ',flush=True)

                time.sleep(2);

            Logger().write(self.fileName + '翻译完成，新文档：' + self.new_fullPath)


    def prepare(self):
        '''准备：生成的文件名和路径'''

        # 查看要生成的文件名是否已存在，若存在，则在文件名中 + 1
        file_name = os.path.splitext(self.fileName)[0] + '_pdf.txt'
        path = self.get_path('Doc_Out',file_name)
        i = 1

        while os.path.exists(path):   # 循环，生成新的文件名
            file_name = os.path.splitext(self.fileName)[0] + str(i) + '_pdf.txt'
            path = self.get_path('Doc_Out',file_name)
            i = i + 1

        self.new_fileName = file_name
        self.new_fullPath = path


    def get_path(self,*paths):
        '''获取，组装路径，参考os.path.join()方法实现'''

        path = os.path.split(os.path.realpath(__file__))[0]
        if len(paths):
            for i in range(len(paths)):
                path = os.path.join(path,paths[i])

        return path


    def write(self,content):
        '''写入文件'''

        # https://www.cnblogs.com/themost/p/6603409.html
        # ‘a+’表示追加文本，txt文本打开默认是gbk编码，需要设置成utf-8
        with open(self.new_fullPath,'a+',encoding='utf-8') as f:
            f.write(content)

================================================
FILE: src/TranslateTool/TranslateTool/T_Txt.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   T_Txt.py
 
@Time    :   June 21,2018
 
@Desc    :   此模块实现.txt文档的翻译功能.
 
'''


from T_Base import Translate
from Logger import Logger
from TranslateFunc import *
import os


class TxtTranslate(Translate):
    
    def __init__(self, fileName,path):
        '''构造函数 path为文件绝对路径'''

        self.fileName = fileName
        self.path = path
        self.prepare()


    def translate(self):
        '''翻译txt文档'''

        '''流程：start->提取->翻译->写入->finish'''

        f = open(self.path,encoding='utf-8')
        line = f.readline()   #'utf-8' codec can't decode byte 0xa1 in position 795: invalid start byte
        temp_line = ''   # 拼接行
        
        i = 0
        while line:
            # 如果当前读取的行以回车结尾，那就拿这一段进行翻译，并写入翻译后的文件
            # 如果当前读取的行不是以回车结尾，则继续拼接文本
            if line.endswith('\n'):
                temp_line += line

                temp_line = temp_line.strip()
                if temp_line:
                    ret = translate_func(temp_line)
                    trans = ret if ret else '翻译失败'
                    self.write(temp_line)
                    self.write(trans + '\n')
                    i += 1
                    print(i,end=' ',flush=True)

                temp_line = ''
            else:  # 实际上readline()就是读取一段文本，而不是表面上的一行
                temp_line += line

            line = f.readline()

        if temp_line:  # 防止拼接之后，有一部分未执行翻译
            ret = translate_func(temp_line)
            trans = ret if ret else '翻译失败'
            self.write(temp_line)
            self.write(trans + '\n')
            i += 1
            print(i,end=' ',flush=True)

        f.close()
        Logger().write(self.fileName + '翻译完成，新文档：' + self.new_path)


    def prepare(self):
        '''准备：生成的文件名和路径'''

        # 查看要生成的文件名是否已存在，若存在，则在文件名中 + 1
        file_name = os.path.splitext(self.fileName)[0] + '.txt'
        path = self.get_path('Doc_Out',file_name)
        i = 1

        while os.path.exists(path):   # 循环，生成新的文件名
            file_name = os.path.splitext(self.fileName)[0] + str(i) + '.txt'
            path = self.get_path('Doc_Out',file_name)
            i = i + 1

        self.new_path = path
        self.new_fileName = file_name


    def get_path(self,*paths):  # 如果这里的参数没有self的话，paths里将会包含这个类实例化对象，导致join方法出错
        '''获取，组装路径，参考os.path.join()方法实现'''

        path = os.path.split(os.path.realpath(__file__))[0]
        if len(paths):
            for i in range(len(paths)):
                path = os.path.join(path,paths[i])

        return path
    

    def write(self,content):
        '''写入文件'''

        # ‘a+’表示追加文本
        with open(self.new_path,'a+',encoding='utf-8') as f:
            f.write(content)

================================================
FILE: src/TranslateTool/TranslateTool/TranslateFunc.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   Translate_Func.py
 
@Time    :   June 21,2018
 
@Desc    :   实现翻译的爬虫功能.
 
'''

import urllib.request
import urllib.parse
import json
import requests   # pip intasll requests
from Py4Js import *
from Logger import *


# 百度翻译方法
def baidu_translate(content,type=1):
    '''实现百度翻译'''
    # 需要 sign 和 Cookie，而且query/sign/Cookie三者之间有关联关系。 此问题暂未解决  2019/3/4 

    baidu_url = "https://fanyi.baidu.com/v2transapi"
    data = {}
    data['from'] = 'zh'
    data['to'] = 'en'
    data['query'] = "妈妈"    # 奶奶
    data['transtype'] = 'translang'     # enter/realtime/translang : 经测试，此处填什么内容无影响
    data['simple_means_flag'] = '3'
    data['sign'] = '64344.268393'  # 422672.167969
    data['token'] = 'b08c1ff5373f47e4a5b62c59d38d4f63'
    data = urllib.parse.urlencode(data).encode('utf-8')

    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36"
    #headers['Referer'] = 'https://fanyi.baidu.com/translate?aldtype=16047&query=&keyfrom=baidu&smartresult=dict&lang=auto2zh'   # 可以不需要
    headers['Cookie'] ='BAIDUID=09546BF566035455B10181ED1091C5B1:FG=1; BIDUPSID=09546BF566035455B10181ED1091C5B1; PSTM=1551086333; BDUSS=1mMmtwVjc1RnBtR05mMUljOC13amFabjBZRn5HY2JKRUJnWkV1bTBMVWdoWjFjQVFBQUFBJCQAAAAAAAAAAAEAAAAACYVHUmF5X0EzNGVuAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACD4dVwg-HVcYW; MCITY=-75%3A; pgv_pvi=4361364480; BDRCVFR[feWj1Vr5u3D]=I67x6TjHwwYf0; delPer=0; PSINO=3; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; pgv_si=s991542272; locale=zh; REALTIME_TRANS_SWITCH=1; FANYI_WORD_SWITCH=1; HISTORY_SWITCH=1; SOUND_SPD_SWITCH=1; SOUND_PREFER_SWITCH=1; H_PS_PSSID=26525_1440_25809_21122_20697_28585_26350_28603_28415; Hm_lvt_64ecd82404c51e03dc91cb9e8c025574=1551663602,1551670352; Hm_lpvt_64ecd82404c51e03dc91cb9e8c025574=1551670352; from_lang_often=%5B%7B%22value%22%3A%22en%22%2C%22text%22%3A%22%u82F1%u8BED%22%7D%2C%7B%22value%22%3A%22zh%22%2C%22text%22%3A%22%u4E2D%u6587%22%7D%5D; to_lang_often=%5B%7B%22value%22%3A%22zh%22%2C%22text%22%3A%22%u4E2D%u6587%22%7D%2C%7B%22value%22%3A%22en%22%2C%22text%22%3A%22%u82F1%u8BED%22%7D%5D'

    baidu_re = urllib.request.Request(baidu_url, data, headers=headers)
    baidu_response = urllib.request.urlopen(baidu_re)
    baidu_html = baidu_response.read().decode('utf-8')
    target2 = json.loads(baidu_html)

    trans = target2['trans_result']['data'][0]['dst']
    ret = ''
    for i in range(len(trans)):
        ret += trans[i]['dst'] + '\n'

    if ret:
        return (True,ret)
    else:
        return (False,ret)


    #baidu_url = 'http://fanyi.baidu.com/basetrans'
    #data = {}

    #data['from'] = 'en'
    #data['to'] = 'zh'
    #data['query'] = content
    #data['transtype'] = 'translang'
    #data['simple_means_flag'] = '3'
    #data['sign'] = '94582.365127'
    #data['token'] = 'ec980ef090b173ebdff2eea5ffd9a778'
    #data = urllib.parse.urlencode(data).encode('utf-8')

    #headers = {"User-Agent":"Mozilla/5.0 (Linux; Android 5.1.1; Nexus 6 Build/LYZ28E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Mobile Safari/537.36"}
    #baidu_re = urllib.request.Request(baidu_url, data, headers)
    #baidu_response = urllib.request.urlopen(baidu_re)
    #baidu_html = baidu_response.read().decode('utf-8')
    #target2 = json.loads(baidu_html)

    #trans = target2['trans']
    #ret = ''
    #for i in range(len(trans)):
    #    ret += trans[i]['dst'] + '\n'

    #if ret:
    #    return (True,ret)
    #else:
    #    return (False,ret)


# 有道翻译方法
def youdao_translate(content):
    '''实现有道翻译的接口'''

    url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&sessionFrom=https://www.baidu.com/link'
    data = {
        'from':'AUTO',
        'to':'AUTO',
        'smartresult':'dict',
        'client':'fanyideskweb',
        'salt':'1500092479607',
        'sign':'d9f9a3aa0a7b34241b3fe30505e5d436',
        'doctype':'json',
        'version':'2.1',
        'keyfrom':'fanyi.web',
        'action':'FY_BY_CL1CKBUTTON',
        'typoResult':'true'}

    data['i'] = content

    data = urllib.parse.urlencode(data).encode('utf-8')
    wy = urllib.request.urlopen(url,data)
    html = wy.read().decode('utf-8')

    ta = json.loads(html)
    ret = ta['translateResult'][0][0]['tgt']

    if ret:
        return (True,ret)
    else:
        return (False,ret)

    # 下面的代码不能使用!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    youdao_url = 'http://fanyi.youdao.com/translate'  
    data = {}
    
    data['i']= content
    data['from'] = 'AUTO'
    data['to'] = 'AUTO'
    data['smartresult'] = 'dict'
    data['client'] = 'fanyideskweb'
    data['salt'] = '1525141473246'
    data['sign'] = 'd9f9a3aa0a7b34241b3fe30505e5d436'
    data['doctype'] = 'json'
    data['version'] = '2.1'
    data['keyfrom'] = 'fanyi.web'
    data['action'] = 'FY_BY_CLICKBUTTION'
    data['typoResult'] = 'false'
    data = urllib.parse.urlencode(data).encode('utf-8')

    youdao_response = urllib.request.urlopen(youdao_url, data)
    youdao_html = youdao_response.read().decode('utf-8')
    target = json.loads(youdao_html)

    trans = target['translateResult']
    ret = ''
    for i in range(len(trans)):
        line = ''
        for j in range(len(trans[i])):
            line = trans[i][j]['tgt']
        ret += line + '\n'

    return ret


# 谷歌翻译方法
def google_translate(content):
    '''实现谷歌的翻译'''

    js = Py4Js()
    tk = js.getTk(content)

    if len(content) > 4891:      
        print("翻译的长度超过限制！！！")      
        return ''   
  
    param = {'tk': tk, 'q': content}  
  
    result = requests.get("""http://translate.google.cn/translate_a/single?client=t&sl=en 
        &tl=zh-CN&hl=zh-CN&dt=at&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss 
        &dt=t&ie=UTF-8&oe=UTF-8&clearbtn=1&otf=1&pc=1&srcrom=0&ssel=0&tsel=0&kc=2""", params=param)  
  
    #返回的结果为Json，解析为一个嵌套列表  
    trans = result.json()[0]
    ret = ''
    for i in range(len(trans)):
        line = trans[i][0]
        if line != None:
            ret += trans[i][0]

    if ret:
        return (True,ret)
    else:
        return (False,ret)


def translate_func(content):
    '''集成百度、谷歌、有道多合一的翻译'''

    funcs = [google_translate, youdao_translate]    # baidu_translate,google_translate,youdao_translate
    count = 0

    # 循环调用百度、谷歌、有道API，其中如果谁调成功就返回，或者大于等于9次没有成功也返回。
    while True:
        for i in range(len(funcs)):
            ret = (False,'')
            try:
                ret = funcs[i](content)
            except:
                Logger().write("调用 %s 方法出现异常。" % funcs[i].__name__)

            if ret[0] == True:
                return ret[1]
            else:
                count += 1
                if count >= 9:
                    Logger().write("以下内容尝试9次仍翻译失败，内容【 %s 】" % content)
                    return ''
                else:
                    continue


================================================
FILE: src/TranslateTool/TranslateTool/TranslateFuncT.py
================================================
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

'''
@Author  :   Allen
 
@License :   (C) Copyright 2018, Allen's Studio
 
@Contact :   188512936@qq.com
 
@Software:   VS2017
 
@File    :   Translate_Func.py
 
@Time    :   June 21,2018
 
@Desc    :   实现翻译的爬虫功能.
 
'''

import urllib.request
import urllib.parse
import json
import requests   # pip intasll requests
from Py4Js import *
from Logger import *


# 百度翻译方法
def baidu_translate(content,type=1):
    '''实现百度翻译'''
    # 需要 sign 和 Cookie，而且query/sign/Cookie三者之间有关联关系。 此问题暂未解决  2019/3/4 

    baidu_url = "https://fanyi.baidu.com/v2transapi"
    data = {}
    data['from'] = 'zh'
    data['to'] = 'en'
    data['query'] = "妈妈"    # 奶奶
    data['transtype'] = 'translang'     # enter/realtime/translang : 经测试，此处填什么内容无影响
    data['simple_means_flag'] = '3'
    data['sign'] = '64344.268393'  # 422672.167969
    data['token'] = 'b08c1ff5373f47e4a5b62c59d38d4f63'
    data = urllib.parse.urlencode(data).encode('utf-8')

    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36"
    #headers['Referer'] = 'https://fanyi.baidu.com/translate?aldtype=16047&query=&keyfrom=baidu&smartresult=dict&lang=auto2zh'   # 可以不需要
    headers['Cookie'] ='BAIDUID=09546BF566035455B10181ED1091C5B1:FG=1; BIDUPSID=09546BF566035455B10181ED1091C5B1; PSTM=1551086333; BDUSS=1mMmtwVjc1RnBtR05mMUljOC13amFabjBZRn5HY2JKRUJnWkV1bTBMVWdoWjFjQVFBQUFBJCQAAAAAAAAAAAEAAAAACYVHUmF5X0EzNGVuAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACD4dVwg-HVcYW; MCITY=-75%3A; pgv_pvi=4361364480; BDRCVFR[feWj1Vr5u3D]=I67x6TjHwwYf0; delPer=0; PSINO=3; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; pgv_si=s991542272; locale=zh; REALTIME_TRANS_SWITCH=1; FANYI_WORD_SWITCH=1; HISTORY_SWITCH=1; SOUND_SPD_SWITCH=1; SOUND_PREFER_SWITCH=1; H_PS_PSSID=26525_1440_25809_21122_20697_28585_26350_28603_28415; Hm_lvt_64ecd82404c51e03dc91cb9e8c025574=1551663602,1551670352; Hm_lpvt_64ecd82404c51e03dc91cb9e8c025574=1551670352; from_lang_often=%5B%7B%22value%22%3A%22en%22%2C%22text%22%3A%22%u82F1%u8BED%22%7D%2C%7B%22value%22%3A%22zh%22%2C%22text%22%3A%22%u4E2D%u6587%22%7D%5D; to_lang_often=%5B%7B%22value%22%3A%22zh%22%2C%22text%22%3A%22%u4E2D%u6587%22%7D%2C%7B%22value%22%3A%22en%22%2C%22text%22%3A%22%u82F1%u8BED%22%7D%5D'

    baidu_re = urllib.request.Request(baidu_url, data, headers=headers)
    baidu_response = urllib.request.urlopen(baidu_re)
    baidu_html = baidu_response.read().decode('utf-8')
    target2 = json.loads(baidu_html)

    trans = target2['trans_result']['data'][0]['dst']
    ret = ''
    for i in range(len(trans)):
        ret += trans[i]['dst'] + '\n'

    if ret:
        return (True,ret)
    else:
        return (False,ret)


    #baidu_url = 'http://fanyi.baidu.com/basetrans'
    #data = {}

    #data['from'] = 'en'
    #data['to'] = 'zh'
    #data['query'] = content
    #data['transtype'] = 'translang'
    #data['simple_means_flag'] = '3'
    #data['sign'] = '94582.365127'
    #data['token'] = 'ec980ef090b173ebdff2eea5ffd9a778'
    #data = urllib.parse.urlencode(data).encode('utf-8')

    #headers = {"User-Agent":"Mozilla/5.0 (Linux; Android 5.1.1; Nexus 6 Build/LYZ28E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Mobile Safari/537.36"}
    #baidu_re = urllib.request.Request(baidu_url, data, headers)
    #baidu_response = urllib.request.urlopen(baidu_re)
    #baidu_html = baidu_response.read().decode('utf-8')
    #target2 = json.loads(baidu_html)

    #trans = target2['trans']
    #ret = ''
    #for i in range(len(trans)):
    #    ret += trans[i]['dst'] + '\n'

    #if ret:
    #    return (True,ret)
    #else:
    #    return (False,ret)


# 有道翻译方法
def youdao_translate(content):
    '''实现有道翻译的接口'''

    url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&sessionFrom=https://www.baidu.com/link'
    #url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
    data = {
        'from':'Auto',
        'to':'Auto',
        'smartresult':'dict',
        'client':'fanyideskweb',
        'salt':'1500092479607',
        'sign':'d9f9a3aa0a7b34241b3fe30505e5d436',
        'doctype':'json',
        'version':'2.1',
        'keyfrom':'fanyi.web',
        'action':'FY_BY_CL1CKBUTTON',
        'typoResult':'true'}

    data['i'] = content

    data = urllib.parse.urlencode(data).encode('utf-8')
    wy = urllib.request.urlopen(url,data)
    html = wy.read().decode('utf-8')

    ta = json.loads(html)
    ret = ta['translateResult'][0][0]['tgt']

    if ret:
        return (True,ret)
    else:
        return (False,ret)

    # 下面的代码不能使用!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    youdao_url = 'http://fanyi.youdao.com/translate'  
    data = {}
    
    data['i']= content
    data['from'] = 'Auto'
    data['to'] = 'Auto'
    data['smartresult'] = 'dict'
    data['client'] = 'fanyideskweb'
    data['salt'] = '1525141473246'
    data['sign'] = 'd9f9a3aa0a7b34241b3fe30505e5d436'
    data['doctype'] = 'json'
    data['version'] = '2.1'
    data['keyfrom'] = 'fanyi.web'
    data['action'] = 'FY_BY_CLICKBUTTION'
    data['typoResult'] = 'false'
    data = urllib.parse.urlencode(data).encode('utf-8')

    youdao_response = urllib.request.urlopen(youdao_url, data)
    youdao_html = youdao_response.read().decode('utf-8')
    target = json.loads(youdao_html)

    trans = target['translateResult']
    ret = ''
    for i in range(len(trans)):
        line = ''
        for j in range(len(trans[i])):
            line = trans[i][j]['tgt']
        ret += line + '\n'

    return ret


# 谷歌翻译方法
def google_translate(content):
    '''实现谷歌的翻译'''

    js = Py4Js()
    tk = js.getTk(content)

    if len(content) > 4891:      
        print("翻译的长度超过限制！！！")      
        return ''   
  
    param = {'tk': tk, 'q': content}  
  
    #result = requests.get("""http://translate.google.cn/translate_a/single?client=t&sl=zh-CN 
        # &tl=en&hl=en&dt=at&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss 
        # &dt=t&ie=UTF-8&oe=UTF-8&clearbtn=1&otf=1&pc=1&srcrom=0&ssel=0&tsel=0&kc=2""", params=param)  

    result = requests.get("""http://translate.google.cn/translate_a/single?client=t&sl=zh-CN&tl=en&hl=en&dt=at&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss 
        &dt=t&ie=UTF-8&oe=UTF-8&clearbtn=1&otf=1&pc=1&srcrom=0&ssel=0&tsel=0&kc=2""", params=param)
  
    #返回的结果为Json，解析为一个嵌套列表  
    trans = result.json()[0]
    ret = ''
    for i in range(len(trans)):
        line = trans[i][0]
        if line != None:
            ret += trans[i][0]

    if ret:
        return (True,ret)
    else:
        return (False,ret)


def translate_func(content):
    '''集成百度、谷歌、有道多合一的翻译'''

    funcs = [google_translate, youdao_translate]    # baidu_translate,google_translate,youdao_translate
    count = 0

    # 循环调用百度、谷歌、有道API，其中如果谁调成功就返回，或者大于等于9次没有成功也返回。
    while True:
        for i in range(len(funcs)):
            ret = (False,'')
            try:
                ret = funcs[i](content)
            except:
                Logger().write("调用 %s 方法出现异常。" % funcs[i].__name__)

            if ret[0] == True:
                return ret[1]
            else:
                count += 1
                if count >= 9:
                    Logger().write("以下内容尝试9次仍翻译失败，内容【 %s 】" % content)
                    return ''
                else:
                    continue


================================================
FILE: src/TranslateTool/TranslateTool/TranslateTool.pyproj
================================================
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0">
  <PropertyGroup>
    <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
    <SchemaVersion>2.0</SchemaVersion>
    <ProjectGuid>c8bd1066-8f3d-42e0-9cc2-40853f4770e9</ProjectGuid>
    <ProjectHome>.</ProjectHome>
    <StartupFile>Main.py</StartupFile>
    <SearchPath>
    </SearchPath>
    <WorkingDirectory>.</WorkingDirectory>
    <OutputPath>.</OutputPath>
    <Name>TranslateTool</Name>
    <RootNamespace>TranslateTool</RootNamespace>
  </PropertyGroup>
  <PropertyGroup Condition=" '$(Configuration)' == 'Debug' ">
    <DebugSymbols>true</DebugSymbols>
    <EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
  </PropertyGroup>
  <PropertyGroup Condition=" '$(Configuration)' == 'Release' ">
    <DebugSymbols>true</DebugSymbols>
    <EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
  </PropertyGroup>
  <ItemGroup>
    <Compile Include="T_Docx.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="Logger.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="Main.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="T_Pdf_pdfminer.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="Py4Js.py" />
    <Compile Include="TranslateFunc.py" />
    <Compile Include="T_Base.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="T_Pdf_PyPDF2.py">
      <SubType>Code</SubType>
    </Compile>
    <Compile Include="T_Txt.py">
      <SubType>Code</SubType>
    </Compile>
  </ItemGroup>
  <Import Project="$(MSBuildExtensionsPath32)\Microsoft\VisualStudio\v$(VisualStudioVersion)\Python Tools\Microsoft.PythonTools.targets" />
  <!-- Uncomment the CoreCompile target to enable the Build command in
       Visual Studio and specify your pre- and post-build commands in
       the BeforeBuild and AfterBuild targets below. -->
  <!--<Target Name="CoreCompile" />-->
  <Target Name="BeforeBuild">
  </Target>
  <Target Name="AfterBuild">
  </Target>
</Project>

================================================
FILE: src/TranslateTool/TranslateTool/log/log.txt
================================================
[2019-09-02 09:30:41] -> 提取到 3 个文档
[2019-09-02 09:30:41] -> 开始翻译Article1.txt
[2019-09-02 09:31:58] -> Article1.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article1.txt
[2019-09-02 09:31:58] -> 开始翻译Article2.txt
[2019-09-02 09:31:59] -> Article2.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article2.txt
[2019-09-02 09:31:59] -> 开始翻译Article3.txt
[2019-09-02 09:32:00] -> Article3.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article3.txt
[2019-09-02 09:32:00] -> 翻译完成，请查看Doc_Out文件夹下面的文档
[2019-09-02 09:32:06] -> 提取到 3 个文档
[2019-09-02 09:32:06] -> 开始翻译Article1.txt
[2019-09-02 09:33:02] -> Article1.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article11.txt
[2019-09-02 09:33:02] -> 开始翻译Article2.txt
[2019-09-02 09:35:52] -> 提取到 3 个文档
[2019-09-02 09:35:52] -> 开始翻译Article1.txt
[2019-09-02 09:36:01] -> Article1.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article1.txt
[2019-09-02 09:36:01] -> 开始翻译Article2.txt
[2019-09-02 09:36:06] -> Article2.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article2.txt
[2019-09-02 09:36:06] -> 开始翻译Article3.txt
[2019-09-02 09:36:11] -> Article3.txt翻译完成，新文档：F:\Git\TranslateTool\src\TranslateTool\TranslateTool\Doc_Out\Article3.txt
[2019-09-02 09:36:11] -> 翻译完成，请查看Doc_Out文件夹下面的文档


================================================
FILE: src/TranslateTool/TranslateTool/testdata/I have a dream.txt
================================================
Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. This momentous decree came as a great beacon light of hope to millions of Negro slaves who had been seared in the flames of withering injustice. It came as a joyous daybreak to end the long night of bad captivity.
    But one hundred years later, the Negro still is not free. One hundred years later, the life of the Negro is still sadly crippled by the manacles of segregation and the chains of discrimination. One hundred years later, the Negro lives on a lonely island of poverty in the midst of a vast ocean of material prosperity. One hundred years later, the Negro is still languished in the corners of American society and finds himself an exile in his own land. So we’ve come here today to dramatize a shameful condition.



    I am not unmindful that some of you have come here out of great trials and tribulations. Some of you have come fresh from narrow jail cells. Some of you have come from areas where your quest for freedom left you battered by the storms of persecution and staggered by the winds of police brutality. You have been the veterans of creative suffering. Continue to work with the faith that unearned suffering is redemptive.
Go back to Mississippi, go back to Alabama, go back to South Carolina, go back to Georgia, go back to Louisiana, go back to the slums and ghettos of our northern cities, knowing that somehow this situation can and will be changed. Let us not wallow in the valley of despair.



I say to you today, my friends, so even though we face the difficulties of today and tomorrow, I still have a dream. It is a dream deeply rooted in the American dream.
    I have a dream that one day this nation will rise up, live up to the true meaning of its creed: “We hold these truths to be self-evident; that all men are created equal.”
    I have a dream that one day on the red hills of Georgia the sons of former slaves and the sons of former slave-owners will be able to sit down together at the table of brotherhood.
    I have a dream that one day even the state of Mississippi, a state sweltering with the heat of injustice, sweltering with the heat of oppression, will be transformed into an oasis of freedom and justice.
    I have a dream that my four children will one day live in a nation where they will not be judged by the color if their skin but by the content of their character.
    I have a dream today.



    I have a dream that one day down in Alabama with its governor having his lips dripping with the words of interposition and nullification, one day right down in Alabama little black boys and black girls will be able to join hands with little white boys and white girls as sisters and brothers.
    I have a dream today.
    I have a dream that one day every valley shall be exalted, every hill and mountain shall be made low, the rough places will be made plain, and the crooked places will be made straight, and the glory of the Lord shall be revealed, and all flesh shall see it together.
This is our hope. This is the faith that I go back to the South with. With this faith we will be able to hew out of the mountain of despair a stone of hope. With this faith we will be able to transform the jangling discords of our nation into a beautiful symphony of brotherhood. With this faith we will be able to work together, to pray together, to struggle together, to go to jail together, to stand up for freedom together, knowing that we will be free one day.
    This will be the day when all of God’s children will be able to sing with new meaning. 
My country, ’ tis of thee,
Sweet land of liberty,
Of thee I sing:
Land where my fathers died,
Land of the pilgrims’ pride,
From every mountainside
Let freedom ring.
And if America is to be a great nation this must become true. So let freedom ring from the prodigious hilltops of New Hampshire.
Let freedom ring from the mighty mountains of New York!
Let freedom ring from the heightening Alleghenies of Pennsylvania!
Let freedom ring from the snowcapped Rockies of Colorado!
Let freedom ring from the curvaceous slops of California!
But not only that; let freedom ring from Stone Mountain of Georgia!
Let freedom ring from Lookout Mountain of Tennessee!
Let freedom ring from every hill and molehill of Mississippi!
From every mountainside, let freedom ring!
When we let freedom ring, when we let it ring from every village and every hamlet, from every state and every city, we will be able to speed up that day when all of God’s children, black men and white men, Jews and Gentiles, Protestants and Catholics, will be able to join hands and sing in the words of the old Negro spiritual, “Free at last! free at last! thank God almighty, we are free at last!”





================================================
FILE: src/TranslateTool/TranslateTool.sln
================================================

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15
VisualStudioVersion = 15.0.27428.2027
MinimumVisualStudioVersion = 10.0.40219.1
Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "TranslateTool", "TranslateTool\TranslateTool.pyproj", "{C8BD1066-8F3D-42E0-9CC2-40853F4770E9}"
EndProject
Global
	GlobalSection(SolutionConfigurationPlatforms) = preSolution
		Debug|Any CPU = Debug|Any CPU
		Release|Any CPU = Release|Any CPU
	EndGlobalSection
	GlobalSection(ProjectConfigurationPlatforms) = postSolution
		{C8BD1066-8F3D-42E0-9CC2-40853F4770E9}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
		{C8BD1066-8F3D-42E0-9CC2-40853F4770E9}.Release|Any CPU.ActiveCfg = Release|Any CPU
	EndGlobalSection
	GlobalSection(SolutionProperties) = preSolution
		HideSolutionNode = FALSE
	EndGlobalSection
	GlobalSection(ExtensibilityGlobals) = postSolution
		SolutionGuid = {7B4F213A-CCBB-4999-8529-34DC5A058F60}
	EndGlobalSection
EndGlobal

Download .txt

gitextract_edc4_0lw/

├── .gitignore
├── .vscode/
│   └── settings.json
├── README.md
└── src/
    └── TranslateTool/
        ├── .vs/
        │   └── TranslateTool/
        │       ├── v15/
        │       │   └── .suo
        │       └── v16/
        │           └── .suo
        ├── TranslateTool/
        │   ├── Doc_In/
        │   │   └── I have a dream.txt
        │   ├── Logger.py
        │   ├── Main.py
        │   ├── Py4Js.py
        │   ├── T_Base.py
        │   ├── T_Docx.py
        │   ├── T_Pdf_PyPDF2.py
        │   ├── T_Pdf_pdfminer.py
        │   ├── T_Txt.py
        │   ├── TranslateFunc.py
        │   ├── TranslateFuncT.py
        │   ├── TranslateTool.pyproj
        │   ├── log/
        │   │   └── log.txt
        │   └── testdata/
        │       ├── I have a dream.docx
        │       └── I have a dream.txt
        └── TranslateTool.sln

Download .txt

SYMBOL INDEX (50 symbols across 10 files)

FILE: src/TranslateTool/TranslateTool/Logger.py
  class Logger (line 26) | class Logger(object):
    method __init__ (line 29) | def __init__(self):
    method __new__ (line 41) | def __new__(cls,*args,**kwargs):
    method write (line 49) | def write(self,content):
    method delete_old_log (line 63) | def delete_old_log():

FILE: src/TranslateTool/TranslateTool/Main.py
  function run (line 30) | def run():
  function translate_doc (line 55) | def translate_doc(doc):
  function get_translate (line 67) | def get_translate(doc,path):
  function get_doc (line 86) | def get_doc():
  function get_path (line 103) | def get_path(*paths):
  function write_log (line 114) | def write_log(msg):

FILE: src/TranslateTool/TranslateTool/Py4Js.py
  class Py4Js (line 23) | class Py4Js():
    method __init__ (line 25) | def __init__(self):
    method getTk (line 67) | def getTk(self,text):

FILE: src/TranslateTool/TranslateTool/T_Base.py
  class Translate (line 23) | class Translate(object):
    method translate (line 25) | def translate(doc):

FILE: src/TranslateTool/TranslateTool/T_Docx.py
  class DocxTranslate (line 29) | class DocxTranslate(Translate):
    method __init__ (line 31) | def __init__(self, fileName, fullPath):
    method translate (line 41) | def translate(self):
    method prepare (line 70) | def prepare(self):
    method get_path (line 87) | def get_path(self,*paths):

FILE: src/TranslateTool/TranslateTool/T_Pdf_PyPDF2.py
  class PdfTranslagePyPDF2 (line 29) | class PdfTranslagePyPDF2(Translate):
    method __init__ (line 32) | def __init__(self, fileName, fullPath):
    method translate (line 42) | def translate(self):
    method removeBlankFromList (line 70) | def removeBlankFromList(self,list_old):
    method combine_list2str (line 80) | def combine_list2str(self,list_old):
    method enter_symbol (line 89) | def enter_symbol(self,content_old):
    method prepare (line 100) | def prepare(self):
    method get_path (line 117) | def get_path(self,*paths):
    method write (line 128) | def write(self,content):

FILE: src/TranslateTool/TranslateTool/T_Pdf_pdfminer.py
  class PdfTranslate (line 36) | class PdfTranslate(Translate):
    method __init__ (line 39) | def __init__(self, fileName, fullPath):
    method translate (line 49) | def translate(self):
    method prepare (line 106) | def prepare(self):
    method get_path (line 123) | def get_path(self,*paths):
    method write (line 134) | def write(self,content):

FILE: src/TranslateTool/TranslateTool/T_Txt.py
  class TxtTranslate (line 28) | class TxtTranslate(Translate):
    method __init__ (line 30) | def __init__(self, fileName,path):
    method translate (line 38) | def translate(self):
    method prepare (line 81) | def prepare(self):
    method get_path (line 98) | def get_path(self,*paths):  # 如果这里的参数没有self的话，paths里将会包含这个类实例化对象，导致joi...
    method write (line 109) | def write(self,content):

FILE: src/TranslateTool/TranslateTool/TranslateFunc.py
  function baidu_translate (line 30) | def baidu_translate(content,type=1):
  function youdao_translate (line 96) | def youdao_translate(content):
  function google_translate (line 161) | def google_translate(content):
  function translate_func (line 191) | def translate_func(content):

FILE: src/TranslateTool/TranslateTool/TranslateFuncT.py
  function baidu_translate (line 30) | def baidu_translate(content,type=1):
  function youdao_translate (line 96) | def youdao_translate(content):
  function google_translate (line 162) | def google_translate(content):
  function translate_func (line 195) | def translate_func(content):

Download .json

Condensed preview — 21 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (69K chars).

[
  {
    "path": ".gitignore",
    "chars": 1203,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": ".vscode/settings.json",
    "chars": 69,
    "preview": "{\n    \"python.pythonPath\": \"C:\\\\Anaconda3\\\\envs\\\\tfenv\\\\python.exe\"\n}"
  },
  {
    "path": "README.md",
    "chars": 10083,
    "preview": "# Python实现文档自动翻译 #\n## 目的 ##\n本文旨在说明Python文档自动翻译的需求分析以及实现过程。 \n\n## 需求分析 ##\n因在工作中，经常需要浏览英文文档，然而对于英文不太好的我来说，有时需要借助谷歌或者百度翻译，但文"
  },
  {
    "path": "src/TranslateTool/TranslateTool/Doc_In/I have a dream.txt",
    "chars": 4800,
    "preview": "Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. "
  },
  {
    "path": "src/TranslateTool/TranslateTool/Logger.py",
    "chars": 1589,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n\n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Studi"
  },
  {
    "path": "src/TranslateTool/TranslateTool/Main.py",
    "chars": 2383,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Stud"
  },
  {
    "path": "src/TranslateTool/TranslateTool/Py4Js.py",
    "chars": 1898,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Alen\n \n@License :   (C) Copyright 2018, Allen's Studi"
  },
  {
    "path": "src/TranslateTool/TranslateTool/T_Base.py",
    "chars": 359,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Stud"
  },
  {
    "path": "src/TranslateTool/TranslateTool/T_Docx.py",
    "chars": 2209,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Stud"
  },
  {
    "path": "src/TranslateTool/TranslateTool/T_Pdf_PyPDF2.py",
    "chars": 3281,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Stud"
  },
  {
    "path": "src/TranslateTool/TranslateTool/T_Pdf_pdfminer.py",
    "chars": 4052,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n \n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Stud"
  },
  {
    "path": "src/TranslateTool/TranslateTool/T_Txt.py",
    "chars": 2847,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n\n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Studi"
  },
  {
    "path": "src/TranslateTool/TranslateTool/TranslateFunc.py",
    "chars": 7082,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n\n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Studi"
  },
  {
    "path": "src/TranslateTool/TranslateTool/TranslateFuncT.py",
    "chars": 7427,
    "preview": "#!/usr/bin/env python\n# -*- encoding: utf-8 -*-\n\n'''\n@Author  :   Allen\n \n@License :   (C) Copyright 2018, Allen's Studi"
  },
  {
    "path": "src/TranslateTool/TranslateTool/TranslateTool.pyproj",
    "chars": 2104,
    "preview": "<Project DefaultTargets=\"Build\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\" ToolsVersion=\"4.0\">\n  <Prope"
  },
  {
    "path": "src/TranslateTool/TranslateTool/log/log.txt",
    "chars": 1400,
    "preview": "[2019-09-02 09:30:41] -> 提取到 3 个文档\r\n[2019-09-02 09:30:41] -> 开始翻译Article1.txt\r\n[2019-09-02 09:31:58] -> Article1.txt翻译完成"
  },
  {
    "path": "src/TranslateTool/TranslateTool/testdata/I have a dream.txt",
    "chars": 4800,
    "preview": "Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. "
  },
  {
    "path": "src/TranslateTool/TranslateTool.sln",
    "chars": 949,
    "preview": "\nMicrosoft Visual Studio Solution File, Format Version 12.00\n# Visual Studio 15\nVisualStudioVersion = 15.0.27428.2027\nM"
  }
]

// ... and 3 more files (download for full content)

About this extraction

This page contains the full source code of the AnuoF/TranslateTool GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 21 files (57.2 KB), approximately 18.8k tokens, and a symbol index with 50 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo