Repository: ADT109119/WhisperGUI Branch: main Commit: 24a34f397e70 Files: 10 Total size: 14.2 KB Directory structure: gitextract_zebwciym/ ├── .gitignore ├── LICENSE ├── README.md ├── main.py ├── requirements.txt ├── run.bat ├── setup.bat ├── update.bat ├── version.txt └── 使用說明.txt ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ venv/ output/ model/ config.json ================================================ FILE: LICENSE ================================================ MIT License Copyright (c) 2023 ADT109119 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # WhisperGUI ![pic](https://user-images.githubusercontent.com/106337749/221340883-4b437d03-97fc-42ee-821e-dd04096323fe.png) > 此專案是我用2個小時簡單製作的WhisperGUI+快速安裝包,可以讓我們在使用Whisper時快速操作,無須打指令,以及讓懶得手動裝一堆東西的人可以快速的使用(Python以及FFmpeg還是要自己裝)。 > 2023.03.30更新 日前有人回報無法抓到GPU的問題,目前推測可能跟CUDA Toolkit有關,若未安裝CUDA Toolkit的人,可嘗試先安裝Nvidia CUDA Toolkit。若已安裝卻還是無法顯示GPU,可嘗試調整Bat檔中的第22行,將安裝的Torch CUDA版本改為與電腦的CUDA Toolkit版本相同。 > 2023.08.12更新 目前我有測試打包這個專案成執行檔,不用跑setup.bat創虛擬環境,直接下載下來解壓縮,執行裡面的**main.exe**就可以執行了,大家可以到releases裡下載下來試試看。(打包版經測試無須手動安裝CUDA Toolkit) [OpenAI Whisper](https://github.com/openai/whisper) ## 專案用途 此專案的作用 在於方便大家可以快速設定好的Whisper執行環境 以及讓多數使用者 可以僅需透過此GUI介面操作使用 而無須打指令 ## 功能 目前支援操作以下幾種Whisper的功能 1. 選擇多個音檔 1. 選擇輸出位置 1. 選擇使用模型 1. 選擇辨識語言 1. 選擇使用裝置(CPU、指定顯卡) 1. 將字幕翻為英文 ## 畫面 > 操作示意 選擇檔案>選擇跑模型的裝置>執行完成 ![選模型](https://user-images.githubusercontent.com/106337749/218459288-0fd24ee4-4ed6-49c9-a3f4-1fd97976a89d.png) ![選裝置](https://user-images.githubusercontent.com/106337749/218459323-faaf2d8d-0a68-4bfc-a6e3-62e45b94ad0f.png) ![執行完成](https://user-images.githubusercontent.com/106337749/218460468-a801fe68-0f01-479d-a4bd-4f04eea1af41.png) ## 安裝 > 請先自行安裝Python 3.7以上版本,以及FFmpeg 以下將會引導你如何在你的電腦上執行此專案。 ### 取得專案 ```bash git clone git@github.com/ADT109119/WhisperGUI.git ``` **或是直接在GitHub頁面點Download ZIP** ### 確認電腦已有Pyhton以及FFmpeg ```bash python --version ffmpeg -version ``` ### 執行setup.bat 請直接執行資料夾中的setup.bat,等待虛擬環境完成設置 ### 執行專案 請直接執行資料夾中的run.bat,若無報錯,將可以看到GUI介面 ## 資料夾說明 - model - 模型存放處 - output - 預設輸出資料夾 - venv - 虛擬環境資料夾 ... ## 專案技術 - Python - tkinter - ttkbootstrap ## 聯絡作者 你可以透過以下方式與我聯絡 - [Email: 2.jerry32262686@gmail.com](mailto:2.jerry32262686@gmail.com) ... ## License This project is under the MIT License. See [LICENSE](https://github.com/ADT109119/WhisperGUI/blob/main/LICENSE) for further details. ================================================ FILE: main.py ================================================ import os import easygui import tkinter as tk from tkinter import filedialog import tkinter.ttk from tkinter.constants import * from tkinter import messagebox import subprocess import json import torch import time import urllib.request import ttkbootstrap as ttk # from ttkbootstrap.constants import * #easygui.fileopenbox() import webbrowser import threading def callback(url): webbrowser.open_new(url) def selectPhotoFolder(): outputDir = easygui.diropenbox("字幕存放資料夾") output_dir.delete(0, 'end') output_dir.insert(0, outputDir) config["outputDir"] = outputDir saveConfig("outputDir", outputDir) # with open('config.json', 'w', encoding='utf8') as f: # json.dump(config, f) def selectAudioFile(): paths = filedialog.askopenfilenames() for path in paths: displayAudioFilePath.insert(END, path) # File = easygui.fileopenbox("選擇音檔檔案") # displayAudioFilePath.delete(0, 'end') # displayAudioFilePath.insert(0, File) def detectAvailableDevice(): if torch.cuda.is_available(): for i in range(torch.cuda.device_count()): devices.append(torch.cuda.get_device_name(i)) usingDevice['value'] = devices return def deviceChange(index, value, op): #print("%s %s %s"%(index, value, op)) dev = devices.index(usingDevice.get()) config["usingDevice"] = dev saveConfig("usingDevice", dev) # with open('config.json', 'w', encoding='utf8') as f: # json.dump(config, f) if dev > 0: print("cuda:%s"%(dev-1)) else: print("cpu") def deviceDecode(): dev = devices.index(usingDevice.get()) if dev > 0: return("cuda:%s"%(dev-1)) else: return("cpu") def languageChange(index, value, op): transcribeLanguage = languages.index(usingLanguage.get()) config["transcribeLanguage"] = transcribeLanguage saveConfig("transcribeLanguage", transcribeLanguage) # with open('config.json', 'w', encoding='utf8') as f: # json.dump(config, f) def modelChange(index, value, op): model = models.index(usingModel.get()) config["usingModel"] = model saveConfig("usingModel", model) # with open('config.json', 'w', encoding='utf8') as f: # json.dump(config, f) def versionCheck(ori): # fetch app version try: url = 'https://raw.githubusercontent.com/ADT109119/WhisperGUI/main/version.txt' response = urllib.request.urlopen(url) fetchVersion = response.read().decode('utf-8') if fetchVersion != "ver 1.9.1": checkVisitGithub = messagebox.askquestion(title="有新版本", message="目前最新版本為%s\n請問您是否想前往GitHub下載最新版本"%(fetchVersion)) if checkVisitGithub == 'yes': callback("https://github.com/ADT109119/WhisperGUI") elif ori == 1: messagebox.showinfo(title="訊息", message="目前版本為最新") except: print("無法獲取版本資訊") def saveConfig(key, value): global config config[key] = value with open('config.json', 'w', encoding='utf8') as f: json.dump(config, f) def start_process(): t = threading.Thread(target=process) t.start() def process(): # File = displayAudioFilePath.get(0) if displayAudioFilePath.size() == 0: messagebox.showerror(title="錯誤", message='未選擇音檔') return 0 start = time.time() baseCommandStr = "venv\\Scripts\\whisper" for i in range(displayAudioFilePath.size()): path = displayAudioFilePath.get(i) print(path) commandStr = baseCommandStr + ' "%s"'%path languageInput = usingLanguage.get() if languageInput != "自動偵測": commandStr = commandStr + " --language %s "%languageInput deviceInput = deviceDecode() commandStr = commandStr + " --device %s --fp16 False"%deviceInput commandStr = commandStr + " --model %s "%usingModel.get() commandStr = commandStr + " -f srt" output_dirInput = output_dir.get() if outputToTheSamePathAsInputVar.get()=="1": output_dirInput = "/".join(path.split("/")[:-1]) # print(output_dirInput) if output_dirInput != "": commandStr = commandStr + ' --output_dir "%s" '%output_dirInput commandStr = commandStr + " --model_dir %s "%("model") if translateToEnglishVar.get() == '1': commandStr = commandStr + " --task %s "%("translate") if initial_prompt.get() != "": commandStr = commandStr + ' --initial_prompt "%s" '%initial_prompt.get() # print(os.system("echo %s"%commandStr)) # print(commandStr) processButton["state"] = "disable" out = subprocess.Popen(commandStr) (out, err) = out.communicate() end = time.time() messagebox.showinfo(title="訊息", message="處理完成\n花費時間%.2f秒"%(end-start)) processButton["state"] = "normal" saveConfig("prompt", initial_prompt.get()) #outputPreviewVar.set(out) #print(out) # cinfig config = { "outputDir": os.getcwd() + "\\output", "usingModel": 2, "usingDevice": 0, "transcribeLanguage": 0, "autoCheckVersion": True, "prompt": "" } if not os.path.exists(".\\config.json"): with open('config.json', 'w', encoding='utf8') as f: json.dump(config, f) else: with open('config.json', 'r', encoding='utf8') as f: config = json.load(f) if config["autoCheckVersion"] == True: versionCheck(0) heightFix_1 = 70 window = tk.Tk() window.title('WhisperGUI By The Walking Fish') window.geometry('580x330') window.resizable(False, False) label1 = tk.Label(text='選擇音檔') label1.place(x=0, y=10) displayAudioFilePath = tk.Listbox(width=60, height=5) displayAudioFilePath.place(x=80, y=10) selectAudioFileButton = ttk.Button(text='+添加', command=selectAudioFile) selectAudioFileButton.place(x=510, y=20) selectAudioFileButton = ttk.Button(text='-刪除', bootstyle='danger', command=lambda x=displayAudioFilePath: x.delete("active")) selectAudioFileButton.place(x=510, y=50) label2 = tk.Label(text='字幕存放資料夾') label2.place(x=0, y=30+heightFix_1) output_dir = tk.Entry(width=55) output_dir.place(x=120, y=30+heightFix_1) output_dir.insert(0, config["outputDir"]) selectPhotoPathButton = tk.Button(text='....', command=selectPhotoFolder) selectPhotoPathButton.place(x=500, y=30+heightFix_1) outputToTheSamePathAsInputVar = tk.StringVar() outputToTheSamePathAsInput = tk.Checkbutton(text="檔案輸出到與個別輸入檔案相同位置", variable=outputToTheSamePathAsInputVar, onvalue="1", offvalue="0") outputToTheSamePathAsInput.deselect() outputToTheSamePathAsInput.place(x=300, y=60+heightFix_1) label_usingModel = tk.Label(text='使用模型') label_usingModel.place(x=0, y=60+heightFix_1) var = tk.StringVar() var.trace("w", modelChange) usingModel = tkinter.ttk.Combobox(window, textvariable=var) models = ['tiny', 'base', 'small', 'medium', 'large', 'large-v1', 'large-v2', 'large-v3', 'turbo'] usingModel['value'] = models usingModel.current(config["usingModel"]) usingModel.place(x=60, y=60+heightFix_1) label_usingDevice = tk.Label(text='使用裝置') label_usingDevice.place(x=0, y=90+heightFix_1) deviceVar = tk.StringVar() deviceVar.trace("w", deviceChange) usingDevice = tkinter.ttk.Combobox(window, textvariable=deviceVar) devices = ['cpu'] usingDevice['value'] = devices detectAvailableDevice() usingDevice.current(config["usingDevice"]) usingDevice.place(x=60, y=90+heightFix_1) label_language = tk.Label(text='辨識語言') label_language.place(x=0, y=120+heightFix_1) languageVar = tk.StringVar() languageVar.trace_add("write", languageChange) usingLanguage = tkinter.ttk.Combobox(window, textvariable=languageVar) languages = ['自動偵測', "Afrikaans","Albanian","Amharic","Arabic","Armenian","Assamese","Azerbaijani","Bashkir","Basque","Belarusian","Bengali","Bosnian","Breton","Bulgarian","Burmese","Castilian","Catalan","Chinese","Croatian","Czech","Danish","Dutch","English","Estonian","Faroese","Finnish","Flemish","French","Galician","Georgian","German","Greek","Gujarati","Haitian","Haitian Creole","Hausa","Hawaiian","Hebrew","Hindi","Hungarian","Icelandic","Indonesian","Italian","Japanese","Javanese","Kannada","Kazakh","Khmer","Korean","Lao","Latin","Latvian","Letzeburgesch","Lingala","Lithuanian","Luxembourgish","Macedonian","Malagasy","Malay","Malayalam","Maltese","Maori","Marathi","Moldavian","Moldovan","Mongolian","Myanmar","Nepali","Norwegian","Nynorsk","Occitan","Panjabi","Pashto","Persian","Polish","Portuguese","Punjabi","Pushto","Romanian","Russian","Sanskrit","Serbian","Shona","Sindhi","Sinhala","Sinhalese","Slovak","Slovenian","Somali","Spanish","Sundanese","Swahili","Swedish","Tagalog","Tajik","Tamil","Tatar","Telugu","Thai","Tibetan","Turkish","Turkmen","Ukrainian","Urdu","Uzbek","Valencian","Vietnamese","Welsh","Yiddish","Yoruba"] usingLanguage['value'] = languages usingLanguage.current(config["transcribeLanguage"]) usingLanguage.place(x=60, y=120+heightFix_1) translateToEnglishVar = tk.StringVar() translateToEnglish = tk.Checkbutton(text="將輸出字幕翻譯為英文", variable=translateToEnglishVar, onvalue="1", offvalue="0") translateToEnglish.deselect() translateToEnglish.place(x=0, y=150+heightFix_1) initial_prompt_label = tk.Label(text='內容提示詞') initial_prompt_label.place(x=0, y=180+heightFix_1) initial_prompt_var = tk.StringVar(value=config.get("prompt")) initial_prompt = tk.Entry(width=55, textvariable=initial_prompt_var) initial_prompt.pack() initial_prompt.place(x=80, y=180+heightFix_1) label_copyright = tk.Label(text='MIT License') label_copyright.place(x=0, y=210+heightFix_1) label_author = tk.Label(text='製作: The Walking Fish') label_author.bind("", lambda e: callback("https://www.youtube.com/@the_walking_fish")) label_author.place(x=0, y=230+heightFix_1) processButton = tk.Button(text='執行', width=20, command=start_process) processButton.place(anchor='center', x=290, y=240+heightFix_1) # menu menu = tk.Menu(window) settingMenu = tk.Menu(menu) autoCheckVar = tk.BooleanVar() autoCheckVar.trace_add("write", lambda index, value, op: saveConfig("autoCheckVersion", autoCheckVar.get())) autoCheckVar.set(config["autoCheckVersion"]) settingMenu.add_checkbutton(label="自動檢查版本", variable=autoCheckVar) settingMenu.add_separator() settingMenu.add_command(label="檢查版本", command=lambda: versionCheck(1)) menu.add_cascade(label="setting", menu=settingMenu) window.config(menu=menu) window.mainloop() ================================================ FILE: requirements.txt ================================================ easygui ttkbootstrap openai-whisper requests ================================================ FILE: run.bat ================================================ venv\Scripts\python main.py ================================================ FILE: setup.bat ================================================ @echo off goto :DOES_PYTHON_EXIST :DOES_PYTHON_EXIST python -V | find /v "Python" >NUL 2>NUL && (goto :PYTHON_DOES_NOT_EXIST) python -V | find "Python" >NUL 2>NUL && (goto :PYTHON_DOES_EXIST) goto :EOF :PYTHON_DOES_NOT_EXIST echo Python is not installed on your system. echo Now opeing the download URL. start "" "https://www.microsoft.com/store/productId/9PJPW5LDXLZ5" PAUSE goto :EOF :PYTHON_DOES_EXIST :: This will retrieve Python 3.8.0 for example. for /f "delims=" %%V in ('python -V') do @set ver=%%V echo Congrats, %ver% is installed... python -m venv .\venv venv\Scripts\pip install torch==1.13.1 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 venv\Scripts\pip install -U -r requirements.txt PAUSE goto :EOF ================================================ FILE: update.bat ================================================ venv\Scripts\pip install -U openai-whisper ================================================ FILE: version.txt ================================================ ver 1.9.1 ================================================ FILE: 使用說明.txt ================================================ 1.先打開setup.bat,此檔案會自動創建虛擬環境,以及安裝所需要的各種函式庫 2.打開run.bat,開始使用 3.在輸出資料夾找到檔案