Full Code of ADT109119/WhisperGUI for AI

main 24a34f397e70 cached

10 files

14.2 KB

4.6k tokens

12 symbols

1 requests

Download .txt

Repository: ADT109119/WhisperGUI
Branch: main
Commit: 24a34f397e70
Files: 10
Total size: 14.2 KB

Directory structure:
gitextract_zebwciym/

├── .gitignore
├── LICENSE
├── README.md
├── main.py
├── requirements.txt
├── run.bat
├── setup.bat
├── update.bat
├── version.txt
└── 使用說明.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
venv/
output/
model/
config.json

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2023 ADT109119

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# WhisperGUI

![pic](https://user-images.githubusercontent.com/106337749/221340883-4b437d03-97fc-42ee-821e-dd04096323fe.png)

> 此專案是我用2個小時簡單製作的WhisperGUI+快速安裝包，可以讓我們在使用Whisper時快速操作，無須打指令，以及讓懶得手動裝一堆東西的人可以快速的使用(Python以及FFmpeg還是要自己裝)。

> 2023.03.30更新 日前有人回報無法抓到GPU的問題，目前推測可能跟CUDA Toolkit有關，若未安裝CUDA Toolkit的人，可嘗試先安裝Nvidia CUDA Toolkit。若已安裝卻還是無法顯示GPU，可嘗試調整Bat檔中的第22行，將安裝的Torch CUDA版本改為與電腦的CUDA Toolkit版本相同。

> 2023.08.12更新 目前我有測試打包這個專案成執行檔，不用跑setup.bat創虛擬環境，直接下載下來解壓縮，執行裡面的**main.exe**就可以執行了，大家可以到releases裡下載下來試試看。(打包版經測試無須手動安裝CUDA Toolkit)

[OpenAI Whisper](https://github.com/openai/whisper)

## 專案用途

此專案的作用

在於方便大家可以快速設定好的Whisper執行環境

以及讓多數使用者

可以僅需透過此GUI介面操作使用

而無須打指令


## 功能

目前支援操作以下幾種Whisper的功能
1. 選擇多個音檔
1. 選擇輸出位置
1. 選擇使用模型
1. 選擇辨識語言
1. 選擇使用裝置(CPU、指定顯卡)
1. 將字幕翻為英文


## 畫面

> 操作示意 選擇檔案>選擇跑模型的裝置>執行完成

![選模型](https://user-images.githubusercontent.com/106337749/218459288-0fd24ee4-4ed6-49c9-a3f4-1fd97976a89d.png)
![選裝置](https://user-images.githubusercontent.com/106337749/218459323-faaf2d8d-0a68-4bfc-a6e3-62e45b94ad0f.png)
![執行完成](https://user-images.githubusercontent.com/106337749/218460468-a801fe68-0f01-479d-a4bd-4f04eea1af41.png)

## 安裝

> 請先自行安裝Python 3.7以上版本，以及FFmpeg

以下將會引導你如何在你的電腦上執行此專案。

### 取得專案

```bash
git clone git@github.com/ADT109119/WhisperGUI.git
```

**或是直接在GitHub頁面點Download ZIP**

### 確認電腦已有Pyhton以及FFmpeg

```bash
python --version
ffmpeg -version
```

### 執行setup.bat

請直接執行資料夾中的setup.bat，等待虛擬環境完成設置

### 執行專案

請直接執行資料夾中的run.bat，若無報錯，將可以看到GUI介面

## 資料夾說明

- model - 模型存放處
- output - 預設輸出資料夾
- venv - 虛擬環境資料夾
...

## 專案技術

- Python
- tkinter
- ttkbootstrap

## 聯絡作者

你可以透過以下方式與我聯絡

- [Email: 2.jerry32262686@gmail.com](mailto:2.jerry32262686@gmail.com)
...

## License
This project is under the MIT License. See [LICENSE](https://github.com/ADT109119/WhisperGUI/blob/main/LICENSE) for further details.


================================================
FILE: main.py
================================================
import os
import easygui
import tkinter as tk
from tkinter import filedialog
import tkinter.ttk
from tkinter.constants import *
from tkinter import messagebox
import subprocess
import json

import torch
import time
import urllib.request

import ttkbootstrap as ttk
# from ttkbootstrap.constants import *

#easygui.fileopenbox()

import webbrowser

import threading

def callback(url):
    webbrowser.open_new(url)


def selectPhotoFolder():
    outputDir = easygui.diropenbox("字幕存放資料夾")
    output_dir.delete(0, 'end')
    output_dir.insert(0, outputDir)
    config["outputDir"] = outputDir
    saveConfig("outputDir", outputDir)
    # with open('config.json', 'w', encoding='utf8') as f:
    #     json.dump(config, f)


def selectAudioFile():
    paths = filedialog.askopenfilenames()
    for path in paths:
        displayAudioFilePath.insert(END, path)
    # File = easygui.fileopenbox("選擇音檔檔案")
    # displayAudioFilePath.delete(0, 'end')
    # displayAudioFilePath.insert(0, File)
    
def detectAvailableDevice():
    if torch.cuda.is_available():
        for i in range(torch.cuda.device_count()):
            devices.append(torch.cuda.get_device_name(i))
        
    usingDevice['value'] = devices
    return

def deviceChange(index, value, op):
    #print("%s %s %s"%(index, value, op))
    dev = devices.index(usingDevice.get())
    config["usingDevice"] = dev
    saveConfig("usingDevice", dev)
    # with open('config.json', 'w', encoding='utf8') as f:
    #     json.dump(config, f)

    if dev > 0:
        print("cuda:%s"%(dev-1))
    else:
        print("cpu")

def deviceDecode():
    dev = devices.index(usingDevice.get())
    if dev > 0:
        return("cuda:%s"%(dev-1))
    else:
        return("cpu")

def languageChange(index, value, op):
    transcribeLanguage = languages.index(usingLanguage.get())
    config["transcribeLanguage"] = transcribeLanguage
    saveConfig("transcribeLanguage", transcribeLanguage)
    # with open('config.json', 'w', encoding='utf8') as f:
    #     json.dump(config, f)

def modelChange(index, value, op):
    model = models.index(usingModel.get())
    config["usingModel"] = model
    saveConfig("usingModel", model)
    # with open('config.json', 'w', encoding='utf8') as f:
    #     json.dump(config, f)

def versionCheck(ori): # fetch app version
    try:
        url = 'https://raw.githubusercontent.com/ADT109119/WhisperGUI/main/version.txt'
        response = urllib.request.urlopen(url)
        fetchVersion = response.read().decode('utf-8')
        if fetchVersion != "ver 1.9.1":
            checkVisitGithub = messagebox.askquestion(title="有新版本", message="目前最新版本為%s\n請問您是否想前往GitHub下載最新版本"%(fetchVersion))
            if checkVisitGithub == 'yes':
                callback("https://github.com/ADT109119/WhisperGUI")
        elif ori == 1:
            messagebox.showinfo(title="訊息", message="目前版本為最新")

    except:
        print("無法獲取版本資訊")

def saveConfig(key, value):
    global config
    config[key] = value
    with open('config.json', 'w', encoding='utf8') as f:
        json.dump(config, f)

def start_process():
    t = threading.Thread(target=process)
    t.start()

def process():
    # File = displayAudioFilePath.get(0)
    if displayAudioFilePath.size() == 0:
        messagebox.showerror(title="錯誤", message='未選擇音檔')
        return 0

    start = time.time()

    baseCommandStr = "venv\\Scripts\\whisper"
    
    for i in range(displayAudioFilePath.size()):
        path = displayAudioFilePath.get(i)
        print(path)
        commandStr = baseCommandStr + ' "%s"'%path

        languageInput = usingLanguage.get()

        if languageInput != "自動偵測":
            commandStr = commandStr + " --language %s "%languageInput

        deviceInput = deviceDecode()
        commandStr = commandStr + " --device %s --fp16 False"%deviceInput

        commandStr = commandStr + " --model %s "%usingModel.get()

        commandStr = commandStr + " -f srt"

        output_dirInput = output_dir.get()

        if outputToTheSamePathAsInputVar.get()=="1":
            output_dirInput = "/".join(path.split("/")[:-1])
            # print(output_dirInput)

        if output_dirInput != "":
            commandStr = commandStr + ' --output_dir "%s" '%output_dirInput

        commandStr = commandStr + " --model_dir %s "%("model")

        if translateToEnglishVar.get() == '1':
            commandStr = commandStr + " --task %s "%("translate")


        if initial_prompt.get() != "":
            commandStr = commandStr + ' --initial_prompt "%s" '%initial_prompt.get()

        # print(os.system("echo %s"%commandStr))
        # print(commandStr)
        processButton["state"] = "disable"
        
        out = subprocess.Popen(commandStr)
        (out, err) = out.communicate()
    

    end = time.time()

    messagebox.showinfo(title="訊息", message="處理完成\n花費時間%.2f秒"%(end-start))
    processButton["state"] = "normal"
    saveConfig("prompt", initial_prompt.get())
    #outputPreviewVar.set(out)
    #print(out)

# cinfig
config = {
    "outputDir": os.getcwd() + "\\output",
    "usingModel": 2,
    "usingDevice": 0,
    "transcribeLanguage": 0,
    "autoCheckVersion": True,
    "prompt": ""
}


if not os.path.exists(".\\config.json"):
    with open('config.json', 'w', encoding='utf8') as f:
        json.dump(config, f)
else:
    with open('config.json', 'r', encoding='utf8') as f:
        config = json.load(f)

if config["autoCheckVersion"] == True:
    versionCheck(0)


heightFix_1 = 70

window = tk.Tk()
window.title('WhisperGUI By The Walking Fish')
window.geometry('580x330')
window.resizable(False, False)

label1 = tk.Label(text='選擇音檔')
label1.place(x=0, y=10)
displayAudioFilePath = tk.Listbox(width=60, height=5)
displayAudioFilePath.place(x=80, y=10)
selectAudioFileButton = ttk.Button(text='＋添加', command=selectAudioFile)
selectAudioFileButton.place(x=510, y=20)
selectAudioFileButton = ttk.Button(text='－刪除', bootstyle='danger', command=lambda x=displayAudioFilePath: x.delete("active"))
selectAudioFileButton.place(x=510, y=50)



label2 = tk.Label(text='字幕存放資料夾')
label2.place(x=0, y=30+heightFix_1)
output_dir = tk.Entry(width=55)
output_dir.place(x=120, y=30+heightFix_1)
output_dir.insert(0, config["outputDir"])
selectPhotoPathButton = tk.Button(text='....', command=selectPhotoFolder)
selectPhotoPathButton.place(x=500, y=30+heightFix_1)

outputToTheSamePathAsInputVar = tk.StringVar()
outputToTheSamePathAsInput = tk.Checkbutton(text="檔案輸出到與個別輸入檔案相同位置", variable=outputToTheSamePathAsInputVar, onvalue="1", offvalue="0")
outputToTheSamePathAsInput.deselect()
outputToTheSamePathAsInput.place(x=300, y=60+heightFix_1)

label_usingModel = tk.Label(text='使用模型')
label_usingModel.place(x=0, y=60+heightFix_1)
var = tk.StringVar()
var.trace("w", modelChange)
usingModel = tkinter.ttk.Combobox(window, textvariable=var)
models = ['tiny', 'base', 'small', 'medium', 'large', 'large-v1', 'large-v2', 'large-v3', 'turbo']
usingModel['value'] = models
usingModel.current(config["usingModel"])
usingModel.place(x=60, y=60+heightFix_1)

label_usingDevice = tk.Label(text='使用裝置')
label_usingDevice.place(x=0, y=90+heightFix_1)
deviceVar = tk.StringVar()
deviceVar.trace("w", deviceChange)
usingDevice = tkinter.ttk.Combobox(window, textvariable=deviceVar)
devices = ['cpu']
usingDevice['value'] = devices
detectAvailableDevice()
usingDevice.current(config["usingDevice"])
usingDevice.place(x=60, y=90+heightFix_1)

label_language = tk.Label(text='辨識語言')
label_language.place(x=0, y=120+heightFix_1)
languageVar = tk.StringVar()
languageVar.trace_add("write", languageChange)
usingLanguage = tkinter.ttk.Combobox(window, textvariable=languageVar)
languages = ['自動偵測', "Afrikaans","Albanian","Amharic","Arabic","Armenian","Assamese","Azerbaijani","Bashkir","Basque","Belarusian","Bengali","Bosnian","Breton","Bulgarian","Burmese","Castilian","Catalan","Chinese","Croatian","Czech","Danish","Dutch","English","Estonian","Faroese","Finnish","Flemish","French","Galician","Georgian","German","Greek","Gujarati","Haitian","Haitian Creole","Hausa","Hawaiian","Hebrew","Hindi","Hungarian","Icelandic","Indonesian","Italian","Japanese","Javanese","Kannada","Kazakh","Khmer","Korean","Lao","Latin","Latvian","Letzeburgesch","Lingala","Lithuanian","Luxembourgish","Macedonian","Malagasy","Malay","Malayalam","Maltese","Maori","Marathi","Moldavian","Moldovan","Mongolian","Myanmar","Nepali","Norwegian","Nynorsk","Occitan","Panjabi","Pashto","Persian","Polish","Portuguese","Punjabi","Pushto","Romanian","Russian","Sanskrit","Serbian","Shona","Sindhi","Sinhala","Sinhalese","Slovak","Slovenian","Somali","Spanish","Sundanese","Swahili","Swedish","Tagalog","Tajik","Tamil","Tatar","Telugu","Thai","Tibetan","Turkish","Turkmen","Ukrainian","Urdu","Uzbek","Valencian","Vietnamese","Welsh","Yiddish","Yoruba"]
usingLanguage['value'] = languages
usingLanguage.current(config["transcribeLanguage"])
usingLanguage.place(x=60, y=120+heightFix_1)

translateToEnglishVar = tk.StringVar()
translateToEnglish = tk.Checkbutton(text="將輸出字幕翻譯為英文", variable=translateToEnglishVar, onvalue="1", offvalue="0")
translateToEnglish.deselect()
translateToEnglish.place(x=0, y=150+heightFix_1)

initial_prompt_label = tk.Label(text='內容提示詞')
initial_prompt_label.place(x=0, y=180+heightFix_1)
initial_prompt_var = tk.StringVar(value=config.get("prompt"))
initial_prompt = tk.Entry(width=55, textvariable=initial_prompt_var)
initial_prompt.pack()
initial_prompt.place(x=80, y=180+heightFix_1)

label_copyright = tk.Label(text='MIT License')
label_copyright.place(x=0, y=210+heightFix_1)
label_author = tk.Label(text='製作: The Walking Fish')
label_author.bind("<Button-1>", lambda e: callback("https://www.youtube.com/@the_walking_fish"))
label_author.place(x=0, y=230+heightFix_1)

processButton = tk.Button(text='執行', width=20, command=start_process)
processButton.place(anchor='center', x=290, y=240+heightFix_1)

# menu
menu = tk.Menu(window)
settingMenu = tk.Menu(menu)
autoCheckVar = tk.BooleanVar()
autoCheckVar.trace_add("write", lambda index, value, op: saveConfig("autoCheckVersion", autoCheckVar.get()))
autoCheckVar.set(config["autoCheckVersion"])
settingMenu.add_checkbutton(label="自動檢查版本", variable=autoCheckVar)
settingMenu.add_separator()
settingMenu.add_command(label="檢查版本", command=lambda: versionCheck(1))
menu.add_cascade(label="setting", menu=settingMenu)
window.config(menu=menu)

window.mainloop()


================================================
FILE: requirements.txt
================================================
easygui
ttkbootstrap
openai-whisper
requests

================================================
FILE: run.bat
================================================
venv\Scripts\python main.py


================================================
FILE: setup.bat
================================================
@echo off
goto :DOES_PYTHON_EXIST


:DOES_PYTHON_EXIST
python -V | find /v "Python" >NUL 2>NUL && (goto :PYTHON_DOES_NOT_EXIST)
python -V | find "Python"    >NUL 2>NUL && (goto :PYTHON_DOES_EXIST)
goto :EOF

:PYTHON_DOES_NOT_EXIST
echo Python is not installed on your system.
echo Now opeing the download URL.
start "" "https://www.microsoft.com/store/productId/9PJPW5LDXLZ5"
PAUSE
goto :EOF

:PYTHON_DOES_EXIST
:: This will retrieve Python 3.8.0 for example.
for /f "delims=" %%V in ('python -V') do @set ver=%%V
echo Congrats, %ver% is installed...
python -m venv .\venv
venv\Scripts\pip install torch==1.13.1 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
venv\Scripts\pip install -U -r requirements.txt

PAUSE
goto :EOF




================================================
FILE: update.bat
================================================
venv\Scripts\pip install -U openai-whisper

================================================
FILE: version.txt
================================================
ver 1.9.1

================================================
FILE: 使用說明.txt
================================================
1.先打開setup.bat，此檔案會自動創建虛擬環境，以及安裝所需要的各種函式庫
2.打開run.bat，開始使用
3.在輸出資料夾找到檔案

Download .txt

gitextract_zebwciym/

├── .gitignore
├── LICENSE
├── README.md
├── main.py
├── requirements.txt
├── run.bat
├── setup.bat
├── update.bat
├── version.txt
└── 使用說明.txt

Download .txt

SYMBOL INDEX (12 symbols across 1 files)

FILE: main.py
  function callback (line 24) | def callback(url):
  function selectPhotoFolder (line 28) | def selectPhotoFolder():
  function selectAudioFile (line 38) | def selectAudioFile():
  function detectAvailableDevice (line 46) | def detectAvailableDevice():
  function deviceChange (line 54) | def deviceChange(index, value, op):
  function deviceDecode (line 67) | def deviceDecode():
  function languageChange (line 74) | def languageChange(index, value, op):
  function modelChange (line 81) | def modelChange(index, value, op):
  function versionCheck (line 88) | def versionCheck(ori): # fetch app version
  function saveConfig (line 103) | def saveConfig(key, value):
  function start_process (line 109) | def start_process():
  function process (line 113) | def process():

Download .json

Condensed preview — 10 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (18K chars).

[
  {
    "path": ".gitignore",
    "chars": 32,
    "preview": "venv/\noutput/\nmodel/\nconfig.json"
  },
  {
    "path": "LICENSE",
    "chars": 1066,
    "preview": "MIT License\n\nCopyright (c) 2023 ADT109119\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\n"
  },
  {
    "path": "README.md",
    "chars": 1831,
    "preview": "# WhisperGUI\n\n![pic](https://user-images.githubusercontent.com/106337749/221340883-4b437d03-97fc-42ee-821e-dd04096323fe."
  },
  {
    "path": "main.py",
    "chars": 10669,
    "preview": "import os\r\nimport easygui\r\nimport tkinter as tk\r\nfrom tkinter import filedialog\r\nimport tkinter.ttk\r\nfrom tkinter.consta"
  },
  {
    "path": "requirements.txt",
    "chars": 46,
    "preview": "easygui\r\nttkbootstrap\r\nopenai-whisper\nrequests"
  },
  {
    "path": "run.bat",
    "chars": 29,
    "preview": "venv\\Scripts\\python main.py\r\n"
  },
  {
    "path": "setup.bat",
    "chars": 787,
    "preview": "@echo off\r\ngoto :DOES_PYTHON_EXIST\r\n\r\n\r\n:DOES_PYTHON_EXIST\r\npython -V | find /v \"Python\" >NUL 2>NUL && (goto :PYTHON_DOE"
  },
  {
    "path": "update.bat",
    "chars": 42,
    "preview": "venv\\Scripts\\pip install -U openai-whisper"
  },
  {
    "path": "version.txt",
    "chars": 9,
    "preview": "ver 1.9.1"
  },
  {
    "path": "使用說明.txt",
    "chars": 73,
    "preview": "1.先打開setup.bat，此檔案會自動創建虛擬環境，以及安裝所需要的各種函式庫\r\n2.打開run.bat，開始使用\r\n3.在輸出資料夾找到檔案"
  }
]

About this extraction

This page contains the full source code of the ADT109119/WhisperGUI GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 10 files (14.2 KB), approximately 4.6k tokens, and a symbol index with 12 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo